session_1.Rmd

---
title: "Session 1"
author: "Antoine Vernet"
subtitle: Statistics and Programming with R
output:
  ioslides_presentation:
    css: style.css
    keep_md: true
  beamer_presentation:
    slide_level: 2
---

```{r setup, echo = FALSE}
library(knitr)
```


# 

```{r, echo = FALSE, out.width = '100%', fig.retina = NULL, fig.align = 'center'}
knitr::include_graphics("./img/bridge.jpg")
```


## Data Spark

Within Imperial Business Analytics, we are running [Data Sparks](https://www.imperialbusinessanalytics.co.uk/data-spark/) which are a great way of getting hands-on experience with real-life data analytics projects.

There will be an information session later in the year.


## Important information

There are three parts to the evaluation:

- Exam: 50%
- Group project: 30%
- Assignments: 20%

Good practice:

- When submitting code, please put your name in the file name, so I can keep track of your submission easily.
- Code should be in Rmd files that compile, remember to use `echo = TRUE` in your code chunks


## Lesson plan
Today we will cover:

- Fundamentals of linear algebra
    + Matrix algebra
    + Systems of equations
    + Further topics in matrix algebra


You will have covered most of those things during highschool, so you should be pretty familiar with everything.

## Definitions

- Algebra: the study of mathematical symbol and the rules to manipulate them

- Linear algebra: the study of vector spaces and linear mapping between those spaces

- Calculus: the study of change

- Geometry: the study of shape, size and relative position

(Source: Wikipedia)


# Matrix algebra


## What is a matrix?

> - A matrix is a rectangular array of numbers.
> - An $m \times n$ matrix has $m$ rows and $n$ columns.

$${\bf A} = [a_{ij}] = 
\begin{bmatrix} 
    a_{11} & a_{12} & a_{13} & \dots  & a_{1n} \\
    a_{21} & a_{22} & a_{23} & \dots  & a_{2n} \\
    \vdots & \vdots & \vdots & \ddots & \vdots \\
    a_{m1} & a_{m2} & a_{m3} & \dots  & a_{mn}
\end{bmatrix}$$

> - $i$ is the index for rows, $j$ is the index for columns.

## Definition: square matrix

- A square matrix has the same number of rows and columns (i.e $n \times n$) 

$${\bf A} = [a_{ij}] = 
\begin{bmatrix} 
    a_{11} & a_{12} & \dots  & a_{1n} \\
    a_{21} & a_{22} & \dots  & a_{2n} \\
    \vdots & \vdots & \ddots & \vdots \\
    a_{n1} & a_{n2} & \dots  & a_{nn}
\end{bmatrix}$$

## Definition: row  and column vector

- A row vector is a $1 \times m$ matrix, it can be written as ${\bf x} \equiv (x_1, x_2, ..., x_m)$ 

- A column vector is a $n \times 1$ matrix, it can be written as $${\bf y} \equiv \begin{bmatrix}
        y_1 \\
        y_2 \\
        \vdots\\
        y_n \\
        \end{bmatrix}$$ 

## Definition: diagonal matrix

A square matrix ${\bf A}$ is a diagonal matrix if all the elements off diagonal are 0 ($a_{ij} = 0$, for all $i \neq j$).

$${\bf A} = [a_{ij}] = 
\begin{bmatrix} 
    a_{11} & 0 & \dots  & 0 \\
    0 & a_{22} & \dots  & 0 \\
    \vdots & \vdots & \ddots & \vdots \\
    0 & 0 & \dots  & a_{nn}
\end{bmatrix}$$


## Definition: identity and zero matrix

The $n \times n$ __identity matrix__, denoted ${\bf I}$ (or ${\bf I_n}$)  is the diagonal matrix with unity in each diagonal position and zero elsewhere:

$${\bf I} = 
\begin{bmatrix} 
    1 & 0 & \dots  & 0 \\
    0 & 1 & \dots  & 0 \\
    \vdots & \vdots & \ddots & \vdots \\
    0 & 0 & \dots  & 1
\end{bmatrix}$$

The $m \times n$ __zero matrix__, denoted ${\bf 0}$ is the $m \times n$ matrix with zero for all entries, this need not be a square matrix.

## Matrix addition

Two matrices of the same dimensions, for example two $m \times n$ matrices can be added element by element: ${\bf A} + {\bf B} = [a_{ij} + b_{ij}]$.

$${\bf A} + {\bf B} =
\begin{bmatrix} 
    a_{11} + b_{11} & a_{12} + b_{12} & a_{13} + b_{13} & \dots  & a_{1n} + b_{1n} \\
    a_{21} + b_{21} & a_{22} + b_{22} & a_{23} + b_{23} & \dots  & a_{2n} + b_{2n} \\
    \vdots & \vdots & \vdots & \ddots & \vdots \\
    a_{m1} + b_{m1} & a_{m2} + b_{m2} & a_{m3} + b_{m3} & \dots  & a_{mn} + b_{mn}
\end{bmatrix}$$

## Scalar multiplication

Given any real number $\gamma$, __scalar multiplication__ is defined as $\gamma{\bf A} \equiv [\gamma a_{ij}]$ or:

$$\gamma{\bf A} = [\gamma a_{ij}] = 
\begin{bmatrix} 
    \gamma a_{11} & \gamma a_{12} & \gamma a_{13} & \dots  & \gamma a_{1n} \\
    \gamma a_{21} & \gamma a_{22} & \gamma a_{23} & \dots  & \gamma a_{2n} \\
    \vdots & \vdots & \vdots & \ddots & \vdots \\
    \gamma a_{m1} & \gamma a_{m2} & \gamma a_{m3} & \dots  & \gamma a_{mn}
\end{bmatrix}$$

## Matrix multiplication

In order for matrix multiplication of matrix ${\bf A}$ and ${\bf B}$ to be possible, the column dimension of ${\bf A}$ must equal the row dimension of ${\bf B}$. If ${\bf A}$ is an $m \times n$ matrix and ${\bf B}$ an $n \times p$ matrix, matrix multiplication is defined as:
$$
{\bf AB} = \left[\sum\limits_{k = 1}^n a_{ik} b_{kj}\right]
$$

for example:

$$
\begin{bmatrix} 
    2 & -1 & 0 \\
    -4 & 1 & 0
\end{bmatrix}
\begin{bmatrix} 
    0 & 1 & 6 & 0 \\
    -1 & 2 & 0 & 1 \\
    3 & 0 & 0 & 0
\end{bmatrix} = \begin{bmatrix} 
    1 & 0 & 12 & -1 \\
    -1 & -2 & -24 & 1
\end{bmatrix}
$$


## Transpose

Let ${\bf A} = [a_{ij}]$ be an $m \times n$ matrix. The transpose of $\bf A$ denoted $\bf A'$ is the $n \times m$ matrix obtained by interchanging the rows and columns of $\bf A$, so ${\bf A'} \equiv [a_{ji}]$.
For example:
$$
{\bf A} = \begin{bmatrix} 
    1 & 0 & 12 & -1 \\
    -1 & -2 & -24 & 1
\end{bmatrix}, \, 
{\bf A'} = \begin{bmatrix} 
    1 & -1 \\
    0 & -2 \\
    12 & -24\\
    -1 & 1
\end{bmatrix}
$$

```{r, echo = FALSE, eval = FALSE} 
## Partitioned matrix multiplication
```

## Properties of transpose

- $(A')' = A$
- If $A$ and $B$ are conformable, $(A + B)' = A' + B'$ and $(A - B)' = A' - B'$
- $(AB)' = B'A'$ and $(ABC)' = C'B'A'$

## Symmetric matrix

A symmetric matrix is a matrix which is it's own transpose:

$$A = A'$$

The product of a matrix by its transpose always yields a symmetric matrix:
$$
Q = AA'\\
Q' = (AA')' = (A')'A' = AA' = Q \\
$$

## Trace

The trace of a matrix is defined for square matrices. 
For any $n \times n$ matrix $\bf A$, the trace of this matrix, tr$\left(\bf A\right)$ is the sum of its diagonal elements: 

$$
\mathrm{tr}({\bf A}) = \sum\limits_{i = 1}^n a_{ii}
$$

## Inverse

An $n \times n$ matrix ${\bf A}$ has an inverse, denoted ${\bf A^{-1}}$, provided that ${\bf A^{-1}A} = {\bf I_n}$ and ${\bf AA^{-1}} = {\bf I_n}$. In this case ${\bf A}$ is said to be _invertible_ or _nonsingular_, otherwise it is said to be _noninvertible_ or _singular_.

## Linear independence

Let $\{{\bf x_1}, {\bf x_2}, ..., {\bf x_r}\}$ be a set of $n \times 1$ vectors. Those are linearly independent vectors if and only if, 
$${ \alpha_1 {\bf x_1} + \alpha_2 {\bf x_2} + ... + \alpha_r {\bf x_r} = {\bf 0} }$$
implies that ${\alpha_1 = \alpha_2 = ... = \alpha_r = 0}$

In case the set of vector is linearly dependent it means that at least one of the vectors can be written as a linear combination of the others.

## Rank

The rank of a matrix ${\bf A}$ is the maximum number of linearly independent columns of $\bf A$.

For example, 

$$
\begin{bmatrix} 
    1 & 20 \\
    5 & 100 \\
    3 & 60\\
    0 & 0
\end{bmatrix}
$$

> - can, at most, be of rank 2. Its rank is one because the second columns is 20 times the first one.

# Matrices in R

## A first example

```{r}
A <- matrix(data = c(1, 2, 3, 4, 5, 6), nrow = 2, byrow = FALSE)

A
```

Scalar multiplication

```{r}
A * 2
```

## A second example

```{r}
B <- matrix(data = c(1, 2, 3, 4, 5, 6), nrow = 3, byrow = FALSE)

B

A %*% B
```

## A third example

```{r}
C <- matrix(data = c(1, 7, 4, 6, 8, 2, 4, 9, 0), nrow = 3, byrow = FALSE)

Transpose
t(C)

Inverse
solve(C)
```

# Systems of equations

## A simple system

$$
\begin{cases} x + y + z = 2  \quad \qquad\qquad(1)\\ 5x - 8y + 2z = 2 \ \ \quad \qquad(2)\\ -2x + 3y - 5z = -3 \qquad (3)\end{cases}
$$

Solve by addition (or method of elimination) and substitution.

First step, add $2 \times (1)$ to $(3)$ and add $-5 \times (1)$ to $(2)$.

$$
\begin{cases} x + y + z = 2 \\ 
              - 13y - 3z = -8\\
              5y - 3z = 1
\end{cases}
$$


## Solving a system of equations (cont'd)


Second step, it looks like we can eliminate $z$ easily by substracting $(3)$ in $(2)$

$$
\begin{cases} x + y + z = 2 \\ 
              y = \frac{-8 + 3 z}{-13}\\
              5y - 3z = 1
\end{cases}
$$

You can then replace in $(3)$

$$
5 \left(\frac{-8 + 3 z}{-13}\right) - 3z = 1
$$

$$
z = \frac{1}{2}
$$


## Solving a system of equations (cont'd 2)

$$
\begin{cases} x + y + z = 2 \\ 
              y = \frac{-8 + 3 z}{-13}\\
              z = \frac{1}{2}
\end{cases}
$$

You can then replace  $z$ in $(2)$

$$ y = \frac{1}{2}$$

And finally we can replace in $(1)$ to obtain

$$x = 1$$

## Solving a system of equation: the matrix method

It is very similar to the previous method

Write down the system of equation as a matrix

$$\begin{bmatrix}
1 & 1 & 1 & | & 2\\
5 & -8 & 2 & | & 2\\
-2 & 3 & -5 & | & -3
\end{bmatrix}
$$

What we want to do is to transform the previous matrix into the following one:

$$\begin{bmatrix}
1 & 0 & 0 & | & a\\
0 & 1 & 0 & | & b\\
0 & 0 & 1 & | & c
\end{bmatrix}
$$

where $a$, $b$ and $c$ are the roots of the system.

## Solving the system

The first row of the first column is already where we want it with $x = 1$. Let's make the second and third row of the first column 0 by substracting $5 \times$ row 1 into row 2 and adding $2 \times$ row 1 into row 3.

$$\begin{bmatrix}
1 & 1 & 1 & | & 2\\
0 & -13 & -3 & | & -8\\
0 & 5 & -3 & | & 1
\end{bmatrix}
$$

Let's now make the second row of the second column 1 by dividing row 2 by -13.

$$\begin{bmatrix}
1 & 1 & 1 & | & 2\\
0 & 1 & \frac{3}{13} & | & \frac{8}{13}\\
0 & 5 & -3 & | & 1
\end{bmatrix}
$$

## 

Now let's make the third row of the second column of row 1 and row 3 zeros using the new row 2 we created (we substract row 2 from row 1 and substract $5 \times$ row 2 from row 3).

$$\begin{bmatrix}
1 & 0 & \frac{10}{13} & | & \frac{18}{13}\\
0 & 1 & \frac{3}{13} & | & \frac{8}{13}\\
0 & 0 & -\frac{54}{13} & | & -\frac{27}{13}
\end{bmatrix}
$$

## 

Now let's make row 3 of column 3 1 by multiplying it by $-\frac{13}{54}$

$$\begin{bmatrix}
1 & 0 & \frac{10}{13} & | & \frac{18}{13}\\
0 & 1 & \frac{3}{13} & | & \frac{8}{13}\\
0 & 0 & 1 & | & \frac{1}{2}
\end{bmatrix}
$$

Now we can replace create a new row 1 and 2 eliminating the values in the third column by substracting $\frac{3}{13} \times$ row 3 from row 2 and substracting $\frac{10}{13} \times$ row 3 from row 1.

## Solution

$$\begin{bmatrix}
1 & 0 & 0 & | & 1\\
0 & 1 & 0 & | & \frac{1}{2}\\
0 & 0 & 1 & | & \frac{1}{2}
\end{bmatrix}
$$


## Solving equations in R


```{r}
a <- matrix(data = c(1, 1, 1, 5, -8, 2, -2, 3, -5), 
            nrow = 3, byrow = TRUE)

b <- c(2, 2, -3)

a

solve(a, b)

```


# Further topics in matrix algebra

## Determinant

The determinant is defined for square matrices and is a scalar number.

Let $A = a_{ij}$ be a square matrix, with $i, j = 1,2, ..., n$

When $n = 2$, $$\det(A) = \begin{bmatrix}
a_{11} & a_{12}\\
a_{21} & a_{22}
\end{bmatrix} = a_{11}a_{22} - a_{21}a_{12}$$

Things get more complicated for larger matrix, thankfully for us, `R` has a function for it.

## Determinant (example)

```{r}
a <- matrix(data = c(5, 3, 2, 7), 
            nrow = 2, byrow = TRUE)
a 
det(a)

```


## Determinant (example 2)

```{r}
a <- matrix(data = c(5, 3, 2, 7, 9, 13, 15, 4, 6), 
            nrow = 3, byrow = TRUE)
a 
det(a)

```

## Determinant: properties

1. $\det(A) = \det(A')$
2. $\det(AB) = \det(A) \det(B)$ if $A$ and $B$ are square matrices of the same size
3. If $B = kA$, $\det(B) = k^n \det(A)$ for $n \times n$ matrices
4. $\det(A^{-1}) = \frac{1}{\det(A)} = \det(A)^{-1}$
5. the determinant of a triangular matrix is the product of its diagonal elements
6. $\det(I_n) = 1$ where $I_n$ is the identity matrix of dimension $n$
7. the determinant is a product of eigenvalues: $\det (A) = \prod_{i=1}^n \lambda_i(A)$
8. if $\det (A) \neq 0$ the rows and columns of $A$ are linearly independent

## Eigenvalue and eigenvector

If $A$ is an $n \times n$ square matrix, the eigenvalue $\lambda$ and the eigenvector $x$ of dimension $n \times 1$ are defined by:

$$ Ax = \lambda x
$$

with $x \neq 0$

This requires:

$$
(A - \lambda I_n)x = 0
$$
which leads to:

$$
\det(A - \lambda I_n) = 0
$$

## Example

If 
$$A = \begin{bmatrix} 4 & 5\\ 6 & 9 \end{bmatrix}$$ 

then 

$$(A - \lambda I_n) = \begin{bmatrix} 4 - \lambda & 5\\ 6 & 9 - \lambda \end{bmatrix}$$

so $$\det(A - \lambda I_n) = (4 - \lambda) (9 - \lambda) - 30 = 0$$

we have two eigenvalues that are the roots of this equation: $0.4792$ and $12.5208$, rounded to the 4^th^ decimal.


## Eigenvectors

Using the previous result, we can now calculate values for $x$ because:

$$
(A - \lambda I_n) x = \begin{bmatrix} (4 - \lambda) x_1 + 5 x_2\\ 
6 x_1 + (9 - \lambda) x_2
\end{bmatrix} = B x = \begin{bmatrix} 0\\ 
0
\end{bmatrix}
$$

However, there is no unique solution for this system.
It is usual to require that the vector $x$ lie on the unit circle in order to circumvent the non-uniqueness issue.

There will be as many eigenvectors as there are eigenvalues. 


## Positive definite and positive semi-definite matrices

we first need to define the quadratic form of a matrix:

Let $A$ be an $n \times n$ matrix, the quadratic form is the function defined for all $n \times 1$ vectors $x$:

$$
f(x)= x'Ax = \sum_{i=1}^n a_{ii} x_i^2 + 2\sum_{i=1}^n \sum_{j>i}^n a_{ij} x_i x_j
$$

## Positive definite and positive semi-definite matrices

A symmetric matrix A is positive definite if: 
$$x' A x > 0$$ for all $n \times 1$ vectors $x$ except $x = 0$

A symmetric matrix A is positive semi-definite if:
$$x' A x \geq 0$$ for all $n \times 1$ vectors


# {.flexbox .vcenter}

![](img/fin.png)