{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Linear Systems\n", "\n", "## Overview, Objectives, and Key Terms\n", " \n", "In this lecture, we consider the linear system $\\mathbf{Ax}=\\mathbf{b}$, where $\\mathbf{A}$ is a square $n\\times n$ matrix, while both $\\mathbf{x}$ and $\\mathbf{b}$ are vectors of $n$ elements. We'll examine the basic *elimination* algorithm and review how to set up and solve such systems with NumPy.\n", "\n", "### Objectives\n", "\n", "By the end of this lesson, you should be able to\n", "\n", "- Explain what it means for $\\mathbf{Ax}=\\mathbf{b}$ to have a solution.\n", "- Define $\\mathbf{Ax}=\\mathbf{b}$ from engineering or other inputs.\n", "- Solve linear systems using basic Gaussian elimination.\n", "- Solve linear systems using NumPy.\n", "\n", "### Key Terms\n", "\n", "- matrix\n", "- vector\n", "- matrix-vector product\n", "- column space\n", "- Gaussian elimination\n", "- `np.linalg.solve`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What Is A Linear System, and When Can It Be Solved?\n", "\n", "From an [earlier reading](ME400_Lecture_4.ipynb), the basic mechanics of vectors, matrices, and their products was reviewed using NumPy arrays. In addition, a very quick glance at solving small system was given. Here, it helps to review some fundamentals in a bit more depth before heading on to solving systems by elimination.\n", "\n", "Consider the matrix-vector product $w=\\mathbf{Av}$. To illustrate, let element $(i, j)$ of $\\mathbf{A}$ be $a_{ij}$ and element $i$ of the vector $\\mathbf{v}$ be $v_i$. Then, for the $3\\times 3$ case, we have\n", "\n", "$$\n", "\\mathbf{Av} = \n", "\\begin{bmatrix} \n", " a_{00} & a_{01} & a_{02} \\\\\n", " a_{10} & a_{11} & a_{12} \\\\\n", " a_{20} & a_{21} & a_{22} \\\\\n", "\\end{bmatrix} \n", "\\times\n", "\\left[ \\begin{array}{c} \n", " v_0 \\\\ \n", " v_1 \\\\\n", " v_2\n", " \\end{array} \\right] =\n", "\\left[ \\begin{array}{c} \n", " a_{00} v_0 + a_{01} v_1 + a_{02} v_2 \\\\ \n", " a_{10} v_0 + a_{11} v_1 + a_{12} v_2 \\\\ \n", " a_{20} v_0 + a_{21} v_1 + a_{22} v_2 \\\\ \n", " \\end{array} \\right] \n", "= \\mathbf{w} \\, .\n", "$$\n", "\n", "There are *two ways* to interpret the matrix-vector product $\\mathbf{w}$. The common way is to note that\n", "\n", "$$\n", " w_i = a_{i0} v_0 + a_{i1} v_1 + a_{i2} v_2 = \\mathbf{a}_{i:}^T \\mathbf{v} = \\mathbf{a}_{i:}\\cdot \\mathbf{v} \\, .\n", "$$\n", "\n", "Here, I'm leveraging the idea of *slicing*, so that $\\mathbf{a}_{i:}$ represents the $i$th row of the original matrix $\\mathbf{A}$. Viewed this way, the matrix-vector product is actually a series of dot products. Specifically, each element of $\\mathbf{w}$ is the dot product of one row of $\\mathbf{A}$ with the vector $\\mathbf{v}$, just like in the [previous reading](ME400_Lecture_4.ipynb). \n", "\n", "However, note the following way to represent $\\mathbf{w}$:\n", "\n", "$$\n", " \\mathbf{w} = \n", "v_0 \\left[ \\begin{array}{c} \n", " a_{00} \\\\ \n", " a_{10} \\\\\n", " a_{20}\n", " \\end{array} \\right] +\n", "v_1 \\left[ \\begin{array}{c} \n", " a_{01} \\\\ \n", " a_{11} \\\\\n", " a_{21}\n", " \\end{array} \\right] +\n", "v_2 \\left[ \\begin{array}{c} \n", " a_{02} \\\\ \n", " a_{12} \\\\\n", " a_{22}\n", " \\end{array} \\right] \n", "= v_0 \\mathbf{a}_{:0} + v_1\\mathbf{a}_{:1} + v_2\\mathbf{a}_{:2} \\, .\n", "$$\n", "\n", "Again, a slicing-like notation is used with which $\\mathbf{a}_{:j}$ represents the $j$th column of the original matrix $\\mathbf{A}$. With this view of $\\mathbf{w}$, I hope you see that $\\mathbf{w}$ is a *linear combination of the columns* of $\\mathbf{A}$. All possible linear combinations of $\\mathbf{A}$'s columns has a special name: the *column space* of $\\mathbf{A}$.\n", "\n", "> The *column space* of a matrix $\\mathbf{A}$ represents all possible linear combinations of its columns.\n", "\n", "Now, what has the column space to do with linear systems? A lot, it turns out. Consider the linear system $\\mathbf{Ax} = \\mathbf{b}$. We'll take $\\mathbf{A}$ to be a square, $m\\times m$ matrix, with $\\mathbf{x}$ and $\\mathbf{b}$ each having $m$ elements. That's $m$ equations for $m$ unknowns, and in theory, it can be solved. However, let's view the system explicitly:\n", "\n", "$$\n", "\\begin{split}\n", "\\mathbf{Ax} & = \n", "\\begin{bmatrix} \n", " a_{00} & a_{01} & \\cdots & a_{0,n-1} \\\\\n", " a_{10} & a_{11} & \\cdots & a_{1,n-1} \\\\\n", " \\vdots & \\vdots & \\ddots & \\vdots \\\\\n", " a_{m-1,0} & a_{m-1,2} & \\cdots & a_{m-1,n-1} \n", "\\end{bmatrix} \n", "\\times \n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \\\\\n", " \\vdots \\\\\n", " x_{m-1} \n", " \\end{array} \\right] \\\\\n", "& = x_0 \\mathbf{a}_{:, 0} + x_1 \\mathbf{a}_{:, 1} + \\ldots + x_{m-1} \\mathbf{a}_{:,m-1} \\\\\n", "& = \\mathbf{b}\n", "\\end{split}\n", "$$\n", "\n", "In other words, the left and right sides can only be equal if $\\mathbf{b}$ is a linear combination of the columns of $\\mathbf{A}$. In other words, $\\mathbf{b}$ must be in the columns space of $\\mathbf{A}$. \n", "\n", "If it turns out that $\\mathbf{b}$ is *not* in the column space of $\\mathbf{A}$, then that column space is *not big enough* and $\\mathbf{A}$ has at least one column that is a linear combination of its other columns. That is to say, the columns of such an $\\mathbf{A}$ are dependent. Moreover, $\\mathbf{A}$ is *not invertible*, and $\\mathbf{Ax}=\\mathbf{b}$ cannot be solved. Let's hope for good $\\mathbf{A}$'s, and we'll handle cases when that's not true as needed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "\n", "**Exercise**: Why does \n", "\n", "$$\n", "\\begin{bmatrix} \n", " 1 & 2 \\\\\n", " 1 & 2 \\\\\n", "\\end{bmatrix} \n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \n", "\\end{array} \\right] = \n", "\\left[ \\begin{array}{c} \n", " 1 \\\\\n", " 1\n", "\\end{array} \\right]\n", "$$\n", "\n", "have a solution but\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 1 & 2 \\\\\n", " 1 & 2 \\\\\n", "\\end{bmatrix} \n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \n", "\\end{array} \\right] = \n", "\\left[ \\begin{array}{c} \n", " 1 \\\\\n", " 0\n", "\\end{array} \\right]\n", "$$\n", "\n", "does not?\n", "\n", "**Solution**: In the first, the vector $\\mathbf{b}=[1, 1]^T$ (where $T$ means transpose) is exactly equal to the first column of the matrix, so $x_0$ could be 1 and $x_1$ could be zero. If you look closely, $x_0 = 0$ and $x_1 = 1/2$ is also a solution, and there are infinitely more. The important thing to note is that the second column of $\\mathbf{A}$ is equal to two times its first column---they are dependent, and in such cases, there are either an infinite number of solutions (as for $\\mathbf{b}=[1, 1]^T$) or no solution (as for $\\mathbf{b}=[1, 0]^T$). For the latter case, no combination of $[1, 1]^T$ and $[2, 2]^T$ can yield $[1, 0]^T$!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solving Linear Systems by Elimination\n", "\n", "Would you agree that solving the following system is straightforward to do?\n", "\n", "$$\n", "\\mathbf{Ux} = \n", "\\begin{bmatrix} \n", " u_{00} & u_{01} & u_{02} \\\\\n", " 0 & u_{11} & u_{12} \\\\\n", " 0 & 0 & u_{22} \\\\\n", "\\end{bmatrix} \n", "\\times\n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \\\\\n", " x_2\n", " \\end{array} \\right] =\n", "\\left[ \\begin{array}{c} \n", " c_0 \\\\ \n", " c_1 \\\\\n", " c_2\n", " \\end{array} \\right]\n", "$$\n", "\n", "If not, note that we can immediately find $x_2 = c_2 / u_{22}$, from which\n", "\n", "$$\n", " x_1 = (c_1 - u_{12} x_2)/u_{11} \\, ,\n", "$$\n", "\n", "and\n", "\n", "$$\n", " x_0 = (c_0 - u_{01} x_2 - u_{02} x_2)/u_{00} \\, .\n", "$$\n", "\n", "This was all made easy because $\\mathbf{U}$ is an *upper triangular* matrix, i.e., it has zeros below the main diagonal for which $i = j$. The goal of elimination is to turn $\\mathbf{A}$ into $\\mathbf{U}$ by eliminating entries below the main diagonal using linear operations." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remember, one can modify linear systems by adding multiples of one row to another without changing the solution. If that bell is too distant, consider the following system,\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 2 & 4 \\\\\n", " 3 & 5 \\\\\n", "\\end{bmatrix} \n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \n", "\\end{array} \\right] = \n", "\\left[ \\begin{array}{c} \n", " 6 \\\\\n", " 8\n", "\\end{array} \\right]\n", "$$\n", "\n", "for which the solution is $\\mathbf{x}=[1, 1]^T$. This is easily confirmed by noting $2\\cdot 1 + 4\\cdot 1=6$ and $3\\cdot 1+5 \\cdot 1 = 8$. \n", "\n", "Now, if we add the first row of the system to the second row (that must include the right-hand side of the equation!), we find\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 2 & 4 \\\\\n", " 3+2 & 5+4 \\\\\n", "\\end{bmatrix} \n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \n", "\\end{array} \\right] = \n", "\\left[ \\begin{array}{c} \n", " 6 \\\\\n", " 8+6\n", "\\end{array} \\right]\n", "$$\n", "\n", "which has the *same* solution: $2\\cdot 1 + 4\\cdot 1$ is still $6$, and now $(3+2)\\cdot 1 + (5+4)\\cdot 1 = 5+9 = 14$. The same $\\mathbf{x}$ works. This is not a proof by any means, but a good sanity check. Of course, the *right* modification would be to subtract $3/2$ times that first row from the second, which leads to\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 2 & 4 \\\\\n", " 3+(-3/2)2 & 5+(-3/2)4 \\\\\n", "\\end{bmatrix} \n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \n", "\\end{array} \\right] = \n", "\\left[ \\begin{array}{c} \n", " 6 \\\\\n", " 8+(-3/2)6\n", "\\end{array} \\right]\n", "$$\n", "\n", "or\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 2 & 4 \\\\\n", " 0 & -1 \\\\\n", "\\end{bmatrix} \n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \n", "\\end{array} \\right] = \n", "\\left[ \\begin{array}{c} \n", " 6 \\\\\n", " -1\n", "\\end{array} \\right]\n", "$$\n", "\n", "Note that $\\mathbf{A}$ turned into $\\mathbf{U}$, which is called the [row echelon form](https://en.wikipedia.org/wiki/Row_echelon_form) in fancy linear algebra circles. From this form, we find immediately that $x_1 = (-1)/(-1) = 1$, and that $x_0 = (6 - 4x_1)/2 = 1$. The key was the multiple of the first row we selected, i.e., -3/2. Why -3/2? Here, 3 is the value we want to knock out. Moreover, 2 is the value in the same column as the element to be eliminated that lives in the row we are adding to others. Often, 2 is called the *pivot*. More generally, the pivot is on the main diagonal, and everything below it in the same column is to be eliminated. Things go bad when the pivot is zero because anything divided by zero is undefined. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's formalize this elimination process using (some rather) loose pseudocode for a square system of any size. Here, the output will be $\\mathbf{U}$ (from $\\mathbf{A})$ and the adjust right-hand side vector $\\mathbf{c}$ (from $\\mathbf{b}$):\n", "\n", "```\n", "\"\"\"Algorithm to turn A into U and b into c by elimination\"\"\"\n", " 1. Input: A, b # matrix A and right-hand side b\n", " 2. Set U to A\n", " 3. Set c to b\n", " 4. For all columns j in U\n", " 5. # Set the pivot, below which we want to eliminate all values.\n", " 6. Set pivot to U[j, j]\n", " 7. If pivot is zero then\n", " 8. Scream for help!\n", " 9. # Eliminate/adjust values below and to the right of the pivot.\n", "10. For all rows i > j in U\n", "11. Set multiple to U[i, j]/pivot\n", "12. # Knock out the element to be eliminated\n", "13. U[i, j] = U[i, j] - U[j, j]*multiple \n", "14. # Adjust all the other elements to the right\n", "15. Set c[i] = c[i] - c[j]*multiple\n", "16. For all columns k > j in U\n", "17. U[i, k] = U[i, k] - U[j, k]*multiple \n", "18. Output: U and c\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "**Exercise**: Trace the elimination algorithm *by hand* for the following linear systems:\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 2 & -1 & 0 \\\\\n", " -1 & 2 & -1 \\\\\n", " 0 & -1 & 2 \\\\\n", "\\end{bmatrix} \n", "\\times\n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \\\\\n", " x_2\n", " \\end{array} \\right] =\n", "\\left[ \\begin{array}{c} \n", " 1 \\\\ \n", " 1 \\\\ \n", " 1 \\\\ \n", " \\end{array} \\right] \n", "$$\n", "\n", "\n", "***\n", "**Exercise**: Trace the elimination algorithm *by hand* for the following linear system:\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 1 & 2 & 3 \\\\\n", " 4 & 5 & 6 \\\\\n", " 7 & 8 & 9 \\\\\n", "\\end{bmatrix} \n", "\\times\n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \\\\\n", " x_2\n", " \\end{array} \\right] =\n", "\\left[ \\begin{array}{c} \n", " 1 \\\\ \n", " 1 \\\\ \n", " 1 \\\\ \n", " \\end{array} \\right] \n", "$$\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In Python, the following function implements the same elimination algorithm:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "def eliminate(A, b):\n", " U, c = np.copy(A), np.copy(b)\n", " for j in range(len(U)):\n", " pivot = U[j, j]\n", " assert pivot != 0.0, \"arggggh!\"\n", " for i in range(j+1, len(A)):\n", " multiple = U[i, j]/pivot\n", " c[i] = c[i] - c[j]*multiple\n", " for k in range(j, len(A)):\n", " U[i, k] = U[i, k] - U[j, k]*multiple\n", " return U, c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "\n", "**Exercise**: Use the function `eliminate` to reduce the linear system\n", "\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 2 & -1 & 0 \\\\\n", " -1 & 2 & -1 \\\\\n", " 0 & -1 & 2 \\\\\n", "\\end{bmatrix} \n", "\\times\n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \\\\\n", " x_2\n", " \\end{array} \\right] =\n", "\\left[ \\begin{array}{c} \n", " 1 \\\\ \n", " 1 \\\\ \n", " 1 \\\\ \n", " \\end{array} \\right] \n", "$$\n", "\n", "**Solution**:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 2. , -1. , 0. ],\n", " [ 0. , 1.5 , -1. ],\n", " [ 0. , 0. , 1.33333333]])" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[ 2.0, -1.0, 0.0],\n", " [-1.0, 2.0, -1.0],\n", " [ 0.0, -1.0, 2.0]])\n", "b = np.array([1.0, 1.0, 1.0])\n", "U, c = eliminate(A, b)\n", "U" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Is this value for $\\mathbf{U}$ what you found by hand for the first exercise above? If not, go back and try it (again).\n", "\n", "***\n", "\n", "**Exercise**: Perform a numerical experiment to determine the *order* of the elimination algorithm. \n", "\n", "**Solution (Partial)**: Remember, the elapsed time of a function call can be computed using, e.g.,\n", "```python\n", "from time import time\n", "t0 = time() # initial time\n", "some_function_call() \n", "te = time() - t0 # elapsed time (in seconds)\n", "```\n", "\n", "Moreover, the algorithm does not care about the values of $\\mathbf{A}$ (assuming no zero pivots), just how many there are. Hence, a good way to make a matrix of size $n$ is to use `np.random.rand(n, n)`. That should get you moving in the right direction.\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Elimination is the hard part, but we do not yet have the solution $\\mathbf{x}$. However, we know from $\\mathbf{U}$ what the last element of $\\mathbf{x}$, and then we work *backward* by *substituting* each known value into the equation for the next unknown value. Here is that approach in pseudcode, where $\\mathbf{U}$ and $\\mathbf{c}$ are, e.g., outputs from the elimination process:\n", "\n", "```\n", "\"\"\"Algorithm for back substitution of an eliminated system\"\"\"\n", "Input: U, c\n", "Set m to the size of c\n", "Set x to an array with m zeros\n", "Set i = n - 1\n", "While i >= 0\n", " # Add up all the pieces that go on the right hand side\n", " Set rhs = c[i] \n", " Set j = i + 1 \n", " While j < n\n", " Set rhs = rhs - U[i, j]*x[j] # Recall that x[j] is known!\n", " Set j = j + 1\n", " Set x[i] = rhs / U[i, i] # Solve for the next unknown\n", " Set i = i - 1\n", "Output: x\n", "```\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "\n", "**Exercise**: Trace the back-substitution algorithm *by hand* for the following system:\n", "\n", "$$\n", "\\begin{bmatrix} \n", " 2 & -1 & 0 \\\\\n", " 0 & 3/2 & -1 \\\\\n", " 0 & 0 & 4/3 \\\\\n", "\\end{bmatrix} \n", "\\times\n", "\\left[ \\begin{array}{c} \n", " x_0 \\\\ \n", " x_1 \\\\\n", " x_2\n", " \\end{array} \\right] =\n", "\\left[ \\begin{array}{c} \n", " 1 \\\\ \n", " 3/2 \\\\ \n", " 2 \\\\ \n", " \\end{array} \\right] \n", "$$\n", "\n", "For your trace, record the values of `i`, `j`, `rhs`, `x[0]`, `x[1]`, and `x[2]`.\n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise**: Implement the back-substitution algorithm as the Python function `back_substition(U, c)`.\n", "\n", "***\n", "\n", "**Exercise**: Perform a numerical experiment to determine the *order* of the back-substitution algorithm. \n", "\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Together, the *elimination* and *back-subsitution* algorithms form what is traditionally called *Gaussian elimination* for the solution of $\\mathbf{Ax}=\\mathbf{b}$. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Elimination via NumPy and SciPy\n", "\n", "NumPy has a built in capability to do elimination directly for solving $\\mathbf{Ax}=\\mathbf{b}$:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 1.5, 2. , 1.5])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[ 2.0, -1.0, 0.0],\n", " [-1.0, 2.0, -1.0],\n", " [ 0.0, -1.0, 2.0]])\n", "b = np.array([1.0, 1.0, 1.0])\n", "np.linalg.solve(A, b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we can get the LU factorization of $\\mathbf{A}$ by using SciPy. Recall that the LU factorization of $\\mathbf{A}$ is \n", "\n", "$$\n", "\\mathbf{A} = \\mathbf{LU} \\, ,\n", "$$\n", "\n", "where $\\mathbf{U}$ is an upper triangle matrix and $\\mathbf{L}$ is a lower triangular matrix with ones on the main diagonal. It turns out, $\\mathbf{U}$ is what we've computed all along during elimination, while the `multiple` defined in elimination used to adjust element $u_{ij}$ is actually $1/l_{ij}$. Hence, we could define $\\mathbf{L}$ as part of the elimination algorithm if desired.\n", "\n", "Using SciPy, we find" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1. 0. 0.]\n", " [ 0. 1. 0.]\n", " [ 0. 0. 1.]]\n", "[[ 1. 0. 0. ]\n", " [-0.5 1. 0. ]\n", " [ 0. -0.66666667 1. ]]\n", "[[ 2. -1. 0. ]\n", " [ 0. 1.5 -1. ]\n", " [ 0. 0. 1.33333333]]\n" ] } ], "source": [ "import scipy as sp\n", "import scipy.linalg\n", "P, L, U = sp.linalg.lu(A)\n", "print(P)\n", "print(L)\n", "print(U)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note, that $\\mathbf{U}$ is identical to what was computed above using our own `eliminate` function. Moreover, the product $\\mathbf{LU}$ gives us back the matrix $\\mathbf{A}$:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 2., -1., 0.],\n", " [-1., 2., -1.],\n", " [ 0., -1., 2.]])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L.dot(U)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What's that matrix $\\mathbf{P}$? It's a permutation matrix. Here, it's just the identity, but watch what happens when we compute the LU factorization of this matrix:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "B = np.array([[0, -1, 2], \n", " [-1, 2 , -1], \n", " [2, -1, 0]])" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0. 0. 1.]\n", " [ 0. 1. 0.]\n", " [ 1. 0. 0.]]\n", "[[ 1. 0. 0. ]\n", " [-0.5 1. 0. ]\n", " [ 0. -0.66666667 1. ]]\n", "[[ 2. -1. 0. ]\n", " [ 0. 1.5 -1. ]\n", " [ 0. 0. 1.33333333]]\n" ] } ], "source": [ "P, L, U = sp.linalg.lu(B)\n", "print(P)\n", "print(L)\n", "print(U)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, $\\mathbf{P}$ is not the identify, but $\\mathbf{L}$ and $\\mathbf{U}$ are the same as they were for $\\mathbf{A}$. Hence, we still have $\\mathbf{A}=\\mathbf{LU}$, and then $\\mathbf{P}$ reorders the rows of $\\mathbf{A}$ to get $\\mathbf{B}$:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0., -1., 2.],\n", " [-1., 2., -1.],\n", " [ 2., -1., 0.]])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "P.dot(L.dot(U))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You'll see why we needed permutation of $\\mathbf{B}$. Look at that very first element: a zero pivot. (Try it yourself by modifying the code from above!). The trick is to find a better row. If there is a nonzero element below the zero pivot, then that row can be swapped with the row in question. If no such rows can be found, the game is over! No more pivots means not invertible. \n", "\n", "Even when a pivot exists, it's not always the best one possible. Imagine a pivot equal to $10^{-12}$. That means division by a small number, which leads to a big number. Addition of a big number to smaller numbers leads to something call *round-off error*, due to which precision of a result is lost. Hence, a different pivot might be better, and choosing the one of largest magnitude is the process called *partial pivoting*, which is pretty common in production-level solvers (like SciPy's LU implementation)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Further Reading\n", "\n", "None for now." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }