{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data Analysis via Interpolation \n",
    "\n",
    "## Overview, Objectives, and Key Terms\n",
    " \n",
    "In this lecture, we continue the theme of data analysis initiated in [Lecture 24](ME400_Lecture_24.ipynb), focusing now on the process of **interpolation**.  Whereas the fitted models of [Lecture 24](ME400_Lecture_24.ipynb) allow one to produce simple models from noisy (or otherwise complicated) data, interpolation allows one to estimate new values from known discrete points that are assumed to be noise free.\n",
    "\n",
    "\n",
    "### Objectives\n",
    "\n",
    "By the end of this lesson, you should be able to\n",
    "\n",
    "- Apply linear interpolation to evaluate $f(x)$ given discrete points $(x_i, f(x_i))$ for $i = 0, 1 \\ldots$.\n",
    "- Apply linear interpolation to evaluate $f(x, y)$ given discrete points $(x_i, y_j, f(x_i, y_j))$ for $\\, i = 0, 1, \\ldots$ and $j = 0, 1,\\ldots$.\n",
    "- Apply tools from NumPy and SciPy to perform high-order (e.g., cubine-spline) interpolation \n",
    "\n",
    "### Key Terms\n",
    "\n",
    "- linear interpolation\n",
    "- `np.interp`\n",
    "- `scipy.interpolate`\n",
    "- `interp1d`\n",
    "- `interp2d`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## Simple Linear Interpolation\n",
    "\n",
    "A common task in many thermal engineering courses is to use tabulated physical properties.  Often, the value needed is not in the table, and *linear interpolation* is required to estimate the missing value.  \n",
    "\n",
    "Let the quantity of interest be $y(x)$, for which values $y_0, y_1, \\ldots$ are provided at $x_0, x_1, \\ldots$.  Suppose the value of $y$ at $x = \\tilde{x}$ is desired.  The first step is to determine which two values $x_i$ and $x_{i+1}$ contain the point $\\tilde{x}$.  This step is accomplished using the linear or binary search algorithms developed in [Lecture 14](ME400_Lecture_14.ipynb). \n",
    "\n",
    "***\n",
    "\n",
    "**Exercise**:  Apply linear search to find $i$ such that $\\tilde{x} \\in [x_i, x_{i+1}]$ for $\\tilde{x} = 3.5$ and $x = [0, 1, 2, 3, 4, 5]$.\n",
    "\n",
    "***\n",
    "\n",
    "\n",
    "**Exercise**: Repeat the exercise using binary search. \n",
    "\n",
    "***\n",
    "\n",
    "**Exercise**: For the same $\\tilde{x}$ and points $x$, evaluate `np.array(x) < x_tilde` and `sum(np.array(x) < x_tilde)`.  What can you conclude?   \n",
    "\n",
    "***\n",
    "\n",
    "**Exercise**: Use `help` to learn about `np.searchsorted`. Then, for the same $\\tilde{x}$ and points $x$, evaluate `np.searchsorted(x, x_tilde)`.  What can you conclude?\n",
    "\n",
    "***\n",
    "\n",
    "Once $i$ is determined, $(x_i, y(x_i))$ and $(x_{i+1}, y(x_{i+1}))$ are connected by a line of the form $ax + b$, and $y(\\tilde{x})$ is evaluated from $a\\tilde{x} + b$.  The trick is to determine $a$ and $b$, but that's easy: we have two equations $y_i = ax_i + b$ and $y_{i+1} = a x_{i+1} + b$.  Just solve for $a$ and $b$:\n",
    "\n",
    "$$\n",
    "   a = \\frac{y_{i+1}-y_i}{x_{i+1}-x_i} \\quad \\text{and} \\quad b = y_i - \\frac{y_{i+1}-y_i}{x_{i+1}-x_i} x_i \\, .\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "***\n",
    "\n",
    "**Exercise**:  Consider the following table of thermodynamic data for saturated steam (saved as [thermo_data.txt](thermo_data.txt)):\n",
    "\n",
    "```\n",
    "  T [K]   P [MPa]  h_l [kJ/kg] h_v [kJ/kg]\n",
    " 373.150   0.101    2675.572     419.099\n",
    " 389.817   0.179    2701.054     489.625\n",
    " 406.483   0.298    2724.633     560.634\n",
    " 423.150   0.476    2745.919     632.252\n",
    " 439.817   0.730    2764.528     704.626\n",
    " 456.483   1.082    2780.059     777.934\n",
    " 473.150   1.555    2792.062     852.393\n",
    " 489.817   2.175    2800.004     928.269\n",
    " 506.483   2.972    2803.247    1005.893\n",
    " 523.150   3.976    2801.012    1085.687\n",
    " 539.817   5.222    2792.314    1168.202\n",
    " 556.483   6.745    2775.820    1254.194\n",
    " 573.150   8.588    2749.574    1344.771\n",
    "```\n",
    "\n",
    "Evaluate $h_l$ at 515 K using linear interpolation.\n",
    "\n",
    "*Solution*\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2802.1048931421374"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# first load the data\n",
    "import numpy as np\n",
    "T, P, h_l, h_v = np.loadtxt('thermo_data.txt', unpack=True, skiprows=1)\n",
    "# find i such that T[i-1] <= 500 <= T[i]\n",
    "T_tilde = 515\n",
    "for i in range(len(T)):\n",
    "    if T[i] > T_tilde:\n",
    "        break\n",
    "# compute slope and intercept\n",
    "a = (h_l[i]-h_l[i-1])/(T[i]-T[i-1]) \n",
    "b = h_l[i-1] - a*T[i-1]\n",
    "# compute approximation\n",
    "h_l_tilde = a*T_tilde + b\n",
    "h_l_tilde"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "***\n",
    "\n",
    "**Exercise**: Adapt the solution above to produce a function that returns $h_l$ at *any* temperature *T*.  However, add assertions to ensure the user does not request $h_l$ at temperatures outside the range of the data.  In other words, limit the function to *interpolation* and exclude any *extrapolation*.\n",
    "\n",
    "***\n",
    "\n",
    "**Exercise**:  Suppose $\\tilde{x}$ is beyond $x_{n-1}$, the last of the $n$ values of $x$ in the tabulated data.  How would one linearly *extrapolate* the data to obtain $y(\\tilde{x})$?\n",
    "\n",
    "***\n",
    "\n",
    "**Exercise**:  Execute `help(np.interp)`.  Use it to interpolate $x = [1, 2, 3]$ and $y = [2, 8, 18]$ for $\\tilde{x} = 1.5$ and $\\tilde{x} = 2.5$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Interpolation In Two Dimensions\n",
    "\n",
    "Some applications require interpolation of two-dimensional data evaluated at $(x_i, y_j)$ for $i=0,1,\\ldots$ and $j=0,1,\\ldots$.  Linear interpolation of such data is based on the model $f(x, y) \\approx ax + by + c$ for $x \\in [x_i, x_{i+1}]$ and $y \\in [y_j, y_{j+1}]$.  \n",
    "\n",
    "The trick now is to find $i$ and $j$ for values $\\tilde{x}$ and $\\tilde{y}$ for which $f(x, y)$ is desired.  However, the *same* ideas applied for one dimension work for this two-dimensional problem. To do so requires an extra step: \n",
    "\n",
    "  1. Again, first assume that $i$ and $j$ are found by an appropriate search.\n",
    "  2. Then, find $f(\\tilde{x}, y_{j})$ and $f(\\tilde{x}, y_{j+1})$ via *two separate* interpolations.  \n",
    "  3. Finally, find $f(\\tilde{x}, \\tilde{y})$ by linearly interpolating the two values $f(\\tilde{x}, y_{j})$ and $f(\\tilde{x}, y_{j+1})$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "***\n",
    "\n",
    "**Exercise**: Given $f(1, 1) = 6$, $f(1, 2) = 9$, $f(2, 1) = 8$, and $f(2, 2) = 11$, estimate $f(4/3, 7/4)$ by linear interpolation.  \n",
    "\n",
    "*Solution*:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8.9166666666666661"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# define the given data\n",
    "x = [1, 2]\n",
    "y = [1, 2]\n",
    "f = np.array([[6, 9],\n",
    "              [8, 11]])\n",
    "xtilde = 4/3\n",
    "ytilde = 7/4\n",
    "# interpolate to get f(xtilde, 1)\n",
    "a = (f[1, 0]-f[0, 0])/(x[1] - x[0]) \n",
    "b = f[0, 0] - a*x[0]\n",
    "f_xt_1 = a*xtilde + b\n",
    "# interpolate to get f(xtilde, 2)\n",
    "a = (f[1, 1]-f[0, 1])/(x[1] - x[0]) \n",
    "b = f[0, 1] - a*x[0]\n",
    "f_xt_2 = a*xtilde + b\n",
    "# interpolate to get f(xtilde, ytilde)\n",
    "a = (f_xt_2 - f_xt_1)/(y[1] - y[0])\n",
    "b = f_xt_1 - a*y[0]\n",
    "f_xt_yt = a*ytilde + b\n",
    "f_xt_yt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "***\n",
    "\n",
    "**Exercise**: It turns out the previous exercise produces a perfect value for $f(4/3, 7/4)$ when $f(x, y) = 2x + 3y + 1$.  Can you recover $a=2$, $b=3$, and $c=1$ from your interpolation process?\n",
    "\n",
    "***\n",
    "\n",
    "**Exercise**: Suppose $f(1, 1) = 7$, $f(1, 2) = 11$, $f(2, 1)=10$, and $f(2, 2)=15$.  Compute $f(4/3, 7/4)$.\n",
    "\n",
    "***\n",
    "\n",
    "**Exercise**: Consider the following table of water densities ($\\rho$) for different temperatures [K] and pressures [MPa] (from [thermodata_2.txt](thermodata_2.txt)):\n",
    "\n",
    "```\n",
    "  T [K]   P [MPa]  rho [kg/m^3]\n",
    " 293.150   2.000    999.073\n",
    " 293.150   4.000    999.982\n",
    " 293.150   6.000   1000.888\n",
    " 293.150   8.000   1001.791\n",
    " 313.150   2.000    993.053\n",
    " 313.150   4.000    993.924\n",
    " 313.150   6.000    994.790\n",
    " 313.150   8.000    995.653\n",
    " 333.150   2.000    984.037\n",
    " 333.150   4.000    984.903\n",
    " 333.150   6.000    985.766\n",
    " 333.150   8.000    986.625\n",
    " 353.150   2.000    972.650\n",
    " 353.150   4.000    973.538\n",
    " 353.150   6.000    974.422\n",
    " 353.150   8.000    975.301\n",
    " 373.150   2.000    959.242\n",
    " 373.150   4.000    960.172\n",
    " 373.150   6.000    961.097\n",
    " 373.150   8.000    962.017\n",
    "```\n",
    "Use linear interpolation to estimate $\\rho(300, 3)$, $\\rho(330, 5)$, and $\\rho(360, 7)$. \n",
    "\n",
    "***\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Higher-Order Interpolation\n",
    "\n",
    "Linear interpolation is conceptually straightforward, but it helps to use tools like `np.interp` (see the exercises above).  However, interpolation that is *better* than linear is more challenging to implement, and we'll turn to the `scipy.interpolate` module and its `interp1d` and `interp2d` utilities.  \n",
    "\n",
    "The syntax for `interp1d` is quite simple: `f = interp1d(x, y, kind)`, where `x` and `y` are the points to interpolate, and `kind` is a `str` argument equal to `'linear'` (default), `'quadratic'`, `'cubic'`, among other possibilities.  Here, `linear` leads to linear interpolation equivalent to the method introduced above.  Quadratic and cubic interpolation involve defined quadratic curves over sets of three points or cubic functions over sets of four points.  Of course, if the underlying data were actually quadratic or cubic, these higher-order interpolation schemes should be expected to do better.  The resulting `f` is a callable function that accepts $\\tilde{x}$, the point at which the data is to be estimated. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "***\n",
    "\n",
    "**Example**: Interpolate $x = [1, 2, 3, 4, 5]$ and $y = [1, 4, 9, 16, 25]$ for $\\tilde{x} = 4.5$ \n",
    "\n",
    "*Solution*:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array(20.5), array(20.25), 20.25)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from scipy.interpolate import interp1d\n",
    "f_lin = interp1d([1,2,3,4,5], [1,4,9,16,25], kind='linear')\n",
    "f_quad = interp1d([1,2,3,4,5], [1,4,9,16,25], kind='quadratic')\n",
    "f_lin(4.5), f_quad(4.5), 4.5**2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "***\n",
    "\n",
    "The syntax for `interp2d` is equally simple: `f = interp2d(x, y, f, kind)`.  Here, `x` and `y` are individual arrays, while `f` is a 2-D array representing $f(x_i, y_j)$ for $i = 0, 1, \\ldots$ and $j = 0, 1, \\ldots$.  Any easy way to evaluate `f` is to use `np.meshgrid`.  Finally, `kind` is a `str` argument that can be `'linear'` and`'cubic'` (among others, but *not* `'quadratic'`).\n",
    "\n",
    "***\n",
    "\n",
    "**Exercise**:  Given `x = np.linspace(0, 10, 11)`, `y = np.linspace(0, 10, 11)`, and `z = lambda x, y: 1 + x + y + x**2 + y**2 + x**4*y**4` (a decidely nonlinear function), evaluate `f(4.5, 6.5)` using linea and cubic interpolation.\n",
    "\n",
    "*Solution*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([ 814339.25]), array([ 731923.8169771]), 732061.69140625)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.linspace(0, 10, 11)\n",
    "y = np.linspace(0, 10, 11)\n",
    "xx, yy  = np.meshgrid(x, y)\n",
    "z = lambda x, y: 1 + x + y + x**2 + y**2 + x**4*y**4\n",
    "from scipy.interpolate import interp2d\n",
    "f_lin = interp2d(x, y, z(xx, yy), kind='linear')\n",
    "f_cub = interp2d(x, y, z(xx, yy), kind='cubic')\n",
    "f_lin(4.5, 6.5), f_cub(4.5, 6.5), z(4.5, 6.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Further Reading\n",
    "\n",
    "In this lecture, interpolation has been limited to regular grids of points along the $x$ and $y$ axes.  When points are more than one dimension are *not* on such grids, different methods are needed.  [Barycentric interpolation](https://en.wikipedia.org/wiki/Barycentric_coordinate_system) is one option.  In 2-D, points can be organized as triangles, and linear interpolation can be applied within a single triangle.  In 3-D, triangles become pyramids, and in higher dimensions, one interpolates over general *n-simplices*."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}