{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Basics of Functions\n", "\n", "## Overview, Objectives, and Key Terms\n", " \n", "With some simple programs under our belts, it is time to *modularize* our programs by using *functions*. In this lecture and [Lecture 13](ME400_Lecture_13.ipynb), you will learn how to define your own functions to meet a variety of needs. \n", "\n", "### Objectives\n", "\n", "By the end of this lesson, you should be able to\n", "\n", "- Define a function that accepts (zero or more) input arguments and returns (zero or more) values.\n", "- Explain the meaning of a named and default argument.\n", "- Use *unpacking* to define multiple variable in a single statement.\n", "- Include functions in flowcharts.\n", "\n", "\n", "### Key Terms\n", "\n", "- function\n", "- `def`\n", "- call\n", "- argument\n", "- return value\n", "- unpacking\n", "- named (or keyword) argument\n", "- default value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is a Function?\n", "\n", "For our purposes, a function is something that is executed (possibly with input) and provides some sort of output. Often, this output will be a value (or several values) explicitly returned by the function. However, functions can also be used to modify the very data given to them as input. Although you’ve been using functions all along (e.g., those provided by the math module like `math.cos`), you'll understand how to define and use your own functions by the end of this lesson.\n", "\n", "Let's start by example. Consider the sum of an array $x$, defined mathematically as $s = \\sum^n_{i=1} x_i$ and computed in Python via " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "x = [1, 3, 4, 2, 4] \n", "s = 0\n", "for i in range(len(x)):\n", " s += x[i]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This short program is specific to the value of `x` defined. If it were to be applied for any value of `x`, it must first be turned into a function. Functions, like conditional statements and loops, are defined in Python using a special *keyword*. The basic structure is as follows:\n", "\n", "```python\n", "def function_name(arg1, arg2, ...):\n", " # do something to define rval1\n", " # do something else to define rval2\n", " # and so on...\n", " return rval1, rval2, ...\n", "```\n", "\n", "Following the `def` keyword is the name of the function and a pair of parentheses. Inside these parentheses are the names of zero or more input *arguments*. Each argument name (e.g., `arg1`) can be used within the function like normal variables. After all computations are performed, the function can include a `return` statement with zero or more *return values* (e.g., `rval1`). As observed for `if`, `while`, and `for` statements, the block of code following the `def` line must be indented. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's adapt the structure shown above to the summation problem. The input required is the sequence of numbers `x`, and the output is the sum of `x`. Hence, we need one input *argument* and one *return value*:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def compute_sum(x):\n", " s = 0\n", " for i in range(len(x)):\n", " s += x[i]\n", " return s" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, just like the other functions we've used (e.g.,`math.cos`), the function `compute_sum` can be **called** repeatedly with different arguments:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compute_sum([1, 2, 3, 4])" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "600" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compute_sum([100, 200, 300])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even though the initial `x` was a list, the input to `compute_sum` need not be a list: it just has to support indexing via `[]` and have values that are numbers. \n", "\n", "> **Exercise**: See what happens when `(1, 2, 3)` and `np.linspace(0, 10)` are given to `compute_sum`. \n", "\n", "> **Exercise**: Write a function named `factorial` that computes $n!$.\n", "\n", "> **Exercise**: Write a function that determines whether a positive integer `n` is prime. A prime number is any number divisible only by one and itself. By definition, one is not considered prime. The function should return `True` or `False`.\n", "\n", "> **Exercise**: Write a function with no arguments and no return values that prints out the day of the week. Hint: look up the module `datetime`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One potential pitfall when defining functions is to forget to return a value. Consider this alternative (and *wrong*) version of the summation function and how it can lead to unexpected `TypeError` errors:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def compute_sum_wrong(x):\n", " s = 0\n", " for i in range(len(x)):\n", " s += x[i]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "skip_exceptions": true }, "outputs": [ { "ename": "TypeError", "evalue": "unsupported operand type(s) for *=: 'NoneType' and 'int'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0ms\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcompute_sum_wrong\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;31m# and then double it\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0ms\u001b[0m \u001b[0;34m*=\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: unsupported operand type(s) for *=: 'NoneType' and 'int'" ] } ], "source": [ "# define a sum\n", "s = compute_sum_wrong([1, 2, 3])\n", "# and then double it\n", "s *= 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The problem appears to be the statement `s *= 2`, but `s` should be an `int` value, right? Nope:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "NoneType" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Any function without explicit return values returns `None`. \n", "\n", "> **Warning**: Be sure to check that functions return any values expected when called." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Functions with Multiple Return Values\n", "\n", "In some applications, programs can be simplified by using functions with multiple return values. For example, consider the problem of computing the mean and standard deviation of an array $x$ with $n$ elements. Recall, the mean of $x$ is $\\bar{x} = \\frac{1}{n} \\sum^n_{i=1} x_i$, while the variance is defined by $\\mu = \\frac{1}{n}\\sum^n_{i=1} (x_i -\\bar{x})^2$. Surely, one could implement two separate functions (like `np.mean` and `np.var`), but they can be combined in one function:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def mean_and_var(x):\n", " mu = 0\n", " for i in range(len(x)):\n", " mu += x[i]\n", " mu /= len(x)\n", " var = 0\n", " for i in range(len(x)):\n", " var += (x[i] - mu)**2\n", " var /= len(x)\n", " return mu, var" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, calling this function with `x = [1, 2, 3]` leads to" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2.0, 0.6666666666666666)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vals = mean_and_var([1, 2, 3])\n", "vals" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output (here, `vals`) is a `tuple`. Actually, this is expected: the statement `return mu, var` is equivalent to `return (mu, var)`. With `vals` defined, one could set `mu = vals[0]` and `var = vals[1]`, but it is simpler to use" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.0 0.6666666666666666\n" ] } ], "source": [ "mu, val = mean_and_var([1, 2, 3])\n", "print(mu, val)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In fact, this shows a neat feature of Python: the elements of a `tuple` (or `list`) can be assigned to several names all at once using a process called **unpacking**. This feature of Python can yield very compact, multiple assignments and lets one swap two values with one statement. For example:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 2\n", "2 1\n", "1 2\n", "2 1\n" ] } ], "source": [ "a, b = (1, 2) # a and b from tuple\n", "print(a, b)\n", "a, b = [2, 1] # a and b from list\n", "print(a, b)\n", "c, d = b, a # c and d from b and a\n", "print(c, d)\n", "c, d = d, c # swap c and d\n", "print(c, d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Just as one must be careful to ensure the correct values are returned by a function, one must also be sure to capture that output correctly. If, for example, a function returns three values, then one cannot set two values equal to that function's output, e.g.," ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "ename": "ValueError", "evalue": "too many values to unpack (expected 2)", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mreturn_three\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mreturn_three\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mValueError\u001b[0m: too many values to unpack (expected 2)" ] } ], "source": [ "def return_three():\n", " return 1, 2, 3\n", "a, b = return_three()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, a `ValueError` pops up, now because we attempted to unpack three return values into two variables." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Functions with Multiple Arguments\n", "\n", "Functions like `plt.plot` and `np.dot` naturally accept two (or more) input arguments to produce a particular output (be it a visualization or the dot product of two arrays). There is no limit to the number of arguments a Python function can accept. Furthermore, some arguments can be made **optional** by providing **default values**.\n", "\n", "Consider the following function `foo`, which accepts three input arguments `x`, `y`, and `z`, and prints them in a particular format:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def foo(x, y, z) :\n", " print(\"(x, y, z) = ({:.4f}, {:.4f}, {:.4f})\".format(x, y, z))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As defined, one *must* pass three values to `foo` for `x`, `y`, and `z`, e.g.," ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(x, y, z) = (1.0000, 2.0000, 3.0000)\n" ] } ], "source": [ "foo(1, 2, 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, suppose `y` and `z` represent values that do not often change. That is, perhaps there are good **default values** that can be assigned to `y` and `z` for most applications. Here, assume those values are `y = 1.1` and `z = 2.2`. A new version of `foo` may then be defined with these default values:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def foo_with_defaults(x, y = 1.1, z = 2.2):\n", " print(\"(x, y, z) = ({:.4f}, {:.4f}, {:.4f})\".format(x, y, z))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This new function can be called just like the original version, i.e., with all three input arguments defined:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(x, y, z) = (1.0000, 2.0000, 3.0000)\n" ] } ], "source": [ "foo_with_defaults(1, 2, 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, with defaults defined for `y` and `z`, `foo_with_defaults` can also be called in the following ways:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(x, y, z) = (1.0000, 3.0000, 2.2000)\n", "(x, y, z) = (1.0000, 1.1000, 2.2000)\n" ] } ], "source": [ "foo_with_defaults(1, 3) # let z be its default value\n", "foo_with_defaults(1) # let y and z be their default values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result of the first call shows that `y` is `3`. In other words, the arguments passed are assigned to the names in the order they appear in the function definition. Passing just one value (the second call) leads to default values for both `y` and `z`.\n", "\n", "Now, what if we wanted to pass a value for `z` but are fine with the default value for `y`? In Python, that’s easy, just **name** the argument `z` when calling `foo`:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(x, y, z) = (1.0000, 1.1000, 3.0000)\n" ] } ], "source": [ "foo_with_defaults(1, z=3) # y is still defaulted to 1.1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In fact, arguments can always be named explicitly:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(x, y, z) = (1.0000, 2.0000, 3.0000)\n" ] } ], "source": [ "foo_with_defaults(x=1, y=2, z=3) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, once an argument has been named, all subsequent arguments must also be named. For example, the following fails:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "ename": "SyntaxError", "evalue": "positional argument follows keyword argument (, line 1)", "output_type": "error", "traceback": [ "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m foo_with_defaults(1, y=2, 3)\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m positional argument follows keyword argument\n" ] } ], "source": [ "foo_with_defaults(1, y=2, 3) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This particular `SyntaxError` points out that arguments can be **positional** or **keyword** arguments. Any variable we name explicitly is a **keyword** argument. We'll learn more about these terms in [Lecture 13](ME400_Lecture_13.ipynb).\n", "\n", "With named arguments, one can pass arguments in any order. For example, consider the following three calls of `foo_with_defaults`:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(x, y, z) = (1.0000, 2.0000, 3.0000)\n", "(x, y, z) = (1.0000, 2.0000, 3.0000)\n", "(x, y, z) = (1.0000, 2.0000, 3.0000)\n" ] } ], "source": [ "foo_with_defaults(1, z=3, y=2)\n", "foo_with_defaults(z=3, y=2, x=1)\n", "foo_with_defaults(y=2, x=1, z=3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each call yields exactly the same output." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Function Documentation\n", "\n", "From the very beginning, the use of internal documentation via `help` has been emphasized. The value of `help`, however, depends on developers to provide the information displayed. That responsibility now turns to you, the function writer. \n", "\n", "Suppose you've provided your friend the `compute_sum` function, and she did was not quite sure how to use it. If she read [Lecture 1](ME400_Lecture_1.ipynb), she might try" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function compute_sum in module __main__:\n", "\n", "compute_sum(x)\n", "\n" ] } ], "source": [ "help(compute_sum)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Well, that output is not very helpful. What `compute_sum` is missing is a [docstring](https://www.python.org/dev/peps/pep-0257/). A **docstring** is a string placed immediately below the first line of a function (i.e., after the `def function_name(arg1, ...):` line. Ideally, this string says something about the function, including how to use it and what it does. \n", "\n", "For `compute_sum`, one possible docstring is `\"\"\"Returns the sum of an array x.\"\"\"`. Inserting this string below the `def` line of `compute_sum` leads to the following, documented function:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def compute_sum_with_docstring(x):\n", " \"\"\"Returns the sum of an array x.\"\"\"\n", " s = 0\n", " for i in range(len(x)):\n", " s += x[i]\n", " return s" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function compute_sum_with_docstring in module __main__:\n", "\n", "compute_sum_with_docstring(x)\n", " Returns the sum of an array x.\n", "\n" ] } ], "source": [ "help(compute_sum_with_docstring)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "More complicated functions require more detailed documentation, and with the triple, double-quote strings, multi-line docstring values are possible. A version of `compute_sum` with more detailed documentation is the following:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def compute_sum_with_verbose_docstring(x):\n", " \"\"\"Returns the sum of an array x.\n", " \n", " Inputs\n", " x: sequential type with numerical elements\n", " \n", " Returns\n", " s: the sum of the elements of x\n", " \"\"\"\n", " s = 0\n", " for i in range(len(x)):\n", " s += x[i]\n", " return s" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function compute_sum_with_verbose_docstring in module __main__:\n", "\n", "compute_sum_with_verbose_docstring(x)\n", " Returns the sum of an array x.\n", " \n", " Inputs\n", " x: sequential type with numerical elements\n", " \n", " Returns\n", " s: the sum of the elements of x\n", "\n" ] } ], "source": [ "help(compute_sum_with_verbose_docstring)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now your friend has all she needs to use the function!\n", "\n", "> **Note**: Always provide a **docstring** for functions you define." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Functions and Flowcharts\n", "\n", "Fundamentally, functions do not add anything new to our basic logical structures. Functions can perform computations involving sequences, selection (`if`), and iteration (`while` or `for`), just like our programs have done up to this point. The difference in practice is that a function lets one define such operations once and use them repeatedly within a program without writing the same lines of code repeatedly. Ultimately, substituting a function name in place of a large chunk of code can improve readability and one's ability to debug the program.\n", "\n", "Pseudocode is flexible, and function calls can be described pretty directly (e.g., `Set s to the sum of array a`). To include functions in flowcharts requres a new shape. Before, a simple rectangle had been used to define processes (usually, single statements like `i = i + 1`). Another rectangle, with double vertical bars, can be used to represent functions, as in the following figure:\n", "\n", "![Program to compute sum that uses the `compute_sum` function.](img/sum_function_program.png)\n", "\n", "Surely, this flowchart is simpler than it would be were the function not used. When developing flowcharts for programs that use functions, it is recommended that separate flow charts are provided for those functions that you wrote. For others, you may not know enough about the particular implementation to create the flowchart. For example, `np.sum` isn't as easy as you would think!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Further Reading\n", "\n", "None at this time." ] } ], "metadata": { "celltoolbar": "Edit Metadata", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }