|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "code", |
| 5 | + "execution_count": 1, |
| 6 | + "metadata": { |
| 7 | + "collapsed": true |
| 8 | + }, |
| 9 | + "outputs": [], |
| 10 | + "source": [ |
| 11 | + "%load_ext watermark" |
| 12 | + ] |
| 13 | + }, |
| 14 | + { |
| 15 | + "cell_type": "code", |
| 16 | + "execution_count": 2, |
| 17 | + "metadata": { |
| 18 | + "collapsed": false |
| 19 | + }, |
| 20 | + "outputs": [ |
| 21 | + { |
| 22 | + "name": "stdout", |
| 23 | + "output_type": "stream", |
| 24 | + "text": [ |
| 25 | + "Sebastian Raschka \n", |
| 26 | + "last updated: 2016-06-23 \n", |
| 27 | + "\n", |
| 28 | + "CPython 3.5.1\n", |
| 29 | + "IPython 4.2.0\n" |
| 30 | + ] |
| 31 | + } |
| 32 | + ], |
| 33 | + "source": [ |
| 34 | + "%watermark -a 'Sebastian Raschka' -u -d -v" |
| 35 | + ] |
| 36 | + }, |
| 37 | + { |
| 38 | + "cell_type": "markdown", |
| 39 | + "metadata": {}, |
| 40 | + "source": [ |
| 41 | + "# Introduction to Divide-and-Conquer Algorithms" |
| 42 | + ] |
| 43 | + }, |
| 44 | + { |
| 45 | + "cell_type": "markdown", |
| 46 | + "metadata": {}, |
| 47 | + "source": [ |
| 48 | + "The subfamily of *Divide-and-Conquer* algorithms is one of the main paradigms of algorithmic problem solving next to *Dynamic Programming* and *Greedy Algorithms*. The main goal behind greedy algorithms is to implement an efficient procedure for often computationally more complex, often infeasible brute-force methods such as exhaustive search algorithms by splitting a task into subtasks that can be solved indpendently and in parallel; later, the solutions are combined to yield the final result.\n", |
| 49 | + "\n" |
| 50 | + ] |
| 51 | + }, |
| 52 | + { |
| 53 | + "cell_type": "markdown", |
| 54 | + "metadata": {}, |
| 55 | + "source": [ |
| 56 | + "## Example 1 -- Binary Search" |
| 57 | + ] |
| 58 | + }, |
| 59 | + { |
| 60 | + "cell_type": "markdown", |
| 61 | + "metadata": {}, |
| 62 | + "source": [ |
| 63 | + "Let's say we want to implement an algorithm that returns the index position of an item that we are looking for in an array. \n", |
| 64 | + "in an array. Here, we assume that the array is alreadt sorted. The simplest (and computationally most expensive) approach would be to check each element in the array iteratively, until we find the desired match or return -1:" |
| 65 | + ] |
| 66 | + }, |
| 67 | + { |
| 68 | + "cell_type": "code", |
| 69 | + "execution_count": 3, |
| 70 | + "metadata": { |
| 71 | + "collapsed": false |
| 72 | + }, |
| 73 | + "outputs": [], |
| 74 | + "source": [ |
| 75 | + "def linear_search(lst, item):\n", |
| 76 | + " for i in range(len(lst)):\n", |
| 77 | + " if lst[i] == item:\n", |
| 78 | + " return i\n", |
| 79 | + " return -1" |
| 80 | + ] |
| 81 | + }, |
| 82 | + { |
| 83 | + "cell_type": "code", |
| 84 | + "execution_count": 4, |
| 85 | + "metadata": { |
| 86 | + "collapsed": false |
| 87 | + }, |
| 88 | + "outputs": [ |
| 89 | + { |
| 90 | + "name": "stdout", |
| 91 | + "output_type": "stream", |
| 92 | + "text": [ |
| 93 | + "2\n", |
| 94 | + "0\n", |
| 95 | + "-1\n", |
| 96 | + "-1\n" |
| 97 | + ] |
| 98 | + } |
| 99 | + ], |
| 100 | + "source": [ |
| 101 | + "lst = [1, 5, 8, 12, 13]\n", |
| 102 | + "\n", |
| 103 | + "for k in [8, 1, 23, 11]:\n", |
| 104 | + " print(linear_search(lst=lst, item=k))" |
| 105 | + ] |
| 106 | + }, |
| 107 | + { |
| 108 | + "cell_type": "markdown", |
| 109 | + "metadata": {}, |
| 110 | + "source": [ |
| 111 | + "The runtime of linear search is obviously $O(n)$ since we are checking each element in the array -- remember that big-Oh is our upper bound. Now, a cleverer way of implementing a search algorithm would be *binary search*, which is a simple, yet nice example of a *divide-and-conquer* algorithm.\n", |
| 112 | + "\n", |
| 113 | + "The idea behind divide-and-conquer algorithm is to break a problem down into non-overlapping subproblems of the original problem, which we can then solve recursively. Once, we processed these recursive subproblems, we combine the solutions into the end result.\n", |
| 114 | + "\n", |
| 115 | + "Using a divide-and-conquer approach, we can implement an $O(\\log n)$ search algorithm called *binary search*." |
| 116 | + ] |
| 117 | + }, |
| 118 | + { |
| 119 | + "cell_type": "markdown", |
| 120 | + "metadata": {}, |
| 121 | + "source": [ |
| 122 | + "The idea behind binary search is quite simple:\n", |
| 123 | + "\n", |
| 124 | + "1. We take the midpoint of an array and compare it to its search key\n", |
| 125 | + "2. If the search key is equal to the midpoint, we are done, else\n", |
| 126 | + " 3. search key < midpoint?\n", |
| 127 | + " 4. Yes: repeat search (back to step 1) with subarray that ends at index position `midpoint - 1` \n", |
| 128 | + " 5. No: repeat search (back step 1) with subarray that starts `midpoint + 1 `\n", |
| 129 | + " \n", |
| 130 | + " \n", |
| 131 | + "Assuming that we are looking for the search key *k=5*, the individual steps of binary search can be illustrated as follows:\n", |
| 132 | + "\n", |
| 133 | + "" |
| 134 | + ] |
| 135 | + }, |
| 136 | + { |
| 137 | + "cell_type": "markdown", |
| 138 | + "metadata": {}, |
| 139 | + "source": [ |
| 140 | + "And below follows our Python implementation of this idea:" |
| 141 | + ] |
| 142 | + }, |
| 143 | + { |
| 144 | + "cell_type": "code", |
| 145 | + "execution_count": 5, |
| 146 | + "metadata": { |
| 147 | + "collapsed": false |
| 148 | + }, |
| 149 | + "outputs": [], |
| 150 | + "source": [ |
| 151 | + "def binary_search(lst, item):\n", |
| 152 | + " first = 0\n", |
| 153 | + " last = len(lst) - 1\n", |
| 154 | + " found = False\n", |
| 155 | + "\n", |
| 156 | + " while first <= last and not found:\n", |
| 157 | + " midpoint = (first + last) // 2\n", |
| 158 | + " if lst[midpoint] == item:\n", |
| 159 | + " found = True\n", |
| 160 | + " else:\n", |
| 161 | + " if item < lst[midpoint]:\n", |
| 162 | + " last = midpoint - 1\n", |
| 163 | + " else:\n", |
| 164 | + " first = midpoint + 1\n", |
| 165 | + " \n", |
| 166 | + " if found:\n", |
| 167 | + " return midpoint\n", |
| 168 | + " else:\n", |
| 169 | + " return -1" |
| 170 | + ] |
| 171 | + }, |
| 172 | + { |
| 173 | + "cell_type": "code", |
| 174 | + "execution_count": 6, |
| 175 | + "metadata": { |
| 176 | + "collapsed": false |
| 177 | + }, |
| 178 | + "outputs": [ |
| 179 | + { |
| 180 | + "name": "stdout", |
| 181 | + "output_type": "stream", |
| 182 | + "text": [ |
| 183 | + "2\n", |
| 184 | + "0\n", |
| 185 | + "-1\n", |
| 186 | + "-1\n" |
| 187 | + ] |
| 188 | + } |
| 189 | + ], |
| 190 | + "source": [ |
| 191 | + "for k in [8, 1, 23, 11]:\n", |
| 192 | + " print(binary_search(lst=lst, item=k))" |
| 193 | + ] |
| 194 | + }, |
| 195 | + { |
| 196 | + "cell_type": "markdown", |
| 197 | + "metadata": {}, |
| 198 | + "source": [ |
| 199 | + "# ... to be continued" |
| 200 | + ] |
| 201 | + }, |
| 202 | + { |
| 203 | + "cell_type": "code", |
| 204 | + "execution_count": null, |
| 205 | + "metadata": { |
| 206 | + "collapsed": true |
| 207 | + }, |
| 208 | + "outputs": [], |
| 209 | + "source": [] |
| 210 | + } |
| 211 | + ], |
| 212 | + "metadata": { |
| 213 | + "kernelspec": { |
| 214 | + "display_name": "Python 3", |
| 215 | + "language": "python", |
| 216 | + "name": "python3" |
| 217 | + }, |
| 218 | + "language_info": { |
| 219 | + "codemirror_mode": { |
| 220 | + "name": "ipython", |
| 221 | + "version": 3 |
| 222 | + }, |
| 223 | + "file_extension": ".py", |
| 224 | + "mimetype": "text/x-python", |
| 225 | + "name": "python", |
| 226 | + "nbconvert_exporter": "python", |
| 227 | + "pygments_lexer": "ipython3", |
| 228 | + "version": "3.5.1" |
| 229 | + } |
| 230 | + }, |
| 231 | + "nbformat": 4, |
| 232 | + "nbformat_minor": 0 |
| 233 | +} |
0 commit comments