forked from pllk/cphb
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathchapter02.tex
538 lines (460 loc) · 14.6 KB
/
chapter02.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
\chapter{Time complexity}
\index{time complexity}
The efficiency of algorithms is important in competitive programming.
Usually, it is easy to design an algorithm
that solves the problem slowly,
but the real challenge is to invent a
fast algorithm.
If the algorithm is too slow, it will get only
partial points or no points at all.
The \key{time complexity} of an algorithm
estimates how much time the algorithm will use
for some input.
The idea is to represent the efficiency
as a function whose parameter is the size of the input.
By calculating the time complexity,
we can find out whether the algorithm is fast enough
without implementing it.
\section{Calculation rules}
The time complexity of an algorithm
is denoted $O(\cdots)$
where the three dots represent some
function.
Usually, the variable $n$ denotes
the input size.
For example, if the input is an array of numbers,
$n$ will be the size of the array,
and if the input is a string,
$n$ will be the length of the string.
\subsubsection*{Loops}
A common reason why an algorithm is slow is
that it contains many loops that go through the input.
The more nested loops the algorithm contains,
the slower it is.
If there are $k$ nested loops,
the time complexity is $O(n^k)$.
For example, the time complexity of the following code is $O(n)$:
\begin{lstlisting}
for (int i = 1; i <= n; i++) {
// code
}
\end{lstlisting}
And the time complexity of the following code is $O(n^2)$:
\begin{lstlisting}
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= n; j++) {
// code
}
}
\end{lstlisting}
\subsubsection*{Order of magnitude}
A time complexity does not tell us the exact number
of times the code inside a loop is executed,
but it only shows the order of magnitude.
In the following examples, the code inside the loop
is executed $3n$, $n+5$ and $\lceil n/2 \rceil$ times,
but the time complexity of each code is $O(n)$.
\begin{lstlisting}
for (int i = 1; i <= 3*n; i++) {
// code
}
\end{lstlisting}
\begin{lstlisting}
for (int i = 1; i <= n+5; i++) {
// code
}
\end{lstlisting}
\begin{lstlisting}
for (int i = 1; i <= n; i += 2) {
// code
}
\end{lstlisting}
As another example,
the time complexity of the following code is $O(n^2)$:
\begin{lstlisting}
for (int i = 1; i <= n; i++) {
for (int j = i+1; j <= n; j++) {
// code
}
}
\end{lstlisting}
\subsubsection*{Phases}
If the algorithm consists of consecutive phases,
the total time complexity is the largest
time complexity of a single phase.
The reason for this is that the slowest
phase is usually the bottleneck of the code.
For example, the following code consists
of three phases with time complexities
$O(n)$, $O(n^2)$ and $O(n)$.
Thus, the total time complexity is $O(n^2)$.
\begin{lstlisting}
for (int i = 1; i <= n; i++) {
// code
}
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= n; j++) {
// code
}
}
for (int i = 1; i <= n; i++) {
// code
}
\end{lstlisting}
\subsubsection*{Several variables}
Sometimes the time complexity depends on
several factors.
In this case, the time complexity formula
contains several variables.
For example, the time complexity of the
following code is $O(nm)$:
\begin{lstlisting}
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= m; j++) {
// code
}
}
\end{lstlisting}
\subsubsection*{Recursion}
The time complexity of a recursive function
depends on the number of times the function is called
and the time complexity of a single call.
The total time complexity is the product of
these values.
For example, consider the following function:
\begin{lstlisting}
void f(int n) {
if (n == 1) return;
f(n-1);
}
\end{lstlisting}
The call $\texttt{f}(n)$ causes $n$ function calls,
and the time complexity of each call is $O(1)$.
Thus, the total time complexity is $O(n)$.
As another example, consider the following function:
\begin{lstlisting}
void g(int n) {
if (n == 1) return;
g(n-1);
g(n-1);
}
\end{lstlisting}
In this case each function call generates two other
calls, except for $n=1$.
Let us see what happens when $g$ is called
with parameter $n$.
The following table shows the function calls
produced by this single call:
\begin{center}
\begin{tabular}{rr}
function call & number of calls \\
\hline
$g(n)$ & 1 \\
$g(n-1)$ & 2 \\
$g(n-2)$ & 4 \\
$\cdots$ & $\cdots$ \\
$g(1)$ & $2^{n-1}$ \\
\end{tabular}
\end{center}
Based on this, the time complexity is
\[1+2+4+\cdots+2^{n-1} = 2^n-1 = O(2^n).\]
\section{Complexity classes}
\index{complexity classes}
The following list contains common time complexities
of algorithms:
\begin{description}
\item[$O(1)$]
\index{constant-time algorithm}
The running time of a \key{constant-time} algorithm
does not depend on the input size.
A typical constant-time algorithm is a direct
formula that calculates the answer.
\item[$O(\log n)$]
\index{logarithmic algorithm}
A \key{logarithmic} algorithm often halves
the input size at each step.
The running time of such an algorithm
is logarithmic, because
$\log_2 n$ equals the number of times
$n$ must be divided by 2 to get 1.
\item[$O(\sqrt n)$]
A \key{square root algorithm} is slower than
$O(\log n)$ but faster than $O(n)$.
A special property of square roots is that
$\sqrt n = n/\sqrt n$, so the square root $\sqrt n$ lies,
in some sense, in the middle of the input.
\item[$O(n)$]
\index{linear algorithm}
A \key{linear} algorithm goes through the input
a constant number of times.
This is often the best possible time complexity,
because it is usually necessary to access each
input element at least once before
reporting the answer.
\item[$O(n \log n)$]
This time complexity often indicates that the
algorithm sorts the input,
because the time complexity of efficient
sorting algorithms is $O(n \log n)$.
Another possibility is that the algorithm
uses a data structure where each operation
takes $O(\log n)$ time.
\item[$O(n^2)$]
\index{quadratic algorithm}
A \key{quadratic} algorithm often contains
two nested loops.
It is possible to go through all pairs of
the input elements in $O(n^2)$ time.
\item[$O(n^3)$]
\index{cubic algorithm}
A \key{cubic} algorithm often contains
three nested loops.
It is possible to go through all triplets of
the input elements in $O(n^3)$ time.
\item[$O(2^n)$]
This time complexity often indicates that
the algorithm iterates through all
subsets of the input elements.
For example, the subsets of $\{1,2,3\}$ are
$\emptyset$, $\{1\}$, $\{2\}$, $\{3\}$, $\{1,2\}$,
$\{1,3\}$, $\{2,3\}$ and $\{1,2,3\}$.
\item[$O(n!)$]
This time complexity often indicates that
the algorithm iterates through all
permutations of the input elements.
For example, the permutations of $\{1,2,3\}$ are
$(1,2,3)$, $(1,3,2)$, $(2,1,3)$, $(2,3,1)$,
$(3,1,2)$ and $(3,2,1)$.
\end{description}
\index{polynomial algorithm}
An algorithm is \key{polynomial}
if its time complexity is at most $O(n^k)$
where $k$ is a constant.
All the above time complexities except
$O(2^n)$ and $O(n!)$ are polynomial.
In practice, the constant $k$ is usually small,
and therefore a polynomial time complexity
roughly means that the algorithm is \emph{efficient}.
\index{NP-hard problem}
Most algorithms in this book are polynomial.
Still, there are many important problems for which
no polynomial algorithm is known, i.e.,
nobody knows how to solve them efficiently.
\key{NP-hard} problems are an important set
of problems, for which no polynomial algorithm
is known\footnote{A classic book on the topic is
M. R. Garey's and D. S. Johnson's
\emph{Computers and Intractability: A Guide to the Theory
of NP-Completeness} \cite{gar79}.}.
\section{Estimating efficiency}
By calculating the time complexity of an algorithm,
it is possible to check, before
implementing the algorithm, that it is
efficient enough for the problem.
The starting point for estimations is the fact that
a modern computer can perform some hundreds of
millions of operations in a second.
For example, assume that the time limit for
a problem is one second and the input size is $n=10^5$.
If the time complexity is $O(n^2)$,
the algorithm will perform about $(10^5)^2=10^{10}$ operations.
This should take at least some tens of seconds,
so the algorithm seems to be too slow for solving the problem.
On the other hand, given the input size,
we can try to \emph{guess}
the required time complexity of the algorithm
that solves the problem.
The following table contains some useful estimates
assuming a time limit of one second.
\begin{center}
\begin{tabular}{ll}
input size & required time complexity \\
\hline
$n \le 10$ & $O(n!)$ \\
$n \le 20$ & $O(2^n)$ \\
$n \le 500$ & $O(n^3)$ \\
$n \le 5000$ & $O(n^2)$ \\
$n \le 10^6$ & $O(n \log n)$ or $O(n)$ \\
$n$ is large & $O(1)$ or $O(\log n)$ \\
\end{tabular}
\end{center}
For example, if the input size is $n=10^5$,
it is probably expected that the time
complexity of the algorithm is $O(n)$ or $O(n \log n)$.
This information makes it easier to design the algorithm,
because it rules out approaches that would yield
an algorithm with a worse time complexity.
\index{constant factor}
Still, it is important to remember that a
time complexity is only an estimate of efficiency,
because it hides the \emph{constant factors}.
For example, an algorithm that runs in $O(n)$ time
may perform $n/2$ or $5n$ operations.
This has an important effect on the actual
running time of the algorithm.
\section{Maximum subarray sum}
\index{maximum subarray sum}
There are often several possible algorithms
for solving a problem such that their
time complexities are different.
This section discusses a classic problem that
has a straightforward $O(n^3)$ solution.
However, by designing a better algorithm, it
is possible to solve the problem in $O(n^2)$
time and even in $O(n)$ time.
Given an array of $n$ numbers,
our task is to calculate the
\key{maximum subarray sum}, i.e.,
the largest possible sum of
a sequence of consecutive values
in the array\footnote{J. Bentley's
book \emph{Programming Pearls} \cite{ben86} made the problem popular.}.
The problem is interesting when there may be
negative values in the array.
For example, in the array
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$-1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$-3$};
\node at (4.5,0.5) {$5$};
\node at (5.5,0.5) {$2$};
\node at (6.5,0.5) {$-5$};
\node at (7.5,0.5) {$2$};
\end{tikzpicture}
\end{center}
\begin{samepage}
the following subarray produces the maximum sum $10$:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (1,0) rectangle (6,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$-1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$-3$};
\node at (4.5,0.5) {$5$};
\node at (5.5,0.5) {$2$};
\node at (6.5,0.5) {$-5$};
\node at (7.5,0.5) {$2$};
\end{tikzpicture}
\end{center}
\end{samepage}
We assume that an empty subarray is allowed,
so the maximum subarray sum is always at least $0$.
\subsubsection{Algorithm 1}
A straightforward way to solve the problem
is to go through all possible subarrays,
calculate the sum of values in each subarray and maintain
the maximum sum.
The following code implements this algorithm:
\begin{lstlisting}
int best = 0;
for (int a = 0; a < n; a++) {
for (int b = a; b < n; b++) {
int sum = 0;
for (int k = a; k <= b; k++) {
sum += array[k];
}
best = max(best,sum);
}
}
cout << best << "\n";
\end{lstlisting}
The variables \texttt{a} and \texttt{b} fix the first and
last index of the subarray,
and the sum of values is calculated to the variable \texttt{sum}.
The variable \texttt{best} contains the maximum sum found during the search.
The time complexity of the algorithm is $O(n^3)$,
because it consists of three nested loops
that go through the input.
\subsubsection{Algorithm 2}
It is easy to make Algorithm 1 more efficient
by removing one loop from it.
This is possible by calculating the sum at the same
time when the right end of the subarray moves.
The result is the following code:
\begin{lstlisting}
int best = 0;
for (int a = 0; a < n; a++) {
int sum = 0;
for (int b = a; b < n; b++) {
sum += array[b];
best = max(best,sum);
}
}
cout << best << "\n";
\end{lstlisting}
After this change, the time complexity is $O(n^2)$.
\subsubsection{Algorithm 3}
Surprisingly, it is possible to solve the problem
in $O(n)$ time\footnote{In \cite{ben86}, this linear-time algorithm
is attributed to J. B. Kadane, and the algorithm is sometimes
called \index{Kadane's algorithm} \key{Kadane's algorithm}.}, which means
that just one loop is enough.
The idea is to calculate, for each array position,
the maximum sum of a subarray that ends at that position.
After this, the answer for the problem is the
maximum of those sums.
Consider the subproblem of finding the maximum-sum subarray
that ends at position $k$.
There are two possibilities:
\begin{enumerate}
\item The subarray only contains the element at position $k$.
\item The subarray consists of a subarray that ends
at position $k-1$, followed by the element at position $k$.
\end{enumerate}
In the latter case, since we want to
find a subarray with maximum sum,
the subarray that ends at position $k-1$
should also have the maximum sum.
Thus, we can solve the problem efficiently
by calculating the maximum subarray sum
for each ending position from left to right.
The following code implements the algorithm:
\begin{lstlisting}
int best = 0, sum = 0;
for (int k = 0; k < n; k++) {
sum = max(array[k],sum+array[k]);
best = max(best,sum);
}
cout << best << "\n";
\end{lstlisting}
The algorithm only contains one loop
that goes through the input,
so the time complexity is $O(n)$.
This is also the best possible time complexity,
because any algorithm for the problem
has to examine all array elements at least once.
\subsubsection{Efficiency comparison}
It is interesting to study how efficient
algorithms are in practice.
The following table shows the running times
of the above algorithms for different
values of $n$ on a modern computer.
In each test, the input was generated randomly.
The time needed for reading the input was not
measured.
\begin{center}
\begin{tabular}{rrrr}
array size $n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\
\hline
$10^2$ & $0.0$ s & $0.0$ s & $0.0$ s \\
$10^3$ & $0.1$ s & $0.0$ s & $0.0$ s \\
$10^4$ & > $10.0$ s & $0.1$ s & $0.0$ s \\
$10^5$ & > $10.0$ s & $5.3$ s & $0.0$ s \\
$10^6$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\
$10^7$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\
\end{tabular}
\end{center}
The comparison shows that all algorithms
are efficient when the input size is small,
but larger inputs bring out remarkable
differences in the running times of the algorithms.
Algorithm 1 becomes slow
when $n=10^4$, and Algorithm 2
becomes slow when $n=10^5$.
Only Algorithm 3 is able to process
even the largest inputs instantly.