Mathematical Growth Rates / Analytic Time Complexity

In [3]:
import numpy as np
import matplotlib.pyplot as plt

Timing Function

Def. Timing Function T(N): The amount of time elapsed in some unit of time (seconds) for an algorithm on a problem of size $N$.

  • Everyone's computer goes at a different speed
  • Depends on implementation details
  • How much time does each operation take relative to others
In [ ]:
root1 = (-b + (b**2 - 4*a*c)**0.5)/(2*a)

+,-,*,/ are much faster than the square root

  • Variability in timing based on things going on in the background

Big O Notation for Functions

A mathematical construct to help us understand "blurry upper bounds" of functions. In particular, how does a function scale as we give it larger and larger inputs, ignoring constants and non-dominant terms

Def. Big O Notation

A function $f(N)$ is $O(g(N))$ if there exists a constant $c$ and a constant $n_0$ so that

$f(N) \leq c*g(N)$ for all $N \geq n_0$

Ex) $f(N) = 5N^2 + 3N + 2$

Claim: $f(N)$ is $O(N^2)$

find a constant $c$ and $n_0$, with $g(N) = N^2$

Choose $c = 5 + 3 + 2 = 10$

Notice that $N^2 \geq N$ for $N \geq 1$

$5N^2 + 3N^2 + 2 \geq 5N^2 + 3N + 2, N \geq 1$

$2N^2 \geq 2, N \geq 1$

$5N^2 + 3N^2 + 2N^2 \geq 5N^2 + 3N + 2, N \geq 1$

$f(N) \leq 10N^2, N \geq 1$

$n_0 = 1$

In [8]:
N = np.arange(6)
f = 5*N**2 + 3*N + 2
cg = 10*N**2
plt.plot(f)
plt.plot(cg)
plt.legend(["f", "c*g"])
Out[8]:
<matplotlib.legend.Legend at 0x1cd51daef98>

Big O in Algorithms T(N)

Think about best case $T(N)$, average case $T(N)$, and the worst case $T(N)$

Focus on worst case

Ex) Finding duplicate elements in a list

Naive algorithm:

All i in range(N), j in range(N). Going to go through $N*N$ iterations. Let's assume the most most amount of time an iteration can take is some constant $a$.

$T(N) = a*N^2$

$T(N)$ is $O(N^2)$

Improved algorithm:

Insisted that $i > j$, went through $N(N-1)/2$

$O(N^2)$

$N(N-1)/2 = N^2/2 -N/2$

but /2 is a constant, and -N/2 gets dominated by $N^2/2$ so doesn't help much as $N$ grows

Best algorithm: Sorts things

$T(N) = a*N*log_2(N) + b*N$

Claim: $O(N \log_2(N))$

will be true as long as $N \geq 2$

$ a*N*log_2(N) + b*N \leq a*N*log_2(N) + b*N*log_2(N)$

when $N \geq 2$

$ f(N) \leq (a+b)*N*log_2(N), N \geq 2$

Base of logarithm doesn't matter

A change of base comes out as a constant, which can always be rolled into our constant choice in big O

$log_a(N) = log_b(N)/log_b(a)$

So algorithmists usually don't provide a base when talking about logarithms

In [ ]: