The List Data Structure And Amortized Cost

Roughly equivalent to Java's ArrayList and C++'s stl vector

In [1]:
arr = [1, 2, 3]
arr.append(10)
arr.append(20)
print(arr.pop())
print(arr)
20
[1, 2, 3, 10]

For an actual array in Java/C++, if you want to remove the last element, you have to make an entirely new array one size smaller, copy everything over except for the last element.

if you want to add something to the end, you have to make an entirely new array one size larger, copy everything over, and then put in that new element

In [19]:
class ArrayList:
    def __init__(self, capacity = 10):
        self._capacity = capacity
        self._arr = [0]*capacity # Have an array in the background, and I'm only allowed to
        # make a new array of a fixed length, or to modify the current array's elements
        self._N = 0 # How many elements we have
    
    def add_slow(self, x):
        """
        Add an element at the end of the list, adding 1
        to the capacity if needed
        Parameters
        ----------
        x: The element to add
        """
        if self._N == self._capacity:
            self._capacity += 1
            new_arr = [0]*self._capacity
            for i in range(self._N):
                new_arr[i] = self._arr[i]
            self._arr = new_arr
        # Now I know I have enough room no matter what
        self._arr[self._N] = x
        self._N += 1

    def add_fast(self, x):
        """
        Add an element at the end of the list, doubling
        capacity if needed
        (NOTE: We could do more work to share between add_slow)
        Parameters
        ----------
        x: The element to add
        """
        if self._N == self._capacity:
            self._capacity *= 2
            new_arr = [0]*self._capacity
            for i in range(self._N):
                new_arr[i] = self._arr[i]
            self._arr = new_arr
        # Now I know I have enough room no matter what
        self._arr[self._N] = x
        self._N += 1
    
    def at(self, i):
        return self._arr[i]

    def __str__(self):
        s = ""
        for i in range(self._N):
            s += "{}".format(self._arr[i])
            if i < self._N-1:
                s += ", "
        return s
    
    def __len__(self):
        return self._N
In [20]:
L = ArrayList(20)
for i in range(30):
    L.add_slow(i*2)
print(L)
0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58

Ex) capacity is 10, and we add 15 elements.

First 10 adds are going $O(1)$ After 10, all adds are going $O(N)$

Assume that increasing the capacity by 1 takes $a*(N+1)$,

and adding an element takes $b$ operations

$10*b$ for the first 10 $a*(11 + 12 + 13 + 14 + 15) + b*5$

Total cost is $15*b + a*(11+12+13+14+15) = 15b + 65a$

What was the amortized cost per operation in these first 15 adds. What is the average time it takes per add

In [21]:
(15b + 65a)/15 ~ b + 4a
Out[21]:
65
In [23]:
65/15
Out[23]:
4.333333333333333

Amortized Analysis of add_slow

Ex) Add $N$ elements, where N is very large

$T(N)$ is the time to add $N$ elements

$T(N) \leq Nb + (1 + 2 + 3 + ... + N)a$

There's a identity that we can use

$ 1 + 2 + 3 + ... + N = N(N+1)/2 $

So we can simplify to

$T(N) \leq bN + aN(N+1)/2 $

The amortized cost, or average time per $N$ operations is

$\frac{T(N)}{N} \leq b + (N+1)/2 $

Amortized cost $T(N) / N$ is $O(N)$

Just to say it again, this means that on average, it takes $O(N)$ to add a single element! This is really slow.

In [ ]: