Amortized Cost of List Doubling

In [ ]:
class ArrayList:
    def __init__(self, capacity = 10):
        self._capacity = capacity
        self._arr = [0]*capacity # Have an array in the background, and I'm only allowed to
        # make a new array of a fixed length, or to modify the current array's elements
        self._N = 0 # How many elements we have

    def add_fast(self, x):
        """
        Add an element at the end of the list, doubling
        capacity if needed
        (NOTE: We could do more work to share between add_slow)
        Parameters
        ----------
        x: The element to add
        """
        if self._N == self._capacity:
            self._capacity *= 2
            new_arr = [0]*self._capacity
            for i in range(self._N):
                new_arr[i] = self._arr[i]
            self._arr = new_arr
        # Now I know I have enough room no matter what
        self._arr[self._N] = x
        self._N += 1

    def __len__(self):
        return self._N

Consider adding $N$ elements. Furthermore, let's suppose that when we add the Nth element, we double right at that moment. This is really the worst case, because we just spent all of this time doubling the capacity, but then we walked away

The total time to add $N$ elements, $T(N)$ consists of two parts:

  1. The cost to change an element at a particular index in the array and to update the array length.
In [ ]:
self._arr[self._N] = x
self._N += 1

Suppose that these operations take some constant $a$

Overall, this contributes a total of $T_a(N) = aN$ to the total cost


  1. The underlying array will double. Let's assume that to create a new array of size $K$ and to copy in the elements takes $bK$

Let's work backwards to figure out how much we have to pay over all of the doublings

  • The last doubling happened at $N$, so we doubled up to $2N$, which took $b2N$

  • Second to last doubling, doubled to a size of $N$, took $bN$

  • Third to last doubling, double to a size of $N/2$, took $bN/2$

  • ...

Total contribution over all doublings is

$ T_d(N) = 2bN + bN + bN/2 + bN/4 + ... + n_0$ ($n_0$ initial capacity)

$ T_d(N) \leq 2bN(1 + 1/2 + 1/4 + 1/8 + ... + 0) $

We can use the identity that

1 + 1/2 + 1/4 + 1/8 + ... 0 = 2. Therefore

$T_d(N) \leq 2bN*2$

$T_d(N) \leq 4bN$

Total time $T(N) = T_a(N) + T_d(N) \leq aN + 4bN$

$T(N) \leq (a + 4b) N$

Let $c = a + 4b$

$T(N) \leq cN \implies T(N)$ is $O(N)$

Amortized (Average) cost $T(N) / N$ is $c$, which is $O(1)$

In [ ]: