William Liu

Technical Coding Interview Cheat Sheet

Summary

I got into software development through a fairly non-traditional path (Peace Corps, Nonprofits, then Software Engineering). When I first started interviewing, I naively thought it would be behavioral questions (since that’s the only questions that I’ve seen). Needless to say, I failed my first few interviews.

I’m writing this post as what I would tell my younger self on how to study for a technical code / skills assessment.

  1. Coding for interviews is a much different skillset than what you do at work. You need to specifically practice for this skillset (sorry, just the reality at most companies; will save my thoughts on this for another post)
  2. Pick one language (the simpler and less verbose the better; I use Python)
  3. Read the first few chapters of Cracking the Coding Interview (Note: solutions are in Java) Optionally pick up a language specific book (e.g. Elements of Programming Interviews in Python) and get the gist of what questions/solutions look like
  4. Learn those data structures and algorithms with Neetcode.io, a more targetted set of questions pulled from places like Leetcode or other variations like HackerRank.
  5. Interview

Question Types

The types of questions you’ll see are:

Click me test

Something here

Arrays

RAM

RAM / Memory is where data is stored.

Example Array: [1, 3, 5] Example RAM: 8GB of RAM = 10^9 bytes 1 Byte = 8 bits 1 bit is a position and can be either 0 or 1’s. A single character can be 4 Bytes (i.e. 32 bits).

Arrays will always be a contiguous allocation of RAM Addresses.

Static Arrays

Static Arrays have a fixed size and generally not that useful (see Dynamic Arrays instead). Static Arrays are not available in some programming languages (e.g. Python or JavaScript), which instead only offer Dynamic Arrays.

If you want to shift values (add or remove) in a static array (e.g. need to shift every value before or after), it’s an O(n) cost for worst case.

Operation Big-O Time
Read/Write i-th element O(1)
Insert/Remove Beginning O(n)
Insert/Remove Middle O(n)
Insert/Remove End O(1)

Dynamic Array

Dynamic Arrays do not have a fixed size, but usually has a default size (with a length of 0). Programming languages like Python use this as their default list.

Note: In Big O Notation, we don’t care about constant values in multiplying or adding to a variable (e.g. 20 * n, 2 + n). We only care about constants raised to a power of a constant (e.g. n ^ 2).

Operation Big-O Time
Read/Write i-th element O(1)
Insert/Remove Beginning O(n)
Insert/Remove Middle O(n)
Insert/Remove End O(1)

Stacks

Stacks are basically a Dynamic Array. We push (add to the end) and pop (remove from the end).

Example:

class Stack:
    def __init__(self):
        self.stack = []

    def push(self, n):
        self.stack.append(n)

    def pop(self):
        return self.stack.pop()

Linked List

Singly Linked List

A Singly Linked List is a node that points to other node(s).

E.g. ListNode1 (value, pointer to ListNode2), ListNode2 (value, pointer to ListNode3)

class ListNode:
    def __init__(self, value):
        self.value = value
        self.next = None

# Singly Linked List
class LinkedList:
    def __init__(self):
        # init the list with a 'dummy' node
        self.head = ListNode(-1)
        self.tail = self.head

    def insertEnd(self, value):
        self.tail.next = ListNode(value)
        self.tail = self.tail.next

    def remove(self, index):
        i = 0
        current = self.head
        while i < index and current:
            i += 1
            current = current.next

        # Remove the node ahead of current, need to set .next.next to get next (after removed)
        if current:
            current.next = current.next.next

    def print(self):
        current = self.head.next
        while current:
            print(current.value, ' -> ')
            current = current.next
        print()
Operation Big-O Time
Read/Write i-th element O(n)
Insert/Remove Beginning O(1)
Insert/Remove Middle O(n)
Insert/Remove End O(n)

Doubly Linked List

A Doubly Linked List is similar to a Singly Linked List, except each node has two pointers (prev, next)

Example: ListNode1 (prev = null, value, next = pointer to ListNode2), ListNode2 (prev = pointer to ListNode1, value, next = pointer to ListNode3) ListNode3 (prev = pointer to ListNode2, value, next = null)

Operation Big-O Time
Read/Write i-th element O(n)
Insert/Remove Beginning O(1)
Insert/Remove Middle O(n)
Insert/Remove End O(1)
class ListNode:
    def __init__(self, value):
        self.value = value
        self.next = None
        self.prev = None

# Implementation for Doubly Linked List
class LinkedList:
    def __init__(self):
        # Init the list with 'dummy' head and tail nodes
        # which makes edge cases for insert and remove easier
        self.head = ListNode(-1)
        self.tail = ListNode(-1)
        self.head.next = self.tail
        self.tail.prev = self.head

    def insertFront(self, value):
        newNode = ListNode(value)
        newNode.prev = self.head
        newNode.next = self.head.next

        self.head.next.prev = newNode
        self.head.next = newNode

    def insertEnd(self, value):
        newNode = ListNode(value)
        newNode.next = self.tail
        newNode.prev = self.tail.prev

        self.tail.prev.next = newNode
        self.tail.prev = newNode

    # Remove first node after dummy head (assume it exists)
    def removeFront(self):
        self.head.next.next.prev = self.head
        self.head.next = self.head.next.next

    # Remove last node before dummy tail (assume it exists)
    def removeEnd(self):
        self.tail.prev.prev.next = self.tail
        self.tail.prev = self.tail.prev.prev

    def print(self):
        curr = self.head.next
        while curr != self.tail:
            print(curr.value, " -> ")
            curr = curr.next
        print()

Queues

Queues are a data structure similar to Stacks, with the following properties: Queues support two operations Enqueue (add to the end) and Dequeue (remove from the beginning)

Operation Big-O Time
Enqueue O(1)
Dequeue O(1)
class ListNode:
    def __init__(self, value):
        self.value = value
        self.next = None

class Queue:
    def __init__(self):
        self.left = self.right = None

    def enqueue(self, value):
        newNode = ListNode(value)

        # Queue is non-empty
        if self.right:
            self.right.next = newNode
            self.right = self.right.next

        # Queue is empty
        else:
            self.left = self.right = newNode

    def dequeue(self):
        # Queue is empty
        if not self.left:
            return None

        # Remove left node and return value
        value = self.left.value
        self.left = self.left.next
        return value

    def print(self):
        current = self.left
        while current:
            print(current.value, ' -> ', end ="")
            current = current.next
        print()

Recursion

One-Branch Recursion

n! = n * (n-1) * (n-2) * … * 1 Example: 5! = 5 * 4 * 3 * 2 * 1 Notice: 5! = 5 * 4! Notice: 5! = 5 * (5-1)! Pattern: n! = n * (n-1)!

We took one big problem and turned it into many subproblems. We work on a problem until we get to the base case (e.g. 1)

Example:

# Recursive implementation of n! (n-factorial) calculation
def factorial(n):
    # Base case: n = 0 or 1
    if n <= 1:
        return 1

    # Recursive case: n! = n * (n - 1)!
    return n * factorial(n - 1)

Sorting

Insertion Sort

Insertionsort is a sorting algorithm similar to how we sort playing cards in our hands. You want to use this when data is almost all sorted for a small number of elements. If it’s a lot of elements, use Quicksort instead.

Worst-case time: O(n^2) Best-case time: O(n) Average-case time: O(n^2) Space: O(1)

Code:

""" Insertion Sort """

def insertionSort(mylist):
    for index in range(1, len(mylist)):
        print("Index is ", index)  # 1, 2, 3, 4, 5, 6, 7, 8; this is the outer loop

        # setup first case (only one item)
        currentvalue = mylist[index]
        position = index

        # this is the inner loop, loops through the sorted list backwards and compares values
        while position > 0 and mylist[position-1] > currentvalue:
            mylist[position] = mylist[position-1]
            position = position - 1

        mylist[position] = currentvalue  # found spot in inner sorted loop to place item

if __name__ == '__main__':
    mylist = [54,26,93,17,77,31,44,55,20]
    print("Original: ", mylist)
    insertionSort(mylist)
    print("Insertion Sorted: ", mylist)

Merge Sort

Mergesort is an algorithm based on a divide and conquer strategy that operates well on any type of dataset, large or small.

Worst-case time: O(n log n) Best-case time: O(n log n) Average-case time: O(n log n) Space: O(n)

Merge Sort:

""" Merge Sort """
def mergeSort(mylist):
    print("Splitting", mylist)

    if len(mylist) > 1:
        mid = len(mylist) // 2
        lefthalf = mylist[:mid]
        print("Left half ", lefthalf)
        righthalf = mylist[mid:]
        print("Right half ", righthalf)

        mergeSort(lefthalf)
        mergeSort(righthalf)

        # below code merges the two smaller sorted lists to larger sorted list
        i = 0  # left half index
        j = 0  # right half index
        k = 0  # main / large sorted list

        while i < len(lefthalf) and j < len(righthalf):

            # take the smallest value from either left or right half
            if lefthalf[i] < righthalf[j]:
                mylist[k] = lefthalf[i]  # smaller value on lefthalf
                i += 1
            else:
                mylist[k] = righthalf[j]  # smaller value on righthalf
                j += 1
            k += 1

        # insert remaining values from lefthalf
        while i < len(lefthalf):
            mylist[k] = lefthalf[i]
            i += 1
            k += 1

        # insert remaining values from righthalf
        while j < len(righthalf):
            mylist[k] = righthalf[j]
            j += 1
            k += 1

    print("Merging", mylist)


if __name__ == '__main__':
    mylist = [54, 26, 93, 17, 77, 31, 44, 55, 20]
    print("Original: ", mylist)
    mergeSort(mylist)
    print("Merge Sorted: ", mylist)

Quick Sort

Quicksort is a an in-place sorting O(1) memory algorithm based on a divide and conquer strategy that is generally more efficient for small datasets or where the elements are fairly evenly distributed over the range.

Worst-case time: O(n^2) Best-case time: O(n log n) Average-case time: O(n log n) Space: O(1)

Code:

""" Quick Sort in Python3
Quick sort uses divide and conquer to gain the same advantages as merge sort,
with the benefit of using less storage, but at the cost of a worse worst case runtime
O(n^2) if the pivot values are bad.
"""
import pdb
from typing import List


def quickSort(mylist):
    """ Initialize our recursive function """
    quickSortHelper(mylist, 0, len(mylist)-1)

def quickSortHelper(mylist, first, last):
    """ Recursive function to split up """
    if first < last:  # check if need to sort still

        splitpoint = partition(mylist, first, last)

        # now that we know our splitpoint, we can then recursively run quicksort on the list's bottom half and top half
        quickSortHelper(mylist, first, splitpoint-1)
        quickSortHelper(mylist, splitpoint+1, last)

def partition(mylist, first, last):
    """ Partition Process, made up of:
    * Pick a pivot value (i.e. what we'll compare our unsorted numbers to)
    Based off this value, we'll compare our unsorted values and either move
    our items to the left of the pivot or to the right of the pivot.
    * """
    pivotvalue = mylist[first]  # get the first value as pivotvalue

    leftmark = first + 1
    rightmark = last

    done = False
    while not done:

        # Go from leftmost side onwards (to right) and try to find a value
        # that is greater than the pivot value (i.e. left side of pivot should be
        # smaller values than pivot value, if we found one that is greater, we
        # stop at leftmark, saying we need to do a swap to the right side)
        while leftmark <= rightmark and mylist[leftmark] <= pivotvalue:
            leftmark += 1

        # Go from rightmost side inwards (to left) and try to find a value
        # that is less than the pivot value (i.e. right side of pivot should be
        # greater values than pivot value, if we found one that is smaller, we
        # stop at rightmark, saying we need to do a swap to the left side)
        while rightmark >= leftmark and mylist[rightmark] >= pivotvalue:
            rightmark -= 1

        if rightmark < leftmark:
            done = True  # we're done sorting through this list because we've crossed
        else:
            # we have a swap between a value in the left list and a value in the right list
            mylist[leftmark], mylist[rightmark] = mylist[rightmark], mylist[leftmark]

    # Once rightmark is less than leftmark, then rightmark is now the split point.
    # That means what we picked as the pivot value can now be exchanged with the
    # contents of the split point and the pivot value is now in the correct place
    # Note: remember that our pivot value was the first value in our list
    mylist[first], mylist[rightmark] = mylist[rightmark], mylist[first]

    return rightmark


if __name__ == '__main__':
    mylist = [54, 26, 93, 17, 77, 31, 44, 55, 20]
    print("Original: ", mylist)
    quickSort(mylist)
    print("Quick Sorted: ", mylist)

Bucket Sort

Binary Search

Binary Search works on a sorted list of items by repeatedly dividing in half to narrow down your search.

Example: Guess a number from 1 -100, you pick 50 You know it’s lower than 50, so you pick 25 next You know it’s higher than 25, so you pick 12, etc.

arr = [1, 3, 3, 4, 5, 6, 7, 8]

# Python impelemntation of Binary Search
def binarySearch(arr, target):
    L, R = 0, len(arr) - 1

    while L <= R:
        mid = (L + R) // 2

        if target > arr[mid]:
            L = mid + 1
        elif target < arr[mid]:
            R = mid - 1
        else:
            return mid
    return -1
Operation Big-O Time
Memory O(1)
Time log(n)

Search Range

Search Range is a slight variation of the Binary Search (e.g. not given a sorted array) You might not be given an array of 1-100 or given a target (e.g. 10). You might be given a search range where n is the secret number that you have to guess.

# low = 1, high = 100

# Binary search on some range of values
def binarySearch(low, high):

    while low <= high:
        mid = (low + high) // 2

        if isCorrect(mid) > 0:
            high = mid - 1
        elif isCorrect(mid) < 0:
            low = mid + 1
        else:
            return mid
    return -1

# Return 1 if n is too big, -1 if too small, 0 if correct
def isCorrect(n):
    if n > 10:
        return 1
    elif n < 10:
        return -1
    else:
        return 0

Trees

Binary Trees

Binary Trees are similar in structure to Linked Lists in that there are Nodes, which can hold values (e.g. numbers, strings), and two (left and right due to binary) nodes that are typically pointed down.

Example Code:

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

Binary Search Tree (BST)

Binary Search Tree (BST) is a special type of Binary Tree that has a sorted property. These are not sorted like an array.

Example Code:

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def search(root, target):
    # Need a root
    if not root:
        return False

    if target > root.value:
        return search(root.right, target)
    elif target < root.value:
        return search(root.left, target)
    else:
        return True

Why create a Binary Search Tree when we have Sorted Arrays? If you want to add or remove a value from a sorted array, it’s (O(n)) due to shifting over values. The advantage of a Binary Search Tree is that inserting and deleting values can also be log(n)

Assuming that a Binary Search Tree is roughly balanced (for every single subtree, the height of the left and right might differ by 0 or 1).

Operation Big-O Time
Read/Write i-th element log(n), O(h)

If the Binary Search Tree is not balanced (e.g. one subtree has a height of 10 and another subtree has has height of 2) |Operation | Big-O Time | |———————– | ———- | |Read/Write i-th element | O(n) |

Binary Search Tree Insert and Remove

Assuming a tree is roughly balanced, inserting is O(h) or log(n).

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

# Insert a new node and return the root of the BST
def insert(root, value):
    if not root:
        return TreeNode(value)

BST Insert and Remove

Assuming the tree is roughly balanced, inserting will traverse the h of the tree, which is log(n) for a balanced tree.

Example:

Note: There are more advanced cases of inserting to create a balanced AVL tree.

For removal, we have a couple cases:

Worst case scenario is O log(n); need to find the node you’re removing/replacing (traverse height once), then traverse the height again to remove the node.

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

# Insert a new node and return the root of the BST
def insert(root, value):
    if not root:
        return TreeNode(value)

    if value > root.value:
        root.right = insert(root.right, value)
    elif value < root.value:
        root.left = insert(root.left, value)
    return root

# Return the minimum value node of the BST
def minValueNode(root):
    current = root
    while current and current.left:
        current = current.left
    return current

# Remove a node and return the root of the BST
def remove(root, value):
    if not root:
        return None

    if value > root.value:
        root.right = remove(root.right, value)
    elif value < root.value:
        root.left = remove(root.left, value)
    else:
        if not root.left:
            return root.right
        elif not root.right:
            return root.left
        else:
            minNode = minValueNode(root.right)
            root.value = minNode.value
            root.right = remove(root.right, minNode.value)
    return root

With a sorted array, you can iterate through that array by going left to right. One algorithm for traversing a tree is Depth-First-Search (DFS), where you go as deep (height wise) in one subtree as possible before going to another subtree.

Within a tree, you can do many types of traversals (recursively), including:

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

# normally you want to go in order (lowest to highest)
def inorder(root):
    if not root:
        return
    inorder(root.left)
    print(root.value)  # or do something here
    inoder(root.right)

# if you want to go in reverse order (highest to lowest)
def reverseorder(root):
    if not root:
        return
    inorder(root.right)
    print(root.value)  # or do something here
    inorder(root.left)

# print/do something before we go to the left subtree
def preorder(root):
    if not root:
        return
    print(root.value)  # or do something here
    preorder(root.left)
    preorder(root.right)

# print/do something after we go to the right subtree
def postorder(root):
    if not root:
        return
    postorder(root.left)
    postorder(root.right)
    print(root.value)  # or do something here

Breadth-First Search (BFS)

Breadth-First Search (BFS) can be applied to any tree, traversing each layer first (instead of depth). BFS is a little different than DFS (isn’t recursive, doesn’t go height/depth first). BFS is also known as level order traversal where after we process a level of a node, we want to process its children. BFS is O(n) to traverse each node.

from collections import deque  # a double ended queue


class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def bfs(root):
    queue = deque()

    if root:
        queue.append(root)

    level = 0
    while len(queue) > 0:
        print("level: ", level)
        for i in range(len(queue)):
            current = queue.popleft()
            print(current.value)

            # this order determines if going left to right (or swap if you want right to left)
            if current.left:
                queue.append(current.left)
            if current.right:
                queue.append(current.right)
        level += 1

BST Sets and Maps

You can create an Ordered Set as a tree. For Python, you can use a collections.OrderedDict

For a Native Tree Map, Python doesn’t have one normally (SortedDict is one under the hood).

from sortedcontainers import SortedDict

tree_map = SortedDict(
  {'c': 3,
   'a': 1,
   'b': 2
   })

Backtracking

Backtracking is an algorithm to recursively build a solution incrementally, one piece at a time, removing those solutions that fail to meet our constraints. There are a few types of backtracking algorithms including:

Example Problem: Question: Determine if a path exists from the root of the tree to a leaf node. It may not contain any zeroes

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def canReachLeaf(root):

    # If the first/root node is 0, we can't reach anything downstream
    if not root or root.value == 0:
        return False

    if not root.left and not root.right:
        return True
    if canReachLeaf(root.left):
        return True
    if canReachLeaf(root.right):
        return True
    return False

def leafPath(root, path):
    if not root or root.value == 0:
        return False
    path.append(root.value)

    if not root.left and not root.right:
        return True
    if leafPath(root.left, path):
        return True
    path.pop()
    return False

Heap / Priority Queue

Heap Properties

A Heap / Priority Queue is different than a regular queue in that instead of FIFO, we can order based off a priority value. What gets popped is based on a priority property, which can pop the minimum or maximum of the priority value. A Priority Queue uses a Heap under the hood.

The Heap below is a Binary Heap (can be a minimum binary heap or maximum binary heap).

Heap represented as an Array

# leftChild of i = heap[2 *i]
# rightChild of i = heap[(2 * i) + 1]
# parent of i = heap[i // 2]  # Note, we round down

Push and Pop

Pushing to a heap means inserting a value into the heap. In order to meet the order property, for a minimum heap, we want to make sure the parent is smaller. If the descendant is smaller, we swap with the parent. You stop when the parent is smaller than the child. We shift up (percolate up) to compare descendant with parent, then swap if needed.

Popping from a heap means removing a value from the heap. We can’t just pop a value in the middle of a heap because we can lose the structure. The trick to maintaining the order is to:

# Min Heap
class Heap:
    def __init__(self):
        self.heap = [0]

    def push(self, value):
        self.heap.append(value)
        i = len(self.heap) - 1

        # Percolate up
        while i > 1 and self.heap[i] < self.heap[i // 2]:
            tmp = self.heap[i]
            self.heap[i] = self.heap[i // 2]
            self.heap[i // 2] = tmp
            i = i // 2

    def pop(self):
        if len(self.heap) == 1:
            return None
        if len(self.heap) == 2:
            return self.heap.pop()

        res = self.heap[1]
        # Move last value to root
        self.heap[1] = self.heap.pop()
        i = 1
        # Percolate down
        while 2 * i < len(self.heap):
            if (2 * i + 1 < len(self.heap) and \
                self.heap[2 * i + 1] < self.heap[2 * i] and \
                self.heap[i] > self.heap[2 * i + 1]):

                # Swap right child
                tmp = self.heap[i]
                self.heap[i] = self.heap[2 * i + 1]
                self.heap[2 * i + 1] = tmp
                i = 2 * i + 1
            elif self.heap[i] > self.heap[2 * i]:
                # Swap left child
                tmp = self.heap[i]
                self.heap[i] = self.heap[2 * i]
                self.heap[2 * i] = tmp
                i = 2 * i
            else:
                break
        return res

Pushing and popping is O(log n) (from percolating). We can get the min or max in O(1) time (instead of a Binary Search tree’s O(log n) time.

Heapify

There is a special algorithm called heapify where you’re given any array and can turn them into a heap in O(n) time.

Quick tip: To get the first node that has children, you can take the entire array // 2 and you’ll get the earliest parent that has children. What we’ll do is go through every node that has children, then percolate down.

Continue moving to the left (up the tree) and then percolate / swap.

def heapify(self, arr):
    # 0-th position is moved to the end
    arr.append(arr[0])

    self.heap = arr
    cur = (len(self.heap) -1) // 2
    while cur > 0:
        # Percolate down
        i = cur
        while 2 * i < len(self.heap):
            if (2* i + 1 < len(self.heap) and \
                self.heap[2 * i + 1] < self.heap[2 * i] and \
                self.heap[i] > self.heap[2 * i + 1]):
                # Swap right child
                tmp = self.heap[i]
                self.heap[i] = self.heap[2 * i + 1]
                self.heap[2 * i + 1] = tmp
                i = 2 * i + 1
            elif self.heap[i] > self.heap[2 * i]:
                # Swap left child
                tmp = self.heap[i]
                self.heap[i] = self.heap[2 * i]
                self.heap[2 * i ] = tmp
                i = 2 * i
            else:
                break
        cur -= 1

Heaps are very important for algorithms.

Heapify runs in O(n) time. You can push and pop from the heap in O(log n) time. You can get the min or max in O(1) time.

Hash Sets/Maps

The most important data structure (in real life and leetcode).

Sets

E.g. an unduplicated set of keys

Maps

Key-value pairs.

Properties:

Hash Usage

TreeMap HashMap Operation
O(logn) O(1) Insert
O(logn) O(1) Remove
O(logn) O(1) Search
O(n) O(nlogn) Inorder

Example: Count how often a name appears.

names = ["alice", "brad", "collin", "brad", "dylan", "kim"]

countMap = {}

for name in names:
    # if countMap does not contain name
    if name not in countMap:
        countMap[name] = 1
    else:
        countMap[name] += 1

Hash Implementation

A HashMap is implemented under the hood as an array.

hashmap.put("Alice", "NYC")
hashmap.put("Brad", "Chicago")
hashmap.put("Collin", "Seattle")

We take the key value (e.g. “Alice”) and then hash that string into an integer (usually ascii representation) E.g. a simple hashing algorithm could be add up all the characters, then mod with how much space you have.

a = 0
l = 11
... # add all of this up

We can then have: total int from hash % index spaces = index space to put the key, value pair

HashMap Example

Index Key, Value
0  
1 “Alice”, “NYC”

Hashmap Collissions

After we insert values, we’ll check when we get to 50% usage, then double space and rehash all of the key, value pairs to recompute their new indexes. E.g. “Alice”, “NYC” might be in Index 3 instead of Index 1 after we resize. This process is called rehashing the array.

Ideally the size of the array should be a prime number (instead of doubling, just roughly doubling until it’s a prime).

HashMap Example After Resizing

Index Key, Value
0  
1 “Alice”, “NYC”
2  
3 “Brad”, “Chicago”

Chaining vs Open Addressing

With Chaining, we can create a linked list to store multiple key, value pairs in the same index. The disadvantage is that we’ll need traverse through these values.

With Open Addressing, we get the index and if that address is full, we move onto the next index to see if that address (key, value) is empty until we find an empty one. This is a naive way of doing open addressing (can cluster when there’s a lot of hashmap collisions). We can do some smarter ways of open addressing (e.g. square the index), etc.

class Pair:
    def __init__(self, key, value):
        self.key = key
        self.value = value

class HashMap:
    def __init__(self):
        self.size = 0
        self.capacity = 2
        self.map = [None, None]

    def hash(self, key):
        index = 0
        for c in key:
            index += ord(c)
        return index % self.capacity  # need to mod in case of out-of-bounds

    def get(self, key):
        index = self.hash(key)

        # This implementation is using a naive Open Addressing (instead of Chaining)
        while self.map[index] != None:
            if self.map[index].key == key:
                return self.map[index].value
            index += 1
            index = index % self.capacity
        return None

    def put(self, key, value):
        index = self.hash(key)

        while True:
            if self.map[index] == None:
                self.map[index] = Pair(key, value)
                self.size += 1
                if self.size >= self.capacity // 2:
                    self.rehash()
                return
            elif self.map[index].key == key:
                self.map[index].value = value
                return

            index += 1
            index = index % self.capacity

    def remove(self, key):
        if not self.get(key):
            return

        index = self.hash(key)
        while True:
            if self.map[index].key == key:
                # Removing an element using open-addressing causes a bug,
                # because we may create a hole in the list, and our get() may
                # stop searching early when it reaches this hole.
                self.map[index] = None
                self.size -= 1
                return
            index += 1
            index = index % self.capacity

    def rehash(self):
        # Ideally we want prime numbers for capacity, but this naive way just doubles
        self.capacity = 2 * self.capacity
        newMap = []
        for i in range(self.capacity):
            newMap.append(None)

        oldMap = self.map
        self.map = newMap
        self.size = 0
        for pair in oldMap:
            if pair:
                self.put(pair.key, pair.value)

    def print(self):
        for pair in self.map:
            if pair:
                print(pair.key, pair.value)

Graphs

There’s three types of graphs:

We’ve already seen different subsets of graphs:

What makes up a graph?

Types of graphs

Matrix

A Matrix is a two dimensional array that can be used to represent a graph.

Example:

grid = [[0, 0, 0, 0],
        [1, 1, 0, 0],
        [0, 0, 0, 1],
        [0, 1, 0, 0]]

print(grid[1][2])  # gives us row 1, column 2 (`0`). Remember starts at row 0, col 0

Hint for defining:

Adjacency Matrix

Less common than regular matrixes, these are usually a square matrix (e.g. same as the grid above).

adjMatrix[v1][v2] = 1 # an edge exists from v1 to v2 (i.e. a directed edge)
adjMatrix[v2][v1] = 1 # an edge exists from v2 to v1 (i.e. another directed edge)
adjMatrix[v2][v3] = 0 # an edge does not exist from v2 to v3 (i.e. not a directed edge)

It’s rare to use because we need an entire matrix, meaning we have to use O(v^2) space (no matter how many edges we have). We could reduce down this information to O(v + v) => O(v) (number of nodes + number of edges).

Adjacency List

Adjacency Lists are the most common ways of representing graphs (especially during coding interviews). Example use cases are say a social network (who follows who).

# Used for Adjacency Lists

class GraphNode:
    def __init__(self, value):
        self.value = value
        self.neighbors = []

This is more space efficient (only declaring an array with the pointers we need)

Matrix DFS

Matrix Depth First Search (DFS)

Example Question with DFS (and backtracking): Count the unique paths from the top left to the bottom right. A single path may only move along 0’s and can’t visit the same cell more than once (otherwise might get into infinity paths)

# Matrix (2D Grid)
grid = [[0, 0, 0, 0],
        [1, 1, 0, 0],
        [0, 0, 0, 1],
        [0, 1, 0, 0]]

# Count paths (backtracking)
def dfs(grid, r, c, visit):
    ROWS, COLS = len(grid), len(grid[0])

    # Don't go out of bounds or if we've been there already (visit)
    if (min(r, c) < 0 or
        r == ROWS or c == COLS or
        (r, c) in visit or grid[r][c] == 1):  # visit is a hash set
        return 0

    # Found a single path, assuming there's at least one valid path
    if r == ROWS - 1 and c == COLS - 1:
        return 1

    # Adding to a hash set is O(1)
    # Other options is to mark as a `1` if you visited already
    # Another option is to create a duplicate of the grid and modify the duplicated grid
    visit.add((r, c))

    count = 0  # how many ways can you reach the result
    count += dfs(grid, r + 1, c, visit)  # goes down first
    count += dfs(grid, r - 1, c, visit)  # goes up
    count += dfs(grid, r, c + 1, visit)  # goes right
    count += dfs(grid, r, c - 1, visit)  # goes left

    visit.remove((r, c))
    return count

print(dfs(grid, 0, 0, set()))

Say we have a 4 * 4 matrix; the worst case is 4 choices to 4 choices, etc. When you follow one of these 4 paths/branches, we end up with O(4 ^ (r * c)) The memory complexity is O(r * c).

Matrix BFS

BFS is a common algorithm that is a little simpler to understand (visually) than DFS and is also more time efficient in terms of complexity O(r * c). Instead of going through layers (columns with DFS), we’re now going through layers (rows with BFS).

Question: Find the length of the shortest path from the top left of the grid to the bottom right.

# Shortest path from top left to bottom right
def bfs(grid):
    ROWS, COLS = len(grid), len(grid[0])
    visit = set()
    queue = deque()
    queue.append((0, 0))
    visit.add((0, 0))

    length = 0
    while queue:
        for i in range(len(queue)):
            r, c = queue.popleft()
            # check if it's the destination
            if r == ROWS -1 and c == COLS - 1:
                return length  # when we reach the result

            # go through all four directions
            neighbors = [[0, 1], [0, -1], [1, 0], [-1, 0]]
            for dr, dc in neighbors:
                # e.g. first pass is dr=0, dc=1

                # check if out of bounds, if we visited already, and if it's an invalid location (1)
                if (min(r + dr, c + dc) < 0 or
                    r + dr == ROWS or c + dc == COLS or
                    (r + dr, c + dc) in visit or grid[r + dr][c + dc] == 1):
                    continue
                queue.append((r + dr, c + dc))
                visit.add((r + dr, c + dc))
        length += 1

print(bfs(grid))

Adjacency List

An Adjacency List is much easier to run an algorithm on instead of a Matrix. To implement an adjacency list, you can use a GraphNode (value with list of neighbors) or a HashMap.

# GraphNode for adjacency list
class GraphNode:
    def __init__(self, value):
        self.value = value
        self.neighbors = []

# Or use a HashMap
adjList = {"A": [], "B": []}

# Given directed eges, build an adjacency list
edges = [["A", "B"], ["B", "C"], ["B", "E"], ["C", "E"], ["E", "D"]]

adjList = {}  # Using a hashmap for this example

for src, dst in edges:
    if src not in adjList:
        adjList[src] = []
    if dst not in adjList:
        adjList[dst] = []
    adjList[src].append(dst)

DFS with an Adjacency List

DFS with Backtracking: O(N^V), not very efficient (exponential)

# Count paths (backtracking)
def dfs(node, target, adjList, visit):
    if node in visit:
        return 0
    if node == target:
        return 1

    count = 0
    visit.add(node)
    for neighbor in adjList[node]:
        count += dfs(neighbor, target, adjList, visit)
    visit.remove(node)

    return count

print(dfs("A", "E", adjList, set()))

BFS with an Adjacency List

O(V + E) for time complexity and space complexity of O(V) where V is the number of vertices

# Shortest path from node to target
def bfs(node, target, adjList):
    length = 0
    visit = set()
    visit.add(node)
    queue = deque()
    queue.append(node)

    while queue:
        for i in range(len(queue)):
            current = queue.popleft()
            if current == target:
                return length

            for neighbor in adjList[current]:
                if neighbor not in visit:
                    visit.add(neighbor)
                    queue.append(neighbor)
        length += 1
    return length

print(bfs("A", "E", adjList))

Graphs can get more complicated based on priority (e.g. if say our graph is cities and there’s actual weights to the vertices, say miles/distance between each city)

1-D Dynamic Programming

Example Problem: Fibonacci Sequence

Brute Force with Recursion

# Brute Force
def bruteForce(n):
    if n <= 1:
        return n
    return bruteForce(n - 1) + bruteForce(n - 2)

Memoization (aka Top Down Dynamic Programming)

Once you calculate something, you can store that result somewhere (that way you don’t create the subtrees that were already calculated). Basically adds caching.

For the cache, you can either use a hashmap or an array.

Takes O(n) time and O(n) space.


# Memoization (an optimization technique to make applications more efficient)
def memoization(n, cache):
    if n <= 1:
        return n
    if n in cache:
        return cache[n]  # here the cache is a hashmap

    # Cache so we don't have to make the same recursive calls
    cache[n] = memoization(n - 1) + memoization(n - 2)
    return cache[n]

Dynamic Programming (aka Bottom Up Dynamic Programming)

The true dynamic programming approach (aka bottom up dynamic programming) where we don’t use recursion. Instead of top down, we can start at the base case immediately and calculate up.

Time complexity is O(n) and memory is O(1) (since we only save the last two previous values). This is the most efficient way to solve the problem.

# Dynamic Programming
def dp(n):
    if n < 2:
        return n

    dp = [0, 1]
    i = 2
    while i <= n:
        tmp = dp[1]
        dp[1] = dp[0] + dp[1]
        dp[0] = tmp
        i += 1
    return dp[1]

Can be represented mathematically: F(0) =, F(1) = 1, F(n) = F(n-1) + F(n-2)

The main idea is that we are taking a big problem, then breaking into subproblems and then solving that subproblem

2-D Dynamic Programming

Example Problem: Count the number of unique paths from the top left to the bottom right.

  1. You are only allowed to move down or to the right
  2. there’s no blockers (i.e. every path to the result is the same length)

Brute Force

Time: O(n ^ (n + m)) Space: O(n + m)

# Brute Force
def bruteForce(r, c, rows, cols):
    if r == row or c == cols:
        return 0
    if r == rows - 1 and c == cols - 1:
        return 1

    return (bruteForce(r + 1, c, rows, cols) +
            bruteForce(r, c + 1, rows, cols)

print(bruteForce(0, 0, 4, 4))

Memoization

Time: O(n * m) Space: O(n * m)

# Memoization
def memoization(r, c, rows, cols, cache):
    if r == rows or c == cols:
        return 0
    if cache[r][c] > 0:
        return cache[r][c]
    if r == rows - 1 and c == cols -1:
        return 1

    cache[r][c] = (memoization(r + 1, c, rows, cols, cache) +
        memoization(r, c + 1, rows, cols, cache))
    return cache[r][c]

print(memoization(0, 0, 4, 4, [[0] * 4 for i in range(4)]))

Dynamic Programming

Time: O(n * m) Space: O(m) where m is the number of columns

# Dynamic Programming
def dp(rows, cols):
    prevRow = [0] * cols

    for r in range(rows -1, -1, -1):
        curRow = [0] * cols
        curRow[cols - 1] = 1
        for c in range(cols -2, -1, -1):
            curRow[c] = curRow[c + 1] + prevRow[c]
        prevRow = curRow
    return prevRow[0]

Bit Manipulation

Bit Manipulation contains:

Bit Operations:

# AND - both bits need to be 1, otherwise will be 0
n = 1 & 1

# OR - just one bit needs to be 1, otherwise will be 0 if none are
n = 1 | 0

# XOR - exclusive OR, result will be 1 ONLY IF ONE of the bits is 1 (not both, or none)
n = 0 ^ 1

# NOT (negation) - opposite
n = ~n

# Bit Shifting
n = 1
n = n << 1
n = n >> 1

### Truth Tables

## AND

| AND |
| 0 & 0 | 0 |
| 0 & 1 | 0 |
| 1 & 0 | 0 |
| 1 & 1 | 1 |

## OR

| OR |
| `0 | 0` | 0 |
| `0 | 1` | 1 |
| `1 | 0` | 1 |
| `1 | 1` | 1 |

## XOR

| XOR |
| 0 ^ 0 | 0 |
| 0 ^ 1 | 1 |
| 1 ^ 0 | 1 |
| 1 ^ 1 | 0 |

### Bases (e.g. Base 2, Base 10)

Exponents

The exponent of a number shows how many times the number is multiplied by itself.
The zero property of exponents is applied when the exponent of any base is 0.

#### Base 10

`1011` means 10^0 for 1, 10^1 for 10 (the second digit), 10^2 for 0 (the third digit), and 10^4 for 1 (fourth digit).

For base 10:
    5     4     3     2
    10^3  10^2  10^1  10^0
    1000s 100s  10s   1s

Bit Shifting for Base 10

Basically just multiplies by 10

e.g. 5432 would add a 0, so it would be 54320 (i.e. multiplies by 10)

#### Base 2 (aka __Binary__)

`1011` means that the first digit on the right is 1. The next digit is the presence of 2 (another 1).
The third digit is the presence of 4. The next digit is the presence of 8.

It's basically 2^0 for the first right digit, 2^1 for the next digit, 2^2 for the next digit, 2^4 for next, 2^8 next.

For binary (base 2):
    1    0    1    1
    2^3  2^2  2^1  2^0

Bit Shifting for Base 2

Shift left by one, e.g. `001 << 1` bit shift to the left would be `010` (i.e. 1 * 2)
* If we shift left by 1 again, we now have `100`. We're multiplying by 2 again
* If we shift left again `100 << 1`, the `1` drops off and we have `000`.

Shift right by one, e.g. `100 >> 1` bit shift to the right would be `010` (i.e. divide by 2)
* If we shift right by 1, we now have `010 = 50`. We're dividing by 2.
* If we have an odd number, we round down

Code:

Counting Bits

def countBits(n): count = 0 while n > 0: if n & 1 == 1: # AND count += 1 n = n » 1 # same as n // 2 return count

23 = 10111

print(countBits(23))



## Advanced Algorithms

### Overview

Arrays

* Kadane's Algorithm
* Sliding Window Fixed Size
* Sliding Window Variable Size
* Two Pointers
* Prefix Sums

Linked Lists

* Fast and Slow Pointers

Trees

* Trie
* Union-Find
* Segment Tree
* Iterative DFS

Heaps

* Two Heaps

Graphs

* Dijkstra's
* Prim's
* Kruskal's

Dynamic Programming

* 0/1 Knapsack
* Unbounded Knapsack
* LCS
* Palindromes

### Arrays

#### Kadane's Algorithm

Question: Given an array of positive or negative numbers, return a non-empty (contiguous) subarray with the largest sum.

Brute Force

O(n^2)

def bruteForce(nums): maxSum = nums[0]

for i in range(len(nums)):
    curSum = 0
    for j in range(i, len(nums)):
        curSum += nums[j]
        maxSum = max(maxSum, curSum)

return maxSum ```

Kadane’s Algorithm

# Kadane's Algorithm: O(n)
def kadanes(nums):
    maxSum = nums[0]
    curSum = 0

    for n in nums:
        curSum = max(curSum, 0)
        curSum += n
        maxSum = max(maxSum, curSum)
    return maxSum

Kadane’s Algorithm has a lot of overlap under the Sliding Window pattern.

Sliding Window

# Return the left and right index of the max subarray sum,
# Assuming there's exactly one result (no ties).
# Sliding window variation of Kadane's: O(n)

def slidingWindow(nums):
    maxSum = nums[0]
    curSum = 0
    maxL, maxR = 0, 0  # positions of window with the maxSum
    L = 0

    for R in range(len(nums)):
        if curSum < 0:
            curSum = 0
            L = R

        curSum += nums[R]
        if curSum > maxSum:
            maxSum = curSum
            maxL, maxR = L, R

    return [maxL, maxR]

Sliding Window Fixed Size

Question: Given an array, return true if there are two elements within a window of k that are equal

Example 1:

k = 2
array = [1, 2, 3, 2, 3, 3]
check 1,2
check 2,3
check 3,2
check 3,3 returns True

CheckKNearbyDuplicatesBruteForce

# Check if array contains a pair of duplicate values,
# where the two duplicates are no farther than k poisitions from
# each other (i.e. arr[i] == arr[j] and abs(i -j) <= k).
# O(n * k)
def checkKNearbyDuplicatesBruteForce(nums, k):
    for L in range(len(nums)):
        for R in range(L + 1, min(len(nums), L + k)):
            return True
    return False

We use a hashset to help optimize the solution by adding the window to the hashset (and removing anything outside the window). CheckKNearbyDuplicatesOptimized

# Same problem using a sliding window
# O(n)
def checkKNearbyDuplicatesOptimized(nums, k):
    window = set()  # Cur window of size <= k
    L = 0

    for R in range(len(nums)):
        if R - L + 1 > k:
            window.remove(nums[L])
            L += 1
        if nums[R] in window:
            return True
        window.add(nums[R])

    return False

Sliding Window Variable Size

Example 1:

Question: Find the length of the longest subarray, with the same value in each position

Array = [4, 2, 2, 3, 3, 3]
4 = L = R
Increment R, check if they're the same value (4 and 2), they're not
Increment L until L = R (both at 2)
Increment R until it's a new value
...

longestSubarray

# Find the length of longest subarray with the same
# value in each position
# O(n)
def longestSubarray(nums):
    length = 0
    L = 0

    for R in range(len(nums)):
        if nums[L] != nums[R]:
            L = R
        length = max(length, R - L + 1)
    return length

Example 2:

Question: Find the minimum length subarray, where the sum is greater than or equal to the target. Assume all values are positive.

Target = 6
# Find length of minimum size subarray where the sum is
# greater than or equal to the target:
# O(n)
def shortestSubarray(nums, target):
    L, total = 0, 0
    length = float("inf")

    for R in range(len(nums)):
        total += nums[R]
        while total >= target:
            length = min(R - L + 1, length)
            total -= nums[L]
            L += 1
    return 0 if length == float("inf") else length

Two Pointers

Example 1:

Question: Check if an array is a palindrome

# Given a string of characters, return true if it's a palindrome,
# return false otherwise
# O(n)
def isPalindrome(word):
    L, R = 0, len(word) - 1
    while L < R:
        if word[L] != word[R]:
            return False
        L += 1
        R -= 1
    return True

Example 2:

Question: Given a sorted input array, return the two indices of two elements which sum up to the target value. Assume there’s exactly one solution.

Target = 7
Array = [-1, 2, 3, 4, 8, 9]

# Given a sorted array of integers, return the indices
# of two elements (in different positions) that sum up to the
# target value. Assume there is exactly one solution.
# O(n)

def targetSum(nums, target):
    L, R = 0, len(nums) - 1
    while L < R:
        if nums[L] + nums[R] > target:
            R -= 1
        elif nums[L] + nums[R] < target:
            L += 1
        else:
            return [L, R]

Prefix and Postfix Sums

Prefixes are contiguous blocks that start at the beginning.

array = [2, -1, 3, -3, 4]

Example Prefixes:
[2]
[2, -1]
[2, -1, 3]

A postfix would look like:

[4]
[-3, 4]
[3, -3, 4]

Example 1:

Question: Given an array of values, design a data structure that can query the sum of a subarray of the values

The idea is to precompute the work and save that data so we can use it in our calculations later.

class PrefixSum:

    def __init__(self, nums):
        self.prefix = []
        total = 0
        for n in nums:
            total += n
            self.prefix.append(total)

    def rangeSum(self, left, right):
        preRight = self.prefix[right]
        preLeft = self.prefix[left - 1] if left > 0 else 0  # handle out of bounds
        return (preRight - preLeft)

Linked Lists

Fast and Slow Pointers (aka Floyd’s tortoise and hare) algorithm

Example 1:

Question: Find the middle of a linked list

Can just technically count the length and then get the middle in O(n).

The idea is that we have two pointers, a fast and a slow that we initiate both at the front of the linked list. The fast pointer increments by 2 positions while the slow pointer increments by 1 position. If the fast pointer reaches the end of the linked list, the slow pointer should be in the middle of the linked list, at least if there’s an odd number of items. If there’s an even number, then the fast pointer hits null and the slow pointer will be at the midway point (assuming we use the latter on a tie; if we want to use the earlier value on a tie, then just initialize the faster point one ahead of the slow pointer).

# Find the middle of a linked list with two pointers
# Time: O(n), Space: O(1)

def middleOfList(head):
    slow, fast, = head, head
    while fast and fast.next:
        slow = slow.next
        fast = fast.next.next
    return slow

Example 2: Linked List with a cycle

Question: Determine if a linked list has a cycle

We can use a HashSet to add the pointer that we visited into the HashSet, but that costs O(n) memory space.

If the slow pointer and fast cycle intercept, then it means there is a cycle (i.e. a loop/circle).

# Determine if the linked list contains a cycle
# Time: O(n), Space: O(1)

def hasCycle(head):
    slow, fast = head, head
    while fast and fast.next:
        slow = slow.next
        fast = fast.next.next
        if slow == fast:
            return True
    return False

Example 3:

Question: Determine if a linked list has a cycle and return the beginning of the cycle if there is (or None if no cycle)

# Determine if the linked list contains a cycle and
# return the beginning of the cycle, otherwise return null.
# Time: O(n), Space: O(1)
# 2 * slow = fast
# if there is a cycle, slow and fast pointers will intersect at some arbitrary location

def cycleStart(head):
    slow, fast = head, head
    while fast and fast.next:
        slow = slow.next
        fast = fast.next.next
        if slow == fast:
            break

    if not fast or not fast.next:
        return None

    slow2 = head
    while slow != slow2:
        slow = slow.next
        slow2 = slow2.next
    return slow

Tree

Trie (Prefix Tree)

A trie (aka prefix tree) is a tree of characters (e.g. a-z, A-Z). It’s a data structure where you can:

We can use a HashSet, but it doesn’t let you do a search prefix in O(1) With a hashmap, we can search for the word “apple”, but we can’t find the prefix of “ap”; we would have to search through every word.

We have a single root node that is empty, then there are children of the characters (e.g. a-z). Each of those children have another set of characters (e.g. a-z). So the word ‘apple’ would be ‘a’, ‘p’, ‘p’, ‘l’, ‘e’ with each character as a TrieNode. To keep track of what children are in a TrieNode, we use a hashmap.

class TrieNode:
    def __init__(self):
        self.children = {}
        self.word = False  # boolean value to determine if it's the end of the word we're looking for

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word):
        """ If the character is not there, then create a new TrieNode """
        curr = self.root
        for c in word:
            if c not in curr.children:
                curr.children[c] = TrieNode()
            curr = curr.children[c]
        curr.word = True

    def search(self, word):
        curr = self.root
        for c in word:
            if c not in curr.children:
                return False
            curr = curr.children[c]
        return curr.word

    def startsWith(self, prefix):
        curr = self.root
        for c in prefix:
            if c not in curr.children:
                return False
        return True

Union-Find (aka Disjoint sets)

Union-Find (aka Disjoint sets) is a tree data structure, but can be applied to any generic graphs.

Purpose:

Implementation:

Example 1:

nodes = [1, 2, 3, 4]
edges = [[1,2], [4,1], [2,4]]

# all nodes are disconnected
# can randomly go to any node


class UnionFind:
    def __init__(self, n):
        self.par = {}  # keep track of the parent, can use as a hash or array
        self.rank = {}  # by rank we mean the 'height', we want our tree to be as small as possible (more efficient for find)

        for i in range(1, n+ 1):
            self.par[i] = i  # initialize the parent as the root itself
            self.rank[i] = 0  # set the rank / height to 0 as a default, can also be 1

    def find(self, n):
        """ Given some node, we want to find the parent (i.e. the root parent) """
        p = self.par[n]
        while p != self.par[p]:
            self.par[p] = self.par[self.par[p]]  # path compression; shorten chain by setting the parent to it's grandparent
                                                 # in case we run find again
            p = self.par[p]
        return p

    def union(self, n1, n2):
        """ Union the two nodes together """
        p1, p2 = self.find(n1), self.find(n2)
        if p1 == p2:  # if the root parents are the same, cannot union since they're the same component
            return False

        # union by rank, aka union by height
        if self.rank[p1] > self.rank[p2]:  # p1 higher height than p2, p1 should be parent
            self.par[p2] = p1
        elif self.rank[p1] < self.rank[p2]:  # p2 is higher height than p1, p2 should be parent
            self.par[p1] = p2
        else:  # heights are equal
            self.par[p1] = p2
            self.rank[p2] += 1  # need to increase the height by one
        return True

Segment Tree

Note: not as common in coding interviews Say we’re given an array of values and we want to support two main operations:

Approach 1 (with an Array):

Approach 2 (with a Segment Tree):

How does this work?

Example 1:

Index   0 1 2 3 4 5
Values  5 3 7 1 4 2

               [0,5]
                22
              /    \
        [0,2]       [3,5]
         15           7
        /    \      /   \
    [0,1]   [2,2] [3,4]  [5,5]
      8       7     5      2
    /    \        /    \
[0,0]   [1,1]   [3,3]  [4,4]
  5       3       1      4
class SegmentTree:
    def __init__(self, total, L, R):  # e.g. total is the 22 at the root node
        self.sum = total  # e.g. 22 at the root [0,5] (total of everything added up)
        self.left = None  # pointer to the left child
        self.right = None  # pointer to the right child
        self.L = L  # references index of the left boundary of the node e.g. 0 of the [0,5]
        self.R = R  # references index of the right boundary of the node e.g. 5 of the [0,5]

    # O(n)
    @staticmethod
    def build(nums, L, R):
        """ Build a Segment Tree """
        if L == R:
            return SegmentTree(nums[L], L, R)

        M = (L + R) // 2
        root = SegmentTree(0, L, R)
        root.left = SegmentTree.build(nums, L, M)  # build left side of the segment tree
        root.right = SegmentTree.build(nums, M + 1, R)  # build right side of the segment tree
        root.sum = root.left.sum + root.right.sum  # calculate the sum of the root node
        return root

    # O(log n)
    def update(self, index, value):
        """

        """
        if self.L == self.R:
            self.sum = value
            return

        M = (self.L + self.R) // 2  # similar to how we built the segment tree, find where this value is
        if index > M:
            self.right.update(index, value)
        else:
            self.left.update(index, value)
        self.sum = self.left.sum + self.right.sum

    # O(log n)
    def rangeQuery(self, L, R):
        """
        think of ranges as a number line. the entire range can be one of the following scenarios:
          * left only
          * right only
          * left and right
        we go through each value and keep popping back up the actual sum/total
        """
        if L == self.L and R == self.R:
            return self.sum

        M = (self.L + self.R) // 2
        if L > M:
            return self.right.rangeQuery(L, R)
        elif R <= M:
            return self.left.rangeQuery(L, R)
        else:
            return (self.left.rangeQuery(L, M) +
                    self.right.rangeQuery(M + 1, R))

Iterative DFS

Note: Recursive DFS are easier; use that if you can. We can implement DFS on a binary tree with recursion, but we can also do this as an Iterative DFS. We can do this three ways and we’ll do some type of operation (print in the example):

Example 1:

    1
   / \
  2   3
 /      \
4        5

Inorder

We process the children node first. We go left and keep going left until we can’t anymore (null node). On a recursive solution, we would just return (pop back up to the previous node of the call stack). When we do this iteratively, we do the same thing (not a call stack, but just create a stack data structure and save the nodes onto the stack).

Add to the stack

Stack
 1

Stack
 2
 1

Stack
 4
 2
 1

Reach a null node, then pop from the stack

Stack
  popped 4 (next step is to keep popping because nothing on the right)
 2
 1

Stack
  popped 4
  popped 2 (next step is to keep popping because nothing on the right)
 1

Stack
  popped 4
  popped 2
  popped 1 (start traversing the right subtree)
  3

Stack
  popped 4
  popped 2
  popped 1
  popped 3
  5

Stack
  popped 5, no more nodes to traverse, can stop algorithm
Preorder

We print or process the current node BEFORE we process the children node.

Code:

# Definition for a binary tree node
class TreeNode:
    def __init__9self, val, left, right):
        self.val = val
        self.left = left
        self.right = right

# Time and space: O(n), really O(height) of the tree
def inorder(root):
    """ We're given the root node and we create our own stack """
    stack = []
    curr = root

    while curr or stack:  # if our curr pointer is null and stack is empty, we reached the end
        if curr:
            stack.append(curr)  # save node by pushing to the stack
            curr = curr.left  # traverse left
        else:
            curr = stack.pop()  # pop back to the parent node
            print(curr.val)
            curr = curr.right  # traverse right

# Time and space: O(n)
def preorder(root):
    """
    We print or process the current node BEFORE we process the children node
    We process the left subtree immediately (e.g. 1, 2, 4) then right subtree after (3, 5)
    At the root node, add left and right node to the stack, then keep traversing left nodes and push to the stack.
    """
    stack = []
    curr = root
    while curr or stack:
        if curr:
            print(curr.val)
            if curr.right:
                stack.append(curr.right)
            curr = curr.left
        else:
            curr = stack.pop()

# Time and space: O(n)
def postorder(root):
    """
    We have a stack and a visit flag stack
    * Start with adding the root node (1) and visit flag (False)
    * Then add next level (2, False) and (3, False)
    * Now add the original/root node (1, True) but with a True since it's been visited
    * We add 3 to the stack first (False), then 2 to the stack (False)
    * We don't maintain our pointer, we just pop from the stack
    * We then visit 2 and add that to the stack (2, True), then add its children to the stack
    * We add the right child (Null) to the stack (visit = False)
    * We then add 4 to the stack (visit = False)
    * When we pop a value with a visit = True, we print it because all of its children have been added
    * etc
    """
    stack = [root]
    visit = [False]
    while stack:
        curr, visited = stack.pop(), visit.pop()  # going to be the same size stacks
        if curr:
            if visited:
                print(curr.val)  # if visited already, we already added its descendants
            else:
                stack.append(curr)  # take the current node and add it to the stack
                visit.append(True)

                # important to add right child first so left child is popped first
                stack.append(curr.right)
                visit.append(False)  # we set False because we haven't added its children yet
                visit.append(curr.left)
                visit.append(False)  # we set False because we haven't added its children yet

Two Heaps

Example 1 (Basic):

Two heaps are useful when you want to find the median.

Odd
[4, 7, 3, 5, 1]
Median would be the middle value = 4

Even
[1, 3, 4, 5]
Median would be 3 + 4 / 2 = 3.5

Example 1 (Stream): So instead of getting all of those values at once, say we get a stream of values:

[4]
[4, 7]
[4, 7, 10]

If we’re inserting a new value: O(n) to insert into a sorted order To getMedian is an O(1) operation then.

Question: Implement a Median finder, where new values are inserted into the set, and you have to getMedian from that set.

We want to insert with O(log n) and getMedian with O(1).

Solution:

import heapq


class Median:
    def __init__(self):
        self.small, self.large = [], []

    def insert(self, num):
        # Push to the max heap and swap with min heap if needed
        heapq.heappush(self.small, -1 * num)
        if (self.small and self.large and (-1 * self.small[0]
            val = -1 * heapq.heappop(self.small)
            heapq.heappush(self.large, val)

        # Handle uneven size
        if len(self.small) > len(self.large) + 1:
            val = -1 * heapq.heappop(self.small)
            heapq.heappush(self.large, val)
        if len(self.large) > len(self.small) + 1:
            val = heapq.heappop(self.large)
            heapq.heappush(self.small, -1 * val)

    def getMedian(self):
        if len(self.small) > len(self.large):
            return -1 * self.small[0]
        elif len(self.large) > len(self.small):
            return self.large[0]

        # Even # of elements, return avg of two middle nums
        return (-1 * self.small[0] + self.large[0]) /2

Backtracking

Subsets

Combinations

Permutations

Graphs

Dijkstra’s

Prim’s

Kruskal’s

Topological Sort

Dynamic Programming

0/1 Knapsack

Unbounded Knapsack

LCS

Palindromes

Competitive Programming Algorithms and Data Structures

More advanced/competitive programming algorithms can be found here