William Liu

Algorithms and Data Structures


##Table of Contents

TODO:

DATA STRUCTURES

##Summary

In computer science, a data structure is a data organization, management and storage format that enables efficient access and modification. A data structure is a collection of data values, the relationships among them, and the functions or operations that can be applied to the data.

####Structure of Data

An individual data piece (i.e. a record) is usually part of a bigger collection of data. Each record has a key (e.g. a value to be sorted) and the rest of the record has satellite data (data carried around with the key).

So what goes on behind the scenes? When we move data around, we apply a sorting algorithm; usually there’s a large amount of satellite data that makes it slow to physically move (e.g. erase from hard drive, write to another place in hard drive) so instead we normally just move our pointer to the records.

A sorting algorithm is the method that we determine the sorted order (regardless of individual fields or large satellite data). There’s various algorithms we can apply and various ways we can store our data.

####Abstract Data Type (ADT)

An abstract data type is a mathematical model for data types, where a data type is defined by its behavior from the point of view of a user of the data, specifically in terms of possible values, possible operations on data of this type, and the behavior of these operations.

We call these ‘abstract’ because it gives an implementation independent view, (hiding the details of the implementation). For example, we have data types like int, float, char with the knowledge of what values can be assigned and what operations can be performed, but not know the implementation details.

The issue with Python compared to other programming languages is that in say Java, the naming scheme of a standard data structure is pretty straightforward. In Java, there isn’t just a list; it’s either a LinkedList or an ArrayList. In Python, we have a more ‘human’ naming scheme, meaning we just call it a list, but really don’t know if this is a linked list or a dynamic array.

What?

So let’s try to break this down a bit. Say we swapped out ‘data type’ with vehicle. What would be an ‘abstract vehicle’? You would know what some of the actions you could do with it would be (e.g. drive the vehicle, sit in the vehicle), but it doesn’t say exactly how you would do that (whether that’s an automatic or manual vehicle, whether you need to open the door because its in a car or motorcycle). Some of the data stored might be say the color of the vehicle, number of wheels, etc. The idea is that we get an implmentation independent view. We hide the details of so that it’s an abstraction. We don’t care about details like the number of wheels, the number of doors (if any) or how to apply an action (e.g. drive stick or automatic); it’s all a black box. We do this to create an encapsulation around the data.

So we might have an Abstract Data Type like an Integer (e.g. -2, -1, 0, 1, 2) where we can do operations like addition, subtraction, multiplication, division, and comparisons like equals, greater than, less than. We don’t care how the computer moves the data around, whether it’s a binary number or a binary-coded decimal. We simply say ‘hey, add these two numbers together and give me the result’.

Some common abstract data types include:

User vs Implementor

Imagine if you’re the user that has to interact with the interface, using the operations that have been specified by the abstract data type. You might call a get_max method or a delete_last method, but not be concerned about how its implemented, since there can be many different ways to implement an abstract data type; i.e. we create an implementation-indepdent view of the data.

The implementation of an abstract data type is often called the data structure.

Programming Specific

In a programming language like Python, we have objects like Abstract Base Class. Here we see what methods are available, but they’re not yet implemented.

#####Fundamental Abstract Data Type (ADT)

There are a few fundamental abstract data types:

Containers

Containers are abstract data types that hold stuff. We use the container to hold data and retrieve data items independently of content. Some popular types of containers are:

Dictionaries

Dictionaries are a form of container that gives you access to data (values) by keys.

Binary Search Trees

Binary Search Trees is a binary tree where each node contains at most two children. Each child can be identified as either a left or right child.

Priority Queues

Priority Queues are a list with a priority (i.e. sorting items inside). Priority Queue operations can be implemented in time by representing the heap with a binary search tree.

####Abstract Data Types and Classes

So how does an Abstract Data Type work with a programming language’s Class? We define a class to be a description of what:

Data items are called objects in this object-oriented paradigm. An object is an instance of a class.

Python specific Classes

Python has a number of built-in classes, from:

####Using Lists to build data structures

A lot of data structures can be implemented with the help of arrays and lists. In Python, arrays are lists and lists are say custom classes for:

You might see the following methods for the data structure of alist.

Operations include:

Methods include:

####<a` id=”stringfordatastructures”>Using Strings to build data structures</a>

You might see the following methods for the data structure of astring.

Methods include:

####<a` id=”setsfordatastructures”>Using Sets to build data structures</a>

Use Sets with:

engineers = set(['John', 'Jane', 'Jack', 'Janice'])

Operations include:

Methods include:

####<a` id=”dictsfordatastructures”>Using Dictionaries to build data structures</a>

Operators include:

Methods include:

#####Modeling a Problem

Modeling is the art of formulating your application in terms of precisely described, well understood problems. Modeling is the key to applying algorithmic design techniques to real-world problems. It’s a way of relating your application to what has been done before.

Most algorithms are designed to work on a rigorously defined abstract structure, such as a permutation or graph or set. To utilize algorithms to their fullest extent, you have to be able to describe your problem abstractly in terms of procedures on fundamental structures.

What this means is that if you’re in a company building software for “widgets”, you can’t look up a solution to a “widget optimization problem”. Instead, you have to describe how your widget fits into common data structures (to then leverage the common methods of those data structures).

#####Fundamental Data Structures

So with your widget, you can see if it fits any of these fundamental data structures:

These fundamental data structures are important because they provide the language to use to model applications. Study the input and output of each problem and you’ll know where to look when the problem arises in your application.

Modeling is only the first step in designing an algorithm for a problem, but its also the most important step towards a solution. Make sure you model the problem correctly before trying to solve it.

###Contiguous vs Linked Data Structures

Data structures can be either contiguous or linked, depending on whether they are based on arrays or pointers.

####Comparison: Linked Lists versus static Arrays

Advantages of Linked List over Arrays

Advantages of Arrays over Linked Lists

##What is an Algorithm?

An algorithm is a procedure to accomplish a specific task. An algorithm must solve a general, well-specified problem. An algorithmic problem is specified by describing the complete set of instances it must work on and of its output after running on one of these instances.

We take an input (i.e. some value(s)) and produces an output (some value(s)). For example, we might encounter a sorting problem where we want to sort a sequence of numbers into nondecreasing order.

An instance of sorting might be an array of names like ['Mike', 'Bob', 'Sally', 'Jill', 'Jan'] whereas an algorithm takes any of the possible input instances and transforms it to the desired output.

###Expressing Algorithms

You can express algorithms in three common forms:

The best notation depends on which method you are most comfortable with. Usually I prefer to express the ideas of an algorithm in English, then move to a more formal programming language like Python. If your idea is not clearly revealed when you express an algorithm, then you are using too low-level a notation to describe it.

###What’s the problem?

Before even looking at an algorithm, ask yourself if you have a careful description of the problem that the algorithm is trying to solve? Problem specifications usually have two parts:

  1. The set of allowed input instances
  2. The required properties of the algorithm’s output

It is impossible to prove the correctness of an algorithm for a fuzzily-stated problem. Ask the wrong problem and you will get the wrong answer.

Its very important that you try to narrow the set of allowable instances until there is a correct and efficient algorithm.

For example, what is the best route between these two places? We need to define what ‘best’ means, whether that is shortest in distance, time, or minimizing the number of turns.

###a id=”thinkrecursively”>Recursive Objects</a>

When you’re trying to find what the problem is, learning to think recursively is to important so that you know how to look for big things that are made from smaller things that are made exactly the same type as the big thing. Think of houses as sets of rooms, then adding or deleting a room and you still have a house behind. If we look at our fundamental data structures, we have:

These recursive descriptions of objects require both decomposition rules and basis cases, namely the specification of the smallest and simplest objects where the decomposition stops.

###What makes a good algorithm?

In a good algorithm, you want to look for:

  1. Is the algorithm correct?
  2. Is the algorithm efficient?
  3. Is the algorithm easy to implement?

Sometimes you can’t achieve all three.

####Is the algorithm correct?

An algorithm is correct if every input instance halts with the correct output. A single input is an instance of a problem. For example:

// an instance of a problem Input Sequence of {31, 41, 59, 26, 41, 58}
// Output Sequence of {26, 31, 41, 41, 58, 59}

It is usually not obvious whether a given algorithm correctly solves a given problem.

Correct algorithms usually come with a proof of correctness, which is an explanation of why we know that the algorithm must take every instance of the problem to the desired result. We can’t just say “it’s obvious”, because usually it’s not obvious.

There’s a fundamental difference between algorithms, which always produce a correct result, and heuristics, which may usually do a good job, but without providing any guarantee.

#####<a id=”disprovealgorithms>Disproving algorithms</a>

Searching for counterexamples is the best way to disprove the correctness of a heuristic.

Good counterexamples have all unnecessary details boiled away. Make it clear why the proposed algorithm fails. Try to:

#####<a id=”provealgorithms>Proving algorithms</a>

A proof or demonstration of correctness is needed. Usually this is done through mathematical induction. So what is it? Think of mathematical induction like recursion. You break a problem up by solving a base case and then any smaller pieces.

####Is the algorithm efficient?

There are different algorithms to solve the same problem and they are often drastically different in efficiency. We normally measure efficiency by looking at running time, which is mainly affected by the size of input. However, Big O notation is also used to check space complexity. See Big O for further details.

  1. the size of input normally means looking at the number of items in the input; for other problems like multiplying integers, we look at the total number of bits used.
  2. the running time is the number of ‘steps’ (i.e. the number of times something is executed).

####Is a sorting algorithm stable?

For sorting algorithms, you might hear that one type is ‘stable’ while others are ‘unstable’.

##Walking through an Algorithm Problem

  1. Listen Carefully
  2. Draw an Example
  3. State a Brute Force
  4. Optimize
  5. Walk Through
  6. Implement

###Listen Carefully

Mentally record any unique information in the problem. For example:

Ask yourself if you’ve used all the information in the problem, especially if you get stuck.

###Draw an Example

Do not skip this step. Draw a good example, meaning watch out to:

Make sure you really understand what you’re trying to solve. You can’t come up with the right solution if you’re solving the wrong problem!

Your example should help reinforce your understanding of the problem and whether any algorithms that you apply will work.

###State a Brute Force

After the example, state a brute force answer (even if it’s obvious to you). It’s okay that the initail solution is terrible. Explain the space and time complexity, then improve it. It’s a great starting point for optimizations and it helps you wrap your head around the problem.

###Optimize

After the brute force algorithm, you’ll need to come up with an optimized algorithm.

###Walk Through

After you’ve gotten an optimal algorithm, don’t dive into coding. Take a moment to solidify your understanding of the algorithm.

Whiteboard coding is very slow; same with testing and fixing code. Make sure you have code that is close to perfect in the beginning.

If you don’t understand exactly what you’re about to write, you’ll struggle to code it. It will take longer to finish the code and you’ll make major errors.

###Implement

When you have an optimal algorithm and know exactly what you’re going to write, then go ahead and implement it. Start coding on the top left corner, avoid “line creep” (where each line of code is at a slant) and remmber spacing.

Write beautiful code, meaning:

###Test

Don’t submit code in an interview without testing it.

When you find bugs, don’t just make the first correction you can think of. Analyze why the bug occurred and ensure your fix is the best one.

##Optimize and Solve Technique 1: BUD

Look for BUD:

###Bottlenecks

A bottleneck is a part of your algorithm that slows down the overall runtime. This can occur when:

###Unnecessary Work

###Duplicated Work

##Optimize and Solve Technique 2: DIY

Try to solve the problem manually with a real life example, like instead of “Create a sorting algorithm” using binary search, we think about how to locate a student’s paper.

##Optimize and Solve Technique 3: Simplify and Generalize

We’ll implemnet a multi-step approach:

##Optimize and Solve Technique 4: Base Case and Build

We solve the problem first for a base case (e.g. n=1) and then try to build up from there. When we get to more complex cases (e.g. n=3, n=4), we try to build those using the prior solutions. Now that we understand the pattern, try to use say a recursive algorithm.

##Optimize and Solve Technique 5: Data Structure Brainstorm

Run through the list of data structures and try to apply each one. This might help if you’re stuck on a problem and realize that if you used say a tree, it’s a trivial problem.

Use data structures generously!!!

##Best Conceivable Runtime (BCR)

Consider the best conceivable runtime and what that solution might look like in Big O.

##Good Code

Try to write good code, meaning:

##Big O

We use Big O notation to give an estimated running time based on the input size (as it tends toward infinity). Big O is basically a math formula that counts how many steps of computations get executed given a specified input size; the more computational steps the slower it is. We characterize these functions according to their rate of growth (aka growth rate), which is represented using the O Notation. We also care about space (so see Space Complexity below).

Note: We also use Big O for measuring space complexity (not just time)!

###Big O Operations

When we say calculate Big O, we often have some scenarios we want to look at:

Note: Access vs Search - Access is like if I knew your mailbox, I would go there directly. Search would be I know your mailbox is somewhere in this block, start searching through. You might not always have a constant Access (e.g. linked list might take you through multiple items before you find the one you’re looking for)

####Big O Example

Say you want to get a hard drive of information to your friend across the country. We can do an ‘electronic transfer’ through email or a ‘physical transfer’ by flying out there with the hard drive. Each method has a different time complexity:

If we compare these two scenarios, the ‘Airplane Transfer’ time is slower for small files, but at some point beats out the ‘Electronic Transfer’. This general idea is that the running time of an algorithm is a function of the size of its inputs.

###Algorithm Analysis Tools

Big O is used to compare the efficiency of algorithms without implementing them. We have a couple tools for this:

1.) RAM model of computation 2.) Asymptotic analysis of worst-case complexity

####RAM Model of Computation

Machine-independent algorithm design depends upon a hypothetical computer called the Random Access Machine (RAM). This computer strikes a balance of capturing essential behavior of computers while being simple to work with.

We measure run time by counting up the number of steps an algorithm takes on a given problem. Our imaginary machine runs a given number of steps per second.

With RAM, we can count how many steps our algorithm takes on any given input instance by executing it. We look at the best-case, average-case, and worst-case complexity. Usually we use the worst-case complexity.

####Common O Notations

Common O Notation runtimes are below (from the fastest first to slowest last):

A good cheatsheet is here

####Evaluating Runtimes

When evaluating time, we can look at the O Notation a few different ways.

####Asymptotic Notation

What really happens is that we’re interested in the asymptotic notation runtime, which is how the algorithm runtime scales for very large inputs and not in the minute details or small inputs. This means we:

Add or Multiple

If we do code chunk A completely then start code chunk B, the notation should look like O(A + B). If we do code chunk A and code chunk B is in code A (e.g. for-loop in a for-loop), the notation should look like O(A * B).

####Amortized Time

Amortized time is like doing an operation say a million times. You don’t care about the worse-case or best-case scenario every once in a while. We only care about the time taken in total to do the million operations.

####Log N Runtimes

O(log N) is a common runtime because of the elements being halved scenario. For example, in a binary search for x in an N-element sorted array, the options are:

  1. if x==middle then return middle 2. if x<middle then search left
  2. if x>middle then search right

The runtime for this looks like this (say given an N-element array of 16):

  1. N=16 # divide by 2
  2. N=8 # divide by 2
  3. N=4 # divide by 2 4. N=2

    divide by 2 5. N=1 # divide by 2

We could reverse this to say how many times can we multiply by 2 until we get N?

  1. N=1 # multiply by 2
  2. N=2 # multiply by 2
  3. N=4 # multiply by 2
  4. N=8 # multiply by 2
  5. N=16 # multiply by 2

We then get 2^k=N, which is what log expresses. For example:

####Recursive Runtimes

When you have a recursive functions that makes multiple calls, the runtimes will often (not always) look like O(branches^depth) where ‘branches’ is the number of times each recursive call branches.

####Space Complexity

Besides time, other considerations in real life (that we won’t consider for now) are things like space (RAM, hard drive), bandwidth speed, caches, and parallelism (single, multiple cores).

The general rule is if we need to create an array of size n, we would require O(n) space.

##Primitive Types

A program updates variables in memory according to code. Variables come in types, a classification of data saying what the possible values are and what operations can be performed on it (e.g. add a number, check the length of a string). In Python, everything is an object, including different types like:

###Bits and Bytes

A bit is atomic, the smallest unit of storage, storing just a 0 or 1 If we have just one bit, the pattern can be either 0 or 1. If we have two bits, we can have patterns like 00, 01, 10, 11 If we have three bits, we can have patterns like 000, 001, 010, etc. With each additional bit, we can double the number of previous patterns

1 bit = 2 patterns
2 bits = 4 patterns
3 bits = 8 patterns
4 bits = 16 patterns
5 bits = 32 patterns
6 bits = 64 patterns
7 bits = 128 patterns
8 bits = 256 patterns (one byte)

Mathematically: n bits yields 2^n patterns

A byte is a collection of 8 bits. One byte can store one character (e.g. ‘A’, ‘$’)

Kilobyte (KB) = 1 thousand bytes
Megabyte (MB) = 1 million bytes
Gigabyte (GB) = 1 billion bytes
Terabyte (TB) = 1 trillion bytes

Examples of ASCII

A is 65
B is 66
space is 32

###Bitwise Operators

Bitwise operators act on operands as if they were a string of binary digits. It operates bit by bit, meaning we have O(1) computations per bit and a time complexity of O(n). Common bitwise operators in Python are:

&  # bitwise AND, e.g. x& y =0      (0000 0000)
|  # bitwise OR, e.g. x | y = 14    (0000 1110)
~  # bitwise NOT, e.g. ~x = -11     (0000 0101)
^  # bitwise XOR, e.g. x ^ y = 14   (0000 1110)
>> # bitwise right shift, x>> 2 = 2 (0000 0010)
<< # bitwise left shift, x<< 2 = 40 (0010 1000)

So what exactly does a bitwise operator do?

These can appear as a truth table:

AND | 0 1     OR | 0 1     XOR | 0 1    NOT | 0 1
----+-----    ---+----     ----+----    ----+----
 0  | 0 0      0 | 0 1       0 | 0 1        | 1 0
 1  | 0 1      1 | 1 1       1 | 1 0

An example would be if you only wanted the lower 4 bits of an integer, you AND it with 15 (binary 1111). The zero bits act as a filter, forcing the bits in the result to be zero as well.

    201: 1100 1001
AND  15: 0000 1111
------------------
 IS   9  0000 1001

In addition, » and « are often included as bitwise operators, and they “shift” a value respectively right and left by a certain number of bits, throwing away bits that roll of the end you’re shifting towards, and feeding in zero bits at the other end.

1001 0101 >> 2 gives 0010 0101
1111 1111 << 4 gives 1111 0000

Note that with Python, a left shift simply expands the width of the number for the extra bits (whereas other languages drop off the extra bits).

So why is this important? Think of when you chmod a file to give permissions. Each file and folder has 8-bit data that controls the permissions.

000 means no permissions. To read and write it would be 6 (4+2). To read, write and execute it would be 7 (4+2+1). Common permissions are:

###Numeric Methods

Key methods for numeric types are:

It’s good to know what the max size is on your machine using sys.maxsize and sys.float_info

+>>> import sys
+>>> sys.maxsize
9223372036854775807
+>>> sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308,
min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

###Computing the Pairity of a Word

The pairity of a binary word is 1 if the number of 1’s in the word is odd. Otherwise, it is 0. Parity checks are used to detect single bit errors in data storage and communication.

x & (x-1)  # equals x with its lowest set bit erased
e.g. x = (00101100), then x - 1 = (00101011)
     x & (x-1) = (00101100) & (00101011) = (00101000)

    00101100  x
    00101011  x-1
    00101000  result

###Built-in Types

Python has a few built-in types. This is for Python 3.6:

https://docs.python.org/3.6/library/index.html

##Must Knows

###Data Structures

###Algorithms

###Concepts

###Powers of 2 Table

Use this table below for questions about scalability or memory limitation. For example, if we have a bit vector mapping of every 32-bit integer to a boolean value, we need 2^32 bits (or 2^30 bytes) to store this mapping; that’s approximately 1 GB of memory. Remember that 1 bit = .125 bytes (8 bits = 1 byte)

Power of 2  |   Exact Value (X)     |   Approx Value    |   X Bytes in MB, GB, TB
        7   |               128     |                   |
        8   |               256     |                   |
       10   |              1024     |   1 thousand      |           1 KB
       16   |             65536     |                   |          64 KB
       20   |           1048576     |    1 million      |           1 MB
       30   |        1073741824     |    1 billion      |           1 GB
       32   |        4294967296     |                   |           4 GB
       40   |     1099511627776     |    1 trillion     |           1 TB

##Arrays and Strings

###Arrays

The array is the simplest data structure; it is a contiguous block of memory, usually used to represent sequences. If we have an Array A, then we can say that A[i] is the (i + 1)th object stored in the array (because A[i] starts at 0). Retrieving and updating A[i] takes O(1) calculation time.

####Array Advantages and Disadvantages”

Advantages of a contiguously-allocated arrays (i.e. a ‘list’ in Python) are:

Disadvantages are:

We can efficiently enlarge arrays as we need them through dynamic arrays. As soon as we need a larger array, we double the size and copy the contents of the smaller array into the larger array. The primary thing lost using dynamic arrays is the guarantee that each array access takes constant time in the worst case. All queries will be fast, except those relatively few queries triggering array doubling.

####Array CRUD Operations”

The time for operations really depends on whether an array is sorted or unsorted.

####Sorted Arrays

Inserting and Deleting

For a sorted array, deleting an element from an array means moving all successive elements one over to the left to fill the vacated space. For example, if an array is: [2, 3, 5, 7, 9, 11, 13, 17], then deleting an element at index 4 results in [2, 3, 5, 7, 11, 13, 17, 0]. The time complexity of deleting an element at index i from an array of length n is O(n-i). The same is true for inserting a new element (as opposed to updating an existing entry). The reason for this is that we now need to move elements over.

####Array Problem 1

Say we have an input array consisting of integers. Your job is to reorder its entries so that the even entries appear first. We can do this easily in O(n) space where n is the length of the array. However, this is more difficult to solve without allocating additional storage.

##Hashing and Strings

Hash tables are a very practical way to maintain a dictionary. They exploit the fact that looking up an item in an array takes constant time once you have its index. A hash function is a mathematical function that maps keys to integers. We use the value of our hash function as an index into an array, and store our item at that position. Here are the steps:

  1. Map each key to a big integer
  2. The result is unique identifier numbers, but they are so large they exceed the number of slots in the hash table. We take the remainder (e.g. same principle as a roulette wheel)
  3. The resulting hash values should be fairly uniformly distributed

##Hashing

We do hashing because of its speed. If we know information about the structure of our data (e.g. if a list is ordered or not), we can use this knowledge to do efficient searches (e.g. in O(log N) runtime using a binary search instead of O(N)). We can get this even faster with a technique called hashing.

If we know where every item should be, then our search can do a single comparison to find the item. This concept is called hashing and it has a really efficient runtime. The idea is that when we input a key (e.g. Will) through a hash function, this returns a map to where the data is stored (aka the hash values, hash codes, hash sums, hashes).

####Hash Table

A hash table (aka hash map) is a collection of items that are stored in a way where we can find the items very quickly. Each position in the hash table is called a slot (sometimes the entire slot is referred to as a bucket) and can hold an item.

What is a real life example of this? Say a teacher sorts their student’s papers in categories of A-Z based on their first names (or specific buckets like W-Z if there are not a lot of names with those characters, as long as we get about an even distribution). If there are multiple people with the same first letter (e.g. Laura, Lisa), we have a hash collision that can be fixed with a few different methods, with chaining being the simplest (i.e. multiple names are added to the same category).

####Hash Function

A hash function is the function that distributes key/value pairs across an array of slots. A sample function would look like index = f(key, array_size) and hash = hashfunc(key). The goal of the hash function is to:

####Hash Time and Space

A perfect hash function is the case where every item is mapped to a unique slot, which then runs in O(1) time. We don’t need a perfect hash function in order for the hash to be efficient.

For worst case scenario (high number of collisions), the worst case runtime is O(N) where N is the number of keys.

Another implementation of the hash table is with a balanced binary search tree, which will give us O(log N) lookup time. This will use less space since we no longer allocate a large array.

####Hash Example

Say we have a small hash table represented by a list of 11 empty slots (m=11) that have Python value of None. Note that the number of empty slots should ideally be prime (so we can have an easier way of resolving hash collisions, more on that later).

slots       0    1    2    3    4    5    6    7    8    9    10
values      None None None None None None None None None None None

Say we need to put in a set of integer items, like this: 54, 26, 93, 73

We can create a hash function that takes an item in the collection and returns an integer that tells what slot (between 0 and m-1) to place the item. A common hash function is the remainder method, where we take an item and divide by the table size, returning the remainder as the hash value.

54 % 11 = remainder of 10 26 % 11 = remainder of 4 93 % 11 = remainder
of 5 73 % 11 = remainder of 7

After we calculate the hash values, we enter them into the hash table. The load factor is the numberofitems / tablesize (e.g. 4 items/ 10 buckets). Conceptually, if the load factor is small, there are potentially less collisions; if the load factor is large, there are potentially more collisions (which also means collision resolution is more difficult). Usually a load factor of .8 or higher is where linear probing’s performance really degrades and you should definitely do chaining. Here we have our updated hash table:

slots       0    1    2    3    4    5    6    7    8    9    10
values      None None None None 26   93   None 73   None None 54

####Hash Collision

So what happens in our example if we add a new item of 71? We would get a remainder of 5, which would create a hash collision with 93; this means at least two values are trying to fit into the same bucket (in our example, the values of 93 and 71). We have two goals:

  1. minimize hash collisions - what can we do to prevent hash collisions?
  2. hash resolution - despite our best efforts to prevent hash collision, we’ll probably run into them; what can we do once a hash collision happens?

####Minimizing Hash Collision

As we look at different ways to minimize hash collision, we should keep in mind that you want a function to uniformly distribute hash values. These can be tested using statistical tests (e.g. Pearson’s chi-squared test for discrete uniform distributions)

####Minimizing Hash Collision with the Folding Method

One way of trying to minimize hash collision is with the folding method.

  1. The idea is that we divide our items into equal sized pieces (with the last piece probably not the same size). e.g. if we have a phone number of 436-555-4601, we would take the numbers and divide them into groups of 2, like 43, 65, 55, 46, 01. 2. We then add all the pieces together to get a resulting hash value. e.g. 43+65+55+46+1 = 210
  2. With the resulting hash value (e.g. 210), we do the modulus to get a remainder of 1 (210%11=1) to place the value 210 into position 1.
  3. Optionally: Some folding methods reverse every other piece before the addition. e.g. 43+65+55+46+01 turns into 43+56+55+64+01=219, which 219%11=10

####Minimizing Hash Collision with the Mid-Square Method

Another way of trying to minimize hash collision is with the mid-square method.

  1. The idea is that we first square the item, then extract some porition of the resulting digits. e.g. if the item is 44, we calculate 44^2=1936. 2. The next step is extracting the middle two digits (e.g. 93 from 1936) and taking the modulus to get the remainder (93%11=5)

####Minimizing Hash Collision when Hashing Characters

We can create hash functions for character-based items like strings. For example, we have: 99+97+116=312, then 312%11=4.

ord('c')  # 99 ord('a')  # 97 ord('t')  # 116

If we do a hash function that just simply adds up all the characters and gets the modulus, we end up with hash collision for anagrams (e.g. dad). We can fix this by adding a weighted position to the character (e.g. multiply by 1 for first char, multiply by 2 for second char, etc). For example:

def hash(astring, tablesize):
    sum = 0 for position in range(len(astring)):
        sum = sum * position + ord(astring[position])
    return sum%tablesize

###Collision Resolution

Collision resolution is the method for resolving hash collision, which means what do we do when two or more items are hashed to the same slot in the hash table. There are many ways of attempting to address this including open addressing and separate chaining (chaining).

####Resolving Hash Collision with Open Addressing

Open Addressing (aka closed hashing) is a method to resolve hash collisions by trying to find the next open slot in the hash table through some probe sequence, which systematically visits each slot until an open slot is found. The name comes from the fact that the location (‘address’) of the item is not determined by its ‘hash value’. The downside of open addressing is that the number of stored entries cannot exceed the number of slots.

There are many probe sequences, which has different takes on trying to minimize clustering of hash values. These probing sequences include:

Deletion in an open addressing scheme could get ugly since removing one element might break a chain of insertions. We have no alternative except to reinsert all the items in the run following the new hole.

            Hash Table      Hash Table
            (Expected)      (Worst Case)
Search      O(n/m)          O(n)
Insert      O(1)            O(1)
Delete      O(1)            O(1)
Successor   O(n + m)        O(n + m)
Predecessor O(n + m)        O(n + m)
Minimum     O(n + m)        O(n + m)
Maximum     O(n + m)        O(n + m)

####Resolving Hash Collision with Open Addressing and Linear Probing

When we visit each bucket one at a time, we are using a technique called linear probing. With the given example above, if we added 71, we would have a hash collision since 93 is already in position 5 (71%11=5 and 93%11=5). We call this rehashing when we try to make 71 into a new hash.

slots       0    1    2    3    4    5    6    7    8    9    10
values      None None None None 26   93   None 73   None None 54

####Resolving Hash Collision with Open Addressing and Quadratic Probing

Instead of making the skip a constant value (e.g. 1, 2), we can use a rehash function that increments the hash value by 1, 3, 5, 7, etc. This means that if the first hash value is h, the successive hash values are h+1, h+4, h+9, h+16, h+25 etc. This ‘skip’ value is successive perfect squares.

####Resolving Hash Collision with Separate Chaining

Separate chaining (aka chaining) allows many items to exist at the same location in the hash table. When a hash collision happens, the item does not do probing and instead is placed in the proper slot. This is the easiest resolution, but devotes a considerable amount of memory to pointers. This is space that could be used to make the table larger and the ‘lists’ smaller.

When we search for an item, we use the hash function to generate the slot where it should be. Each slot has a collection so we use a searching technique to decide if the item is present. The advantage is that on average, there are likely to be very few items in each slot so the search should be more efficient. If there are many items in each slot, we have a few ways to handle this:

Here is some sample code for a hash table implementation:

HashTable Implementation

class SlotItem(object):
    """ An items in the same hash table slot """
    def __init__(self, key, value):
        self.key = key
        self.value = value

    def __str__(self):
        return self.key

class HashTable(object):
    """
        Implementation of a hash table using chaining instead of
        open addressing.  Each slot has a list of items that are
        appended when there is a hash collision.
    """

    def __init__(self, size):
        self.size = size
        self.table = [[]] * self.size

    def hash_function(self, key):
        """
            Return modulus (i.e. remainder) and handles two scenarios:
            1 - numbers only (int or long); just take modulus 2 -
            characters involved; create hash from position and ordinal
                value of character (position added to handle anagrams)
        """

        if type(key) is int or type(key) is long:  # numbers only
            return key % self.size
        else:
            total = 0  # characters involved for position in xrange(len(key)):
            total = total + (ord(key[position]) * (position+1))
            return total % self.size

    def set(self, key, value):
        """
            Finds slot with hash_function, then saves key-value pair.
            If there is an existing key, we overwrite that value.
        """ index = self.hash_function(key)  # find correct slot

        for slot in self.table[index]:  # look inside slot for key
            if slot.key == key:  # key found, replace current value
                slot.value = value return
        self.table[index].append(SlotItem(key, value))  # key not found, add

        self.check_load_factor(upper_limit=.8, resize=2)

    def get(self, key):
        """ Finds slot with hash_function, returns slot value or
        else None """
        index = self.hash_function(key)

        for slot in self.table[index]:  # look inside slot for key
            if slot.key == key:  # key found, return current value
                return slot.value
        return None  # no key found

    def remove(self, key):
        """ Given a key, remove the key-value pair """
        index = self.hash_function(key)
        for i, slot in enumerate(self.table[index]):
            if slot.key == key:  # key found, return current value
                del self.table[index][i]

    def check_load_factor(self, upper_limit=.8, resize=2):
        """
            Check load factor, if limit reached, warn to resize
            larger table
        """ load = 0 for i in xrange(self.size):
            for slot in self.table[i]:
                if slot:
                    load += 1

        load_factor = float(load)/float(self.size)
        #print "Load factor is ", load_factor

        if load_factor > upper_limit:  # need to resize for larger hash table
            print "Load Factor is past upper limit, you should resize"
            # TODO: Create deepcopy, dynamically resize for high and low limit

        else:
            pass  # load_factor is in acceptable limits

Tests for Hashtable Implementation

""" Implement a hashmap (with amortized constant time look-ups)
without using a hashmap primitive.  Includes an executable testing
framework for the data structure """


import unittest from nose.tools import assert_equal, assert_not_equal,
assert_true, assert_false


class TestUnitHashTable(unittest.TestCase):
    """ Unit Tests - Test each piece in isolation """

    def setUp(self):
        self.ht = HashTable(11) self.a = SlotItem('hello', 1112223333)
        # char, int scenario self.b = SlotItem(1, 1)  # int, int
        scenario self.c = SlotItem('f', 'morestuff')  # char, char
        scenario self.ht.table = [[self.a], [self.b], [], [self.c],
                                  [], [], [], [], [], [], []]

    def test_keys_can_be_integers(self):
        try:
            self.ht.set(11, 100)
        except TypeError:
            self.fail("UT: Hashing func cannot handle key with ints")

    def test_keys_can_be_characters(self):
        try:
            # hello = 104*1 + 101*2 + 108*3 + 108*4 + 111*5 = 1617 =>
            %11=0 self.ht.set('hello', 'world')  # pos: 'hello'(0), value: 'world'
        except TypeError:
            self.fail("UT: Hashing func cannot handle key with chars")

    def test_keys_can_be_spaces(self):
        try:
            self.ht.set('  ', 'somevalue')
        except TypeError:
            self.fail("UT: Hashing func cannot handle key with
            spaces")

    def test_keys_can_be_mix_chararacters_and_integers(self):
        try:
            self.ht.set('a1b3e', 'somevalue')
        except TypeError:
            self.fail("UT: Hashing func cannot handle key with ints & chars")

    def test_hashes_anagram_keys_to_different_buckets(self):
        """ Ensure hashing of keys with anagrams turns out unique
        """ self.ht.set('elvis', 'samestuff')  # 1666 => %11=5
        self.ht.set('lives', 'samestuff')  # 1651 => %11=1

        bucket5 = self.ht.table[5]  # elvis bucket goes to slot 5
        bucket1 = self.ht.table[1]  # lives bucket goes to slot 1

        for _ in bucket5:
            if _.key == 'elvis':
                value5 = _.value

        for _ in bucket1:
            if _.key == 'lives':
                value1 = _.value

        assert_equal(value5, value1)

    def test_identify_empty_hash_table_index(self):
        # Check if list in this bucket is empty
        bucket2 = self.ht.table[2] assert_false(bucket2)

    def test_identify_filled_hash_table_index(self):
        # Check if list in this bucket is empty
        bucket1 = self.ht.table[1] assert_true(bucket1)

    def test_get_existing_hash_table_item(self):
        value = self.ht.get('hello') assert_equal(value, 1112223333)

    def test_get_nonexisting_hash_table_item(self):
        value = self.ht.get('nothere') assert_equal(value, None)

    def test_set_data_on_empty_hash_table_slot(self):
        assert_equal(self.ht.get(16), None) self.ht.set(16, 'abc')
        assert_equal(self.ht.get(16), 'abc')

    def test_set_data_on_existing_hash_table_slot(self):
        assert_equal(self.ht.get('f'), 'morestuff')
        self.ht.set('f', 'differentstuff')
        assert_equal(self.ht.get('f'), 'differentstuff')

    def test_remove_data_on_existing_key_with_value(self):
        assert_equal(self.ht.get('hello'), 1112223333)
        self.ht.remove('hello') assert_equal(self.ht.get('hello'), None)

    def test_remove_data_on_nonexisting_key(self):
        try:
            self.ht.remove('notthere')
        except:
            print "Issue with trying to remove a nonexisting key"
            raise

    def test_set_load_factor_high(self):
        assert_equal(self.ht.size, 11)
        self.ht.table = [[self.a], [self.b], [self.a], [self.c], [self.a],
                         [self.c], [], [self.a], [self.a], [self.b], []]
        self.ht.set(10, 'here')
        # TODO: if more time, trigger auto resize
        #assert_equal(self.ht.size, 22)
        #assert_equal(self.ht.get(10), 'here')  # check value here after resize
        # TODO: check hash value not too high
        # TODO: check uniform distribution of values

class TestFunHashTable(object):
    """ Functional Tests - Test all pieces together from end to end
    """

    def test_end_to_end(self):
        hash_table = HashTable(11)
        print "FT: Created Hash Table of size, ", hash_table.size

        print "FT: Default value type at slot 0 is None"
        assert_equal(hash_table.get(0), None)

        print "FT: Setting value in slot 13 (i.e. 13%11=2)
        with value "dee":
            hash_table.set(13, 'dee')
        assert_equal(hash_table.get(13), 'dee')

        print "FT: Adding value in slot 3 with value dee"
        hash_table.set(3, 'dee') assert_equal(hash_table.get(3), 'dee')

        print "FT: Checking that two values are the same"
        assert_equal(hash_table.get(13), hash_table.get(3))

        print "FT: Deleting value in slot 2" hash_table.remove(2)
        assert_equal(hash_table.get(2), None)


if __name__ == '__main__':

    # run unit tests suite =
    unittest.TestLoader().loadTestsFromTestCase(TestUnitHashTable)
    unittest.TextTestRunner(verbosity=2).run(suite)

    A = TestFunHashTable()
    A.test_end_to_end()

###String Matching via Hashing

Strings are sequences of characters where the order of the character matters. The primary data structure for representing strings is an array of characters. We get a constant-time access to the ith character of the string.

You’ll often run into a substring search. There’s different approaches, including:

### Duplicate Detection Via Hashing

The key idea of hashing is to represent a large object (e.g. a key, a string, a substring) using a single number. This representation of a large object is to be able to manipulate the object in constant time.

Hashing is used most often in speeding up search, including:

Hashing is a fundamental idea in randomized algorithms, yielding linear expected-time algorithms.

##<a id=”specializeddatastructures>Specialized Data Structures</a>

We have basic data structures that represent an unstructured set of items so we can run retrieval operations.

There’s also another set of data structures that represent more structured or specialized kinds of objects, like points in space, strings, and graphs. These include:

##Linked Lists

So remember that a list is a mutable data structure that can hold different data types in the same list (e.g. int, then a str) And remember that an array is a mutable data structure that can only hold the same data types in the same array (e.g. only strings) So with an array, memory allocation is done for all of its elements as one block of memory (easier since its all the same data types). A linked list instead does memory allocation so that each element (aka linked list element, node), gets its own memory block. The linked list gets its overall structure by using pointers to connect all its nodes together.

So what are some examples of linked lists?

In terms of implementation, we can search for an item in a linked list either iteratively or recursively.

###Pointers

Before we get too in-depth with linked list/structures, we need to get into detail about pointers. Pointers are the connections that hold the pieces of linked structures together.

Pointers represent the address of a location in memory. A variable that stores a pointer to a given data item can provide more freedom than storing a copy of the item itself.

A real life example might be a cell-phone number can be thought of as a pointer to its owner as they move around.

In programming languages like C, we have:

###Creating a Linked List

Each node contains two fields:

##Stacks and Queues

A container is a data structure that lets you store and retrieve data items independent of content. Containers are in contrast to dictionaries, which are abstract data types that retrieve based on key values or content. There are a few types of containers and they’re identified by the retrieval order. The two most important are:

You can implement a stack or queue using either arrays or linked lists.

###Stacks and Queues in Python

In Python, you can use a list as a stack very easily (just run mylist.append(myitem) or mylist.pop(myitem).

In Python, if you need a queue, just use the collections.deque. You can use a list as a queue, but it’s not efficient (due to inserts or pops from the beginning of a list since all the other elements have to be shifted by one). With a deque, you can run get and put with: myqueue.popleft() and myqueue.append(myitem)

##Dictionaries

The dictionary data type allows access to data items by content. The idea is that you stick an item into a dictionary so you can find it later. The primary operations on a dictionary are:

###Dictionary Examples

A lot of common data processing tasks take dictionaries. For example, if we want to remove all duplicate names from a mailing list, we can:

  1. Initialize an empty dictionary where the search key is the record name
  2. Iterate through the mailing list
  3. For each record, search to see if a name exists in the dictionary
  4. If there is no record, then insert the record
  5. Once we finish iterating through the mailing list, we need to extract the names out of the dictionary
  6. We start from the first item with Min, then repeatedly call Successor until we obtain Max.

When we define a problem in terms of an abstract dictionary’s operations, we can avoid the details of the data structure’s representation and instead focus on the task

###Dictionary Uses

There are various dictionaries based on:

###Dictionary Runtimes

Remember that a data representation can have some efficient operations while other operations are more expensive. A lot of these operation costs depend on whether the data structure is sorted or unsorted.

####Dictionary Runtimes (Sorted Array and Unsorted Arrays)

Dictionary runtimes as a sorted array and unsorted array are:

                Unsorted array  Sorted array
Search          O(n)            O(log n)
Insert          O(1)            O(n)
Delete          O(1)*           O(n)
Successor       O(n)            O(1)
Predecessor     O(n)            O(1)
Minimum         O(n)            O(1)
Maximum         O(n)            O(1)

You can see that the implementation of a dictionary using a sorted array completely reverses our notations of what is easy and what is hard.

In a sorted array, insertion and deletion become more expensive because making room for a new item or filling a hole requires moving around a lot of items. While in a sorted array, get methods like search and successor and minimum/maximum are quick operations.

When designing data structures, make sure to balance all the different operations it supports. The fastest data structure to support both operations A and B might not be the fastest data structure to support either operation A or B.

####Dictionary Runtimes (Lists)

A dictionary can be implemented as sorted list or unsorted list. A dictionary can also be implemented as a singly-linked list or doubly-linked list.

We then have:

Dictionary runtimes as a the above are:

                Singly      Double      Singly      Doubly
                Unsorted    Unsorted    Sorted      Sorted
Search          O(n)        O(n)        O(n)        O(n)
Insert          O(1)        O(1)        O(n)        O(n)
Delete          O(n)*       O(1)        O(n)*       O(1)
Successor       O(n)        O(n)        O(1)        O(1)
Predecessor     O(n)        O(n)        O(n)*       O(1)
Minimum         O(n)        O(n)        O(1)        O(1)
Maximum         O(n)        O(n)        O(1)*       O(1)

Same as unsorted arrays, search operations are slow when we have unsorted while maintenance operations are fast.

###Dictionary Runtimes (Binary Search Trees)

So far we’ve seen data structures that allow either fast search OR flexible updates, but not both fast search AND flexible updates.

Binary Search requires that we have fast access to two elements, being the median elements above and below the given node. We create this by using a ‘linked list’ with two pointers per node and that gives us binary search trees.

There’s a rooted binary tree that is recursively defined as either:

  1. empty
  2. consisting of a node called the root, together with two rooted binary trees called the left and right subtree (The order matters among ‘brother’ nodes in rooted trees)

A binary search tree labels each node in a binary tree with a single key (say x). The labeling scheme is special, with:

####Binary Search Tree Examples

For example, on a three node tree, there are five distinct binary search trees:

  3     3       2      1       1
 2     1       1 3      3       2
1       2              2         3

####Binary Search Tree Implementation and Operations

Binary tree nodes have a:

Binary trees have the following basic operations:

####Binary Search Tree Runtimes

All three dictionary operations take O(h) time, where h is the height of the tree. The smallest height we can hope for is when the tree is perfectly balanced, where h = log n. The issue is that the tree has to be perfectly balanced.

With binary search trees, the order that we insert data greatly affects runtimes (since that’s the shape/height we create for our tree)

For example, if we insert our data that has been sorted, we have n list items with an n tree. Say we have a list of [5, 9, 11, 12]. This creates a binary tree of:

5
 9
  11
   12

So with the above example, we have n runtime because we only have right insertions, creating a skinny tree. Our implementation depends a lot on randomization and hoping for the best results based on the input.

There are different implementations of binary search trees caused balanced binary search trees, such as red-black trees and splay trees that guarantee the height always be O(log n) (by adjusting the tree a little after each insertion / keeping it close enough to be balanced).

##Priority Queues

Many algorithms process items in a specific order. Priority Queues are data structures that provide more flexibility than simple sorting; instead they allow new elements to enter a system at arbitrary intervals. It’s cost-effective to insert a new job into a priority queue than to re-sort everything on each new item’s arrival. So the idea is that you can add items at any time, but only the item with the highest priority can be removed first.

Priority Queues are often implemented with heaps since we’re sure that the value of a node is less than all the values in its subtrees. The smallest element is the one with the highest priority.

We have three primary operations:

###Priority Queue Example

Examples of priority queues could be:

###Priority Queue Runtimes

When the data structure is: an unsorted array, a sorted array, or a balanced search tree, the runtimes are:

        Unsorted        Sorted      Balanced
        Array           Array       Tree Insert      O(1)            O(n)        O(log n) Find Min    O(1)            O(1)        O(1) Delete Min  O(n)            O(1)        O(log n)

Notice how find minimum is in constant time for each of these data structures. The trick is to use an extra variable to store the pointer/index to the minimum entry in each of these structures so we can return this value whenever we’re asked to find the minimum.

##Designing Algorithms

We briefly cover the structure of data, then go into a couple of design approaches with incremental and divide and conquer, which are opposites of each other.

####Approach: Incremental

Incremental is the repetition of a block of statements. An example is:

a = 0 for i in range(10):  #0, 1, 2...8, 9
    a+=1
print a  #10

####loop invariant

As we create our loops, we need to be aware of loop invariants (aka invariant), which simply means that these general conditions have to be true.

  1. initialization means it is true before the first iteration of the loop 2. maintenance means it remains true before the next iteration
  2. termination means when the loop terminates, the invariant gives us a useful property that helps show that the algorithm is correct.

####Approach: Divide and Conquer

The divide and conquer approach is to break apart a large problem into several subproblems that are similar to the original problem but smaller in size. We solve the subproblems recursively (i.e. they call themselves) until they reach the base case, and then combine these solutions to create a solution to the original problem.

  1. divide means to split the problem into subproblems that are smaller instances of the same problem. 2. conquer means solving the subproblems recursively. If the subproblem size is small enough, just solve the subproblems in a straightforward manner. 3. combine means to combine the subproblem solutions into the solution for the original problem.

####recursion

Recursion is a method where the solution to a problem depends on solutions to smaller instances of the same problem. You can visually see this as a recursion tree, which is a tree diagram of recursive calls. Recursion has to obey three laws:

  1. A recursive algorithm must have a base case. 2. A recursive algorithm must changes its state and move toward the base case. 3. A recursive algorithm must call itself, recursively.

Recursive Example of calculating the Fibonacci number

def fib(n):
    """ return the Fibonacci number """
    if n==0:
        return 0
    elif n==1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

Recursive Example of Removing Duplicates Letters next to each other

def removeDups(word):
    if len(word) <= 1:
        return word
    elif word[0] == word[1]:
        return removeDups(word[1:])
    else:
        return word[0] + removeDups(word[1:])
    word = 'aaaabbbbbcccdd'
    print word print removeDups(word)  # abcd

Other examples of recursive problems include:

##Sorting and Searching

Sorting is the basic building block that many other algorithms are built around. By really understanding sorting, we can then understand how to solve other problems. A lot of the interesting ideas in the design of algorithms appear in the context of sorting (e.g. divide-and-conquer, data structures, randomized algorithms).

There are several fundamental sorting algorithms, including:

###Sorting Algorithms - Comparison Sorts

A basic computational problem is the sorting problem, where you sort a sequence of n numbers (aka keys). We apply the above general approaches (insertion, divide and conquer) using different types of algorithms. The following algorithms (insertion sort, bubble sort, selection sort, merge sort, heapsort, and quicksort) are all comparison sorts (i.e. they determine the order of an input array by comparing elements).

How to use Sorting:

A really important algorithm design technique is to use sorting as a basic building block because many other problems become easy once a set of items is sorted.

###Sorting Runtimes

Clever sorting algorithms exist that run in O(n log n) whereas naive sorting algorithms are really slow with O(n^2). Let’s look at some sample numbers for n:

n           n^2/4           n lg n
10          25              33
100         2500            664
1000        250000          9965
10000       25000000        132877
100000      2500000000      1660960

###Applications of Searching

Some sample applications of searching are:

####Sorting Example Problem

Say you have two sets (size m and size n) and need to determine if they’re disjoint (i.e. no element in common). Analyze the worst-case complexity in terms of m and n if m is substantially smaller than n. You have a few different algoirthms that you can do, including:

####Pragmatics of Sorting

So now that we know the importance of sorting and how to start sorting, we need to ask ourselves what order do we want our items sorted? Some questions to ask are:

####Why learn Sorting?

Again, you might wonder why we want to learn the different ways of sorting when you’re usually better off not implementing them and instead should use a built-in library function instead. The answer is that these design techniques are very important for other algorithm problems that you will encounter.

####Incremental: insertion sort

Insertion sort is a simple sorting algorithm based on the incremental approach and is efficient at sorting a small number of elements in place and is useful when items are mostly sorted already. The idea is to always maintain a sorted sublist in the lower portions of the list; what this means is we will have a ‘sorted’ and an ‘unsorted’ group. See example steps:

 sorted|unsorted
      0|5,3,-1
    0,5|3,-1
  0,3,5|-1
-1,0,3,5|

So what are we doing? We move the items from our ‘unsorted group’ to the ‘sorted group’ (this is where the new item is then ‘inserted’) into the sorted group. Notice we have a base case where the first item is sorted (because it is the only item in the sorted group) When we move from unsorted to sorted one item at a time, it simply does a comparison of the next unsorted item with the next sorted item(if it is larger, then insert item after your comparison item). Basically, we move this invisible line that divides the “sorted” vs the “unsorted” groups one item at a time.

For example, if we want to sort a hand of playing cards:

  1. Start with an empty left hand and all cards face down on the table
  2. We remove one card from the table and insert it into the correct position on the left hand (first card is the base case, its already sorted then) 3. To find the correct position, we compare it with each of the cards already in the left hand (from right to left); this way left hand cards are always sorted

Example Code:

""" Insertion Sort """

def insertionSort(mylist):
    for index in range(1, len(mylist)):
        print "Index is ", index  # 1, 2, 3, 4, 5, 6, 7, 8; this is the outer loop

        # setup first case (only one item) currentvalue =
        mylist[index] position = index

        # this is the inner loop, loops through the sorted list backwards and compares values
        while position > 0 and mylist[position-1] > currentvalue:
            mylist[position] = mylist[position-1] position = position - 1

        # found spot in inner sorted loop to place item
        mylist[position] = currentvalue


if __name__ == '__main__':
    mylist = [54,26,93,17,77,31,44,55,20]
    print "Original: ", mylist
    insertionSort(mylist)
    print "Insertion Sorted: ", mylist

So why would you want one of the O(n^2) algorithms when there are O(n log n) algorithms? Well, insertion sort is good when:

####Incremental: bubble sort

Bubble sort (aka sinking sort, ripple sort) is a simple but inefficient sorting algorithm that repeatedly goes through the list to be sorted, compares each pair of adjacent items, and swaps them if they are in the wrong order.

Here is a sample run:

# first pass 5 1 4 2 8
# original 1 5 4 2 8
# swap the first pair
(5, 1) 1 4 5 2 8  # check to swap the next pair (4, 5), but no swap needed
(4 < 5) 1 4 2 5 8  # swap 5 and 2 1 4 2 5 8
# check to swap the next pair (5, 8), but no swap needed (5 < 8)

# second pass 1 4 2 5 8 1 2 4 5 8
# swap 2 and 4 1 2 4 5 8
# check to swap the next pair (4, 5), but no swap needed
(4 < 5) 1 2 4 5 8  # check to swap the next pair (5, 8), but no swap needed (5 < 8)

# third pass checks through each pair, but no swaps needed since its sorted

For example, say we were sorting scrabble tiles into alphabetical order.

  1. Place letters on tile holder and look at the first block. 2. Look at the block to the right of it. 3. If the block to the right should come before the block on the left, swap them. 4. Compare the next block in line with the first and repeat step 3 5. Begin step 1 again with the second block

The name bubble sort is because elements tend to move up into the correct order like bubbles rising to the surface and you see a rippling effect for the ones that are not in the correct order. After each pass of the bubble sort, one item is definitely sorted; a total of n-1 passes to sort n items. Big O Runtime is O(n^2).

Example Code

# Bubble Sort def bubbleSort(mylist):
    for passnum in range(len(mylist)-1, 0, -1):
        #print passnum
        # backwords (8,7,6,..2,1) b/c other items are already sorted
        for i in range(passnum):
            if mylist > mylist[i+1]:  # compare neighbors
                mylist, mylist[i+1] = mylist[i+1], mylist  # swap

if __name__ == '__main__':
    mylist = [54,26,93,17,77,31,44,55,20]
    print "Original: ", mylist
    bubbleSort(mylist) print "Bubble Sorted: ", mylist

####Incremental: selection sort

Selection sort improves on bubble sort by making only one exchange for every pass through the list. The selection sort finds the largest value as it makes its pass and after completing the pass, places it in the correct location/order. What happens is that there is a ‘swap’ (where we put the largest value into the largest index; the item that was previously in the largest index is swapped over to where the previous largest value was).

Here is an example run:

#sorted | unsorted 64 25 12 22 11
# initial state 11|25 12 22 64
# find smallest value (11) and swap with first element (64) 11 12|25 22 64
# find next smallest value from unsorted (12) and compare with second element (25),
  do swap (12 < 25) 11 12 22|25 64
# find next smallest value from unsorted (22) and compare with third element
  (25), do swap (22 < 25) 11 12 22 25|64  # find next smallest value
#from unsorted (25) and compare with fourth element (25), no need to swap (25 < 64) 11 12 22 25 64|
# all sorted

Similar to bubble sort, after the initial pass, the largest item appears in place. The final item is in place after n-1 passes to sort n items. This is slightly faster than bubble sort since we don’t have to do as many exchanges. Big O Runtime is still O(n^2).

""" Selection Sort from largest to smallest """

def selectionSort(mylist):
    for fillslot in range(len(mylist)-1, 0, -1):
        #print fillslot
        # backwords (8,7,6,..2,1) b/c other items are already sorted
        positionOfMax = 0 for i in range(1, fillslot+1):
            if mylist[i] > mylist[positionOfMax]:  # is value greater than value at max
                positionOfMax = i

        # to move the largest value to the largest index,
        # we 'swap' the item # currently in the largest index
        position mylist[fillslot], mylist[positionOfMax] = mylist[positionOfMax], mylist[fillslot]


if __name__ == '__main__':
    mylist = [54,26,93,17,77,31,44,55,20]
    print "Original: ", mylist
    selectionSort(mylist)
    print "Selection Sorted: ", mylist

The algorithm can be changed to swap out for the smallest item instead of the largest item (depending on how you want to sort by). If so, the idea is the same; we select the smallest unsorted item (instead of largest) and then swap it with the item in the next position to be filled.

We look through the entire array for the smallest element, once you find it you swap it (smallest element with the first element of the array). Then you look for the smallest element in the remaining array (the array without the first element) and swap with the second element, etc.

Realistically, you wouldn’t normally need selection sort because it is O(n^2) and so is not an optimal solution for large lists.

####Divide and Conquer: merge sort

Merge sort is a recursive algorithm that uses the divide-and-conquer approach to keep splitting a list in half until there are pairs of individual items. Don’t worry if the list is an odd number, just add the last item anywhere. We then go in reverse (instead of splitting), we combine the items back together using a merge. This merge compares the pairs and determines which is smaller/larger, which then sorts the smaller list. The trick is that since all of our lists are already sorted, when we combine pairs of lists, we only need to look at the first element of each list.

Use mergesort when quicksort is impractical (if you can’t have the possibility of quicksort’s worst case runtime of ‘O(n^2)’. Mergesort is also better able to be done in parallel computing.

Here’s a chart:

Algorithm   Time Complexity                         | Space   Complexity
            Best        Average     Worst           | Worst
Quicksort   Ω(n log(n)) Θ(n log(n)) O(n^2)          | O(log(n))
Mergesort   Ω(n log(n)) Θ(n log(n)) O(n log(n))     | O(n)

Here is an example run:

# k     i       j
# 0 1 2 3 4 5 6 7
# position 9 7 3 8 4 5 6 2
# original values in list

# start splitting up (divide by two) 9 7 3 8 | 4 5 6 2
# divide into two lists
9 7 | 3 8 | 4 5 | 6 2

# divide by two again into four lists
9 | 7 | 3 | 8 | 4 | 5 | 6 | 2
# divide until all items are separated

# now merge back with the merged pairs sorted 7 9 | 3 8 | 4 5 | 2 6
# merge from single items to pairs that are sorted 3 7 8 9 | 2 4 5 6
# merge again, key is that we only compare first item from each list (since they are sorted)
2 3 4 5 6 7 8 9
# combine again, get final sorted list

For example, if we want to sort a hand of playing cards:

  1. divide means splitting the n-element sequence into two subsequences of n/2 elements each 2. conquer by sorting the two subsequences recursively using merge sort 3. combine by merging the two sorted subsequences to produce the sorted answer

Merge sort is good for data that is too big to fit into memory at once because the pattern of storage is regular. It is especially good for data stored as linked lists.

Example Code:

""" Merge Sort """

def mergeSort(mylist):
    print "Splitting", mylist

    if len(mylist) > 1:
        mid = len(mylist) // 2 lefthalf = mylist[:mid] righthalf = mylist[mid:]

        mergeSort(lefthalf) mergeSort(righthalf)

        # below code merges the two smaller sorted lists to larger
        sorted list i = 0  # left half index j = 0
        # right half
        index k = 0  # main / large sorted list

        while i < len(lefthalf) and j < len(righthalf):
            # take the smallest value from either left or right half
            if lefthalf[i] < righthalf[j]:
                mylist[k] = lefthalf[i]  # smaller value on lefthalf
                i += 1
            else:
                mylist[k] = righthalf[j]  # smaller value on righthalf
                j += 1
            k += 1

        # insert remaining values from lefthalf
        while i < len(lefthalf):
            mylist[k] = lefthalf[i] i += 1 k += 1

        # insert remaining values from righthalf
        while j < len(righthalf):
            mylist[k] = righthalf[j] j += 1 k += 1
    print "Merging", mylist

if __name__ == '__main__':
    mylist = [54,26,93,17,77,31,44,55,20]
    print "Original: ", mylist
    mergeSort(mylist)
    print "Merge Sorted: ", mylist

####Divide and Conquer: quick sort

Quick sort is an efficient algorithm that does a sort in place by splitting the array into two smaller arrays, one with lower value elements and one with higher value elements based off a ‘pivot’ element.

Quicksort is my go-to sorting algorithm. This is a really good algorithm to know offhand. It performs fast in most real-world data (because its inner loop can be efficiently implemented in most architectures)

Example Run:

 6 5 1 3 8 4 6 9 5  # initial list

# we pick a pivot (say for this example it is the last element of
#the list, which is 2; can also be any item, e.g. the first item of the list, median)
|6 5 1 3 8 4 6 9 5

# we setup the pivot wall (separated by `|`), 
# items to the left should be smaller, items to
# the right of the wall are larger
|6 5 1 3 8 4 6 9 5

# we compare the pivot element 5 with the first element to the right of the wall (6);
# 5 < 6 so no swap needed
|6 5 1 3 8 4 6 9 5

# we compare the pivot element 5 with the next element to the right of the wall (5);
#5 == 5 so no swap needed
1|5 6 3 8 4 6 9 5

# our current element 1 is smaller than the pivot
# element 5 so we switch the current element (1) with the lowest
# index item on the right side of the wall (6)

A good video of quicksort is here: https://www.youtube.com/watch?v=kUon6854joI

It’s similar to merge sort in that it does the divide and conquer approach. The advantage is that you do not need as much storage over merge sort, but the performance could possibly be diminished (depending on if the pivot value selected is near the middle).

  1. Pick an element from the array; this element is called a pivot
  2. We now want to do a partition operation; this means we want to reorder the array so that all elements with values less than the pivot are on one side while all elements with values greater than the pivot are on the other side (equal values can go either way). After this paritioning, the pivot element is in its final position.
  3. Partitions begin by locating two position markers (e.g. leftmark and rightmark) at the beginning and end of the remaining items in the list. The goal of the partition process is to move items that are on the wrong side with respect to the pivot value while also converging on the split point (i.e. the actual position where the pivot value belongs in the final sorted list)
  4. We increment leftmark until we locate a value that is greater than the pivot value. We then decrement rightmark until we locate a value less than the pivot value. When this happens, we exchange the two items and repeate the process again.

Example Code:

""" Quick Sort in Python3
Quick sort uses divide and conquer to gain the same advantages as merge sort,
with the benefit of using less storage, but at the cost of a worse worst case runtime
O(n^2) if the pivot values are bad.
"""
import pdb
from typing import List


def quickSort(mylist):
    """ Initialize our recursive function """
    quickSortHelper(mylist, 0, len(mylist)-1)

def quickSortHelper(mylist, first, last):
    """ Recursive function to split up """
    if first < last:  # check if need to sort still

        splitpoint = partition(mylist, first, last)

        # now that we know our splitpoint, we can then recursively run quicksort on the list's bottom half and top half
        quickSortHelper(mylist, first, splitpoint-1)
        quickSortHelper(mylist, splitpoint+1, last)

def partition(mylist, first, last):
    """ Partition Process, made up of:
    * Pick a pivot value (i.e. what we'll compare our unsorted numbers to)
    Based off this value, we'll compare our unsorted values and either move
    our items to the left of the pivot or to the right of the pivot.
    * """
    pivotvalue = mylist[first]  # get the first value as pivotvalue

    leftmark = first + 1
    rightmark = last

    done = False
    while not done:

        # Go from leftmost side onwards (to right) and try to find a value
        # that is greater than the pivot value (i.e. left side of pivot should be
        # smaller values than pivot value, if we found one that is greater, we
        # stop at leftmark, saying we need to do a swap to the right side)
        while leftmark <= rightmark and mylist[leftmark] <= pivotvalue:
            leftmark += 1

        # Go from rightmost side inwards (to left) and try to find a value
        # that is less than the pivot value (i.e. right side of pivot should be
        # greater values than pivot value, if we found one that is smaller, we
        # stop at rightmark, saying we need to do a swap to the left side)
        while rightmark >= leftmark and mylist[rightmark] >= pivotvalue:
            rightmark -= 1

        if rightmark < leftmark:
            done = True  # we're done sorting through this list because we've crossed
        else:
            # we have a swap between a value in the left list and a value in the right list
            mylist[leftmark], mylist[rightmark] = mylist[rightmark], mylist[leftmark]

    # Once rightmark is less than leftmark, then rightmark is now the split point.
    # That means what we picked as the pivot value can now be exchanged with the 
    # contents of the split point and the pivot value is now in the correct place
    # Note: remember that our pivot value was the first value in our list
    mylist[first], mylist[rightmark] = mylist[rightmark], mylist[first]

    return rightmark


if __name__ == '__main__':
    mylist = [54, 26, 93, 17, 77, 31, 44, 55, 20]
    print("Original: ", mylist)
    quickSort(mylist)
    print("Quick Sorted: ", mylist)

####Quickselect Algorithm

The quickselect algorithm is a selection algorithm, whereas the quicksort algorithm is a sorting algorithm. It’s similar to quicksort except that we only need to recursively call one side of the partition to be re-partitioned. Instead of a runtime of O(log n), we get O(n).

What this solves is that if we get an unordered list of items and want to find the k-th smallest or k-th largest item, we can get items in O(n).

""" Python 3 implementation of a quickselect algorithm """
from typing import List
import unittest
import random


class Solution:

    def quickselect(self, items, item_index):
        if items is None or len(items) < 1:
            return None

        if item_index < 0 or item_index > len(items) - 1:
            raise IndexError()

        return self.select(items, 0, len(items) - 1, item_index)

    def select(self, lst, l, r, index):
        # base case
        if r == l:
            return lst[l]

        # choose random pivot
        pivot_index = random.randint(l, r)

        # move pivot to beginning of list
        lst[l], lst[pivot_index] = lst[pivot_index], lst[l]

        # partition
        i = l
        for j in range(l+1, r+1):
            if lst[j] < lst[l]:
                i += 1
                lst[i], lst[j] = lst[j], lst[i]

        # move pivot to correct location
        lst[i], lst[l] = lst[l], lst[i]

        # recursively partition one side only
        if index == i:
            return lst[i]
        elif index < i:
            return self.select(lst, l, i-1, index)
        else:
            return self.select(lst, i+1, r, index)


class SolutionTest(unittest.TestCase):

    def test_quickselect(self):
        s = Solution()
        response = s.quickselect([12, 2, 4, 3, 5], 2)
        assert response == 4

        response = s.quickselect([12, 2, 4, 3, 5], 0)
        assert response == 2

####Divide and Conquer: heap sort

Heap sort takes the best properties of merge sort (the run time) and the efficency of insertion sort’s ability to sort in place (i.e. only a constant number of elements are stored outside the input array at any time). What makes heapsort unique is that it uses a data structure called a heap to help manage information (instead of a linear-time search); heapsort divides its input into a sorted and an unsorted region and it iteratively shrinks the unsorted region by extracting the largest element and moving that to the sorted region. This heapsort is really efficient at managing priority queues.

Really, heapsort is just an implementation of selection sort using the right data structure.

Note: In programming languages like Java or Lisp, heap refers to ‘garbage-collected storage’; this is not what we’re talking about here.

Given an array A that represents a heap, we can look at the two attributes (length, heap-size) and determine what part is valid (i.e. this is the correctly sorted region) and what is still the heap (i.e. unsorted):

#####Heaps

Heaps are a simple data structure for efficiently working with priority queue operations insert and extract-min. They work by maintaining a partial order on the set of elements that are weaker than the sorted order (i.e. efficient to maintain), yet stronger than random order (so finding minimum element is quick)

Power in any hierarchically-structured organization is reflected by a tree, where each node represents a person. Each edge (x, y) implies that x directly supervises y (aka x __dominates__ y). The top person is at the root of the heap / top of the heap.

There’s a couple variations for a heap-labeled tree, where we have a binary tree that has the key labeling of each node dominating the key labeling of each of its children.

A binary tree might look like:

                 1492
              /        \
           1783        1776
          /     \     /    \
        1804   1865 1945  1963
        /
      1918

This binary tree might be represented as a node with points to its two children. However, the memory used by the pointers can easily outweight the size of keys.

The heap is a data structure that will let us represent binary trees without using any pointers. We store data as an array of keys, and the important difference is that we will use the position of the keys to implicitly satisfy the role of the pointers.

We store the root of the tree in the first position of the array. In the second position, we have the left child. On the third position, we have the right child. The positions can actually be calculated, where given an array k, the left children of k will sit at positoin 2k and the right children will be 2k+1 while the parent of k will be in position k/2. We can then move around without any pointers.

1   1492
2   1783
3   1776
4   1804
5   1865
6   1945
7   1963
8   1918

####Heap Issues

Space

We can store any binary tree in an array without pointers, but the issue is that we might get a tree that is sparse. If we were given a tree with height h that was sparse (number of nodes n < 2^h, then all missing internal nodes still take up space in our structure. We need to represent the full binary tree to maintain the positional mapping between parents and children. Space efficiency demands that we not allow holes in our tree (i.e. each level be packed as much as it can be)

Search for a particular key

We can’t search for a particular key in a heap beause we don’t know anything about the relative order of the n/2 leaf elements in a heap. We have to do linear search through them. In order to do a binary search (i.e. we have a sorted list, then divide by 2 to see if we’re looking at the upper or lower, etc), we need a binary search tree. We’re missing the order of elements in the heap.

####Constructing a Heap

Heaps are constructed incrementally, by inserting each new element into the left-most open spot in the array, namely the (n+1)st position of a previously n-element heap. We get a balanced shape out of a heap-labeled tree, but note that we don’t necessarily maintain the dominance ordering of the keys.

The new key might be less than its parent in a min-heap or greater than its parent in a max-heap. We’ll need to swap any dissatisfied element with its parent. We do this process repeatedly, essentially bubbling up the new key to its right position. The swap process takes constant time. With an n element heap, we have a height of lg n, which makes each insertion take at most O(log n) time. That means an initial construction of the heap takes O(n log n) time.

We can make faster heaps using the bubble_down procedure, which will create heaps in linear time instead of O (n log n) time. However, also note that construction time does not help improve worst-case performance.

####Extracting the Minimum from a Heap

We can identify the dominant element of a heap since it sits at the top of the heap in the first positoin of the array. Removing the top element leaves a hole in the array. We can fill the hole by moving the element from the right-most leaf (at the nth position of the array) into the first position.

Heapify with bubble_down

The shape of the tree is restored, however the labeling of the root might not satisfy the heap property (e.g. if min-heap, key might be less than its parent / i.e. the new root may be dominated by its both of its children). We can swap by bubbling down the heap until the element dominates all its children; this percolate-down operation is called heapify because it merges two heaps (the subtrees below the original root) with a new key.

####Heapsort

When we exchange the maximum element with the last element and call heapify repeatedly, we get an O (n log n) sorting algorithm called Heapsort. Heapsort is easy to program, runs an in-place sort (i.e. uses no extra memory) and has a worst-case of O(n log n) time, which is the best that you can expect from any sorting algorithm.

There are some other algorithms that are slightly faster in practice, but you can’t go wrong with heapsort for sorting data that sits in your computer’s memory.

####(Binary) Heap

A (binary) heap data structure is an array object that we can view as a binary tree. Think of this algorithm as two parts:

  1. We have some data (e.g. a list of [6, 5, 3, 1, 8, 7, 2, 4]) that we use to create the heap, a data structure that looks like a binary tree.

A heap is a complete binary tree in which the node is less than all the values in its subtrees (or greater if reversed). An example looks like:

           3
          /  \
        9     4
       / \   / \
      26 10 18 20

As we’re building this binary tree, the heap swaps elements depending on the type (e.g. min or max) of the binary heap (e.g. sorting smallest to largest, larger nodes don’t stay below smaller node parents and end up swapping; 8 can’t be below 5 on the heap). Once the binary tree is built, we have a tree where each array index represents a node and also has the index of the node’s parent, left child branch, or right child branch.

  1. We then create a sorted array by repeatedly removing the largest element from the root of the heap and inserting it into the array. The heap is updated after each removal to maintain the heap. The heap incrementally gets smaller until all objects have been removed from the heap, resulting in only a sorted array left.

####Priority Queue

As mentioned earlier, heap sort is great for creating priority queues, which is a data structure for maintaining a set S of elements, each with an associated value called a key. There’s max-priority queue (e.g. used to schedule jobs on a server and set their relative priorities) and a min-priority queue (e.g. used for discrete event-driven simulation models like determining how many patients can be served from changing 8 hours to 9 hours of operation when avg surgery takes say 4 hours).

##Sorting Algorithms with Linear Time (Decision Tree Models)

Previously mentioned algorithms are comparison sorts, which determines the sort order based only on comparisons between the input elements. If we make the assumption that all the input elements are distinct, we can sort by linear time. This means we can do comparison sorts in terms of decision trees, which is a full binary tree that represents the comparisons between elements in a sorting algorithm (say elements i : j).

####Counting Sort

Placeholder for smart stuff.

####Radix Sort

Placeholder for smart stuff.

####Bucket Sort

Placeholder for smart stuff.

##Probabilistic Analysis

Probabilistic analysis is the use of probability in the analysis of problems. We can use this in analyzing the running time of an algorithm or we can use it to analyze other quantities, such as who to hire. We have some examples:

####The Hiring Problem

For this example, we want to hire an office assistant. We interview candidates and determine if they are better than the current assistant (if so, replace the current assistant right then and there). There is a cost to hiring and firing someone. We want to find the expected cost after interviewing everyone (which is a fixed n candidates).

Say we have a list and rank them into an ordered list of best possible candidate using: [rank1, rank2, rank3, rankn]. Saying that applicants come in a random order is equivalent to saying there is n! permutations of the numbers 1 through n. We call this uniform random permutation, which means that each of the possible n! permutations appears with equal probability.

We first assume (or make sure we randomly select) candidates for hire. We can check probabilities and expectations using an indicator random variable. For example, if we flip a coin, we count the number of times it actually comes up heads (saying using a random number generator) to what we expect.

####The Birthday Paradox

How many people must there be in a room before there is a 50% chance that two of them are born on the same day of the year? We have:

####Balls and Bins

If we randomly toss identical balls into ‘b’ bins (1 through ‘b’) and assuming the tosses are independent with an equal chance of ending up in any bin, we have the probability that a tossed ball lands in any given bin as 1/b of success (where success is falling into the given bin). The ball tossing can be seen as a sequence of Bernoulli trials (i.e. a binomial trial, a random experiment where there are exactly two possible outcomes; success and failure). This answers questions like:

####Streaks

If you flip a fair coin ‘n’ times, what is the longest streak of consecutive heads that you expect to see?

##Graph Theory

Graph theory is the study of graphs, which are mathematical structures used to model pairwise relations between objects. A graph is made up of vertices (aka nodes; note that a node is called a leaf if it has no children) and lines that connect them are called edges.

###Famous Graph Algorithms

####Dijkstra’s Shortest Path

Dijkstra’s shortest path algorithm is commonly used to find the the shortest paths on a graph from a source to all vertices.

####A* search algorithm

The A* (A Star) algorithm is used in pathfinding and graph traversal, which is the process of finding a path between multiple points (aka nodes). Its used due to its performance and accuracy, but in practical travel-routing systems, it is generally outperformed by algorithms which can pre-process the graph to attain better performance.

####Prim’s Algorithm for Minimum Spanning Tree

Prim’s Algorithm for Minimum Spanning Tree is a greedy algorithm that takes a spanning tree (all vertices must be connected) where the two disjoint subsets of vertcies are connected to make a Spanning Tree.

TODO: How this algorithm works

####Trees

Trees are a type of graph and they’re described as: * Each tree has a root node * The root node has zero or more child nodes * Each child node has zero or more child nodes, and so on.

The tree cannot contain cycles. The nodes may or may not be in a particular order, they could have any data type as values, and they may or may not have links back to their parent nodes.

Real life examples of trees

Pieces of Trees

How to appraoch trees

We can think of trees two different ways, as a large set of rules or as a smaller recursive set of rules:

  1. A tree consists of a set of nodes and a set of edges that connect pairs of nodes. Trees have the following properties:
    • One node of the tree is designed as the root node - Every node n, except the root node, is connected by an edge from exactly one other node p, where p is the parent of n.
  2. Think recurisvely; a tree is either empty or consists of a root and zero or more subtrees, each of which is also a tree.

Types of Trees There are different types of trees that can take on different properties:

Traversing Trees You can traverse binary trees using a variety of methods (the most common being ‘in-order’ traversal):

Tries (prefix trees)

A trie (aka prefix tree) is a special kind of tree; a trie is like an n-ry tree in which characters are stored at each node and each path down the tree may represent a word.

There are * nodes (aka null nodes) that are used to indicate complete words (e.g. Many* means many is a complete word or Ma* means there are lots of words that start with Ma).

A trie is commonly used to store the entire English language for quick prefix lookups. A hash table can quickly look up whether a string is a valid word, but it cannot tell us if a string is a prefix of any valid words.

#####Types of Trees

We have different types of trees including:

####Example Trees

An example tree belongs to the category directed acyclic graphs (DAGs). Git is a great example of a DAG. Other examples are say an Airflow scheduled job.

####Graphs

A tree is a type of graph, but not all graphs are trees. A tree is a connected graph without cycles. A graph is a collection of nodes with edges between (some of) them. Graphs are used to model pairwise relations between objects.

####Real Life examples of Graphs

Computers can use graphs to find the shortest, quickest, or easiest path from one place to another. Some real life scenarios include:

####Classic examples of Graphs

####Vocabulary and Definitions for Graphs

A graph can be represented by G=(V,E) for the graph G, V is a set of vertices and E is a set of edges. Each edge is a tuple (v, w) where w, v makes up V. Optionally, we can add a third component to the edge tuple to represent a weight.

####Graph Abstract Data Type (ADT)

This is how to setup a graph abstract data type (ADT) that we can build off of:

####Ways to store a graph (adjacency list and matrix)

You can store graphs as an adjacency list (most common way) or as an adjacency matrices. When two verticies are connected by an edge, they are adjacent, thus the name of list and matrix. There are advantages and disadvantages to both.

Example adjacency matrix

    v0  v1  v2  v3  v4  v5
v0       5               2
v1           4
v2               9
v3                   7   3
v4   1
v5           1       8

Example adjacency list

v0: id=v0, adj={v1:5, v5:2}
v1: id=v1, adj={v2:4}
v2: id=v2, adj={v3:9}
v3: id=v3, adj={v4:7, v5:3}
v4: id=v4, adj={v0:1}
v5: id=v5, adj={v2:1, v4:8}

####Adjacency List Implementation

To implement an ‘adjacency list’, we need to implement:

####Ways to search a graph

After constructing a graph. The two most common ways to search a graph are depth-first search and breadth-first search.

The heart of a BFS is a Queue, which decides which vertex to explore next.

####Trees vs Graphs

You can think of a tree as just a restricted form of a graph. The main difference is that a tree is setup so that each node has only one parent, but graphs can have multiple predecessors (i.e. we don’t say ‘parent’ for graphs because there can be multiple connections) Trees have a root node while graphs don’t.

Trees also don’t contain cycles. Trees fit into a category called Directed Acyclic Graphs (DAG), meaning that there are no cycles (i.e. acyclic). Graphs can be cyclic or acyclic (remember that a tree is always a graph, but a graph is not always a tree).

An example of a tree would be a scheduled job that depended on previous jobs. An example of a graph would be a map with cities being connected.

##NP-Complete Problems

A NP-Complete problem stands for nondeterministic polynomial time problem. There is no known way to find a solution quickly (i.e. the time required to solve a problem using any currently known algorithm increases rapidly as the size of the problem grows). Note: By quickly, it means in polynomial time.

Some famous NP-Complete problems are:

###NP-Hard Problems

A NP-Hard problem is one that is not solvable in polynomial time, but can be verified in polynomial time.

##Gotchas

Remember when whiteboarding, its a bit different than coding because you can’t just run your code. Some things to watch out for include: