If you like this project, please give me a star. ★
This is my multi-month study plan for going from web developer (self-taught, no CS degree) to Google software engineer.
This long list has been extracted and expanded from Google's coaching notes, so these are the things you need to know. There are extra items I added at the bottom that may come up in the interview or be helpful in solving a problem.
Many items are from Steve Yegge's "Get that job at Google" and are reflected sometimes word-for-word in Google's coaching notes.
I'm following this plan to prepare for my Google interview. I've been building the web, building services, and launching startups since 1997. I have an economics degree, not a CS degree. I've been very successful in my career, but I want to work at Google. I want to progress into larger systems and get a real understanding of computer systems, algorithmic efficiency, data structure performance, low-level languages, and how it all works. And if you don't know any of it, Google won't hire you.
When I started this project, I didn't know a stack from a heap, didn't know Big-O anything, anything about trees, or how to traverse a graph. If I had to code a sorting algorithm, I can tell ya it wouldn't have been very good. Every data structure I've ever used was built into the language, and I didn't know how they worked under the hood at all. I've never had to manage memory, unless a process I was running would give an "out of memory" error, and then I'd have to find a workaround. I've used a few multi-dimensional arrays in my life and thousands of associative arrays, but I've never created data structures from scratch.
But after going through this study plan I have high confidence I'll be hired. It's a long plan. It's going to take me months. If you are familiar with a lot of this already it will take you a lot less time.
Everything below is an outline, and you should tackle the items in order from top to bottom.
I'm using Github's special markdown flavor, including tasks lists to check my progress.
I check each task box at the beginning of a line when I'm done with it. When all sub-items in a block are done, I put [x] at the top level, meaning the entire block is done. Sorry you have to remove all my [x] markings to use this the same way. If you search/replace, just replace [x] with [ ]. Sometimes I just put a [x] at top level if I know I've done all the subtasks, to cut down on clutter.
More about Github flavored markdown: https://guides.github.com/features/mastering-markdown/#GitHub-flavored-markdown
I have a friendly referral already to get my resume in at Google. Thanks JP.
Print out a "future Googler" sign (or two) and keep your eyes on the prize.
I'm on the journey. Follow along at GoogleyAsHeck.com
Some videos are available only by enrolling in a Coursera or EdX class. It is free to do so, but sometimes the classes are no longer in session so you have to wait a couple of months, so you have no access. I'm going to be adding more videos from public sources and replacing the online course videos over time. I like using university lectures.
-
Videos:
-
Articles:
- http://www.google.com/about/careers/lifeatgoogle/hiringprocess/
- http://steve-yegge.blogspot.com/2008/03/get-that-job-at-google.html
- all the things he mentions that you need to know are listed below
- (very dated) http://dondodge.typepad.com/the_next_big_thing/2010/09/how-to-get-a-job-at-google-interview-questions-hiring-process.html
- http://sites.google.com/site/steveyegge2/five-essential-phone-screen-questions
-
Additional (not suggested by Google but I added):
- https://medium.com/always-be-coding/abc-always-be-coding-d5f8051afce2#.4heg8zvm4
- https://medium.com/always-be-coding/four-steps-to-google-without-a-degree-8f381aa6bd5e#.asalo1vfx
- https://medium.com/@dpup/whiteboarding-4df873dbba2e#.hf6jn45g1
- http://www.kpcb.com/blog/lessons-learned-how-google-thinks-about-hiring-management-and-culture
- http://www.coderust.com/blog/2014/04/10/effective-whiteboarding-during-programming-interviews/
- Cracking The Coding Interview Set 1:
- How to Get a Job at the Big 4:
- http://alexbowe.com/failing-at-google-interviews/
This short section were prerequisites/interesting info I wanted to learn before getting started on the daily plan.
You can use a language you are comfortable in to do the coding part of the interview, but for Google, these are solid choices:
- C++
- Java
- Python
You need to be very comfortable in the language, and be knowledgeable. Read more (rescued from the lost web): - https://web.archive.org/web/20160204193730/http://blog.codingforinterviews.com/best-programming-language-jobs/
You'll see some C, C++, and Python learning included below, because I'm learning. There are a few books involved, see the bottom.
-
How computers process a program:
-
How floating point numbers are stored:
-
Computer Arch Intro: (first video only - interesting but not required) https://www.youtube.com/watch?v=zLP_X4wyHbY&list=PL5PHm2jkkXmi5CxxI7b3JCL1TWybTDtKq&index=1
-
C
-
C++
- C++ Cheat Sheet
- STL Cheat Sheet
- basics
- pointers
- functions
- references
- templates
- compilation
- scope & linkage
- namespaces
- OOP
- STL
- functors: http://www.cprogramming.com/tutorial/functors-function-objects-in-c++.html
- C++ at Google: https://www.youtube.com/watch?v=NOCElcMcFik
- Google C++ Style Guide: https://google.github.io/styleguide/cppguide.html
- Google uses clang-format (there is a command line "style" argument: -style=google)
- Efficiency with Algorithms, Performance with Data Structures: https://youtu.be/fHNmRkzxHWs
- review of C++ concepts: https://www.youtube.com/watch?v=Rub-JsjMhWY
-
Python
- Python Cheat Sheet
- Python in One Video: https://www.youtube.com/watch?v=N4mEzFDjqtA
- Series on 3.4: https://www.youtube.com/playlist?list=PL6gx4Cwl9DGAcbMi1sH6oAMk4JHw91mC_
- Zero to Hero: https://www.youtube.com/watch?v=emY34tSKXc4
- Statistics for Hackers: https://www.youtube.com/watch?v=Iq9DzN6mvYA
- Faster Python: https://www.youtube.com/watch?v=JDSGVvMwNM8
- CPython Walk: https://www.youtube.com/watch?v=LhadeL7_EIU&list=PLzV58Zm8FuBL6OAv1Yu6AwXZrnsFbbR0S&index=6
- 10 Tips for Pythonic Code: https://www.youtube.com/watch?v=_O23jIXsshs
- Beyond PEP 8 -- Best practices for beautiful intelligible code: https://www.youtube.com/watch?v=wf-BqAjZb8M
-
Compilers
- https://class.coursera.org/compilers-004/lecture/1
- https://class.coursera.org/compilers-004/lecture/2
- C++: https://www.youtube.com/watch?v=twodd1KFfGk
- Understanding Compiler Optimization (C++): https://www.youtube.com/watch?v=FnGCDLhaxKU
Each subject does not require a whole day to be able to understand it fully, and you can do multiple of these in a day.
Each day I take one subject from the list below, watch videos about that subject, and write an implementation in: C - using structs and functions that take a struct * and something else as args. C++ - without using built-in types C++ - using built-in types, like STL's std::list for a linked list Python - using built-in types (to keep practicing Python) and write tests to ensure I'm doing it right, sometimes just using simple assert() statements You may do Java or something else, this is just my thing.
Why code in all of these? Practice, practice, practice, until I'm sick of it, and can do it with no problem (some have many edge cases and bookkeeping details to remember) Work within the raw constraints (allocating/freeing memory without help of garbage collection (except Python)) Make use of built-in types so I have experience using the built-in tools for real-world use (not going to write my own linked list implementation in production)
I may not have time to do all of these for every subject, but I'll try.
You can see my code here:
- C: https://github.com/jwasham/practice-c
- C++: https://github.com/jwasham/practice-cpp
- Python: https://github.com/jwasham/practice-python
You don't need to memorize the guts of every algorithm.
Write code on a whiteboard, not a computer. Test with some sample inputs. Then test it out on a computer to make sure it's not buggy from syntax.
-
Before you get started:
- The myth of the Genius Programmer: https://www.youtube.com/watch?v=0SARbwvhupQ
- Google engineers are smart, but many have an insecurity that they aren't smart enough.
-
Algorithmic complexity / Big O / Asymptotic analysis
- nothing to implement
- Harvard CS50 - Asymptotic Notation: https://www.youtube.com/watch?v=iOq5kSKqeR4
- Big O Notations (general quick tutorial) - https://www.youtube.com/watch?v=V6mKVRU1evU
- Big O Notation (and Omega and Theta) - best mathematical explanation:
- Skiena:
- A Gentle Introduction to Algorithm Complexity Analysis: http://discrete.gr/complexity/
- Orders of Growth: https://class.coursera.org/algorithmicthink1-004/lecture/59
- Asymptotics: https://class.coursera.org/algorithmicthink1-004/lecture/61
- UC Berkeley Big O: https://youtu.be/VIS4YDpuP98
- UC Berkeley Big Omega: https://youtu.be/ca3e7UVmeUc
- Amortized Analysis: https://www.youtube.com/watch?v=B3SpQZaAZP4&index=10&list=PL1BaGV1cIH4UhkL8a9bJGG356covJ76qN
- Illustrating "Big O": https://class.coursera.org/algorithmicthink1-004/lecture/63
- Cheat sheet: http://bigocheatsheet.com/
If some of the lectures are too mathy, you can jump down to the bottom and watch the discrete mathematics videos to get the background knowledge.
-
Arrays: (Implement an automatically resizing vector)
- Description:
- Arrays: https://www.coursera.org/learn/data-structures/lecture/OsBSF/arrays
- Arrays: https://www.lynda.com/Developer-Programming-Foundations-tutorials/Basic-arrays/149042/177104-4.html
- Multi-dim: https://www.lynda.com/Developer-Programming-Foundations-tutorials/Multidimensional-arrays/149042/177105-4.html
- Dynamic Arrays: https://www.coursera.org/learn/data-structures/lecture/EwbnV/dynamic-arrays
- Jagged: https://www.lynda.com/Developer-Programming-Foundations-tutorials/Jagged-arrays/149042/177106-4.html
- Resizing arrays:
- Implement a vector (mutable array with automatic resizing):
- Practice coding using arrays and pointers, and pointer math to jump to an index instead of using indexing.
- new raw data array with allocated memory
- can allocate int array under the hood, just not use its features
- start with 16, or if starting number is greater, use power of 2 - 16, 32, 64, 128
- size() - number of items
- capacity() - number of items it can hold
- is_empty()
- at(index) - returns item at given index, blows up if index out of bounds
- push(item)
- insert(index, item) - inserts item at index, shifts that index's value and trailing elements to the right
- prepend(item) - can use insert above at index 0
- pop() - remove from end, return value
- delete(index) - delete item at index, shifting all trailing elements left
- remove(item) - looks for value and removes index holding it (even if in multiple places)
- find(item) - looks for value and returns first index with that value, -1 if not found
- resize(new_capacity) // private function
- when you reach capacity, resize to double the size
- when popping an item, if size is 1/4 of capacity, resize to half
- Time
- O(1) to add/remove at end (amortized for allocations for more space), index, or update
- O(n) to insert/remove elsewhere
- Space
- contiguous in memory, so proximity helps performance
- space needed = (array capacity, which is >= n) * size of item, but even if 2n, still O(n)
- Description:
-
Linked Lists
- Description:
- C Code: https://www.youtube.com/watch?v=QN6FPiD0Gzo - not the whole video, just portions about Node struct and memory allocation.
- Linked List vs Arrays:
- why you should avoid linked lists:
- Gotcha: you need pointer to pointer knowledge: (for when you pass a pointer to a function that may change the address where that pointer points) This page is just to get a grasp on ptr to ptr. I don't recommend this list traversal style. Readability and maintainability suffer due to cleverness.
- implement (I did with tail pointer & without):
- size() - returns number of data elements in list
- empty() - bool returns true if empty
- value_at(index) - returns the value of the nth item (starting at 0 for first)
- push_front(value) - adds an item to the front of the list
- pop_front() - remove front item and return its value
- push_back(value) - adds an item at the end
- pop_back() - removes end item and returns its value
- front() - get value of front item
- back() - get value of end item
- insert(index, value) - insert value at index, so current item at that index is pointed to by new item at index
- erase(index) - removes node at given index
- value_n_from_end(n) - returns the value of the node at nth position from the end of the list
- reverse() - reverses the list
- remove_value(value) - removes the first item in the list with this value
- Doubly-linked List
- Description: https://www.coursera.org/learn/data-structures/lecture/jpGKD/doubly-linked-lists
- No need to implement
-
Stack
- https://www.coursera.org/learn/data-structures/lecture/UdKzQ/stacks
- https://www.lynda.com/Developer-Programming-Foundations-tutorials/Using-stacks-last-first-out/149042/177120-4.html
- Will not implement. Implementing with array is trivial.
-
Queue
- https://www.lynda.com/Developer-Programming-Foundations-tutorials/Using-queues-first-first-out/149042/177122-4.html
- https://www.coursera.org/learn/data-structures/lecture/EShpq/queue
- Circular buffer/FIFO: https://en.wikipedia.org/wiki/Circular_buffer
- https://www.lynda.com/Developer-Programming-Foundations-tutorials/Priority-queues-deques/149042/177123-4.html
- Implement using linked-list, with tail pointer:
- enqueue(value) - adds value at position at tail
- dequeue() - returns value and removes least recently added element (front)
- empty()
- Implement using fixed-sized array:
- enqueue(value) - adds item at end of available storage
- dequeue() - returns value and removes least recently added element
- empty()
- full()
- Cost:
- a bad implementation using linked list where you enqueue at head and dequeue at tail would be O(n) because you'd need the next to last element, causing a full traversal each dequeue
- enqueue: O(1) (amortized, linked list and array [probing])
- dequeue: O(1) (linked list and array)
- empty: O(1) (linked list and array)
-
Hash table
-
Videos:
- Hashing with Chaining: https://www.youtube.com/watch?v=0M_kIqhwbFo&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=8
- Table Doubling, Karp-Rabin: https://www.youtube.com/watch?v=BRO7mVIFt08&index=9&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
- Open Addressing, Cryptographic Hashing: https://www.youtube.com/watch?v=rvdJDijO2Ro&index=10&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
- PyCon 2010: The Mighty Dictionary: https://www.youtube.com/watch?v=C4Kc8xzcA68
- (Advanced) Randomization: Universal & Perfect Hashing: https://www.youtube.com/watch?v=z0lJ2k0sl1g&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=11
- (Advanced) Perfect hashing: https://www.youtube.com/watch?v=N0COwN14gt0&list=PL2B4EEwhKD-NbwZ4ezj7gyc_3yNrojKM9&index=4
-
Online Courses:
- https://www.lynda.com/Developer-Programming-Foundations-tutorials/Understanding-hash-functions/149042/177126-4.html
- https://www.lynda.com/Developer-Programming-Foundations-tutorials/Using-hash-tables/149042/177127-4.html
- https://www.lynda.com/Developer-Programming-Foundations-tutorials/Supporting-hashing/149042/177128-4.html
- https://www.lynda.com/Developer-Programming-Foundations-tutorials/Language-support-hash-tables/149042/177129-4.html
- https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/m7UuP/core-hash-tables
- https://www.coursera.org/learn/data-structures/home/week/3
- https://www.coursera.org/learn/data-structures/lecture/NYZZP/phone-book-problem
- distributed hash tables:
-
implement with array using linear probing
- hash(k, m) - m is size of hash table
- add(key, value) - if key already exists, update value
- exists(key)
- get(key)
- remove(key)
-
-
Endianness
- https://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Data/endian.html
- https://www.youtube.com/watch?v=JrNF0KRAlyo
- https://www.youtube.com/watch?v=oBSuXP-1Tc0
- Very technical talk for kernel devs. Don't worry if most is over your head.
- The first half is enough.
-
Binary search:
- https://www.youtube.com/watch?v=D5SrAga1pno
- https://www.khanacademy.org/computing/computer-science/algorithms/binary-search/a/binary-search
- detail: https://www.topcoder.com/community/data-science/data-science-tutorials/binary-search/
- Implement:
- binary search (on sorted array of integers)
- binary search using recursion
-
Bitwise operations
- Bits cheat sheet - you should know many of the powers of 2 from (2^1 to 2^16 and 2^32)
- Get a really good understanding of manipulating bits with: &, |, ^, ~, >>, <<
- words: https://en.wikipedia.org/wiki/Word_(computer_architecture)
- Good intro: https://www.youtube.com/watch?v=7jkIUgLC29I
- https://www.youtube.com/watch?v=d0AwjSpNXR0
- https://en.wikipedia.org/wiki/Bit_manipulation
- https://en.wikipedia.org/wiki/Bitwise_operation
- https://graphics.stanford.edu/~seander/bithacks.html
- http://bits.stephan-brumme.com/
- http://bits.stephan-brumme.com/interactive.html
- 2s and 1s complement
- count set bits
- round to next power of 2:
- swap values:
- absolute value:
-
Notes & Background:
- Series: https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/ovovP/core-trees
- Series: https://www.coursera.org/learn/data-structures/lecture/95qda/trees
- basic tree construction
- traversal
- manipulation algorithms
- BFS (breadth-first search)
- MIT: https://www.youtube.com/watch?v=s-CYnVz-uh4&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=13
- level order (BFS, using queue) time complexity: O(n) space complexity: best: O(1), worst: O(n/2)=O(n)
- DFS (depth-first search)
- MIT: https://www.youtube.com/watch?v=AfSk24UTFS8&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=14
- notes: time complexity: O(n) space complexity: best: O(log n) - avg. height of tree worst: O(n)
- inorder (DFS: left, self, right)
- postorder (DFS: left, right, self)
- preorder (DFS: self, left, right)
-
Binary search trees: BSTs
- Binary Search Tree Review: https://www.youtube.com/watch?v=x6At0nzX92o&index=1&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6
- Series: https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/p82sw/core-introduction-to-binary-search-trees
- starts with symbol table and goes through BST applications
- https://www.coursera.org/learn/data-structures/lecture/E7cXP/introduction
- MIT: https://www.youtube.com/watch?v=9Jry5-82I68
- C/C++:
- https://www.youtube.com/watch?v=COZK7NATh4k&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P&index=28
- https://www.youtube.com/watch?v=hWokyBoo0aI&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P&index=29
- https://www.youtube.com/watch?v=Ut90klNN264&index=30&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P
- https://www.youtube.com/watch?v=_pnqMz5nrRs&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P&index=31
- https://www.youtube.com/watch?v=9RHO6jU--GU&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P&index=32
- https://www.youtube.com/watch?v=86g8jAQug04&index=33&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P
- https://www.youtube.com/watch?v=gm8DUJJhmY4&index=34&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P
- https://www.youtube.com/watch?v=yEwSGhSsT0U&index=35&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P
- https://www.youtube.com/watch?v=gcULXE7ViZw&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P&index=36
- https://www.youtube.com/watch?v=5cPbNCrdotA&index=37&list=PL2_aWCzGMAwI3W_JlcBbtYTwiQSsOTa6P
- Implement:
- insert // insert value into tree
- get_node_count // get count of values stored
- print_values // prints the values in the tree, from min to max
- delete_tree
- is_in_tree // returns true if given value exists in the tree
- get_height // returns the height in nodes (single node's height is 1)
- get_min // returns the minimum value stored in the tree
- get_max // returns the maximum value stored in the tree
- is_binary_search_tree
- delete_value
- get_successor // returns next-highest value in tree after given value, -1 if none
-
Heap / Priority Queue / Binary Heap:
- visualized as a tree, but is usually linear in storage (array, linked list)
- https://en.wikipedia.org/wiki/Heap_(data_structure)
- https://www.coursera.org/learn/data-structures/lecture/2OpTs/introduction
- https://www.coursera.org/learn/data-structures/lecture/z3l9N/naive-implementations
- https://www.coursera.org/learn/data-structures/lecture/GRV2q/binary-trees
- https://www.coursera.org/learn/data-structures/supplement/S5xxz/tree-height-remark
- https://www.coursera.org/learn/data-structures/lecture/GRV2q/binary-trees
- https://www.coursera.org/learn/data-structures/lecture/0g1dl/basic-operations
- https://www.coursera.org/learn/data-structures/lecture/gl5Ni/complete-binary-trees
- https://www.coursera.org/learn/data-structures/lecture/HxQo9/pseudocode
- Heap Sort - jumps to start: https://youtu.be/odNJmw5TOEE?list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&t=3291
- Heap Sort: https://www.coursera.org/learn/data-structures/lecture/hSzMO/heap-sort
- Building a heap: https://www.coursera.org/learn/data-structures/lecture/dwrOS/building-a-heap
- MIT: Heaps and Heap Sort: https://www.youtube.com/watch?v=B7hVxCmfPtM&index=4&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
- CS 61B Lecture 24: Priority Queues: https://www.youtube.com/watch?v=yIUFT6AKBGE&index=24&list=PL4BBB74C7D2A1049C
- Linear Time BuildHeap (max-heap): https://www.youtube.com/watch?v=MiyLo8adrWw
- Implement a max-heap:
- insert
- sift_up - needed for insert
- get_max - returns the max item, without removing it
- get_size() - return number of elements stored
- is_empty() - returns true if heap contains no elements
- extract_max - returns the max item, removing it
- sift_down - needed for extract_max
- remove(i) - removes item at index x
- heapify - create a heap from an array of elements, needed for heap_sort
- heap_sort() - take an unsorted array and turn it into a sorted array in-place using a max heap
- note: using a min heap instead would save operations, but double the space needed (cannot do in-place).
-
Tries
- Note there are different kinds of tries. Some have prefixes, some don't, and some use string instead of bits to track the path.
- I read through code, but will not implement.
- http://www.cs.yale.edu/homes/aspnes/classes/223/notes.html#Tries
- Short course videos:
- https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/08Xyf/core-introduction-to-tries
- https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/PvlZW/core-performance-of-tries
- https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/DFvd3/core-implementing-a-trie
- The Trie: A Neglected Data Structure: https://www.toptal.com/java/the-trie-a-neglected-data-structure
- TopCoder - Using Tries: https://www.topcoder.com/community/data-science/data-science-tutorials/using-tries/
- Stanford Lecture (real world use case): https://www.youtube.com/watch?v=TJ8SkcUSdbU
- MIT, Advanced Data Structures, Strings (can get pretty obscure about halfway through): https://www.youtube.com/watch?v=NinWEPPrkDQ&index=16&list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf
-
Balanced search trees
-
Know least one type of balanced binary tree (and know how it's implemented):
-
"Among balanced search trees, AVL and 2/3 trees are now passé, and red-black trees seem to be more popular. A particularly interesting self-organizing data structure is the splay tree, which uses rotations to move any accessed key to the root." - Skiena
-
Of these, I chose to implement a splay tree. From what I've read, you won't implement a balanced search tree in your interview. But I wanted exposure to coding one up and let's face it, splay trees are the bee's knees. I did read a lot of red-black tree code.
- splay tree: insert, search, delete functions If you end up implementing red/black tree try just these:
- search and insertion functions, skipping delete
-
I want to learn more about B-Tree since it's used so widely with very large data sets.
-
Self-balancing binary search tree: https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree
-
AVL trees
- In practice: From what I can tell, these aren't used much in practice, but I could see where they would be: The AVL tree is another structure supporting O(log n) search, insertion, and removal. It is more rigidly balanced than red–black trees, leading to slower insertion and removal but faster retrieval. This makes it attractive for data structures that may be built once and loaded without reconstruction, such as language dictionaries (or program dictionaries, such as the opcodes of an assembler or interpreter).
- MIT AVL Trees / AVL Sort: https://www.youtube.com/watch?v=FNeL18KsWPc&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=6
- https://www.coursera.org/learn/data-structures/lecture/Qq5E0/avl-trees
- https://www.coursera.org/learn/data-structures/lecture/PKEBC/avl-tree-implementation
- https://www.coursera.org/learn/data-structures/lecture/22BgE/split-and-merge
-
Splay trees
- In practice: Splay trees are typically used in the implementation of caches, memory allocators, routers, garbage collectors, data compression, ropes (replacement of string used for long text strings), in Windows NT (in the virtual memory, networking, and file system code) etc.
- CS 61B: Splay Trees: https://www.youtube.com/watch?v=Najzh1rYQTo&index=23&list=PL-XXv-cvA_iAlnI-BQr9hjqADPBtujFJd
- MIT Lecture: Splay Trees: - Gets very mathy, but watch the last 10 minutes for sure. - https://www.youtube.com/watch?v=QnPl_Y6EqMo
-
2-3 search trees
- In practice: 2-3 trees have faster inserts at the expense of slower searches (since height is more compared to AVL trees). You would use 2-3 tree very rarely because its implementation involves different types of nodes. Instead, people use Red Black trees.
- 23-Tree Intuition and Definition: https://www.youtube.com/watch?v=C3SsdUqasD4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=2
- Binary View of 23-Tree: https://www.youtube.com/watch?v=iYvBtGKsqSg&index=3&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6
- 2-3 Trees (student recitation): https://www.youtube.com/watch?v=TOb1tuEZ2X4&index=5&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp
-
2-3-4 Trees (aka 2-4 trees)
- In practice: For every 2-4 tree, there are corresponding red–black trees with data elements in the same order. The insertion and deletion operations on 2-4 trees are also equivalent to color-flipping and rotations in red–black trees. This makes 2-4 trees an important tool for understanding the logic behind red–black trees, and this is why many introductory algorithm texts introduce 2-4 trees just before red–black trees, even though 2-4 trees are not often used in practice.
- CS 61B Lecture 26: Balanced Search Trees: https://www.youtube.com/watch?v=zqrqYXkth6Q&index=26&list=PL4BBB74C7D2A1049C
- Bottom Up 234-Trees: https://www.youtube.com/watch?v=DQdMYevEyE4&index=4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6
- Top Down 234-Trees: https://www.youtube.com/watch?v=2679VQ26Fp4&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=5
-
B-Trees
- fun fact: it's a mystery, but the B could stand for Boeing, Balanced, or Bayer (co-inventor)
- In Practice: B-Trees are widely used in databases. Most modern filesystems use B-trees (or Variants). In addition to its use in databases, the B-tree is also used in filesystems to allow quick random access to an arbitrary block in a particular file. The basic problem is turning the file block i address into a disk block (or perhaps to a cylinder-head-sector) address.
- B-Tree: https://en.wikipedia.org/wiki/B-tree
- Introduction to B-Trees: https://www.youtube.com/watch?v=I22wEC1tTGo&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6&index=6
- B-Tree Definition and Insertion: https://www.youtube.com/watch?v=s3bCdZGrgpA&index=7&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6
- B-Tree Deletion: https://www.youtube.com/watch?v=svfnVhJOfMc&index=8&list=PLA5Lqm4uh9Bbq-E0ZnqTIa8LRaL77ica6
- MIT 6.851 - Memory Hierarchy Models: https://www.youtube.com/watch?v=V3omVLzI0WE&index=7&list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf - covers cache-oblivious B-Trees, very interesting data structures - the first 37 minutes are very technical, may be skipped (B is block size, cache line size)
-
Red/black trees
- In practice: Red–black trees offer worst-case guarantees for insertion time, deletion time, and search time. Not only does this make them valuable in time-sensitive applications such as real-time applications, but it makes them valuable building blocks in other data structures which provide worst-case guarantees; for example, many data structures used in computational geometry can be based on red–black trees, and the Completely Fair Scheduler used in current Linux kernels uses red–black trees. In the version 8 of Java, the Collection HashMap has been modified such that instead of using a LinkedList to store identical elements with poor hashcodes, a Red-Black tree is used.
- Aduni - Algorithms - Lecture 4 link jumps to starting point: https://youtu.be/1W3x0f_RmUo?list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&t=3871
- Aduni - Algorithms - Lecture 5: https://www.youtube.com/watch?v=hm2GHwyKF1o&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=5
- https://en.wikipedia.org/wiki/Red%E2%80%93black_tree
- https://www.topcoder.com/community/data-science/data-science-tutorials/an-introduction-to-binary-search-and-red-black-trees/
-
-
N-ary (K-ary, M-ary) trees
- note: the N or K is the branching factor (max branches)
- binary trees are a 2-ary tree, with branching factor = 2
- 2-3 trees are 3-ary
- https://en.wikipedia.org/wiki/K-ary_tree
- note: the N or K is the branching factor (max branches)
-
Notes:
- Implement sorts & know best case/worst case, average complexity of each:
- no bubble sort - it's terrible - O(n^2), except when n <= 16
- stability in sorting algorithms ("Is Quicksort stable?")
- Which algorithms can be used on linked lists? Which on arrays? Which on both?
- I wouldn't recommend sorting a linked list, but merge sort is doable.
- http://www.geeksforgeeks.org/merge-sort-for-linked-list/
- Implement sorts & know best case/worst case, average complexity of each:
-
For heapsort, see Heap data structure above. Heap sort is great, but not stable.
-
Bubble Sort: https://www.youtube.com/watch?v=P00xJgWzz2c&index=1&list=PL89B61F78B552C1AB
-
Analyzing Bubble Sort: https://www.youtube.com/watch?v=ni_zk257Nqo&index=7&list=PL89B61F78B552C1AB
-
Insertion Sort, Merge Sort: https://www.youtube.com/watch?v=Kg4bqzAqRBM&index=3&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
-
Insertion Sort: https://www.youtube.com/watch?v=c4BRHC7kTaQ&index=2&list=PL89B61F78B552C1AB
-
Merge Sort: https://www.youtube.com/watch?v=GCae1WNvnZM&index=3&list=PL89B61F78B552C1AB
-
Quicksort: https://www.youtube.com/watch?v=y_G9BkAm6B8&index=4&list=PL89B61F78B552C1AB
-
Selection Sort: https://www.youtube.com/watch?v=6nDMgr0-Yyo&index=8&list=PL89B61F78B552C1AB
-
Stanford lectures on sorting:
-
Shai Simonson, Aduni.org:
-
Steven Skiena lectures on sorting:
- lecture begins at 26:46: https://youtu.be/ute-pmMkyuk?list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&t=1600
- lecture begins at 27:40: https://www.youtube.com/watch?v=yLvp-pB8mak&index=8&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b
- lecture begins at 35:00: https://www.youtube.com/watch?v=q7K9otnzlfE&index=9&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b
- lecture begins at 23:50: https://www.youtube.com/watch?v=TvqIGu9Iupw&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&index=10
-
UC Berkeley:
- CS 61B Lecture 29: Sorting I: https://www.youtube.com/watch?v=EiUvYS2DT6I&list=PL4BBB74C7D2A1049C&index=29
- CS 61B Lecture 30: Sorting II: https://www.youtube.com/watch?v=2hTY3t80Qsk&list=PL4BBB74C7D2A1049C&index=30
- CS 61B Lecture 32: Sorting III: https://www.youtube.com/watch?v=Y6LOLpxg6Dc&index=32&list=PL4BBB74C7D2A1049C
- CS 61B Lecture 33: Sorting V: https://www.youtube.com/watch?v=qNMQ4ly43p4&index=33&list=PL4BBB74C7D2A1049C
-
- Merge sort code:
-
- Quick sort code:
-
Implement:
- Mergesort: O(n log n) average and worst case
- Quicksort O(n log n) average case
- Selection sort and insertion sort are both O(n^2) average and worst case
- For heapsort, see Heap data structure above.
-
For curiosity - not required:
- Radix Sort: http://www.cs.yale.edu/homes/aspnes/classes/223/notes.html#radixSort
- Radix Sort: https://www.youtube.com/watch?v=xhr26ia4k38
- Radix Sort, Counting Sort (linear time given constraints): https://www.youtube.com/watch?v=Nz1KZXbghj8&index=7&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
- Randomization: Matrix Multiply, Quicksort, Freivalds' algorithm: https://www.youtube.com/watch?v=cNB2lADK3_s&index=8&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp
- Sorting in Linear Time: https://www.youtube.com/watch?v=pOKy3RZbSws&list=PLUl4u3cNGP61hsJNdULdudlRL493b-XZf&index=14
Graphs can be used to represent many problems in computer science, so this section is long, like trees and sorting were.
-
Notes from Yegge:
- There are three basic ways to represent a graph in memory:
- objects and pointers
- matrix
- adjacency list
- Familiarize yourself with each representation and its pros & cons
- BFS and DFS - know their computational complexity, their tradeoffs, and how to implement them in real code
- When asked a question, look for a graph-based solution first, then move on if none.
- There are three basic ways to represent a graph in memory:
-
Skiena Lectures - great intro:
- CSE373 2012 - Lecture 11 - Graph Data Structures: https://www.youtube.com/watch?v=OiXxhDrFruw&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&index=11
- CSE373 2012 - Lecture 12 - Breadth-First Search: https://www.youtube.com/watch?v=g5vF8jscteo&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&index=12
- CSE373 2012 - Lecture 13 - Graph Algorithms: https://www.youtube.com/watch?v=S23W6eTcqdY&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&index=13
- CSE373 2012 - Lecture 14 - Graph Algorithms (con't): https://www.youtube.com/watch?v=WitPBKGV0HY&index=14&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b
- CSE373 2012 - Lecture 15 - Graph Algorithms (con't 2): https://www.youtube.com/watch?v=ia1L30l7OIg&index=15&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b
- CSE373 2012 - Lecture 16 - Graph Algorithms (con't 3): https://www.youtube.com/watch?v=jgDOQq6iWy8&index=16&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b
-
Graphs (review and more):
- 6.006 Single-Source Shortest Paths Problem: https://www.youtube.com/watch?v=Aa2sqUhIn-E&index=15&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
- 6.006 Dijkstra: https://www.youtube.com/watch?v=2E7MmKv0Y24&index=16&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
- 6.006 Bellman-Ford: https://www.youtube.com/watch?v=ozsuci5pIso&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=17
- 6.006 Speeding Up Dijkstra: https://www.youtube.com/watch?v=CHvQ3q_gJ7E&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=18
- Aduni: Graph Algorithms I - Topological Sorting, Minimum Spanning Trees, Prim's Algorithm - Lecture 6: https://www.youtube.com/watch?v=i_AQT_XfvD8&index=6&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm
- Aduni: Graph Algorithms II - DFS, BFS, Kruskal's Algorithm, Union Find Data Structure - Lecture 7: https://www.youtube.com/watch?v=ufj5_bppBsA&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=7
- Aduni: Graph Algorithms III: Shortest Path - Lecture 8: https://www.youtube.com/watch?v=DiedsPsMKXc&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=8
- Aduni: Graph Alg. IV: Intro to geometric algorithms - Lecture 9: https://www.youtube.com/watch?v=XIAQRlNkJAw&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=9
- CS 61B 2014 (starting at 58:09): https://youtu.be/dgjX4HdMI-Q?list=PL-XXv-cvA_iAlnI-BQr9hjqADPBtujFJd&t=3489
- CS 61B 2014: Weighted graphs: https://www.youtube.com/watch?v=aJjlQCFwylA&list=PL-XXv-cvA_iAlnI-BQr9hjqADPBtujFJd&index=19
- Greedy Algorithms: Minimum Spanning Tree: https://www.youtube.com/watch?v=tKwnms5iRBU&index=16&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp
- Strongly Connected Components Kosaraju's Algorithm Graph Algorithm: https://www.youtube.com/watch?v=RpgcYiky7uw
-
Full Coursera Course:
- Algorithms on Graphs: https://www.coursera.org/learn/algorithms-on-graphs/home/welcome
-
Yegge: If you get a chance, try to study up on fancier algorithms:
- Dijkstra's algorithm - see above - 6.006
- A*
- https://en.wikipedia.org/wiki/A*_search_algorithm
- A* Pathfinding Tutorial: https://www.youtube.com/watch?v=KNXfSOx4eEE
- A* Pathfinding (E01: algorithm explanation): https://www.youtube.com/watch?v=-L-WgKMFuhE
-
I'll implement:
- DFS with adjacency list (recursive)
- DFS with adjacency list (iterative with stack)
- DFS with adjacency matrix (recursive)
- DFS with adjacency matrix (iterative with stack)
- BFS with adjacency list
- BFS with adjacency matrix
- single-source shortest path (Dijkstra)
- minimum spanning tree
- DFS-based algorithms (see Aduni videos above):
- check for cycle (needed for topological sort, since we'll check for cycle before starting)
- topological sort
- count connected components in a graph
- list strongly connected components
- check for bipartite graph
You'll get more graph practice in Skiena's book (see Books section below) and the interview books
-
Recursion
- Stanford lectures on recursion & backtracking:
- https://www.youtube.com/watch?v=gl3emqCuueQ&list=PLFE6E58F856038C69&index=8
- https://www.youtube.com/watch?v=uFJhEPrbycQ&list=PLFE6E58F856038C69&index=9
- https://www.youtube.com/watch?v=NdF1QDTRkck&index=10&list=PLFE6E58F856038C69
- https://www.youtube.com/watch?v=p-gpaIGRCQI&list=PLFE6E58F856038C69&index=11
- when it is appropriate to use it
- how is tail recursion better than not?
- Stanford lectures on recursion & backtracking:
-
Dynamic Programming
- This subject can be pretty difficult, as each DP soluble problem must be defined as a recursion relation, and coming up with it can be tricky.
- I suggest looking at many examples of DP problems until you have a solid understanding of the pattern involved.
- Videos:
- the Skiena videos can be hard to follow since he sometimes uses the whiteboard, which is too small to see
- Skiena: CSE373 2012 - Lecture 19 - Introduction to Dynamic Programming: https://youtu.be/Qc2ieXRgR0k?list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&t=1718
- Skiena: CSE373 2012 - Lecture 20 - Edit Distance: https://youtu.be/IsmMhMdyeGY?list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&t=2749
- Skiena: CSE373 2012 - Lecture 21 - Dynamic Programming Examples: https://youtu.be/o0V9eYF4UI8?list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&t=406
- Skiena: CSE373 2012 - Lecture 22 - Applications of Dynamic Programming: https://www.youtube.com/watch?v=dRbMC1Ltl3A&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&index=22
- Simonson: Dynamic Programming 0 (starts at 59:18): https://youtu.be/J5aJEcOr6Eo?list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&t=3558
- Simonson: Dynamic Programming I - Lecture 11: https://www.youtube.com/watch?v=0EzHjQ_SOeU&index=11&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm
- Simonson: Dynamic programming II - Lecture 12: https://www.youtube.com/watch?v=v1qiRwuJU7g&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=12
- List of individual DP problems (each is short): https://www.youtube.com/playlist?list=PLrmLmBdmIlpsHaNTPP_jHHDx_os9ItYXr
- Yale Lecture notes:
- Coursera:
- The RNA secondary structure problem: https://www.coursera.org/learn/algorithmic-thinking-2/lecture/80RrW/the-rna-secondary-structure-problem
- A dynamic programming algorithm: https://www.coursera.org/learn/algorithmic-thinking-2/lecture/PSonq/a-dynamic-programming-algorithm
- Illustrating the DP algorithm: https://www.coursera.org/learn/algorithmic-thinking-2/lecture/oUEK2/illustrating-the-dp-algorithm
- Running time of the DP algorithm: https://www.coursera.org/learn/algorithmic-thinking-2/lecture/nfK2r/running-time-of-the-dp-algorithm
- DP vs. recursive implementation: https://www.coursera.org/learn/algorithmic-thinking-2/lecture/M999a/dp-vs-recursive-implementation
- Global pairwise sequence alignment: https://www.coursera.org/learn/algorithmic-thinking-2/lecture/UZ7o6/global-pairwise-sequence-alignment
- Local pairwise sequence alignment: https://www.coursera.org/learn/algorithmic-thinking-2/lecture/WnNau/local-pairwise-sequence-alignment
-
Combinatorics (n choose k) & Probability
- Math Skills: How to find Factorial, Permutation and Combination (Choose): https://www.youtube.com/watch?v=8RRo6Ti9d0U
- Make School: Probability: https://www.youtube.com/watch?v=sZkAAk9Wwa4
- Make School: More Probability and Markov Chains: https://www.youtube.com/watch?v=dNaJg-mLobQ
- Khan Academy:
- Course layout:
- Just the videos - 41 (each are simple and each are short):
-
NP, NP-Complete and Approximation Algorithms
- Know about the most famous classes of NP-complete problems, such as traveling salesman and the knapsack problem, and be able to recognize them when an interviewer asks you them in disguise.
- Know what NP-complete means.
- Computational Complexity: https://www.youtube.com/watch?v=moPtwq_cVH8&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=23
- Simonson:
- https://youtu.be/qcGnJ47Smlo?list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&t=2939
- https://www.youtube.com/watch?v=e0tGC6ZQdQE&index=16&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm
- https://www.youtube.com/watch?v=fCX1BGT3wjE&index=17&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm
- https://www.youtube.com/watch?v=NKLDp3Rch3M&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=18
- Skiena:
- CSE373 2012 - Lecture 23 - Introduction to NP-Completeness: https://youtu.be/KiK5TVgXbFg?list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&t=1508
- CSE373 2012 - Lecture 24 - NP-Completeness Proofs: https://www.youtube.com/watch?v=27Al52X3hd4&index=24&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b
- CSE373 2012 - Lecture 25 - NP-Completeness Challenge: https://www.youtube.com/watch?v=xCPH4gwIIXM&index=25&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b
- Complexity: P, NP, NP-completeness, Reductions: https://www.youtube.com/watch?v=eHZifpgyH_4&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=22
- Complexity: Approximation Algorithms: https://www.youtube.com/watch?v=MEz1J9wY2iM&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=24
- Complexity: Fixed-Parameter Algorithms: https://www.youtube.com/watch?v=4q-jmGrmxKs&index=25&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp
- Peter Norvik discusses near-optimal solutions to traveling salesman problem:
- Pages 1048 - 1140 in CLRS if you have it.
-
Garbage collection
- Garbage collection (Java); Augmenting data str: https://www.youtube.com/watch?v=StdfeXaKGEc&list=PL-XXv-cvA_iAlnI-BQr9hjqADPBtujFJd&index=25
- Compilers: https://www.youtube.com/playlist?list=PLO9y7hOkmmSGTy5z6HZ-W4k2y8WXF7Bff
- GC in Python: https://www.youtube.com/watch?v=iHVs_HkjdmI
- Deep Dive Java: Garbage Collection is Good!: https://www.infoq.com/presentations/garbage-collection-benefits
- Deep Dive Python: Garbage Collection in CPython: https://www.youtube.com/watch?v=P-8Z0-MhdQs&list=PLdzf4Clw0VbOEWOS_sLhT_9zaiQDrS5AR&index=3
-
Caches
- LRU cache:
- The Magic of LRU Cache (100 Days of Google Dev): https://www.youtube.com/watch?v=R5ON3iwx78M
- CPU cache:
- MIT 6.004 L15: The Memory Hierarchy: https://www.youtube.com/watch?v=vjYF_fAZI5E&list=PLrRW1w6CGAcXbMtDFj205vALOGmiRc82-&index=24
- MIT 6.004 L16: Cache Issues: https://www.youtube.com/watch?v=ajgC3-pyGlk&index=25&list=PLrRW1w6CGAcXbMtDFj205vALOGmiRc82-
- LRU cache:
-
String searching & manipulations
- Search pattern in text: https://www.coursera.org/learn/data-structures/lecture/tAfHI/search-pattern-in-text
- Rabin-Karp: https://www.coursera.org/learn/data-structures/lecture/c0Qkw/rabin-karps-algorithm https://www.youtube.com/watch?v=BRO7mVIFt08&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=9
- Precomputing: https://www.coursera.org/learn/data-structures/lecture/nYrc8/optimization-precomputation
- Optimization: Implementation and Analysis: https://www.coursera.org/learn/data-structures/lecture/h4ZLc/optimization-implementation-and-analysis
- Knuth-Morris-Pratt (KMP):
- Boyer–Moore string search algorithm
- Coursera: Algorithms on Strings:
-
Design patterns
- description:
- strategy
- singleton
- adapter
- prototype
- decorator
- visitor
- factory
-
Operating Systems (25 videos):
- Computer Science 162: https://www.youtube.com/watch?v=-KWd_eQYLwY&index=2&list=PL-XXv-cvA_iBDyz-ba4yDskqMDY6A1w_c
- https://www.quora.com/What-is-the-difference-between-a-process-and-a-thread
- Covers:
- Processes, Threads, Concurrency issues
- difference between processes and threads
- processes
- threads
- locks
- mutexes
- semaphores
- monitors
- how they work
- deadlock
- livelock
- CPU activity, interrupts, context switching
- Modern concurrency constructs with multicore processors
- Process resource needs (memory: code, static storage, stack, heap, and also file descriptors, i/o)
- Thread resource needs (shares above with other threads in same process but each has its own pc, stack counter, registers and stack)
- Forking is really copy on write (read-only) until the new process writes to memory, then it does a full copy.
- Context switching
- How context switching is initiated by the operating system and underlying hardware
- Processes, Threads, Concurrency issues
- threads in C++:
- concurrency in Python:
- Python Threads: https://www.youtube.com/watch?v=Bs7vPNbB9JM
- Understanding the Python GIL: https://www.youtube.com/watch?v=Obt-vMVdM8s
- David Beazley - Python Concurrency From the Ground Up: LIVE! - PyCon 2015: https://www.youtube.com/watch?v=MCs5OvhV9S4
- Keynote David Beazley - Topics of Interest (Python Asyncio): https://www.youtube.com/watch?v=ZzfHjytDceU
-
Scale & Data Handling:
- see Scalability section below and Papers
- Distill large data sets to single values
- Transform one data set to another
- Handling obscenely large amounts of data
- https://www.youtube.com/watch?v=9nWyWwY2Onc
- https://www.youtube.com/watch?v=H4vMcD7zKM0
-
System design
- https://www.quora.com/How-do-I-prepare-to-answer-design-questions-in-a-technical-interview?redirected_qid=1500023
- features sets
- interfaces
- class hierarchies
- designing a system under certain constraints
- simplicity and robustness
- tradeoffs
- performance analysis and optimization
- Lecture Videos - can skip through some if you already have a good OO background
- Chapter 1 - Software and Software Engineering: https://www.youtube.com/watch?v=maE3PxV4mk0
- Chapter 2 (Part 1) - Basics of Object-Orientation: https://www.youtube.com/watch?v=noe17Sg5Uas
- Chapter 2 (Part 2) - Inheritance, polymorphism and review of key Java concepts: https://www.youtube.com/watch?v=NSJ0zNQ2Ilk
- Chapter 3 (Part 1) - Reuse, Frameworks, and Basic Client-Server Concepts: https://www.youtube.com/watch?v=H7kLteC0vJY
- Chapter 3 (Part 2) - Client-Server Architecture, Network Concepts, and Networking in Java: https://www.youtube.com/watch?v=0W4iYsHjhlY
- Chapter 4 (Part 1) - Simplechat and Requirements: https://www.youtube.com/watch?v=_c-lYobWpjc
- Chapter 4 (Part 2) - Developing Requirements: https://www.youtube.com/watch?v=JDbZa5q4NzM
- Chapter 5 (Part 1) - Class Diagrams: https://www.youtube.com/watch?v=_c-lYobWpjc
- Chapter 5 (Part 2) - Aggregation, OCL, Genealogy Example, Process for Developing Diagrams: https://www.youtube.com/watch?v=ozHk4LEaBJs
- Chapter 5 (Part 3) - Developing a Class Diagram for the Bank Account Management System: https://www.youtube.com/watch?v=Kg1hL-etRwE
- Chapter 6 (Part 1) - Patterns: https://www.youtube.com/watch?v=LAP2A80Ajrg
- Chapter 6 (Part 2) - General Hierarchy, Player-Role, Singleton, Observer, Delegation: https://www.youtube.com/watch?v=U8-PGsjvZc4
- Chapter 6 (Part 3) - Adapter, Facade, Read-Only Interface, Proxy: https://www.youtube.com/watch?v=7sduBHuex4c
- Chapter 7 (Part 1) - Users, Use Cases, User Interface Design: https://www.youtube.com/watch?v=zoANpUOLCn0
- Chapter 7 (Part 2) - Evaluating a UI, Implementing a UI in Java: https://www.youtube.com/watch?v=Ip18LJdy6UY
- Chapter 8 - State and Activity Diagrams: https://www.youtube.com/watch?v=5IQgKtRuyiY
- Chapter 9 (Part 1) - Software Architecture and Design: https://www.youtube.com/watch?v=FMKv8Vozf5c
- Chapter 9 (Part 2) - Design Principles, Software Architecture: https://www.youtube.com/watch?v=XQnytAeZrWE
- Chapter 9 (Part 3) - Pipe-and-Filter Architecture and Design Documents: https://www.youtube.com/watch?v=ZmsUizg6gPY
- Chapter 10 and 11 - Testing process, Inspection, Process Models, Cost Estimation, Team Building: https://www.youtube.com/watch?v=L8x3OuZcEsc
-
Familiarize yourself with a unix-based code editor: emacs & vi(m)
- suggested by Yegge, from an old Amazon recruiting post
- vi(m):
- emacs:
- https://www.youtube.com/watch?v=hbmV1bnQ-i0
- set of 3:
- https://www.youtube.com/watch?v=JWD1Fpdd4Pc
- http://www.cs.yale.edu/homes/aspnes/classes/223/notes.html#Writing_C_programs_with_Emacs
- (maybe) Org Mode In Depth: Managing Structure: https://www.youtube.com/watch?v=nsGYet02bEk
-
Be able to use unix command line tools:
- suggested by Yegge, from an old Amazon recruiting post. I filled in the list below from good tools.
- bash
- cat
- grep
- sed
- awk
- curl or wget
- sort
- tr
- uniq
-
Testing
- how unit testing works
- what are mock objects
- what is integration testing
- what is dependency injection
-
Scheduling
- in an OS, how it works
-
Implement system routines
- understand what lies beneath the programming APIs you use
- can you implement them?
Read and do exercises:
-
The Algorithm Design Manual (Skiena)
- Book (can rent on kindle):
- Half.com is a great resource for textbooks at good prices.
- Answers:
- Errata:
Once you've understood everything in the daily plan, and read and done exercises from the the books above, read and do exercises from the books below. Then move to coding challenges (further down below)
Read first:
- Programming Interviews Exposed: Secrets to Landing Your Next Job, 2nd Edition: http://www.wiley.com/WileyCDA/WileyTitle/productCd-047012167X.html
Read second (recommended by many, but not in Google coaching docs):
- Cracking the Coding Interview, 6th Edition:
- http://www.amazon.com/Cracking-Coding-Interview-6th-Programming/dp/0984782850/
- If you see people reference "The Google Resume", it was a book replaced by "Cracking the Coding Interview".
-
C Programming Language, Vol 2
- answers to questions: https://github.com/lekkas/c-algorithms
-
C++ Primer Plus, 6th Edition
-
The Unix Programming Environment
-
Programming Pearls:
-
Algorithms and Programming: Problems and Solutions: http://www.amazon.com/Algorithms-Programming-Solutions-Alexander-Shen/dp/0817638474
If you have time:
-
Introduction to Algorithms
- https://www.amazon.com/Introduction-Algorithms-3rd-MIT-Press/dp/0262033844
- Half.com is a great resource for textbooks at good prices.
-
Elements of Programming Interviews
- https://www.amazon.com/Elements-Programming-Interviews-Insiders-Guide/dp/1479274836
- all code is in C++, if you're looking to use C++ in your interview
- good book on problem solving in general.
- How Search Works:
- https://www.topcoder.com/community/data-science/data-science-tutorials/the-importance-of-algorithms/
- http://highscalability.com/blog/2012/3/26/7-years-of-youtube-scalability-lessons-in-30-minutes.html
- http://highscalability.com/blog/2016/4/4/how-to-remove-duplicates-in-a-large-dataset-reducing-memory.html
- http://highscalability.com/blog/2016/3/23/what-does-etsys-architecture-look-like-today.html
- http://highscalability.com/blog/2016/3/21/to-compress-or-not-to-compress-that-was-ubers-question.html
- http://highscalability.com/blog/2016/3/3/asyncio-tarantool-queue-get-in-the-queue.html
- http://highscalability.com/blog/2016/2/25/when-should-approximate-query-processing-be-used.html
- http://highscalability.com/blog/2016/2/23/googles-transition-from-single-datacenter-to-failover-to-a-n.html
- http://highscalability.com/blog/2016/2/15/egnyte-architecture-lessons-learned-in-building-and-scaling.html
- http://highscalability.com/blog/2016/2/1/a-patreon-architecture-short.html
- http://highscalability.com/blog/2016/1/27/tinder-how-does-one-of-the-largest-recommendation-engines-de.html
- http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html
- http://highscalability.com/blog/2016/1/13/live-video-streaming-at-facebook-scale.html
- http://highscalability.com/blog/2016/1/11/a-beginners-guide-to-scaling-to-11-million-users-on-amazons.html
- http://highscalability.com/blog/2015/12/16/how-does-the-use-of-docker-effect-latency.html
- http://highscalability.com/blog/2015/12/14/does-amp-counter-an-existential-threat-to-google.html
- http://highscalability.com/blog/2015/11/9/a-360-degree-view-of-the-entire-netflix-stack.html
- http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it
-
The Google File System:
-
MapReduce: Simplified Data Processing on Large Clusters:
-
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
-
How Developers Search for Code: A Case Study
-
Borg, Omega, and Kubernetes
-
Continuous Pipelines at Google
-
AddressSanitizer: A Fast Address Sanity Checker
-
Computing Weak Consistency in Polynomial Time
Once you've learned your brains out, put those brains to work. Take coding challenges every day, as many as you can.
-
How to Find a Solution: https://www.topcoder.com/community/data-science/data-science-tutorials/how-to-find-a-solution/
-
How to Dissect a Topcoder Problem Statement: https://www.topcoder.com/community/data-science/data-science-tutorials/how-to-dissect-a-topcoder-problem-statement/
-
Mathematics for Topcoders: https://www.topcoder.com/community/data-science/data-science-tutorials/mathematics-for-topcoders/
-
Dynamic Programming – From Novice to Advanced: https://www.topcoder.com/community/data-science/data-science-tutorials/dynamic-programming-from-novice-to-advanced/
-
https://courses.csail.mit.edu/iap/interview/materials.php
-
LeetCode: https://leetcode.com/
-
Project Euler: https://projecteuler.net/index.php?section=problems
-
TopCoder: https://www.topcoder.com/
-
HackerRank: https://www.hackerrank.com/
-
Codility: https://codility.com/programmers/
-
InterviewCake: https://www.interviewcake.com/
-
InterviewBit: https://www.interviewbit.com/invite/icjf
-
Exercises for getting better at a given language: http://exercism.io/languages
-
- Cracking The Coding Interview Set 2:
- http://steve-yegge.blogspot.co.uk/2007_09_01_archive.html
- Great stuff at the back of Cracking The Coding Interview
-
Think of about 20 interview questions you'll get, along the lines of the items below:
-
have 2-3 answers for each
-
Have a story, not just data, about something you accomplished
-
Why do you want this job?
-
What's a tough problem you've solved?
-
Biggest challenges faced?
-
Best/worst designs seen?
-
Ideas for improving an existing Google product.
-
How do you work best, as an individual and as part of a team?
-
Which of your skills or experiences would be assets in the role and why?
-
What did you most enjoy at [job x / project y]?
-
What was the biggest challenge you faced at [job x / project y]?
-
What was the hardest bug you faced at [job x / project y]?
-
What did you learn at [job x / project y]?
-
What would you have done better at [job x / project y]?
Some of mine (I already may know answer to but want their opinion or team perspective):
- How large is your team?
- What is your dev cycle look like? Do you do waterfall/sprints/agile?
- Are rushes to deadlines common? Or is there flexibility?
- How are decisions made in your team?
- How many meetings do you have per week?
- Do you feel your work environment helps you concentrate?
- What are you working on?
- What do you like about it?
- What is the work life like?
Everything below is my recommendation, not Google's, and you may not have enough time to
learn, watch or read them all. That's ok. I may not either.
-
Compression
- Compressor Head videos: https://www.youtube.com/playlist?list=PLOU2XLYxmsIJGErt5rrCqaSGTMyyqNt2H
-
Augmented Data Structures
- CS 61B Lecture 39: Augmenting Data Structures: https://www.youtube.com/watch?v=zksIj9O8_jc&list=PL4BBB74C7D2A1049C&index=39
-
Geometry, Convex hull
- Geometric Algorithms: Graham & Jarvis - Lecture 10: https://www.youtube.com/watch?v=J5aJEcOr6Eo&index=10&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm
- Divide & Conquer: Convex Hull, Median Finding: https://www.youtube.com/watch?v=EzeYI7p9MjU&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=2
-
Skip lists
- "These are somewhat of a cult data structure" - Skiena
- Randomization: Skip Lists: https://www.youtube.com/watch?v=2g9OSRKJuzM&index=10&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp
-
Network Flows
- Network Flows: https://www.youtube.com/watch?v=i0q-Irlf4y4&index=5&list=PL2B4EEwhKD-NbwZ4ezj7gyc_3yNrojKM9
- Augmenting Path Algorithms: https://www.youtube.com/watch?v=7QPI3kBIKv4&index=6&list=PL2B4EEwhKD-NbwZ4ezj7gyc_3yNrojKM9
- Network Flow Algorithms: https://www.youtube.com/watch?v=5PR0ExrHO-Q&list=PL2B4EEwhKD-NbwZ4ezj7gyc_3yNrojKM9&index=7
-
Linear Programming
- Lecture 13 10/11 Linear Programming: https://www.youtube.com/watch?v=IOQApuleqvg&list=PL2B4EEwhKD-NbwZ4ezj7gyc_3yNrojKM9&index=11
- Lecture 14 10/16 Linear Programming: https://www.youtube.com/watch?v=vpX0TSAcdJY&list=PL2B4EEwhKD-NbwZ4ezj7gyc_3yNrojKM9&index=12
-
Disjoint Sets & Union Find
- https://en.wikipedia.org/wiki/Disjoint-set_data_structure
- UCB 61B - Disjoint Sets; Sorting & selection: https://www.youtube.com/watch?v=MAEGXTwmUsI&list=PL-XXv-cvA_iAlnI-BQr9hjqADPBtujFJd&index=21
- CS 61B Lecture 31: 5Disjoint Sets: https://www.youtube.com/watch?v=wSPAjGfDl7Q&list=PL4BBB74C7D2A1049C&index=31
- https://www.coursera.org/learn/data-structures/lecture/JssSY/overview
- https://www.coursera.org/learn/data-structures/lecture/EM5D0/naive-implementations
- https://www.coursera.org/learn/data-structures/lecture/Mxu0w/trees
- https://www.coursera.org/learn/data-structures/lecture/qb4c2/union-by-rank
- https://www.coursera.org/learn/data-structures/lecture/Q9CVI/path-compression
- https://www.coursera.org/learn/data-structures/lecture/GQQLN/analysis-optional
-
van Emde Boas Trees
- Divide & Conquer: van Emde Boas Trees: https://www.youtube.com/watch?v=hmReJCupbNU&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=6
-
Fast Fourier Transform
-
Integer Arithmetic, Karatsuba Multiplication: https://www.youtube.com/watch?v=eCaXlAaN2uE&index=11&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
-
Treap
- ?
-
Parity & Hamming Code
- Parity:
- Hamming Code:
- Error detection: https://www.youtube.com/watch?v=1A_NcXxdoCc
- Error correction: https://www.youtube.com/watch?v=JAMLuxdHH8o
- Error Checking:
-
Computer Security
-
Markov text generation
- https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/waxgx/core-markov-text-generation
- https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/gZhiC/core-implementing-markov-text-generation
- https://www.coursera.org/learn/data-structures-optimizing-performance/lecture/EUjrq/project-markov-text-generation-walk-through
-
Information theory
- Markov processes:
- https://www.khanacademy.org/computing/computer-science/informationtheory/moderninfotheory/v/symbol-rate-information-theory
- includes Markov chain
-
Bloom Filter
-
Machine Learning
- Why ML?
- Google Developers' Machine Learning Recipes (Scikit Learn & Tensorflow):
- Tensorflow: https://www.youtube.com/watch?v=oZikw5k_2FM
- great course (Stanford): https://www.coursera.org/learn/machine-learning
- Google's Deep Learning Nanodegree: https://www.udacity.com/course/deep-learning--ud730
- Google/Kaggle Machine Learning Engineer Nanodegree: https://www.udacity.com/course/machine-learning-engineer-nanodegree-by-google--nd009
- Self-Driving Car Engineer Nanodegree: https://www.udacity.com/course/self-driving-car-engineer-nanodegree--nd013
- Metis Online Course ($99 for 2 months): http://www.thisismetis.com/explore-data-science
- Practical Guide to implementing Neural Networks in Python (using Theano): http://www.analyticsvidhya.com/blog/2016/04/neural-networks-python-theano/
- Data School: http://www.dataschool.io/
- Vector calculus
-
Parallel Programming
-
Cryptography (also see videos below)
- Cryptography: Hash Functions: https://www.youtube.com/watch?v=KqqOXndnvic&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=30
- Cryptography: Encryptiom: https://www.youtube.com/watch?v=9TNI2wHmaeI&index=31&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp
-
Entropy (see videos below)
-
Discrete math (see videos below)
-
Go:
- A Tour of Go: https://www.youtube.com/watch?v=ytEkHepK08c
- Go Programming: https://www.youtube.com/watch?v=CF9S4QZuV30
- Why Learn Go?: https://www.youtube.com/watch?v=FTl0tl9BGdc
--
I added these to reinforce some ideas already presented above, but didn't want to include them
above because it's just too much. It's easy to overdo it on a subject.
You want to get hired in this century, right?
-
More Dynamic Programming
- 6.006: Dynamic Programming I: Fibonacci, Shortest Paths: https://www.youtube.com/watch?v=OQ5jsbhAv_M&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=19
- 6.006: Dynamic Programming II: Text Justification, Blackjack: https://www.youtube.com/watch?v=ENyox7kNKeY&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=20
- 6.006: DP III: Parenthesization, Edit Distance, Knapsack: https://www.youtube.com/watch?v=ocZMDMZwhCY&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb&index=21
- 6.006: DP IV: Guitar Fingering, Tetris, Super Mario Bros.: https://www.youtube.com/watch?v=tp4_UXaVyx8&index=22&list=PLUl4u3cNGP61Oq3tWYp6V_F-5jb5L2iHb
- 6.046: Dynamic Programming & Advanced DP: https://www.youtube.com/watch?v=Tw1k46ywN6E&index=14&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp
- 6.046: Dynamic Programming: All-Pairs Shortest Paths: https://www.youtube.com/watch?v=NzgFUwOaoIw&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=15
- 6.046: Dynamic Programming (student recitation): https://www.youtube.com/watch?v=krZI60lKPek&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=12
-
Advanced Graph Processing
- Synchronous Distributed Algorithms: Symmetry-Breaking. Shortest-Paths Spanning Trees: https://www.youtube.com/watch?v=mUBmcbbJNf4&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=27
- Asynchronous Distributed Algorithms: Shortest-Paths Spanning Trees: https://www.youtube.com/watch?v=kQ-UQAzcnzA&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=28
-
MIT Probability (mathy, and go slowly, which is good for mathy things):
- MIT 6.042J - Probability Introduction: https://www.youtube.com/watch?v=SmFwFdESMHI&index=18&list=PLB7540DEDD482705B
- MIT 6.042J - Conditional Probability: https://www.youtube.com/watch?v=E6FbvM-FGZ8&index=19&list=PLB7540DEDD482705B
- MIT 6.042J - Independence: https://www.youtube.com/watch?v=l1BCv3qqW4A&index=20&list=PLB7540DEDD482705B
- MIT 6.042J - Random Variables: https://www.youtube.com/watch?v=MOfhhFaQdjw&list=PLB7540DEDD482705B&index=21
- MIT 6.042J - Expectation I: https://www.youtube.com/watch?v=gGlMSe7uEkA&index=22&list=PLB7540DEDD482705B
- MIT 6.042J - Expectation II: https://www.youtube.com/watch?v=oI9fMUqgfxY&index=23&list=PLB7540DEDD482705B
- MIT 6.042J - Large Deviations: https://www.youtube.com/watch?v=q4mwO2qS2z4&index=24&list=PLB7540DEDD482705B
- MIT 6.042J - Random Walks: https://www.youtube.com/watch?v=56iFMY8QW2k&list=PLB7540DEDD482705B&index=25
-
Simonson: Approximation Algorithms: https://www.youtube.com/watch?v=oDniZCmNmNw&list=PLFDnELG9dpVxQCxuD-9BSy2E7BWY3t5Sm&index=19
Sit back and enjoy. "netflix and skill" :P
-
List of individual Dynamic Programming problems (each is short):
-
MIT 18.06 Linear Algebra, Spring 2005 (35 videos):
-
Computer Science 70, 001 - Spring 2015 - Discrete Mathematics and Probability Theory:
-
Discrete Mathematics (19 videos):
-
CSE373 - Analysis of Algorithms (25 videos):
- Skiena lectures from Algorithm Design Manual
- https://www.youtube.com/watch?v=ZFjhkohHdAA&list=PLOtl7M3yp-DV69F32zdK7YJcNXpTunF2b&index=1
-
UC Berkeley 61B (Spring 2014): Data Structures (25 videos):
-
UC Berkeley 61B (Fall 2006): Data Structures (39 videos):
-
UC Berkeley 61C: Machine Structures (26 videos):
-
MIT 6.004: Computation Structures (49 videos):
-
MIT 6.006: Intro to Algorithms (47 videos):
-
MIT 6.033: Computer System Engineering (22 videos):
-
MIT 6.034 Artificial Intelligence, Fall 2010 (30 videos):
-
MIT 6.042J Mathematics for Computer Science, Fall 2010 (25 videos):
-
MIT 6.046: Design and Analysis of Algorithms (34 videos):
-
MIT 6.050J Information and Entropy, Spring 2008 ()
-
MIT 6.851: Advanced Data Structures (22 videos):
-
MIT 6.854 (Advanced Algorithms), Spring 2016 (24 videos):
-
MIT 6.858 Computer Systems Security, Fall 2014 ():
-
Stanford: Programming Paradigms (17 videos)
-
Introduction to Cryptography:
- https://www.youtube.com/watch?v=2aHkqB2-46k&feature=youtu.be
- more in series (not in order): https://www.youtube.com/channel/UC1usFRN4LCMcfIV7UjHNuQg
http://www.gainlo.co/ - Mock interviewers from big companies
Keep learning.
You're never really done.