Useful Algorithms
No | Algorithm | Description | Usecases | Useful Links |
---|---|---|---|---|
1 | Kalman filter | ◾ as a person who makes a correct guess about something's location. ◾ to find missing roads and update map data. ◾for map matching. | ||
2 | Viterbi algorithm | ◾ as a person who understands the correct story even if some words are spelled wrong. ◾for map matching. |
No | Algorithm | Description | Usecases | Useful Links |
---|---|---|---|---|
1 | 𝐁𝐥𝐨𝐨𝐦 𝐅𝐢𝐥𝐭𝐞𝐫 | ◾ A space-efficient probabilistic data structure used to test whether an element is a member of a set. ◾ False positives are possible, but false negatives are not. ◾ Eliminate costly lookup. | ◾ Checking if a username is available. ◾ Identifying potentially malicious URLs in a web crawler. | https://www.linkedin.com/posts/alex-xu-a8131b11_systemdesign-coding-interviewtips-activity-6917494340315463680-O0sG/ |
2 | 𝐅𝐫𝐮𝐠𝐚𝐥 𝐒𝐭𝐫𝐞𝐚𝐦𝐢𝐧𝐠 | ◾ A memory-efficient algorithm to compute quantiles (e.g., median, percentiles) over streaming data. | ◾ Real-time monitoring of website response times. ◾ Analyzing large-scale sensor data. | |
3 | 𝐆𝐞𝐨𝐡𝐚𝐬𝐡/𝐒2 𝐆𝐞𝐨𝐦𝐞𝐭𝐫𝐲 | ◾ Geospatial indexing systems that divide the Earth's surface into hierarchical grids for efficient spatial queries. | ◾ Location-based services (finding nearby restaurants, stores, etc.). ◾ Proximity searches (matching users based on location). | |
4 | 𝐇𝐲𝐩𝐞𝐫𝐋𝐨𝐠𝐋𝐨𝐠: | ◾ A probabilistic algorithm for estimating the number of distinct elements (cardinality) in a data set with high accuracy and low memory usage. ◾ Count unique values fast. | ◾ Counting unique visitors to a website. ◾ Analyzing user behavior patterns. | https://engineering.fb.com/2018/12/13/data-infrastructure/hyperloglog/ |
5 | 𝐋𝐞𝐚𝐤𝐲 𝐁𝐮𝐜𝐤𝐞𝐭/𝐓𝐨𝐤𝐞𝐧 𝐁𝐮𝐜𝐤𝐞𝐭 | ◾ Algorithms used to control the rate of traffic flow in a network or system, preventing overload. | ◾ Rate limiting API requests. ◾ Smoothing out bursty traffic. | |
6 | 𝐋𝐨𝐬𝐬𝐲 𝐂𝐨𝐮𝐧𝐭 | ◾ An algorithm for finding frequent items in a data stream, particularly when memory is limited. It allows for some error to save space. | ◾ Identifying trending topics on social media. ◾ Finding popular products in an e-commerce store. | |
7 | 𝐎𝐩𝐞𝐫𝐚𝐭𝐢𝐨𝐧𝐚𝐥 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐚𝐭𝐢𝐨𝐧 (𝐎𝐓) | ◾ A technique for enabling concurrent editing of shared documents while maintaining consistency. | ◾ Collaborative text editors (like Google Docs). ◾ Real-time code collaboration tools. ◾ Collaborative drawing applications. | https://en.wikipedia.org/wiki/Operational_transformation |
8 | 𝐐𝐮𝐚𝐝𝐭𝐫𝐞𝐞/𝐑-𝐓𝐫𝐞𝐞 | ◾ Data structures for indexing multi-dimensional data (e.g., spatial data). | ◾ Spatial databases. ◾ Collision detection in video games. | |
9 | 𝐑𝐚𝐲 𝐂𝐚𝐬𝐭𝐢𝐧𝐠 | ◾ A technique in computer graphics used to determine which objects in a scene are intersected by a ray (e.g., from the viewer's eye). | ◾ Rendering 3D scenes. ◾ Line of sight calculations. ◾ Hidden surface removal. | |
10 | 𝐑𝐞𝐯𝐞𝐫𝐬𝐞 𝐈𝐧𝐝𝐞𝐱 | ◾ A data structure that maps terms (words or phrases) to the documents containing them. | ◾ Full-text search engines. ◾ Document retrieval systems. | |
11 | 𝐑𝐬𝐲𝐧𝐜 𝐀𝐥𝐠𝐨𝐫𝐢𝐭𝐡𝐦 | ◾ A file synchronization algorithm that minimizes data transfer by only sending differences between files. | ◾ File backups. ◾ Remote file synchronization. ◾ Version control systems. | |
12 | Geohash | ◾ Location based service. | https://www.pubnub.com/learn/glossary/what-is-geohashing/ | |
13 | Quadtree | ◾ Location based service. | https://engblog.yext.com/post/geolocation-caching | |
14 | Consistent hashing | ◾ Balance the load within a cluster of services. | https://www.toptal.com/big-data/consistent-hashing | |
15 | Leaky bucket | ◾ Rate Limiter. | https://www.quora.com/What-is-the-difference-between-token-bucket-and-leaky-bucket-algorithms | |
16 | token bucket | ◾ Rate Limiter. | https://www.quora.com/What-is-the-difference-between-token-bucket-and-leaky-bucket-algorithms | |
17 | Trie | ◾ Search Autocomplete. | https://en.wikipedia.org/wiki/Trie | |
18 | Rsync | ◾ File Transfers. | https://rsync.samba.org/tech_report/ | |
18 | Raft | ◾ Consensus Algorithm. | https://raft.github.io/ | |
19 | Paxos | ◾ Consensus Algorithm. | https://martinfowler.com/articles/patterns-of-distributed-systems/paxos.html | |
20 | Merkle tree | ◾ Identify Inconsistencies between nodes. | https://en.wikipedia.org/wiki/Merkle_tree | |
21 | Count-min sketch | ◾ Estimate frequencies of items. | https://florian.github.io/count-min-sketch/ | |
22 | Hierarchical timing wheels | ◾ Job scheduler. | https://www.cse.wustl.edu/~cdgill/courses/cs6874/TimingWheels.ppt |
Data Structures That Power Your Databases
No | Algorithm | Description | Usecases | Useful Links |
---|---|---|---|---|
1 | Skiplist | ◾ a common in-memory index type. Used in Redis. | ||
2 | Hash index | ◾ a very common implementation of the “Map” data structure (or “Collection”). | ||
3 | SSTable | ◾ immutable on-disk “Map” implementation. | ||
4 | LSM tree | ◾ Skiplist + SSTable. High write throughput. | ||
5 | B-tree | ◾ disk-based solution. Consistent read/write performance. | ||
6 | Inverted index | ◾ used for document indexing. Used in Lucene. | ||
7 | Suffix tree | ◾ for string pattern search. | ||
8 | R-tree | ◾ multi-dimension search, such as finding the nearest neighbor. |
Data structures
No | Algorithm | Description | Usecases | Useful Links |
---|---|---|---|---|
1 | list | ◾ keep your Twitter feeds. | ||
2 | stack | ◾ support undo/redo of the word editor. | ||
3 | queue | ◾ keep printer jobs, or send user actions in-game. | ||
4 | heap | ◾ task scheduling. | ||
5 | tree | ◾ keep the HTML document, or for AI decision. | ||
6 | suffix tree | ◾ for searching string in a document. | ||
7 | graph | ◾ for tracking friendship, or path finding. | ||
8 | r-tree | ◾ for finding the nearest neighbor. | ||
9 | vertex buffer | ◾ for sending data to GPU for rendering. | ||
10 | Breadth-First Search (BFS) | Explore a graph level by level, starting from the root, which is great for finding the shortest path in unweighted graphs. | ➡️ Useful when: You're designing web crawlers or analyzing social networks. | |
11 | Two Heaps | Uses a min-heap and max-heap to manage dynamic datasets efficiently, maintaining median and priority. | ➡️ Useful when: You need to manage a priority queue or dynamic datasets. | |
12 | Two Pointers | This technique takes 2 points in a sequence and performs logic based on the problem. | ➡️ Useful when: You are implementing sorting or searching functions. | |
13 | Sliding Window | Optimizes the computation by reusing the state from the previous subset of data. | ➡️ Useful when: You're handling network congestion or data compression. | |
14 | Depth-First Search (DFS) | Explores each path to the end, ideal for situations that involve exploring all options like in puzzles. | ➡️ Useful when: You're working with graph structures or need to generate permutations. | |
15 | Topological Sort | Helps in scheduling tasks based on their dependencies. | ➡️ Useful when: You are determining execution order in project management or compiling algorithms. | |
16 | Merge Intervals | Optimizes overlapping intervals to minimize the number of intervals. | ➡️ Useful when: Scheduling resources or managing calendars. | |
17 | Backtracking | It explores all potential solutions systematically and is perfect for solving puzzles and optimization problems. | ➡️ Useful when: Solving complex logical puzzles or optimizing resource allocations. | |
18 | Trie (Prefix Tree) | A tree-like structure that manages dynamic sets of strings efficiently, often used for searching. | ➡️ Useful when: Implementing spell-checkers or autocomplete systems. | |
19 | Flood Fill | It fills a contiguous area for features like the 'paint bucket' tool. | ➡️ Useful when: Working in graphics editors or game development. | |
20 | Segment Tree | Efficiently manages intervals or segments and is useful for storing information about intervals and querying over them. | ➡️ Useful when: Dealing with database range queries or statistical calculations. |