Skip to content

Commit 8ce4df6

Browse files
committed
issue 24 - LRU english version
1 parent d605bfc commit 8ce4df6

File tree

2 files changed

+233
-0
lines changed

2 files changed

+233
-0
lines changed

interview/LRU Algorithm.md

Lines changed: 233 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,233 @@
1+
**Translator: [youyun](https://github.com/youyun)**
2+
3+
**Author: [labuladong](https://github.com/labuladong)**
4+
5+
# Detaild Analysis of LRU Algorithm
6+
7+
### 1. What is LRU Algorithm
8+
9+
It is just a cache clean-up strategy.
10+
11+
A computer has limited memory cache. If the cache is full, some contents need to be removed from cache to provide space for new content. However, which part of the cache should be removed? We hope to remove not so useful contents, while leaving useful contents untouched for future usage. So the question is, what is the criteria to determine if the data is _useful_ or not?
12+
13+
LRU (Least Recently Used) cache clean-up algorithm is a common strategy. According to the name, the latest used data should be _useful_. Hence, when the memory cache is full, we should priortize to remove those data that haven't been used for long are not useful.
14+
15+
For example, an Android phone can run apps in the backgroud. If I opened in sequence: Settings, Phone Manager, and Calendar, their order in the background will be shown as following:
16+
17+
![jietu](../pictures/LRU%E7%AE%97%E6%B3%95/1.jpg)
18+
19+
If I switch to Settings now, Settings will be brought to the first:
20+
21+
![jietu](../pictures/LRU%E7%AE%97%E6%B3%95/2.jpg)
22+
23+
Assume that my phone only allows me to open 3 apps simultaneously, then the cache is already full by now. If I open another app, Clock, then I have to close another app to free up space for Clock. Which one should be closed?
24+
25+
Accoording to LRU strategy, the lowest app, Phone Manager, should be closed, because it is the longest unused app. Afterwards, the newly opened app will be on the top:
26+
27+
![jietu](../pictures/LRU%E7%AE%97%E6%B3%95/3.jpg)
28+
29+
Now you should understand LRU (Least Recently Used) strategy. There are some other strategies available, for example, LFU (Least Frequently Used) strategy, etc. Different strategies can be applied in different use cases. We'll focus on LRU in this article.
30+
31+
### 2. LRU Algorithm Description
32+
33+
LRU algorithm is actually about data structure design:
34+
1. Take a parameter, `capacity`, as the maximum size; then
35+
2. Implement two APIs:
36+
* `put(key, val)`: to store key-value pair
37+
* `get(key)`: return the value associated with the key; return -1 if key doesn't exist.
38+
3. The time complexity for both `get` and `put` should be __O(1)__.
39+
40+
Let's use an example to understand how LRU algorithm works.
41+
42+
```cpp
43+
/* Cache capacity is 2 */
44+
LRUCache cache = new LRUCache(2);
45+
// Assume the cache is an queue
46+
// The head is on the left, while the tail is on the right
47+
// The latest used is at the head, while the longest unused is at the tail
48+
// Bracket represents key-value pair, (key, val)
49+
50+
cache.put(1, 1);
51+
// cache = [(1, 1)]
52+
cache.put(2, 2);
53+
// cache = [(2, 2), (1, 1)]
54+
cache.get(1); // return 1
55+
// cache = [(1, 1), (2, 2)]
56+
// Remarks: because key 1 is visited, move it to the head
57+
// Return the value, 1, associated with key 1
58+
cache.put(3, 3);
59+
// cache = [(3, 3), (1, 1)]
60+
// Remarks: the memory capacity is full
61+
// We need to remove some contents to free up space
62+
// Removal will priortize longest unused data, which is at the tail
63+
// Afterwards, insert the new data at the head
64+
cache.get(2); // return -1 (not found)
65+
// cache = [(3, 3), (1, 1)]
66+
// Remarks: key 2 does not exist in the cache
67+
cache.put(1, 4);
68+
// cache = [(1, 4), (3, 3)]
69+
// Remarks: key 1 exists
70+
// Overwrite with new value 4
71+
// Don't forget to bring the key to the head
72+
```
73+
74+
### 3. LRU Algorithm Design
75+
76+
Through analysis of the above steps, if time complexity for `put` and `get` are both O(1), we can summarize features of this cache data structure: fast search, fast insertion, fast deletion, and ordered.
77+
- _Ordered_: Obviously, the data has to be ordered to distinguish recently used and longest unused.
78+
- _Fast Search_: We also need to be able to find if a key exists in the cache.
79+
- _Fast Deletion_: If the cache is full, we need to delete the last element.
80+
- _Fast Insertion_: We need to insert the data to the head upon each visit.
81+
82+
Which data structure can fulfill the above requirements? Hash table can search fast, but the data is unordered. Data in linked list is ordered, and can be inserted or deleted fast, but is hard to be searched. Combining these two, we can come up with a new data structure: __hash linked list__.
83+
84+
The core data structure of LRU cache algorithm is hash linked list, a combination of doubly linked list and hash table. Here is how the data structure looks:
85+
86+
![HashLinkedList](../pictures/LRU%E7%AE%97%E6%B3%95/5.jpg)
87+
88+
The idea is simple - using hash table to provide the ability of fast earch to linked list. Think again about the previous example, isn't this data structure the perfect solution for LRU cache data structure?
89+
90+
Some audience may wonder, why doubly linked list? Can't single linked list work? Since key exists in hash table, why do we have to store the key-value pairs in linked list instead of values only?
91+
92+
The answers only afloat when we actually do it. We can only understand the rationale behind the design after we implement LRU algorithm ourselves. Let's look at the code.
93+
94+
### 4. Implementation
95+
96+
A lot of programming languages has built-in hash linked list, or LRU-alike functions. To help understand the details of LRU algorithm, let's use Java to re-invent the wheel.
97+
98+
First, define the `Node` class of doubly linked list. Assuming both `key` and `val` are of type `int`.
99+
100+
```java
101+
class Node {
102+
public int key, val;
103+
public Node next, prev;
104+
public Node(int k, int v) {
105+
this.key = k;
106+
this.val = v;
107+
}
108+
}
109+
```
110+
111+
Using our `Node` class, implement a doubly linked list with the necessary APIs (the time complexity of these functions are all O(1)):
112+
113+
```java
114+
class DoubleList {
115+
// Add x at the head, time complexity O(1)
116+
public void addFirst(Node x);
117+
118+
// Delete node x in the linked list (x is guarenteed to exist)
119+
// Given a node in a doubly linked list, time complexity O(1)
120+
public void remove(Node x);
121+
122+
// Delete and return the last node in the linked list, time complexity O(1)
123+
public Node removeLast();
124+
125+
// Return the length of the linked list, time complexity O(1)
126+
public int size();
127+
}
128+
```
129+
130+
P.S. This is the typical interface of a doubly linked list. In order to focus on the LRU algorithm, we'll skip the detailed implementation of functions in this class.
131+
132+
Now we can answer the question, why we have to use doubly linked list. In order to delete a node, we not only need to get the pointer of the node itself, but also need to update the node before and the node after. Only using a doubly linked list, we can guarentee the time complexity is O(1).
133+
134+
With the doubly linked list, we just need to use it in with hash table in LRU algorithm. Let's sort out the logic with pseudo code:
135+
136+
```java
137+
// key associated with Node(key, val)
138+
HashMap<Integer, Node> map;
139+
// Node(k1, v1) <-> Node(k2, v2)...
140+
DoubleList cache;
141+
142+
int get(int key) {
143+
if (key does not exist) {
144+
return -1;
145+
} else {
146+
bring (key, val) to the head;
147+
return val;
148+
}
149+
}
150+
151+
void put(int key, int val) {
152+
Node x = new Node(key, val);
153+
if (key exists) {
154+
delete the old node;
155+
insert the new node x to the head;
156+
} else {
157+
if (cache is full) {
158+
delete the last node in the linked list;
159+
delete the associated value in map;
160+
}
161+
inseart the new node x to the head;
162+
associate the new node x with key in map;
163+
}
164+
}
165+
```
166+
167+
If you can understand the logic above, it's easy to translate to code:
168+
169+
```java
170+
class LRUCache {
171+
// key -> Node(key, val)
172+
private HashMap<Integer, Node> map;
173+
// Node(k1, v1) <-> Node(k2, v2)...
174+
private DoubleList cache;
175+
// Max capacity
176+
private int cap;
177+
178+
public LRUCache(int capacity) {
179+
this.cap = capacity;
180+
map = new HashMap<>();
181+
cache = new DoubleList();
182+
}
183+
184+
public int get(int key) {
185+
if (!map.containsKey(key))
186+
return -1;
187+
int val = map.get(key).val;
188+
// Using put method to bring it forward to the head
189+
put(key, val);
190+
return val;
191+
}
192+
193+
public void put(int key, int val) {
194+
// Initialize new node x
195+
Node x = new Node(key, val);
196+
197+
if (map.containsKey(key)) {
198+
// Delete the old node, add to the head
199+
cache.remove(map.get(key));
200+
cache.addFirst(x);
201+
// Update the corresponding record in map
202+
map.put(key, x);
203+
} else {
204+
if (cap == cache.size()) {
205+
// Delete the last node in the linked list
206+
Node last = cache.removeLast();
207+
map.remove(last.key);
208+
}
209+
// Add to the head
210+
cache.addFirst(x);
211+
map.put(key, x);
212+
}
213+
}
214+
}
215+
```
216+
217+
This can answer the previous question, why we need to store key-value pair in the linked list, instead of value only. Pay attention to the block of code below:
218+
219+
```java
220+
if (cap == cache.size()) {
221+
// Delete the last node
222+
Node last = cache.removeLast();
223+
map.remove(last.key);
224+
}
225+
```
226+
227+
If the cache is full, we not only need to delete the last node, but also need to delete the key in the map, where we can only get the key through the node. If we only store value in a node, we can't get the key, and hence, can't delete the key from the map.
228+
229+
Till now, you should have understood the idea and implementation of LRU algorithm. One common mistake is to update associated entries in the hash table when you deal with nodes in the linked list.
230+
231+
**To make algorithm clear! Subscribe to my WeChat blog labuladong, and find more easy-to-understand articles.**
232+
233+
![labuladong](../pictures/labuladong.png)

pictures/LRU算法/5.jpg

55.1 KB
Loading

0 commit comments

Comments
 (0)