Skip to content

Commit 5a0bca8

Browse files
committed
add top k problem
1 parent 6ba3714 commit 5a0bca8

File tree

5 files changed

+405
-0
lines changed

5 files changed

+405
-0
lines changed

zh-hans/SUMMARY.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,10 @@
221221
* [Big Data](bigdata/README.md)
222222
* [Top K Frequent Words (Map Reduce)](bigdata/top_k_frequent_words_map_reduce.md)
223223
* [Top K Frequent Words](bigdata/top_k_frequent_words.md)
224+
* [Top K Frequent Words II](bigdata/top_k_frequent_words_ii.md)
225+
* [K Closest Points](bigdata/k_closest_points.md)
226+
* [Top k Largest Numbers](bigdata/top_k_largest_numbers.md)
227+
* [Top k Largest Numbers II](bigdata/top_k_largest_numbers_ii.md)
224228
* [Problem Misc](problem_misc/README.md)
225229
* [Nuts and Bolts Problem](problem_misc/nuts_and_bolts_problem.md)
226230
* [String to Integer](problem_misc/string_to_integer.md)

zh-hans/bigdata/k_closest_points.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
difficulty: Medium
3+
tags:
4+
- Heap
5+
- Amazon
6+
- LinkedIn
7+
title: K Closest Points
8+
---
9+
10+
# K Closest Points
11+
12+
## Problem
13+
14+
### Metadata
15+
16+
- tags: Heap, Amazon, LinkedIn
17+
- difficulty: Medium
18+
- source(lintcode): <https://www.lintcode.com/problem/k-closest-points/>
19+
20+
### Description
21+
22+
Given some `points` and a point `origin` in two dimensional space, find `k` points out of the some points which are nearest to `origin`.
23+
Return these points sorted by distance, if they are same with distance, sorted by x-axis, otherwise sorted by y-axis.
24+
25+
#### Example
26+
27+
Given points = `[[4,6],[4,7],[4,4],[2,5],[1,1]]`, origin = `[0, 0]`, k = `3`
28+
return `[[1,1],[2,5],[4,4]]`
29+
30+
## 题解
31+
32+
和普通的字符串及数目比较,此题为距离的比较。
33+
34+
### Java
35+
36+
```java
37+
/**
38+
* Definition for a point.
39+
* class Point {
40+
* int x;
41+
* int y;
42+
* Point() { x = 0; y = 0; }
43+
* Point(int a, int b) { x = a; y = b; }
44+
* }
45+
*/
46+
47+
public class Solution {
48+
/**
49+
* @param points: a list of points
50+
* @param origin: a point
51+
* @param k: An integer
52+
* @return: the k closest points
53+
*/
54+
public Point[] kClosest(Point[] points, Point origin, int k) {
55+
// write your code here
56+
Queue<Point> heap = new PriorityQueue<Point>(new DistanceComparator(origin));
57+
for (Point point : points) {
58+
if (heap.size() < k) {
59+
heap.offer(point);
60+
} else {
61+
Point peek = heap.peek();
62+
if (distance(peek, origin) <= distance(point, origin)) {
63+
continue;
64+
} else {
65+
heap.poll();
66+
heap.offer(point);
67+
}
68+
}
69+
}
70+
71+
int minK = Math.min(k, heap.size());
72+
Point[] kClosestPoints = new Point[minK];
73+
for (int i = 1; i <= minK; i++) {
74+
kClosestPoints[minK - i] = heap.poll();
75+
}
76+
77+
return kClosestPoints;
78+
}
79+
80+
public int distance(Point p, Point origin) {
81+
return (p.x - origin.x) * (p.x - origin.x) +
82+
(p.y - origin.y) * (p.y - origin.y);
83+
}
84+
85+
class DistanceComparator implements Comparator<Point> {
86+
private Point origin = null;
87+
public DistanceComparator(Point origin) {
88+
this.origin = origin;
89+
}
90+
91+
public int compare(Point p1, Point p2) {
92+
int d1 = distance(p1, origin);
93+
int d2 = distance(p2, origin);
94+
if (d1 != d2) {
95+
return d2 - d1;
96+
} else {
97+
if (p1.x != p2.x) {
98+
return p2.x - p1.x;
99+
} else {
100+
return p2.y - p1.y;
101+
}
102+
}
103+
}
104+
}
105+
}
106+
```
107+
108+
### 源码分析
109+
110+
注意 Comparator 的用法和大小根堆的选择即可。
111+
112+
### 复杂度分析
113+
114+
堆的删除插入操作,最大为 K, 故时间复杂度为 $$O(n \log k)$$, 空间复杂度为 $$O(K)$$.
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
---
2+
difficulty: Hard
3+
tags:
4+
- Heap
5+
- Data Structure Design
6+
- Hash Table
7+
title: Top K Frequent Words II
8+
---
9+
10+
# Top K Frequent Words II
11+
12+
## Problem
13+
14+
### Metadata
15+
16+
- tags: Heap, Data Structure Design, Hash Table
17+
- difficulty: Hard
18+
- source(lintcode): <https://www.lintcode.com/problem/top-k-frequent-words-ii/>
19+
20+
### Description
21+
22+
Find top *k* frequent words in realtime data stream.
23+
24+
Implement three methods for *Topk* Class:
25+
26+
1. `TopK(k)`. The constructor.
27+
2. `add(word)`. Add a new word.
28+
3. `topk()`. Get the current top *k* frequent words.
29+
30+
#### Notice
31+
32+
If two words have the same frequency, rank them by alphabet.
33+
34+
#### Example
35+
36+
```
37+
TopK(2)
38+
add("lint")
39+
add("code")
40+
add("code")
41+
topk()
42+
>> ["code", "lint"]
43+
```
44+
45+
## 题解
46+
47+
此题较难,实际上和 Redis 的有序集合类似,综合使用字典和排序集合可完美解决。
48+
49+
### Java
50+
51+
```java
52+
public class TopK {
53+
private int k;
54+
private Map<String, Integer> wordFreq = null;
55+
private TreeSet<String> topkSet = null;
56+
57+
class TopkComparator implements Comparator<String> {
58+
public int compare(String s1, String s2) {
59+
int s1Freq = wordFreq.get(s1), s2Freq = wordFreq.get(s2);
60+
if (s1Freq != s2Freq) {
61+
return s2Freq - s1Freq;
62+
} else {
63+
return s1.compareTo(s2);
64+
}
65+
}
66+
}
67+
68+
/*
69+
* @param k: An integer
70+
*/public TopK(int k) {
71+
// do intialization if necessary
72+
this.k = k;
73+
wordFreq = new HashMap<String, Integer>(k);
74+
topkSet = new TreeSet<String>(new TopkComparator());
75+
}
76+
77+
/*
78+
* @param word: A string
79+
* @return: nothing
80+
*/
81+
public void add(String word) {
82+
// write your code here
83+
if (wordFreq.containsKey(word)) {
84+
if (topkSet.contains(word)) {
85+
topkSet.remove(word);
86+
}
87+
wordFreq.put(word, wordFreq.get(word) + 1);
88+
} else {
89+
wordFreq.put(word, 1);
90+
}
91+
92+
topkSet.add(word);
93+
if (topkSet.size() > k) {
94+
topkSet.pollLast();
95+
}
96+
}
97+
98+
/*
99+
* @return: the current top k frequent words.
100+
*/
101+
public List<String> topk() {
102+
// write your code here
103+
List<String> result = new ArrayList<String>(k);
104+
Iterator<String> it = topkSet.iterator();
105+
while (it.hasNext()) {
106+
result.add(it.next());
107+
}
108+
109+
return result;
110+
}
111+
}
112+
```
113+
114+
### 源码分析
115+
116+
117+
118+
### 复杂度分析
119+
120+
待续
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
---
2+
difficulty: Medium
3+
tags:
4+
- Priority Queue
5+
- Heap
6+
title: Top k Largest Numbers
7+
---
8+
9+
# Top k Largest Numbers
10+
11+
## Problem
12+
13+
### Metadata
14+
15+
- tags: Priority Queue, Heap
16+
- difficulty: Medium
17+
- source(lintcode): <https://www.lintcode.com/problem/top-k-largest-numbers/>
18+
19+
### Description
20+
21+
Given an integer array, find the top *k* largest numbers in it.
22+
23+
#### Example
24+
25+
Given `[3,10,1000,-99,4,100]` and *k* = `3`.
26+
Return `[1000, 100, 10]`.
27+
28+
## 题解
29+
30+
简单题,使用堆即可。
31+
32+
### Java
33+
34+
```java
35+
public class Solution {
36+
/**
37+
* @param nums: an integer array
38+
* @param k: An integer
39+
* @return: the top k largest numbers in array
40+
*/
41+
public int[] topk(int[] nums, int k) {
42+
if (nums == null || nums.length <= 1) return nums;
43+
44+
PriorityQueue<Integer> pq = new PriorityQueue<Integer>(nums.length, Collections.reverseOrder());
45+
for (int num : nums) {
46+
pq.offer(num);
47+
}
48+
49+
int[] maxK = new int[k];
50+
for (int i = 0; i < k; i++) {
51+
maxK[i] = pq.poll();
52+
}
53+
54+
return maxK;
55+
}
56+
}
57+
```
58+
59+
### 源码分析
60+
61+
62+
63+
### 复杂度分析
64+
65+

0 commit comments

Comments
 (0)