Skip to content

Commit daa9de2

Browse files
authoredJul 25, 2019
Create README.md
1 parent 126ca4a commit daa9de2

File tree

1 file changed

+15
-0
lines changed

1 file changed

+15
-0
lines changed
 

‎examples/README.md

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Criteo Data
2+
3+
## Download
4+
Download the classic competition data [here]().
5+
Download the 1TB data [here](https://labs.criteo.com/2013/12/download-terabyte-click-logs-2/).
6+
7+
## Processing
8+
### Competition data
9+
1. Split the file into managable chunks that will fit in memory.
10+
2. Process the data using the spark script.
11+
12+
### 1TB data
13+
1. Unzip each file.
14+
2. (Optional) store the files in a new location.
15+
3. Process the data using the spark script.

0 commit comments

Comments
 (0)