UCI Machine Learning Repository
- Adults Data Set:
adult.data.bz2
(bz2 compressed),adult.names
(the description) - Wine Quality Data Set:
winequality-red.csv
andwinequality-white.csv
- Phase 3 Release:
ALL.chr22.phase3_1000.vcf.bz2
(1000 first variants),integrated_call_samples_v2.20130502.ALL.ped
,1000_gen_populations.txt
Others:
- BOM: About Air Temperature Data:
bom_data_Note.txt
,nsw_temp.csv
- Enron Spam Dataset:
ham.zip
andspam.zip
(zip compressed documents from ham and spam folders in enron1.tar.gz) - Project Gutenberg: "The Prince" by Machiavelli:
prince_by_machiavelli.txt
- The Internet Classics Archive: "The Art of War" by Sun Tzu:
artwar.1b.txt
- Twitter:
tweets.json
- a sample of tweets captured with the public API