Stars
Open solution to the Home Credit Default Risk challenge 🏡
A highly opinionated framework for building shiny dashboards.
LAPOP dashboard with fake data based on Shiny R. An interactive Data Playground for strengthening understanding of Latin America’s politics, economy, culture for the general public.
hadoop_storm_spark结合实验的例子,模拟淘宝双11节,根据订单详细信息,汇总出总销售量,各个省份销售排行,以及后期的SQL分析,数据分析,数据挖掘等。 --------大概流程------- 第一阶段(storm实时报表) 第二阶段(离线报表)第三阶段(大规模订单即席查询,和多维度查询) 第四阶段(数据挖掘和图计算)
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials,…
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Everything about fintech: companies, technologies, libraries & packages, policies, jobs, milestones .
面向程序员的数据挖掘指南
Materials for IBM Spark contest. About the real-world application of big data and spark.
A stand alone, light-weight web server for building, sharing graphs created in ipython. Build for data science, data analysis guys. Aiming at building an interactive visualization, collaborated das…
CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.