Stars
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Reference implementation for Structured Prediction with Deep Value Networks
smh2019 / probcomp-stack
Forked from probcomp/probcomp-stackMIT Probabilistic Computing Project software stack
Downloads and archives content from reddit
this repository accompanies the book "Grokking Deep Learning"
A curated list of awesome Bioinformatics libraries and software.
Resources for learning about Text Mining and Natural Language Processing
🌎 machine learning tutorials (mainly in Python3)
A repo for data science related questions and answers
120+ interactive Python coding interview challenges (algorithms and data structures). Includes Anki flashcards.
Examples of bad data, especially from government.
The scraper, parser, and database creation scripts for Financial Management Service daily U.S. Treasury statements.
A complete computer science study plan to become a software engineer.
There is a continuous stream of user activity events generated from multiple users as they use our mobile Cube app. Objective is to implement a server to ingest these events. The server will expose…
Bootstrap Kubernetes the hard way. No scripts.
Simple code for extracting data from excel sheet and Ingest into AWS S3 bucket
Predicted Bay Area bike share demand with Spark MLlib and built a pipeline to bridge Amazon S3, MongoDB server, and Spark EC2 cluster for NoSQL data processing.
🐬 A comprehensive tutorial on getting started with Docker!
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials,…
A data pipeline to daily pull public transport data from the opentransportdata.swiss portal. This pipeline has three tasks, pull the right data from opentransportdata.swiss, push the data to s3 for…
A serverless data processing pipeline to store Census data in AWS S3.
Scripts to download the U.S. Department of Justice's National Caseload Data and load it into Amazon Athena for querying
Data ingestion for Amazon Elasticsearch Service from S3 and Amazon Kinesis, using AWS Lambda: Sample code
Website to tell visitors whether a Company is an MLM