Skip to content

Droplet is a Python library for sampling, sketching, and summarizing data from massive data streams.

Notifications You must be signed in to change notification settings

YCHEN041/L0Sampler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Droplet

Introduction

Droplet is a Python library for sampling, sketching and summarizing massive data streams. More information can be found on the wiki.

Current status is PRE-ALPHA. Do not expect anything to work.

Contents

Samplers:

  • L0-sampler

Sketches:

  • Count-min (TODO)
  • Top-k (TODO)
  • HyperLogLog (TODO)

Summaries:

  • TBD

Installation guide

Pretty simple, really. From the terminal:

git clone https://github.com/venantius/droplet.git
cd droplet
python setup.py install 

Usage

Droplet is designed for use with massive data streams (GB+,TB+, etc.) that may only be read once.

EXAMPLE GOES HERE

Dependencies

Pypi:

  • mmh3

License

Droplet is licensed under the Apache license.

About

Droplet is a Python library for sampling, sketching, and summarizing data from massive data streams.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%