RumbleDB

With RumbleDB, you can query with ease a lot of different nested, heterogeneous data formats like JSON, CSV, Parquet, Avro, LibSVM, text, etc.

RumbleDB exposes a query language rather than a DataFrame API, for more flexibility, more productivity but also because a lot of data simply will not fit in DataFrames.

You can query it in place from any local file systems or data lakes (Azure blob storage, Amazon S3, HDFS, etc).

You can prepare, clean up, validate your data and put it right into your machine learning pipelines with RumbleDB ML.

Getting started: you will find a Jupyter notebook that introduces the JSONiq language on top of RumbleDB here. You can also run it locally if you prefer.

The documentation also contains an introduction specific to RumbleDB and how you can read input datasets, but we have not converted it to Jupyter notebooks yet (this will follow).

The documentation of the latest official release is available here.

The documentation of the current master (for the adventurous and curious) is available here.

RumbleDB is an effort involving many researchers and ETH Zurich students: code and support by Stefan Irimescu, Ghislain Fourny, Gustavo Alonso, Renato Marroquin, Rodrigo Bruno, Falko Noé, Ioana Stefan, Andrea Rinaldi, Stevan Mihajlovic, Mario Arduini, Can Berker Çıkış, Elwin Stephan, David Dao, Zirun Wang, Ingo Müller, Dan-Ovidiu Graur, Thomas Zhou, Olivier Goerens, Alexandru Meterez, Remo Röthlisberger, Dominik Bruggisser, David Loughlin.

Name		Name	Last commit message	Last commit date
Latest commit History 6,260 Commits
.github		.github
docs		docs
lib		lib
src		src
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.travis.yml		.travis.yml
FLWOR.ipynb		FLWOR.ipynb
LICENSE-ANTLR.txt		LICENSE-ANTLR.txt
LICENSE-Apache-Commons-IO.txt		LICENSE-Apache-Commons-IO.txt
LICENSE-Apache-Commons-Lang.txt		LICENSE-Apache-Commons-Lang.txt
LICENSE-Apache-Commons-Text.txt		LICENSE-Apache-Commons-Text.txt
LICENSE-Apache-HttpClient.txt		LICENSE-Apache-HttpClient.txt
LICENSE-JLine.txt		LICENSE-JLine.txt
LICENSE-Joda-time.txt		LICENSE-Joda-time.txt
LICENSE-Kryo.md		LICENSE-Kryo.md
LICENSE-Laurelin.txt		LICENSE-Laurelin.txt
LICENSE-Spark.txt		LICENSE-Spark.txt
LICENSE-gson.txt		LICENSE-gson.txt
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md
RumbleSandbox.ipynb		RumbleSandbox.ipynb
build_antlr_parser.xml		build_antlr_parser.xml
build_xquery_antlr_parser.xml		build_xquery_antlr_parser.xml
mkdocs.yml		mkdocs.yml
org.eclipse.jdt.core.prefs		org.eclipse.jdt.core.prefs
pom.xml		pom.xml
server_tests_manual.txt		server_tests_manual.txt
spotless-formatter-eclipse-jdt-configurations.xml		spotless-formatter-eclipse-jdt-configurations.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RumbleDB

About

Releases

Packages

Languages

License

darioackermann/rumble

Folders and files

Latest commit

History

Repository files navigation

RumbleDB

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages