- Hyderabad
- @praveenr019
Lists (2)
Sort Name ascending (A-Z)
- All languages
- ApacheConf
- Batchfile
- Bicep
- C
- C++
- CSS
- Clojure
- CoffeeScript
- Crystal
- Dart
- Elixir
- Emacs Lisp
- Gherkin
- Go
- Groovy
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- Lua
- Makefile
- Mako
- Markdown
- Nix
- Nunjucks
- Objective-C
- Objective-C++
- Objective-J
- PHP
- Perl
- PowerShell
- Puppet
- Python
- Rich Text Format
- Roff
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Smarty
- Svelte
- Swift
- TeX
- TypeScript
- Vala
- Vim Script
- Zig
Starred repositories
Apache Spark - A unified analytics engine for large-scale data processing
Scala 2 compiler and standard library. Scala 2 bugs at https://github.com/scala/bug; Scala 3 at https://github.com/scala/scala3
A platform to build and run apps that are elastic, agile, and resilient. SDK, libraries, and hosted environments.
PredictionIO, a machine learning server for developers and ML engineers.
A Git platform powered by Scala with easy installation, high extensibility & GitHub API compatibility
A fault tolerant, protocol-agnostic RPC system
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Spark: The Definitive Guide's Code Repository
REST job server for Apache Spark
Streaming MapReduce with Scalding and Storm
a command line tool to apply templates defined on GitHub
Lightning-fast cluster computing in Java, Scala and Python.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in one cluster
A connector for Spark that allows reading and writing to/from Redis cluster
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
BlinkDB: Sub-Second Approximate Queries on Very Large Data.
Redshift data source for Apache Spark
A library that provides useful extensions to Apache Spark and PySpark.
A Spark WordCountJob example as a standalone SBT project with Specs2 tests, runnable on Amazon EMR
Movie recommendations and more in MapReduce and Scalding
The official repository for the Rock the JVM Spark Optimization 2 course
kayousterhout / spark
Forked from mesos/sparkScala framework for iterative and interactive cluster computing.