Skip to content
View osopardo1's full-sized avatar
🐮
🐮

Highlights

  • Pro

Organizations

@Qbeast-io

Block or report osopardo1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Open, Multi-modal Catalog for Data & AI

Java 2,567 419 Updated Jan 7, 2025

The official home of the Presto distributed SQL query engine for big data

Java 16,144 5,405 Updated Jan 7, 2025

A simple macOS application that will prevent iTunes or Apple Music from launching.

Swift 4,019 65 Updated Aug 8, 2024

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

Java 956 151 Updated Jan 7, 2025

This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…

Scala 718 146 Updated Aug 13, 2024

Visual-Flow main repository

Shell 457 4 Updated Dec 4, 2023

The Open-Source toolkit to build your own reliable and secure Industrial IoT platform.

Go 289 47 Updated Dec 20, 2024

Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.

Rust 1,365 129 Updated Jan 8, 2025

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team co…

TypeScript 5,848 1,103 Updated Jan 8, 2025

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks

Python 412 237 Updated Jan 7, 2025

Upserts, Deletes And Incremental Processing on Big Data.

Java 5,547 2,443 Updated Jan 8, 2025

Full stack application platform for building stateful microservices, streaming APIs, and real-time UIs

Java 491 39 Updated Aug 9, 2024

MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.

Go 49,324 5,602 Updated Jan 8, 2025

QuestDB is a high performance, open-source, time-series database

Java 14,773 1,197 Updated Jan 8, 2025

Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.

Java 275 76 Updated Oct 4, 2024

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 21,256 1,454 Updated Jan 8, 2025

Write data & AI pipelines in (SQL, Spark, Pandas) and deploy to the cloud, simplified

Python 34 8 Updated Jan 7, 2025

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 137,308 27,482 Updated Jan 8, 2025

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Java 10,667 3,058 Updated Jan 8, 2025

The missing star history graph of GitHub repos - https://star-history.com

TypeScript 6,825 266 Updated Jan 3, 2025

This repository started out as a learning in public project for myself and has now become a structured learning map for many in the community. We have 3 years under our belt covering all things Dev…

Shell 27,506 6,363 Updated Nov 12, 2024

A Scala API for Apache Beam and Google Cloud Dataflow.

Scala 2,569 513 Updated Jan 8, 2025

A Github API client to extract events and actions, and load into a database

Jupyter Notebook 28 13 Updated Oct 22, 2021

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

Python 16,825 4,228 Updated Jan 8, 2025

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Python 4,467 961 Updated Dec 19, 2024

A sbt plugin for publishing Scala/Java projects to the Maven central.

Scala 337 65 Updated Dec 31, 2024

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!

Scala 222 20 Updated Dec 17, 2024

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such …

JavaScript 793 54 Updated Aug 10, 2022

A simple Spark-powered ETL framework that just works 🍺

Scala 178 32 Updated Dec 7, 2023
Next