Skip to content
View superctj's full-sized avatar

Highlights

  • Pro

Block or report superctj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,397 109 Updated Oct 8, 2024

MLOS is a project to enable autotuning for systems.

Python 148 68 Updated Jan 16, 2025

Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules

Python 44 5 Updated Jul 8, 2024

Python library for embedding inference of relational tables.

Python 3 Updated Jul 8, 2024
Python 222 8 Updated Jan 4, 2025
Jupyter Notebook 8 1 Updated Jul 8, 2024
Python 1 Updated Dec 22, 2024

A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching methods.

Python 84 23 Updated Nov 28, 2024

Course notes for CS228: Probabilistic Graphical Models.

SCSS 1,923 479 Updated Mar 25, 2024

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

Python 1,100 230 Updated Jan 15, 2025

Loopy belief propagation for factor graphs on discrete variables in JAX

Jupyter Notebook 137 11 Updated Oct 18, 2024

Configuration Tuning for Postgres

Python 3 Updated Dec 13, 2024

Generalized and Efficient Blackbox Optimization System.

Python 81 81 Updated Feb 21, 2023

Supplementary Material for "LlamaTune: Sample-Efficient DBMS Configuration Tuning"

Python 34 9 Updated Jul 29, 2022

DuckDB is an analytical in-process SQL database management system

C++ 25,831 2,038 Updated Jan 16, 2025

A collection of research materials on SSL for non-sequential tabular data (SSL4NSTD)

169 11 Updated Nov 23, 2024

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue 6,742 463 Updated Jan 15, 2025

Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.

67,531 7,118 Updated Aug 16, 2024

Characterization of relational table embeddings (VLDB 2024).

Python 25 1 Updated Jul 1, 2024

Codebase and data for our paper - Pylon: Semantic Table Union Search in Data Lakes.

Python 3 1 Updated Aug 14, 2023

Spider join dataset for our paper - WarpGate: A Semantic Join Discovery System for Cloud Data Warehouses (CIDR 2023).

Python 2 Updated Aug 4, 2023

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 21,366 1,466 Updated Jan 15, 2025

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Python 2,635 296 Updated Jun 4, 2024

https://csstipendrankings.org

HTML 202 54 Updated Jan 13, 2025

A library for efficient similarity search and clustering of dense vectors.

C++ 32,395 3,703 Updated Jan 15, 2025

PVLDB LaTeX style, based on acmart

TeX 13 2 Updated Apr 29, 2021

NextiaJD is a library that supports data discovery based on data profiles and machine learning algorithms to find joinable attributes in heterogeneous datasets

Scala 7 3 Updated Oct 16, 2023

ATTA (Efficient Adversarial Training with Transferable Adversarial Examples)

Python 35 5 Updated Aug 17, 2020

Tools for training schema-aware Web table embedding for unsupervised and supervised machine learning on tabular data

Python 18 1 Updated Apr 14, 2024

D3L dataset discovery framework - an implementation of the ICDE 2020 paper with the same name: https://arxiv.org/pdf/2011.10427.pdf

Python 20 3 Updated Nov 18, 2021
Next