superctj

Tianji Cong superctj

PhD Candidate

11 followers · 11 following

CSE@University of Michigan
Ann Arbor, Michigan
https://superctj.github.io

Achievements

Highlights

Stars

McGill-NLP / llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Python 1,397 109 Updated Oct 8, 2024

microsoft / MLOS

MLOS is a project to enable autotuning for systems.

Python 148 68 Updated Jan 16, 2025

apicrafter / metacrafter

Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules

Python 44 5 Updated Jul 8, 2024

superctj / observatory-library

Python library for embedding inference of relational tables.

Python 3 Updated Jul 8, 2024

RLGen / LakeBench

Python 222 8 Updated Jan 4, 2025

amsterdata / schemapile

Jupyter Notebook 8 1 Updated Jul 8, 2024

superctj / openforge

Python 1 Updated Dec 22, 2024

delftdata / valentine

A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching methods.

Python 84 23 Updated Nov 28, 2024

ermongroup / cs228-notes

Course notes for CS228: Probabilistic Graphical Models.

SCSS 1,923 479 Updated Mar 25, 2024

automl / SMAC3

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

Python 1,100 230 Updated Jan 15, 2025

google-deepmind / PGMax

Loopy belief propagation for factor graphs on discrete variables in JAX

Jupyter Notebook 137 11 Updated Oct 18, 2024

superctj / cybernetics

Configuration Tuning for Postgres

Python 3 Updated Dec 13, 2024

thomas-young-2013 / open-box

Generalized and Efficient Blackbox Optimization System.

Python 81 81 Updated Feb 21, 2023

uw-mad-dash / llamatune

Supplementary Material for "LlamaTune: Sample-Efficient DBMS Configuration Tuning"

Python 34 9 Updated Jul 29, 2022

duckdb / duckdb

DuckDB is an analytical in-process SQL database management system

C++ 25,831 2,038 Updated Jan 16, 2025

wwweiwei / awesome-self-supervised-learning-for-tabular-data

A collection of research materials on SSL for non-sequential tabular data (SSL4NSTD)

169 11 Updated Nov 23, 2024

ccfddl / ccf-deadlines

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue 6,742 463 Updated Jan 15, 2025

ByteByteGoHq / system-design-101

Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.

67,531 7,118 Updated Aug 16, 2024

superctj / observatory

Characterization of relational table embeddings (VLDB 2024).

Python 25 1 Updated Jul 1, 2024

superctj / pylon

Codebase and data for our paper - Pylon: Semantic Table Union Search in Data Lakes.

Python 3 1 Updated Aug 14, 2023

superctj / spider-join-data

Spider join dataset for our paper - WarpGate: A Semantic Join Discovery System for Cloud Data Warehouses (CIDR 2023).

Python 2 Updated Aug 4, 2023

qdrant / qdrant

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 21,366 1,466 Updated Jan 15, 2025

ekzhu / datasketch

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Python 2,635 296 Updated Jun 4, 2024

CSStipendRankings / CSStipendRankings

https://csstipendrankings.org

HTML 202 54 Updated Jan 13, 2025

facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.

C++ 32,395 3,703 Updated Jan 15, 2025

cwida / pvldbstyle

PVLDB LaTeX style, based on acmart

TeX 13 2 Updated Apr 29, 2021

dtim-upc / NextiaJD

NextiaJD is a library that supports data discovery based on data profiles and machine learning algorithms to find joinable attributes in heterogeneous datasets

Scala 7 3 Updated Oct 16, 2023

haizhongzheng / ATTA

ATTA (Efficient Adversarial Training with Transferable Adversarial Examples)

Python 35 5 Updated Aug 17, 2020

guenthermi / table-embeddings

Tools for training schema-aware Web table embedding for unsupervised and supervised machine learning on tabular data

Python 18 1 Updated Apr 14, 2024

alex-bogatu / d3l

D3L dataset discovery framework - an implementation of the ICDE 2020 paper with the same name: https://arxiv.org/pdf/2011.10427.pdf

Python 20 3 Updated Nov 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tianji Cong superctj

Achievements

Achievements

Highlights

Block or report superctj

Stars

McGill-NLP / llm2vec

microsoft / MLOS

apicrafter / metacrafter

superctj / observatory-library

RLGen / LakeBench

amsterdata / schemapile

superctj / openforge

delftdata / valentine

ermongroup / cs228-notes

automl / SMAC3

google-deepmind / PGMax

superctj / cybernetics

thomas-young-2013 / open-box

uw-mad-dash / llamatune

duckdb / duckdb

wwweiwei / awesome-self-supervised-learning-for-tabular-data

ccfddl / ccf-deadlines

ByteByteGoHq / system-design-101

superctj / observatory

superctj / pylon

superctj / spider-join-data

qdrant / qdrant

ekzhu / datasketch

CSStipendRankings / CSStipendRankings

facebookresearch / faiss

cwida / pvldbstyle

dtim-upc / NextiaJD

haizhongzheng / ATTA

guenthermi / table-embeddings

alex-bogatu / d3l