Skip to content

OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.

License

Notifications You must be signed in to change notification settings

Ziy1-Tan/OpenMLDB

Repository files navigation

中文版

Introduction

FEDB is a NewSQL database optimised for realtime inference and decisioning applications.

  • High Performance

    Reduce data access latency by using in-memory storage engine and improve the execution performance significantly with sql compilation optimization.

  • SQL Compatible

    FEDB is compatible with most of ANSI SQL syntax. You can implement your aplications with sqlalchemy or JDBC.

  • Online-offline Consistency

    Machine learning applications developed by FEDB can be launched simply and ensure online and offline consistency, which greatly reduces the cost.

  • High Availability

    Support auto failover and scaling horizontally.

Note: The latest released FEDB is unstable and not recommend to be used in production environment.

Quick Start

Build

git clone --recurse-submodules https://github.com/4paradigm/fedb.git
cd fedb
docker run -v `pwd`:/fedb -it ghcr.io/4paradigm/centos6_gcc7_hybridsql:latest
cd /fedb
sh steps/init_env.sh
sh steps/install_hybridse.sh
mkdir -p build && cd build && cmake ../ && make -j5 fedb

Demo

  • Predict taxi trip duration
  • Detect the healthy of online transaction and make an alert -oncoming
  • Online real-time transaction fraud detection -oncoming

Architecture

Architecture

Roadmap

ANSI SQL Compatibility

FEDB is currently compatible with mainstream DDL and DML syntax, and will gradually enhances the compatibility of ANSI SQL syntax.

  • [2021H1] Support the standard syntax of Window, Where, Group By and Join ect.
  • [2021H1&H2] Expand AI-oriented SQL syntax and UDAF functions.

Features

In order to meet the high performance requirements of realtime inference and decisioning scenarios, FEDB chooses memory as the storage engine medium. At present, the memory storage engine used in the industry has memory fragmentation and recovery efficiency problems. FEDB plans to optimize the memory allocation algorithm to reduce fragmentation and introduce PMEM(Intel Optane DC Persistent Memory Module) to improve data recovery efficiency.

  • [2021H1]Provide a new strategy of memory allocation to reduce memory fragmentation.
  • [2021H2]Support PMEM-based storage engine.

Build Ecosystem

FEDB has python client and java client which support most of JDBC API. FEDB will make a connection with big data ecosystem for integrating with Flink/Kafka/Spark simplily.

  • [2021H1&H2]Support Flink/Kafka/Spark connector.

Feedback and Getting involved

  • Report bugs, ask questions or give suggestions by Github Issues.
  • Cannot find what you are looking for? Have a question or idea? Please post your questions or comments on our slack.

License

Apache License 2.0

About

OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 66.8%
  • Java 23.9%
  • Scala 4.0%
  • Python 3.4%
  • Shell 0.7%
  • CMake 0.6%
  • Other 0.6%