Skip to content

Commit

Permalink
enable emojis for mkdocs
Browse files Browse the repository at this point in the history
Signed-off-by: Shivdeep Singh <[email protected]>
  • Loading branch information
shivdeep-singh-ibm committed May 2, 2024
1 parent 799d37f commit 5c31811
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 24 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
Data Prep Lab is a community project to democratize and accelerate unstructured data preparation for LLM app developers.
With the explosive growth of LLM-enabled use cases, developers are faced with the enormous challenge of preparing use case-specific unstructured data to fine-tune or instruct-tune the LLMs.
As the variety of use cases grows, so does the need to support:

- New modalities of data (code, language, speech, visual)
- New ways of transforming the data to optimize the performance of the resulting LLMs for each specific use case.
- Large variety in the scale of data to be processed, from laptop-scale to datacenter-scale
Expand All @@ -35,6 +36,7 @@ These modules have been tested in producing pre-training datasets for the [Grani

The modules are built on common frameworks (for Spark and Ray), called the *data processing library* that allows the developers to build new custom modules that readily scale across a variety of runtimes.
Eventually, Data Prep Lab will offer consistent APIs and configurations across the following underlying runtimes.

1. Python runtime
2. Ray runtime (local and distributed)
3. Spark runtime (local and distributed)
Expand All @@ -59,6 +61,7 @@ Contributors are welcome to add new modules as well as add runtime support for e


Features of the toolkit:

- Aiming to accelerate unstructured data prep burden for the "long tail" of LLM use cases
- Growing set of module implementations across multiple runtimes and targeting laptop-scale to datacenter-scale processing
- A growing set of sample pipelines developed for real enterprise use cases
Expand Down Expand Up @@ -184,6 +187,3 @@ Thanks to the [BigCode Project](https://github.com/bigcode-project) that has bee






46 changes: 25 additions & 21 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,27 +1,31 @@
site_name: "Data Prep LAB"
docs_dir: .
site_dir: ../site
repo_url: https://github.com/IBM/data-prep-lab
nav:
- Home: README.md
- Overview: data-processing-lib/doc/overview.md
- Tutorials:
- data-processing-lib/doc/transform-tutorials.md
- Simple: data-processing-lib/doc/simplest-transform-tutorial.md
- Advanced: data-processing-lib/doc/advanced-transform-tutorial.md
- KFP Pipeline: kfp/doc/simple_transform_pipeline.md
- Home: README.md
- Overview: data-processing-lib/doc/overview.md
- Tutorials:
- data-processing-lib/doc/transform-tutorials.md
- Simple: data-processing-lib/doc/simplest-transform-tutorial.md
- Advanced: data-processing-lib/doc/advanced-transform-tutorial.md
- KFP Pipeline: kfp/doc/simple_transform_pipeline.md
theme:
name: 'material'
favicon: 'data-processing-lib/doc/favicon.ico'
logo: 'data-processing-lib/doc/logo-ibm.png'
palette:
primary: black
# palette:
# primary: 'blue grey'
features:
- navigation.tabs
name: "material"
favicon: "data-processing-lib/doc/favicon.ico"
logo: "data-processing-lib/doc/logo-ibm.png"
palette:
primary: black
# palette:
# primary: 'blue grey'
features:
- navigation.tabs
plugins:
- search
- mkdocstrings
- badges
- same-dir

- search
- mkdocstrings
- badges
- same-dir
markdown_extensions:
- pymdownx.emoji:
emoji_index: !!python/name:material.extensions.emoji.twemoji
emoji_generator: !!python/name:material.extensions.emoji.to_svg

0 comments on commit 5c31811

Please sign in to comment.