Name		Name	Last commit message	Last commit date
parent directory ..
misc		misc
01 - DLT - Circulars SQL.sql		01 - DLT - Circulars SQL.sql
02- Data Preparation Circulars.py		02- Data Preparation Circulars.py
03 - RAG Chain Circular.py		03 - RAG Chain Circular.py
README.md		README.md

README.md

Streaming Data Pipelines: From Supernovas to LLMs (Circulars + RAG/LLMs)

Overview

This project accompanies the Data + AI Summit 2024 presentation "Streaming Data Pipelines: From Supernovas to LLMs", which is available here.

Note: This is not a beginner tutorial, nor is it a step-by-step guide. For easy to replicate tutorials, please visit databricks.com/demos.

Project Description

This hands-on, in-depth project demonstrates the use of 36,000 NASA circulars for a compoind AI application (RAG + LLL).

Get the Circulars in packed JSON format from here
Upload them to a Databricks UC managed volume and extract them
Use the use the provided DLT pipeline to ingest the data, make sure to specify the right folder to read from
Chunk the data (as a reference, have a look at the provided data prep notebook)
Create a Vector DB endpoint and index using the UI.
Define a Langchain template with a question and RAG content. (as a reference have a look at the provided RAG chain notebook)
Examine the output and iterate over the prompt. Note that unlike without RAG, this version contains precise data from 2024 and no hallucinations.

Additional Resources

Make sure to read the Databricks documentation about DLT and Vector DB.
Slides
Session

License

The code is provided "as is" without any warranty.

Contact

For questions about Databricks products, use our forum at community.databricks.com.

Acknowledgements

We would like to express our gratitude to the following individuals for their contributions and support:

Judith Rascusin (NASA)
Alex, Nicolas, Raghu, Praveen, Neil, Eric (Databricks)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NASA-circulars-rag

NASA-circulars-rag

README.md

Streaming Data Pipelines: From Supernovas to LLMs (Circulars + RAG/LLMs)

Overview

Project Description

Additional Resources

License

Contact

Acknowledgements

Files

NASA-circulars-rag

Directory actions

More options

Directory actions

More options

Latest commit

History

NASA-circulars-rag

Folders and files

parent directory

README.md

Streaming Data Pipelines: From Supernovas to LLMs (Circulars + RAG/LLMs)

Overview

Project Description

Additional Resources

License

Contact

Acknowledgements