Skip to content

add data analysis agent powered by NVIDIA Llama 3.1 Nemotron Ultra #289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions community/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@ Community examples are sample code and deployments for RAG pipelines that are no

## Inventory

* [NVIDIA Data Analysis Agent](./data-analysis-agent/)

This example demonstrates an interactive, agentic data analysis application that leverages NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1 for advanced reasoning and data exploration. Users can upload CSV files, ask questions in natural language, and receive automated visualizations with clear, step-by-step reasoning. The implementation features a modular agent architecture for data insight, code generation, execution, and transparent reasoning.

* [NVIDIA RAG in 5 minutes](./5_mins_rag_no_gpu/)

This is a simple standalone implementation showing rag pipeline using Nvidia API Catalog models. It uses a simple Streamlit UI and one file implementation of a minimalistic RAG pipeline.
Expand Down
99 changes: 99 additions & 0 deletions community/data-analysis-agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Data Analysis Agent

An interactive, agentic data analysis application that leverages advanced LLM reasoning to help users explore, visualize, and understand their data using NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1.

## Overview

This repository contains a Streamlit application that demonstrates a complete workflow for data analysis:
1. **Data Upload**: Upload CSV files for analysis
2. **Natural Language Queries**: Ask questions about your data in plain English
3. **Automated Visualization**: Generate relevant plots and charts
4. **Transparent Reasoning**: Get detailed explanations of the analysis process

The implementation leverages the powerful Llama-3.1-Nemotron-Ultra-253B-v1 model through NVIDIA's API, enabling sophisticated data analysis and reasoning.

Learn more about the model [here](https://developer.nvidia.com/blog/build-enterprise-ai-agents-with-advanced-open-nvidia-llama-nemotron-reasoning-models/).

## Features

- **Agentic Architecture**: Modular agents for data insight, code generation, execution, and reasoning
- **Natural Language Queries**: Ask questions about your data—no coding required
- **Automated Visualization**: Instantly generate and display relevant plots
- **Transparent Reasoning**: Get clear, LLM-generated explanations for every result
- **Powered by NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1**: State-of-the-art reasoning and interpretability

![Workflow](./assets/workflow.png)

## Requirements

- Python 3.10+
- Streamlit
- NVIDIA API Key (see [Installation](#installation) section for setup instructions)
- Required Python packages:
- pandas
- matplotlib
- streamlit
- requests

## Installation

1. Clone this repository:
```bash
git clone https://github.com/NVIDIA/GenerativeAIExamples.git
cd GenerativeAIExamples/community/data-analysis-agent
```

2. Install dependencies:
```bash
pip install -r requirements.txt
```

3. Set up your NVIDIA API key:
- Sign up or log in at [NVIDIA Build](https://build.nvidia.com/nvidia/llama-3_1-nemotron-ultra-253b-v1?integrate_nim=true&hosted_api=true&modal=integrate-nim)
- Generate an API key
- Set the API key in your environment:
```bash
export NVIDIA_API_KEY=your_nvidia_api_key_here
```
- Or add it to your `.env` file if you use one

## Usage

1. Run the Streamlit app:
```bash
streamlit run data_analysis.py
```

2. Download example dataset (optional):
```bash
wget https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv
```

3. Use the application:
- Upload a CSV file (e.g., the Titanic dataset)
- Ask questions in natural language
- View results, visualizations, and detailed reasoning

## Example

![App Demo](./assets/data_analysis_agent_demo.png)

## Model Details

The Llama-3.1-Nemotron-Ultra-253B-v1 model used in this project has the following specifications:
- **Parameters**: 253B
- **Features**: Advanced reasoning capabilities
- **Use Cases**: Complex data analysis, multi-agent systems
- **Enterprise Ready**: Optimized for production deployment

## Acknowledgments

- [NVIDIA Llama-3.1-Nemotron-Ultra-253B-v1](https://build.nvidia.com/nvidia/llama-3_1-nemotron-ultra-253b-v1)
- [Streamlit](https://streamlit.io/)
- [Pandas](https://pandas.pydata.org/)
- [Matplotlib](https://matplotlib.org/)


## Contributing

Contributions are welcome! Please open an issue or submit a pull request.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading