Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from zylon-ai:main #34

Open
wants to merge 50 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
19a7c06
feat(docs): update doc for ipex-llm (#1968)
shane-huang Jul 8, 2024
fc13368
feat(llm): Support for Google Gemini LLMs and Embeddings (#1965)
uw4 Jul 8, 2024
2612928
feat(vectorstore): Add clickhouse support as vectore store (#1883)
Proger666 Jul 8, 2024
067a5f1
feat(docs): Fix setup docu (#1926)
martinzrrl Jul 8, 2024
dde0224
fix(docs): Fix concepts.mdx referencing to installation page (#1779)
mtulio Jul 8, 2024
187bc93
(feat): add github button (#1989)
fern-support Jul 9, 2024
15f73db
docs: update repo links, citations (#1990)
jaluma Jul 9, 2024
01b7ccd
fix(config): make tokenizer optional and include a troubleshooting do…
jaluma Jul 17, 2024
4523a30
feat(docs): update documentation and fix preview-docs (#2000)
jaluma Jul 18, 2024
43cc31f
feat(vectordb): Milvus vector db Integration (#1996)
Jacksonxhx Jul 18, 2024
90d211c
Update README.md (#2003)
imartinez Jul 18, 2024
2c78bb2
docs: add PR and issue templates (#2002)
jaluma Jul 18, 2024
b626697
docs: update welcome page (#2004)
jaluma Jul 18, 2024
05a9862
Add proper param to demo urls (#2007)
imartinez Jul 22, 2024
dabf556
fix: ffmpy dependency (#2020)
jaluma Jul 29, 2024
20bad17
feat(llm): autopull ollama models (#2019)
jaluma Jul 29, 2024
d4375d0
fix(ui): gradio bug fixes (#2021)
jaluma Jul 29, 2024
d080969
added llama3 prompt (#1962)
hirschrobert Jul 29, 2024
65c5a17
chore(docker): dockerfiles improvements and fixes (#1792)
qdm12 Jul 30, 2024
1020cd5
fix: light mode (#2025)
jaluma Jul 31, 2024
e54a8fe
fix: prevent to ingest local files (by default) (#2010)
jaluma Jul 31, 2024
9027d69
feat: make llama3.1 as default (#2022)
jaluma Jul 31, 2024
40638a1
fix: unify embedding models (#2027)
jaluma Jul 31, 2024
8119842
feat(recipe): add our first recipe `Summarize` (#2028)
jaluma Jul 31, 2024
5465958
fix: nomic embeddings (#2030)
jaluma Aug 1, 2024
50b3027
docs: update docs and capture (#2029)
jaluma Aug 1, 2024
cf61bf7
feat(llm): add progress bar when ollama is pulling models (#2031)
jaluma Aug 1, 2024
e44a7f5
chore: bump version (#2033)
jaluma Aug 2, 2024
6674b46
chore(main): release 0.6.0 (#1834)
github-actions[bot] Aug 2, 2024
dae0727
fix(deploy): improve Docker-Compose and quickstart on Docker (#2037)
jaluma Aug 5, 2024
1d4c14d
fix(deploy): generate docker release when new version is released (#2…
jaluma Aug 5, 2024
1c665f7
fix: Adding azopenai to model list (#2035)
itsliamdowd Aug 5, 2024
f09f6dd
fix: add built image from DockerHub (#2042)
jaluma Aug 5, 2024
ca2b8da
chore(main): release 0.6.1 (#2041)
github-actions[bot] Aug 5, 2024
b16abbe
fix: update matplotlib to 3.9.1-post1 to fix win install
jaluma Aug 7, 2024
4ca6d0c
fix: add numpy issue to troubleshooting (#2048)
jaluma Aug 7, 2024
b1acf9d
fix: publish image name (#2043)
jaluma Aug 7, 2024
7fefe40
fix: auto-update version (#2052)
jaluma Aug 8, 2024
22904ca
chore(main): release 0.6.2 (#2049)
github-actions[bot] Aug 8, 2024
89477ea
fix: naming image and ollama-cpu (#2056)
jaluma Aug 12, 2024
7603b36
fix: Rectify ffmpy poetry config; update version from 0.3.2 to 0.4.0 …
arturmartins Aug 21, 2024
4262859
ci: bump actions/checkout to v4 (#2077)
trivikr Sep 9, 2024
77461b9
feat: add retry connection to ollama (#2084)
jaluma Sep 16, 2024
8c12c68
fix: docker permissions (#2059)
jaluma Sep 24, 2024
f9182b3
feat: Adding MistralAI mode (#2065)
itsliamdowd Sep 24, 2024
fa3c306
fix: Add default mode option to settings (#2078)
basicbloke Sep 24, 2024
5fbb402
fix: Sanitize null bytes before ingestion (#2090)
laoqiu233 Sep 25, 2024
5851b02
feat: update llama-index + dependencies (#2092)
jaluma Sep 26, 2024
940bdd4
fix: 503 when private gpt gets ollama service (#2104)
meng-hui Oct 17, 2024
b7ee437
Update README.md
imartinez Nov 13, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat(llm): Support for Google Gemini LLMs and Embeddings (zylon-ai#1965)
* Support for Google Gemini LLMs and Embeddings

Initial support for Gemini, enables usage of Google LLMs and embedding models (see settings-gemini.yaml)

Install via
poetry install --extras "llms-gemini embeddings-gemini"

Notes:
* had to bump llama-index-core to later version that supports Gemini
* poetry --no-update did not work: Gemini/llama_index seem to require more (transient) updates to make it work...

* fix: crash when gemini is not selected

* docs: add gemini llm

---------

Co-authored-by: Javier Martinez <[email protected]>
  • Loading branch information
uw4 and jaluma authored Jul 8, 2024
commit fc13368bc72d1f4c27644677431420ed77731c03
33 changes: 33 additions & 0 deletions fern/docs/pages/manual/llms.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -199,3 +199,36 @@ Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:80
For a fully private setup on Intel GPUs (such as a local PC with an iGPU, or discrete GPUs like Arc, Flex, and Max), you can use [IPEX-LLM](https://github.com/intel-analytics/ipex-llm).

To deploy Ollama and pull models using IPEX-LLM, please refer to [this guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/ollama_quickstart.html). Then, follow the same steps outlined in the [Using Ollama](#using-ollama) section to create a `settings-ollama.yaml` profile and run the private-GPT server.

### Using Gemini

If you cannot run a local model (because you don't have a GPU, for example) or for testing purposes, you may
decide to run PrivateGPT using Gemini as the LLM and Embeddings model. In addition, you will benefit from
multimodal inputs, such as text and images, in a very large contextual window.

In order to do so, create a profile `settings-gemini.yaml` with the following contents:

```yaml
llm:
mode: gemini

embedding:
mode: gemini

gemini:
api_key: <your_gemini_api_key> # You could skip this configuration and use the GEMINI_API_KEY env var instead
model: <gemini_model_to_use> # Optional model to use. Default is models/gemini-pro"
embedding_model: <gemini_embeddings_to_use> # Optional model to use. Default is "models/embedding-001"
```

And run PrivateGPT loading that profile you just created:

`PGPT_PROFILES=gemini make run`

or

`PGPT_PROFILES=gemini poetry run python -m private_gpt`

When the server is started it will print a log *Application startup complete*.
Navigate to http://localhost:8001 to use the Gradio UI or to http://localhost:8001/docs (API section) to try the API.

332 changes: 275 additions & 57 deletions poetry.lock

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions private_gpt/components/embedding/embedding_component.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,20 @@ def __init__(self, settings: Settings) -> None:
azure_endpoint=azopenai_settings.azure_endpoint,
api_version=azopenai_settings.api_version,
)
case "gemini":
try:
from llama_index.embeddings.gemini import ( # type: ignore
GeminiEmbedding,
)
except ImportError as e:
raise ImportError(
"Gemini dependencies not found, install with `poetry install --extras embeddings-gemini`"
) from e

self.embedding_model = GeminiEmbedding(
api_key=settings.gemini.api_key,
model_name=settings.gemini.embedding_model,
)
case "mock":
# Not a random number, is the dimensionality used by
# the default embedding model
Expand Down
13 changes: 13 additions & 0 deletions private_gpt/components/llm/llm_component.py
Original file line number Diff line number Diff line change
Expand Up @@ -190,5 +190,18 @@ def wrapper(*args: Any, **kwargs: Any) -> Any:
azure_endpoint=azopenai_settings.azure_endpoint,
api_version=azopenai_settings.api_version,
)
case "gemini":
try:
from llama_index.llms.gemini import ( # type: ignore
Gemini,
)
except ImportError as e:
raise ImportError(
"Google Gemini dependencies not found, install with `poetry install --extras llms-gemini`"
) from e
gemini_settings = settings.gemini
self.llm = Gemini(
model_name=gemini_settings.model, api_key=gemini_settings.api_key
)
case "mock":
self.llm = MockLLM()
26 changes: 24 additions & 2 deletions private_gpt/settings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,14 @@ class DataSettings(BaseModel):

class LLMSettings(BaseModel):
mode: Literal[
"llamacpp", "openai", "openailike", "azopenai", "sagemaker", "mock", "ollama"
"llamacpp",
"openai",
"openailike",
"azopenai",
"sagemaker",
"mock",
"ollama",
"gemini",
]
max_new_tokens: int = Field(
256,
Expand Down Expand Up @@ -157,7 +164,9 @@ class HuggingFaceSettings(BaseModel):


class EmbeddingSettings(BaseModel):
mode: Literal["huggingface", "openai", "azopenai", "sagemaker", "ollama", "mock"]
mode: Literal[
"huggingface", "openai", "azopenai", "sagemaker", "ollama", "mock", "gemini"
]
ingest_mode: Literal["simple", "batch", "parallel", "pipeline"] = Field(
"simple",
description=(
Expand Down Expand Up @@ -220,6 +229,18 @@ class OpenAISettings(BaseModel):
)


class GeminiSettings(BaseModel):
api_key: str
model: str = Field(
"models/gemini-pro",
description="Google Model to use. Example: 'models/gemini-pro'.",
)
embedding_model: str = Field(
"models/embedding-001",
description="Google Embedding Model to use. Example: 'models/embedding-001'.",
)


class OllamaSettings(BaseModel):
api_base: str = Field(
"http://localhost:11434",
Expand Down Expand Up @@ -426,6 +447,7 @@ class Settings(BaseModel):
huggingface: HuggingFaceSettings
sagemaker: SagemakerSettings
openai: OpenAISettings
gemini: GeminiSettings
ollama: OllamaSettings
azopenai: AzureOpenAISettings
vectorstore: VectorstoreSettings
Expand Down
1 change: 1 addition & 0 deletions private_gpt/ui/ui.py
Original file line number Diff line number Diff line change
Expand Up @@ -444,6 +444,7 @@ def get_model_label() -> str | None:
"sagemaker": config_settings.sagemaker.llm_endpoint_name,
"mock": llm_mode,
"ollama": config_settings.ollama.llm_model,
"gemini": config_settings.gemini.model,
}

if llm_mode not in model_mapping:
Expand Down
7 changes: 7 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,12 @@ llama-index-llms-openai = {version = "^0.1.25", optional = true}
llama-index-llms-openai-like = {version ="^0.1.3", optional = true}
llama-index-llms-ollama = {version ="^0.1.5", optional = true}
llama-index-llms-azure-openai = {version ="^0.1.8", optional = true}
llama-index-llms-gemini = {version ="^0.1.11", optional = true}
llama-index-embeddings-ollama = {version ="^0.1.2", optional = true}
llama-index-embeddings-huggingface = {version ="^0.2.2", optional = true}
llama-index-embeddings-openai = {version ="^0.1.10", optional = true}
llama-index-embeddings-azure-openai = {version ="^0.1.10", optional = true}
llama-index-embeddings-gemini = {version ="^0.1.8", optional = true}
llama-index-vector-stores-qdrant = {version ="^0.2.10", optional = true}
llama-index-vector-stores-chroma = {version ="^0.1.10", optional = true}
llama-index-vector-stores-postgres = {version ="^0.1.11", optional = true}
Expand All @@ -50,6 +52,9 @@ sentence-transformers = {version ="^3.0.1", optional = true}
# Optional UI
gradio = {version ="^4.37.2", optional = true}

# Optional Google Gemini dependency
google-generativeai = {version ="^0.5.4", optional = true}

[tool.poetry.extras]
ui = ["gradio"]
llms-llama-cpp = ["llama-index-llms-llama-cpp"]
Expand All @@ -58,11 +63,13 @@ llms-openai-like = ["llama-index-llms-openai-like"]
llms-ollama = ["llama-index-llms-ollama"]
llms-sagemaker = ["boto3"]
llms-azopenai = ["llama-index-llms-azure-openai"]
llms-gemini = ["llama-index-llms-gemini", "google-generativeai"]
embeddings-ollama = ["llama-index-embeddings-ollama"]
embeddings-huggingface = ["llama-index-embeddings-huggingface"]
embeddings-openai = ["llama-index-embeddings-openai"]
embeddings-sagemaker = ["boto3"]
embeddings-azopenai = ["llama-index-embeddings-azure-openai"]
embeddings-gemini = ["llama-index-embeddings-gemini"]
vector-stores-qdrant = ["llama-index-vector-stores-qdrant"]
vector-stores-chroma = ["llama-index-vector-stores-chroma"]
vector-stores-postgres = ["llama-index-vector-stores-postgres"]
Expand Down
10 changes: 10 additions & 0 deletions settings-gemini.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
llm:
mode: gemini

embedding:
mode: gemini

gemini:
api_key: ${GOOGLE_API_KEY:}
model: models/gemini-pro
embedding_model: models/embedding-001
5 changes: 5 additions & 0 deletions settings.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,8 @@ azopenai:
api_version: "2023-05-15"
embedding_model: text-embedding-ada-002
llm_model: gpt-35-turbo

gemini:
api_key: ${GOOGLE_API_KEY:}
model: models/gemini-pro
embedding_model: models/embedding-001