Skip to content

Commit

Permalink
feat: MemoryVectorStore using mathjs cos (langchain-ai#753)
Browse files Browse the repository at this point in the history
* feat: MemoryVectorStore using mathjs cos

* feat: lint and export inmemoryvectorstore

* Replace dependency, make it an int test, add entrypoint

* Add docs

---------

Co-authored-by: Nuno Campos <[email protected]>
  • Loading branch information
jacobrosenthal and nfcampos authored Apr 12, 2023
1 parent b5edb5c commit b3fe15c
Show file tree
Hide file tree
Showing 19 changed files with 308 additions and 1 deletion.
3 changes: 2 additions & 1 deletion docs/docs/modules/indexes/vector_stores/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,8 @@ abstract class BaseVectorStore implements VectorStore {

Here's a quick guide to help you pick the right vector store for your use case:

- If you're after something that can just run inside your application, in-memory, without any other servers to stand up, then go for [HNSWLib](./integrations/hnswlib)
- If you're after something that can just run inside your Node.js application, in-memory, without any other servers to stand up, then go for [HNSWLib](./integrations/hnswlib)
- If you're looking for something that can run in-memory in browser-like environments, then go for [MemoryVectorStore](./integrations/memory)
- If you come from Python and you were looking for something similar to FAISS, pick [HNSWLib](./integrations/hnswlib)
- If you're looking for an open-source full-featured vector database that you can run locally in a docker container, then go for [Chroma](./integrations/chroma)
- If you're using Supabase already then look at the [Supabase](./integrations/supabase) vector store to use the same Postgres database for your embeddings too
Expand Down
31 changes: 31 additions & 0 deletions docs/docs/modules/indexes/vector_stores/integrations/memory.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
hide_table_of_contents: true
sidebar_label: Memory
sidebar_position: 1
---

import CodeBlock from "@theme/CodeBlock";

# `MemoryVectorStore`

MemoryVectorStore is an in-memory, ephemeral vectorstore that stores embeddings in-memory and does an exact, linear search for the most similar embeddings. The default similarity metric is cosine similarity, but can be changed to any of the similarity metrics supported by [ml-distance](https://mljs.github.io/distance/modules/similarity.html).

## Usage

### Create a new index from texts

import ExampleTexts from "@examples/indexes/vector_stores/memory.ts";

<CodeBlock language="typescript">{ExampleTexts}</CodeBlock>

### Create a new index from a loader

import ExampleLoader from "@examples/indexes/vector_stores/memory_fromdocs.ts";

<CodeBlock language="typescript">{ExampleLoader}</CodeBlock>

### Use a custom similarity metric

import ExampleCustom from "@examples/indexes/vector_stores/memory_custom_similarity.ts";

<CodeBlock language="typescript">{ExampleCustom}</CodeBlock>
1 change: 1 addition & 0 deletions examples/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
"chromadb": "^1.3.0",
"js-yaml": "^4.1.0",
"langchain": "workspace:*",
"ml-distance": "^4.0.0",
"prisma": "^4.11.0",
"sqlite3": "^5.1.4",
"typeorm": "^0.3.12",
Expand Down
13 changes: 13 additions & 0 deletions examples/src/indexes/vector_stores/memory.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";

export const run = async () => {
const vectorStore = await MemoryVectorStore.fromTexts(
["Hello world", "Bye bye", "hello nice world"],
[{ id: 2 }, { id: 1 }, { id: 3 }],
new OpenAIEmbeddings()
);

const resultOne = await vectorStore.similaritySearch("hello world", 1);
console.log(resultOne);
};
15 changes: 15 additions & 0 deletions examples/src/indexes/vector_stores/memory_custom_similarity.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { similarity } from "ml-distance";

export const run = async () => {
const vectorStore = await MemoryVectorStore.fromTexts(
["Hello world", "Bye bye", "hello nice world"],
[{ id: 2 }, { id: 1 }, { id: 3 }],
new OpenAIEmbeddings(),
{ similarity: similarity.pearson }
);

const resultOne = await vectorStore.similaritySearch("hello world", 1);
console.log(resultOne);
};
22 changes: 22 additions & 0 deletions examples/src/indexes/vector_stores/memory_fromdocs.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "langchain/embeddings/openai";
import { TextLoader } from "langchain/document_loaders/fs/text";

export const run = async () => {
// Create docs with a loader
const loader = new TextLoader(
"src/document_loaders/example_data/example.txt"
);
const docs = await loader.load();

// Load the docs into the vector store
const vectorStore = await MemoryVectorStore.fromDocuments(
docs,
new OpenAIEmbeddings()
);

// Search for the most similar document
const resultOne = await vectorStore.similaritySearch("hello world", 1);

console.log(resultOne);
};
3 changes: 3 additions & 0 deletions langchain/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ vectorstores.d.ts
vectorstores/base.cjs
vectorstores/base.js
vectorstores/base.d.ts
vectorstores/memory.cjs
vectorstores/memory.js
vectorstores/memory.d.ts
vectorstores/chroma.cjs
vectorstores/chroma.js
vectorstores/chroma.d.ts
Expand Down
9 changes: 9 additions & 0 deletions langchain/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@
"vectorstores/base.cjs",
"vectorstores/base.js",
"vectorstores/base.d.ts",
"vectorstores/memory.cjs",
"vectorstores/memory.js",
"vectorstores/memory.d.ts",
"vectorstores/chroma.cjs",
"vectorstores/chroma.js",
"vectorstores/chroma.d.ts",
Expand Down Expand Up @@ -389,6 +392,7 @@
"expr-eval": "^2.0.2",
"flat": "^5.0.2",
"jsonpointer": "^5.0.1",
"ml-distance": "^4.0.0",
"object-hash": "^3.0.0",
"openai": "^3.2.0",
"p-queue": "^6.6.2",
Expand Down Expand Up @@ -541,6 +545,11 @@
"import": "./vectorstores/base.js",
"require": "./vectorstores/base.cjs"
},
"./vectorstores/memory": {
"types": "./vectorstores/memory.d.ts",
"import": "./vectorstores/memory.js",
"require": "./vectorstores/memory.cjs"
},
"./vectorstores/chroma": {
"types": "./vectorstores/chroma.d.ts",
"import": "./vectorstores/chroma.js",
Expand Down
1 change: 1 addition & 0 deletions langchain/scripts/create-entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ const entrypoints = {
// vectorstores
vectorstores: "vectorstores/index",
"vectorstores/base": "vectorstores/base",
"vectorstores/memory": "vectorstores/memory",
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/hnswlib": "vectorstores/hnswlib",
"vectorstores/pinecone": "vectorstores/pinecone",
Expand Down
107 changes: 107 additions & 0 deletions langchain/src/vectorstores/memory.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
import { similarity as ml_distance_similarity } from "ml-distance";
import { VectorStore } from "./base.js";
import { Embeddings } from "../embeddings/base.js";
import { Document } from "../document.js";

interface MemoryVector {
content: string;
embedding: number[];
// eslint-disable-next-line @typescript-eslint/no-explicit-any
metadata: Record<string, any>;
}

export interface MemoryVectorStoreArgs {
similarity?: typeof ml_distance_similarity.cosine;
}

export class MemoryVectorStore extends VectorStore {
memoryVectors: MemoryVector[] = [];

similarity: typeof ml_distance_similarity.cosine;

constructor(
embeddings: Embeddings,
{ similarity, ...rest }: MemoryVectorStoreArgs = {}
) {
super(embeddings, rest);

this.similarity = similarity ?? ml_distance_similarity.cosine;
}

async addDocuments(documents: Document[]): Promise<void> {
const texts = documents.map(({ pageContent }) => pageContent);
return this.addVectors(
await this.embeddings.embedDocuments(texts),
documents
);
}

async addVectors(vectors: number[][], documents: Document[]): Promise<void> {
const memoryVectors = vectors.map((embedding, idx) => ({
content: documents[idx].pageContent,
embedding,
metadata: documents[idx].metadata,
}));

this.memoryVectors = this.memoryVectors.concat(memoryVectors);
}

async similaritySearchVectorWithScore(
query: number[],
k: number
): Promise<[Document, number][]> {
const searches = this.memoryVectors
.map((vector, index) => ({
similarity: this.similarity(query, vector.embedding),
index,
}))
.sort((a, b) => (a.similarity > b.similarity ? -1 : 0))
.slice(0, k);

const result: [Document, number][] = searches.map((search) => [
new Document({
metadata: this.memoryVectors[search.index].metadata,
pageContent: this.memoryVectors[search.index].content,
}),
search.similarity,
]);

return result;
}

static async fromTexts(
texts: string[],
metadatas: object[] | object,
embeddings: Embeddings,
dbConfig?: MemoryVectorStoreArgs
): Promise<MemoryVectorStore> {
const docs: Document[] = [];
for (let i = 0; i < texts.length; i += 1) {
const metadata = Array.isArray(metadatas) ? metadatas[i] : metadatas;
const newDoc = new Document({
pageContent: texts[i],
metadata,
});
docs.push(newDoc);
}
return MemoryVectorStore.fromDocuments(docs, embeddings, dbConfig);
}

static async fromDocuments(
docs: Document[],
embeddings: Embeddings,
dbConfig?: MemoryVectorStoreArgs
): Promise<MemoryVectorStore> {
const instance = new this(embeddings, dbConfig);
await instance.addDocuments(docs);
return instance;
}

static async fromExistingIndex(
embeddings: Embeddings,
dbConfig?: MemoryVectorStoreArgs
): Promise<MemoryVectorStore> {
const instance = new this(embeddings, dbConfig);
return instance;
}
}
28 changes: 28 additions & 0 deletions langchain/src/vectorstores/tests/memory.int.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import { test, expect } from "@jest/globals";

import { OpenAIEmbeddings } from "../../embeddings/openai.js";
import { Document } from "../../document.js";
import { MemoryVectorStore } from "../memory.js";

test("MemoryVectorStore with external ids", async () => {
const embeddings = new OpenAIEmbeddings();

const store = new MemoryVectorStore(embeddings);

expect(store).toBeDefined();

await store.addDocuments([
{ pageContent: "hello", metadata: { a: 1 } },
{ pageContent: "hi", metadata: { a: 1 } },
{ pageContent: "bye", metadata: { a: 1 } },
{ pageContent: "what's this", metadata: { a: 1 } },
]);

const results = await store.similaritySearch("hello", 1);

expect(results).toHaveLength(1);

expect(results).toEqual([
new Document({ metadata: { a: 1 }, pageContent: "hello" }),
]);
});
1 change: 1 addition & 0 deletions langchain/tsconfig.json
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
"src/prompts/index.ts",
"src/prompts/load.ts",
"src/vectorstores/base.ts",
"src/vectorstores/memory.ts",
"src/vectorstores/chroma.ts",
"src/vectorstores/hnswlib.ts",
"src/vectorstores/pinecone.ts",
Expand Down
1 change: 1 addition & 0 deletions test-exports-cf/src/entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ export * from "langchain/llms/base";
export * from "langchain/llms/openai";
export * from "langchain/prompts";
export * from "langchain/vectorstores/base";
export * from "langchain/vectorstores/memory";
export * from "langchain/vectorstores/prisma";
export * from "langchain/text_splitter";
export * from "langchain/memory";
Expand Down
1 change: 1 addition & 0 deletions test-exports-cjs/src/entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ const llms_base = require("langchain/llms/base");
const llms_openai = require("langchain/llms/openai");
const prompts = require("langchain/prompts");
const vectorstores_base = require("langchain/vectorstores/base");
const vectorstores_memory = require("langchain/vectorstores/memory");
const vectorstores_prisma = require("langchain/vectorstores/prisma");
const text_splitter = require("langchain/text_splitter");
const memory = require("langchain/memory");
Expand Down
1 change: 1 addition & 0 deletions test-exports-cra/src/entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ export * from "langchain/llms/base";
export * from "langchain/llms/openai";
export * from "langchain/prompts";
export * from "langchain/vectorstores/base";
export * from "langchain/vectorstores/memory";
export * from "langchain/vectorstores/prisma";
export * from "langchain/text_splitter";
export * from "langchain/memory";
Expand Down
1 change: 1 addition & 0 deletions test-exports-esm/src/entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import * as llms_base from "langchain/llms/base";
import * as llms_openai from "langchain/llms/openai";
import * as prompts from "langchain/prompts";
import * as vectorstores_base from "langchain/vectorstores/base";
import * as vectorstores_memory from "langchain/vectorstores/memory";
import * as vectorstores_prisma from "langchain/vectorstores/prisma";
import * as text_splitter from "langchain/text_splitter";
import * as memory from "langchain/memory";
Expand Down
1 change: 1 addition & 0 deletions test-exports-vercel/src/entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ export * from "langchain/llms/base";
export * from "langchain/llms/openai";
export * from "langchain/prompts";
export * from "langchain/vectorstores/base";
export * from "langchain/vectorstores/memory";
export * from "langchain/vectorstores/prisma";
export * from "langchain/text_splitter";
export * from "langchain/memory";
Expand Down
1 change: 1 addition & 0 deletions test-exports-vite/src/entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ export * from "langchain/llms/base";
export * from "langchain/llms/openai";
export * from "langchain/prompts";
export * from "langchain/vectorstores/base";
export * from "langchain/vectorstores/memory";
export * from "langchain/vectorstores/prisma";
export * from "langchain/text_splitter";
export * from "langchain/memory";
Expand Down
Loading

0 comments on commit b3fe15c

Please sign in to comment.