readme update

thewriteway · Nov 8, 2024 · e3e9167 · e3e9167
1 parent 9468e1a
commit e3e9167
Showing 1 changed file with 148 additions and 33 deletions.
diff --git a/README.md b/README.md
@@ -2,40 +2,59 @@
 
 # Meme Search app, walkthrough, and demo
 
-Use Python and AI to index your memes by their content and text, making them easily retrievable for your meme warfare pleasures.
+Use AI to index your memes by their content and text, making them easily retrievable for your meme warfare pleasures.
+
+All processing - from image-to-text extraction, to vector embedding, to search - is performed locally.
 
 <p align="center">
-<img align="center" src="https://github.com/jermwatt/readme_gifs/blob/main/meme_search.gif" height="325">
+<img align="center" src="https://github.com/jermwatt/readme_gifs/blob/main/meme-search-pro-search-example.gif" height="325">
 </p>
 
+This repository contains code, a walkthrough notebook (`meme_search_walkthrough.ipynb`), and apps for indexing, searching, and easily retrieving your memes based on semantic search of their content and text.
+
 A table of contents for the remainder of this README:
 
-- [Introduction](#introduction)
-- [Pipeline overview](#pipeline-overview)
-- [Installation instructions (standard version)](#installation-instructions)
-  - [Start the streamlit server](#start-the-streamlit-server)
-  - [Index your own memes](#index-your-own-memes)
+- [Version overall comparison](#version-overall-comparison)
+- [Meme search - standard version](#meme-search---standard-version)
+
+  - [Features](#features---standard-version)
+  - [Installation instructions](#installation-instructions---standard-version)
+  - [Index your memes](#index-your-memes---standard-version)
+  - [Pipeline overview](#pipeline-overview---standard-version)
+  - [Running tests](#running-tests---standard-version)
+
+- [Meme search - pro version](#meme-search---pro-version)
+
+  - [Features](#features)
+  - [Installation instructions](#installation-instructions---pro-version)
+  - [Index your memes](#index-your-memes---pro-version)
+  - [Pipeline overview](#pipeline-overview---pro-version)
+  - [Running tests](#running-tests---pro-version)
+
 - [Changelog](#changelog)
 - [Feature requests and contributing](#feature-requests-and-contributing)
-- [Running tests](#running-tests)
 
-## Introduction
+## Version overall comparison
 
-This repository contains code, a walkthrough notebook (`meme_search_walkthrough.ipynb`), and streamlit demo app for indexing, searching, and easily retrieving your memes based on semantic search of their content and text.
+This repo contains two versions of the meme search app. Both versions can be used for core meme search organization and retrieval, with the pro version offering a significantly expanded feature set at the cost of more complex architecture.
 
-All processing - from image-to-text extraction, to vector embedding, to search - is performed locally.
+1.  **[The standard version](#meme-search---standard-version):** a simple one page app that contains all the base functionality you need. Simple to install and configure.
 
-## Pipeline overview
+2.  **[The pro version](#meme-search---pro-version):** a multi-page app with enhanced UI and additional features driven by the community - like description editing, meme tagging, and multi-path indexing. Requires larger memory footprint.
 
-This meme search pipeline is built using the following open source components:
+## Meme search - standard version
 
-- [moondream](https://github.com/vikhyat/moondream): a tiny, kickass vision language model used for image captioning / extracting image text
-- [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2): a very popular text embedding model
-- [faiss](https://github.com/facebookresearch/faiss): a fast and efficient vector db
-- [sqlite](https://sqlite.org/): the greatest database of all time, used for data indexing
-- [streamlit](https://github.com/streamlit/streamlit): for serving up the app
+### Features - standard version
+
+The standard version of meme search is a simple one page app that allows you to index a diretory of memes and recover them via text based search as illustrated below.
+
+<p align="center">
+<img align="center" src="https://github.com/jermwatt/readme_gifs/blob/main/meme_search.gif" height="325">
+</p>
+
+While not as feature rich as the [pro version of meme search], the standard version provides all the base functionality you need to organize and recover your memes. The standard version is also simpler to install and configure, consisting of a single server / docker container.
 
-## Installation instructions (standard version)
+### Installation instructions - standard version
 
 To create a handy tool for your own memes pull the repo and install the requirements file
 
@@ -51,9 +70,7 @@ Alternatively you can install all the requirements you need using docker via the
 docker compose up
 ```
 
-### Start the streamlit server
-
-After indexing your memes you can then start the streamlit app, allowing you to semantically search for and retrieve your memes
+After indexing your memes you can then start the server (a streamlit app), allowing you to semantically search for and retrieve your memes
 
 ```sh
 python -m streamlit run meme_search/app.py
@@ -67,7 +84,7 @@ docker compose up
 
 Note: you can drag and drop any recovered meme directly from the streamlit app to any messager app of your choice.
 
-### Index your own memes
+### Index your memes - standard version
 
 Place any images / memes you would like indexed for the search app in this repo's subdirectory
 
@@ -103,19 +120,17 @@ You will see printouts at the terminal indicating success of the 3 main stages f
 
 3.  **index**: index the embeddings in an open source and local vector base [faiss database](https://github.com/facebookresearch/faiss) and references connecting the embeddings to their images in the greatest little db of all time - [sqlite](https://sqlite.org/)
 
-## Changelog
+### Pipeline overview - standard version
 
-Meme Search is under active development! See the `CHANGELOG.md` in this repo for a record of the most recent changes.
+This meme search pipeline is written in pure Python and is built using the following open source components:
 
-## Feature requests and contributing
-
-Feature requests and contributions are welcome!
-
-See [the discussion section of this repository](https://github.com/neonwatty/meme_search/discussions) for suggested enhancements to contribute to / weight in on!
-
-Please see `CONTRIBUTING.md` for some boilerplate ground rules for contributing.
+- [moondream](https://github.com/vikhyat/moondream): a tiny, kickass vision language model used for image captioning / extracting image text
+- [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2): a very popular text embedding model
+- [faiss](https://github.com/facebookresearch/faiss): a fast and efficient vector db
+- [sqlite](https://sqlite.org/): the greatest database of all time, used for data indexing
+- [streamlit](https://github.com/streamlit/streamlit): for serving up the app
 
-## Running tests
+### Running tests - standard version
 
 Tests can be run by first installing the test requirements as
 
@@ -128,3 +143,103 @@ Then the test suite can be run as
 ```sh
 python -m pytest tests/
 ```
+
+## Meme search - pro version
+
+### Features - pro version
+
+The pro version of meme search builds on the standard version, adding an array of features requested by the community.
+
+<p align="center">
+<img align="center" src="https://github.com/jermwatt/readme_gifs/blob/main/meme-search-pro-search-example.gif" height="325">
+<img align="center" src="https://github.com/jermwatt/readme_gifs/blob/main/meme-search-pro-edit-example.gif" height="325">
+<img align="center" src="https://github.com/jermwatt/readme_gifs/blob/main/meme-search-pro-filters-example.gif" height="325">
+<img align="center" src="https://github.com/jermwatt/readme_gifs/blob/main/meme-search-pro-search-example.gif" height="325">
+</p>
+
+These additional features include:
+
+1.  **Auto-Generate Meme Descriptions**
+
+    Target specific memes for auto-description generation (instead of applying to your entire directory).
+
+2.  **Manual Meme Description Editing**
+
+    Edit or add descriptions manually for better search results, no need to wait for auto-generation if you don't want to.
+
+3.  **Tags**
+
+    Create, edit, and assign tags to memes for better organization and search filtering.
+
+4.  **Faster Vector Search**
+
+    Powered by Postgres and pgvector, enjoy faster keyword and vector searches with streamlined database transactions.
+
+5.  **Keyword Search**
+
+    Pro adds traditional keyword search in addition to semantic/vector search.
+
+6.  **Directory Paths**
+
+    Organize your memes across multiple subdirectories—no need to store everything in one folder.
+
+7.  **New Organizational Tools**
+
+    Filter by tags, directory paths, and description embeddings, plus toggle between keyword and vector search for more control.
+
+### Installation instructions - pro version
+
+To start up the pro version of meme search pull this repository and start the server cluster with docker-compose
+
+```sh
+docker compose -f docker-compose-pro.yml up
+```
+
+This pulls and starts containers for the app, database, and auto description generator. The app itself will run on port `3000` and is available at
+
+```sh
+http://localhost:3000
+```
+
+To start the app alone pull the repo and cd into the `meme_search/meme_search_pro/meme_search_app`. Once there execute the following to start the app in development mode
+
+```sh
+./bin/dev
+```
+
+When doing this ensure you have an available Postgres instance running locally on port `5432`.
+
+### Index your memes - pro version
+
+With the pro version you can index your memes by creating your own descriptions, or by generating descriptions automatically, as illustrated below.
+
+### Pipeline overview - pro version
+
+The pro version pipeline contains many of the [components of the standard version](#pipeline-overview---standard-version), with some variationa and several additional components.
+
+- the app - along with its enhanced features - is built using [Ruby on Rails](https://rubyonrails.org/)
+- a ruby version [of the same embedding model] is used in place of the Pythonic version
+- a single Postgres database is used in place of the duo used with the standard version
+- the auto generator is isolated in its own image / container to allow for better maintainance, queueing, and cancellation
+
+### Running tests - pro version
+
+To run tests locally pull the repo and cd into the `meme_search/meme_search_pro/meme_search_app` directory. Once there tests can be executed
+
+```sh
+rails test test/system
+```
+
+When doing this ensure you have an available Postgres instance running locally on port `5432`.
+
+## Changelog
+
+Meme Search is under active development! See the `CHANGELOG.md` in this repo for a record of the most recent changes.
+
+## Feature requests and contributing
+
+Feature requests and contributions are welcome!
+
+See [the discussion section of this repository](https://github.com/neonwatty/meme_search/discussions) for suggested enhancements to contribute to / weight in on!
+
+Please see `CONTRIBUTING.md` for some boilerplate ground rules for contributing.