New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[DOCS] Add full-text search overview #119462

Merged

leemthompo merged 11 commits into elastic:main from leemthompo:full-text-search

Jan 6, 2025

Contributor

leemthompo commented Jan 2, 2025 •

edited

Loading

👁️ URL preview

Adds a full-text search overview/entrypoint in "Search your data". Eventually, with additional nesting capabilities, the Text analysis section should be a child of this.

Summary of changes:

full-text-search.asciidoc: Adds new section on full-text search:
- Explains core concepts and workflow
- Details the components: text analysis, inverted index, relevance scoring
- Copious sprinkling of links to related material
tokenizers.asciidoc: Clarifies distinction between Elasticsearch's linguistic tokenization and neural tokenizers because tokenizers in ML-context are different beasts.
full-text-search-overview.svgAdds new SVG diagram to help visualize full-text search workflow in Elasticsearch.
analysis.asciidoc: Adds link to new FTS section

leemthompo added 2 commits

January 2, 2025 15:25


          [DOCS] Add full-text search overview

0ea93d2


          Typofix

f38e6bc

leemthompo added >docs auto-backport v8.16.0 v8.17.0 v8.18.0 labels

Contributor

github-actions bot commented Jan 2, 2025

Documentation preview:

✨ Changed pages

elasticsearchmachine added v9.0.0 Team:Docs labels

Collaborator

elasticsearchmachine commented Jan 2, 2025

Pinging @elastic/es-docs (Team:Docs)

leemthompo self-assigned this

leemthompo added 2 commits

January 2, 2025 15:34


          Add link

12c9de0


          Add link

1971c5e

leemthompo requested a review from shainaraskas

January 2, 2025 14:47

leemthompo added 2 commits

January 2, 2025 17:59


          Add another link

3725cb0


          Update intro.asciidoc

3184e2f

shainaraskas reviewed

View reviewed changes

docs/reference/analysis/tokenizers.asciidoc Outdated Show resolved Hide resolved

docs/reference/analysis/tokenizers.asciidoc Outdated Show resolved Hide resolved

docs/reference/analysis/tokenizers.asciidoc Show resolved Hide resolved

docs/reference/analysis/tokenizers.asciidoc Outdated

+              ====
+              {es}'s text analysis produces meaningful _linguistic_ tokens (like words and phrases) optimized for search relevance scoring.
+              This differs from neural tokenizers, which break text into smaller subword units and numerical vectors for machine learning models.
+              For example, "searching" becomes the searchable word token "search" in {es}, while a neural tokenizer might split it into ["sea", "##rch", "##ing"] for model consumption.

Contributor

shainaraskas Jan 2, 2025

these ## rendered into a highlight. not sure what your intent was here but you might have to escape the chars

Contributor Author

leemthompo Jan 3, 2025

need to use backticks

Contributor Author

leemthompo Jan 6, 2025

Removing the example as unnecessary detail

shainaraskas reviewed

View reviewed changes

Contributor

shainaraskas left a comment

I really love this - it's so great at explaining the landscape and giving people the confidence to implement ft search.

wonder if we should update the quickstart w/ a link to this new overview as well (in the intro and also in "learn more").

docs/reference/images/search/full-text-search-overview.svg Outdated

Contributor

shainaraskas Jan 2, 2025

I think this diagram is very helpful, but it needs to be polished up so the text placement is more consistent / there's consistent padding in the cells. we could prob leverage the figma auto-layout tools for this.

We could also consider paring back colors that don't add a lot of meaning - I'd suggest doing greyscale for most of these and then maybe using a different shape for search results

Contributor Author

leemthompo Jan 6, 2025

💯

Not working in Figma because I'm visually illiterate but will try to fix those color/layout issues

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated

+              Built on decades of information retrieval research, full-text search in {es} is a compute-efficient, deterministic approach that scales predictably with data volume.
+              Full-text search is the cornerstone of production-grade search solutions.
+              Combine full-text search with <<semantic-search,semantic search using vectors>> to build modern hybrid search applications.

Contributor

shainaraskas Jan 2, 2025

Suggested change

      
            Combine full-text search with <<semantic-search,semantic search using vectors>> to build modern hybrid search applications.
          
            You can combine full-text search with <<semantic-search,semantic search using vectors>> to build modern hybrid search applications.

Contributor Author

leemthompo Jan 6, 2025

This was an ORDER!

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated

+              Documents and search queries are transformed to enable returning https://www.elastic.co/what-is/search-relevance[relevant] results instead of simply exact term matches.
+              Fields of type <<text-field-type,`text`>> are analyzed and indexed for full-text search.
+              Built on decades of information retrieval research, full-text search in {es} is a compute-efficient, deterministic approach that scales predictably with data volume.

Contributor

shainaraskas Jan 2, 2025

This sentence is pretty dense - deterministic is doing a lot of heavy lifting here. Can we be more explicit about the benefits, or alternatively, weigh the value of this sentence to the reader?

Contributor Author

leemthompo Jan 6, 2025

agree, rewording/reshaping

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated Show resolved Hide resolved

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated Show resolved Hide resolved

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated Show resolved Hide resolved

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated Show resolved Hide resolved

docs/reference/search/search-your-data/full-text-search.asciidoc

+              [discrete]
+              [[full-text-search-learn-more]]
+              === Learn more

Contributor

shainaraskas Jan 2, 2025

this section works as a v1 but it might be nice to guide people through what resources we want them to check out next, or help them to understand the context of a topic (e.g. "To learn how to optimize the relevance of your search results, refer to <<Search relevance optimizations>>")

would also consider pulling out the "get started" into its own CTA - it's the most important thing people should be looking at next. I'm also curious to know if there's a resource we can provide to move this into a prod world (guess that would be explained in our references to API clients)

Contributor Author

leemthompo Jan 6, 2025 •

edited

Loading

Adding some more context.

I hinted at prod world in the intro paragraph revision— to concretize the compute efficiency wording, with link to moving to prod section.

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated Show resolved Hide resolved

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated Show resolved Hide resolved

leemthompo added 4 commits

January 6, 2025 11:22


          Update diagram

d2adc5d


          Clarify tokenization disambiguation

bd4d131


          Edits per review

2f5152c


          Fix link, formatting, add link in quickstart

c334a3b

Contributor Author

leemthompo commented Jan 6, 2025

Thanks @shainaraskas! I think I've addressed most things

shainaraskas approved these changes

View reviewed changes

Contributor

shainaraskas left a comment

one final piece of feedback, then ready to go! that diagram is looking super fly now 💅

docs/reference/search/search-your-data/full-text-search.asciidoc Outdated

-              Combine full-text search with <<semantic-search,semantic search using vectors>> to build modern hybrid search applications.
+              Built on decades of information retrieval research, full-text search delivers reliable results that scale predictably as your data grows. Because it runs efficiently on CPUs, {es}'s full-text search requires minimal computational resources compared to GPU-intensive vector operations.
+              This translates to lower infrastructure costs and predictable scaling requirements. You can scale horizontally by adding more nodes with standard CPU cores and RAM - no specialized hardware needed. A typical deployment will start with 2-3 nodes and grow incrementally as search volume increases. Learn more about <<scalability, moving to production>>.

Contributor

shainaraskas Jan 6, 2025

I think this is a bit of a red herring in this doc. I'd just remove the whole paragraph. it also sends the wrong signals to people on serverless who use ft search (the paragraph immediately before it also has references to hardware but I'm less concerned about it because it mostly just sells that this is a performant design)

when I mentioned prod in this context, I mostly meant the idea of making these calls from an app or site (this comment likely also a red herring)

Contributor Author

leemthompo Jan 6, 2025

hmmm yes good point about serverless and the basic message is clear in preceding paragraph anyways


          Remove superfluous red herring paragraph

9eb6984

leemthompo enabled auto-merge (squash)

January 6, 2025 17:38

leemthompo merged commit c7b61bd into elastic:main

4 of 5 checks passed

This was referenced Jan 6, 2025

[8.16] [DOCS] Add full-text search overview (#119462) #119605

Merged

[8.17] [DOCS] Add full-text search overview (#119462) #119606

Merged

leemthompo mentioned this pull request

[8.x] [DOCS] Add full-text search overview (#119462) #119607

Merged

Collaborator

elasticsearchmachine commented Jan 6, 2025

💚 Backport successful

Status	Branch	Result
✅	8.16
✅	8.17
✅	8.x

leemthompo added a commit to leemthompo/elasticsearch that referenced this pull request


          [DOCS] Add full-text search overview (elastic#119462)

8a0604e

leemthompo added a commit to leemthompo/elasticsearch that referenced this pull request


          [DOCS] Add full-text search overview (elastic#119462)

c8bd9cf

leemthompo added a commit to leemthompo/elasticsearch that referenced this pull request


          [DOCS] Add full-text search overview (elastic#119462)

05b784f

leemthompo deleted the full-text-search branch

January 6, 2025 18:09

elasticsearchmachine pushed a commit that referenced this pull request


          [DOCS] Add full-text search overview (#119462) (#119606)

353cd26

elasticsearchmachine pushed a commit that referenced this pull request


          [DOCS] Add full-text search overview (#119462) (#119607)

446c756

elasticsearchmachine pushed a commit that referenced this pull request


          [8.16] [DOCS] Add full-text search overview (#119462) (#119605)

e3ea188

* [DOCS] Add full-text search overview (#119462)

* Fix info per 8.16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport >docs Team:Docs v8.16.0 v8.17.0 v8.18.0 v9.0.0