You have a Hibernate ORM-based application? You want to provide a full-featured full-text search to your users? You’re at the right place.
With this guide, you’ll learn how to synchronize your entities to an Elasticsearch or OpenSearch cluster in a heartbeat with Hibernate Search. We will also explore how you can query your Elasticsearch or OpenSearch cluster using the Hibernate Search API.
The application described in this guide allows to manage a (simple) library: you manage authors and their books.
The entities are stored in a PostgreSQL database and indexed in an Elasticsearch cluster.
We recommend that you follow the instructions in the next sections and create the application step by step. However, you can go right to the completed example.
Clone the Git repository: git clone {quickstarts-clone-url}
, or download an {quickstarts-archive-url}[archive].
The solution is located in the hibernate-search-orm-elasticsearch-quickstart
directory.
Note
|
The provided solution contains a few additional elements such as tests and testing infrastructure. |
First, we need a new project. Create a new project with the following command:
This command generates a Maven structure importing the following extensions:
-
Hibernate ORM with Panache,
-
the PostgreSQL JDBC driver,
-
Hibernate Search + Elasticsearch,
-
RESTEasy Reactive and Jackson.
If you already have your Quarkus project configured, you can add the hibernate-search-orm-elasticsearch
extension
to your project by running the following command in your project base directory:
This will add the following to your pom.xml
:
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-hibernate-search-orm-elasticsearch</artifactId>
</dependency>
implementation("io.quarkus:quarkus-hibernate-search-orm-elasticsearch")
First, let’s create our Hibernate ORM entities Book
and Author
in the model
subpackage.
package org.acme.hibernate.search.elasticsearch.model;
import java.util.List;
import java.util.Objects;
import jakarta.persistence.CascadeType;
import jakarta.persistence.Entity;
import jakarta.persistence.FetchType;
import jakarta.persistence.OneToMany;
import io.quarkus.hibernate.orm.panache.PanacheEntity;
@Entity
public class Author extends PanacheEntity { // (1)
public String firstName;
public String lastName;
@OneToMany(mappedBy = "author", cascade = CascadeType.ALL, orphanRemoval = true, fetch = FetchType.EAGER) // (2)
public List<Book> books;
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (!(o instanceof Author)) {
return false;
}
Author other = (Author) o;
return Objects.equals(id, other.id);
}
@Override
public int hashCode() {
return 31;
}
}
-
We are using Hibernate ORM with Panache, it is not mandatory.
-
We are loading these elements eagerly so that they are present in the JSON output. In a real world application, you should probably use a DTO approach.
package org.acme.hibernate.search.elasticsearch.model;
import java.util.Objects;
import jakarta.persistence.Entity;
import jakarta.persistence.ManyToOne;
import com.fasterxml.jackson.annotation.JsonIgnore;
import io.quarkus.hibernate.orm.panache.PanacheEntity;
@Entity
public class Book extends PanacheEntity {
public String title;
@ManyToOne
@JsonIgnore (1)
public Author author;
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (!(o instanceof Book)) {
return false;
}
Book other = (Book) o;
return Objects.equals(id, other.id);
}
@Override
public int hashCode() {
return 31;
}
}
-
We mark this property with
@JsonIgnore
to avoid infinite loops when serializing with Jackson.
While everything is not yet set up for our REST service, we can initialize it with the standard CRUD operations we will need.
Create the org.acme.hibernate.search.elasticsearch.LibraryResource
class:
package org.acme.hibernate.search.elasticsearch;
import java.util.List;
import java.util.Optional;
import jakarta.enterprise.event.Observes;
import jakarta.inject.Inject;
import jakarta.transaction.Transactional;
import jakarta.ws.rs.DELETE;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.PUT;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.core.MediaType;
import org.acme.hibernate.search.elasticsearch.model.Author;
import org.acme.hibernate.search.elasticsearch.model.Book;
import org.hibernate.search.mapper.orm.session.SearchSession;
import org.jboss.resteasy.reactive.RestForm;
import org.jboss.resteasy.reactive.RestQuery;
import io.quarkus.runtime.StartupEvent;
@Path("/library")
public class LibraryResource {
@PUT
@Path("book")
@Transactional
@Consumes(MediaType.APPLICATION_FORM_URLENCODED)
public void addBook(@RestForm String title, @RestForm Long authorId) {
Author author = Author.findById(authorId);
if (author == null) {
return;
}
Book book = new Book();
book.title = title;
book.author = author;
book.persist();
author.books.add(book);
author.persist();
}
@DELETE
@Path("book/{id}")
@Transactional
public void deleteBook(Long id) {
Book book = Book.findById(id);
if (book != null) {
book.author.books.remove(book);
book.delete();
}
}
@PUT
@Path("author")
@Transactional
@Consumes(MediaType.APPLICATION_FORM_URLENCODED)
public void addAuthor(@RestForm String firstName, @RestForm String lastName) {
Author author = new Author();
author.firstName = firstName;
author.lastName = lastName;
author.persist();
}
@POST
@Path("author/{id}")
@Transactional
@Consumes(MediaType.APPLICATION_FORM_URLENCODED)
public void updateAuthor(Long id, @RestForm String firstName, @RestForm String lastName) {
Author author = Author.findById(id);
if (author == null) {
return;
}
author.firstName = firstName;
author.lastName = lastName;
author.persist();
}
@DELETE
@Path("author/{id}")
@Transactional
public void deleteAuthor(Long id) {
Author author = Author.findById(id);
if (author != null) {
author.delete();
}
}
}
Nothing out of the ordinary here: it is just good old Hibernate ORM with Panache operations in a REST service.
In fact, the interesting part is that we will need to add very few elements to make our full text search application working.
Let’s go back to our entities.
Enabling full text search capabilities for them is as simple as adding a few annotations.
Let’s edit the Book
entity again to include this content:
package org.acme.hibernate.search.elasticsearch.model;
import java.util.Objects;
import jakarta.persistence.Entity;
import jakarta.persistence.ManyToOne;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.FullTextField;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.Indexed;
import com.fasterxml.jackson.annotation.JsonIgnore;
import io.quarkus.hibernate.orm.panache.PanacheEntity;
@Entity
@Indexed // (1)
public class Book extends PanacheEntity {
@FullTextField(analyzer = "english") // (2)
public String title;
@ManyToOne
@JsonIgnore
public Author author;
// Preexisting equals()/hashCode() methods
}
-
First, let’s use the
@Indexed
annotation to register ourBook
entity as part of the full text index. -
The
@FullTextField
annotation declares a field in the index specifically tailored for full text search. In particular, we have to define an analyzer to split and analyze the tokens (~ words) - more on this later.
Now that our books are indexed, we can do the same for the authors.
Open the Author
class and include the content below.
Things are quite similar here: we use the @Indexed
, @FullTextField
and @KeywordField
annotations.
There are a few differences/additions though. Let’s check them out.
package org.acme.hibernate.search.elasticsearch.model;
import java.util.List;
import java.util.Objects;
import jakarta.persistence.CascadeType;
import jakarta.persistence.Entity;
import jakarta.persistence.FetchType;
import jakarta.persistence.OneToMany;
import org.hibernate.search.engine.backend.types.Sortable;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.FullTextField;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.Indexed;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.IndexedEmbedded;
import org.hibernate.search.mapper.pojo.mapping.definition.annotation.KeywordField;
import io.quarkus.hibernate.orm.panache.PanacheEntity;
@Entity
@Indexed
public class Author extends PanacheEntity {
@FullTextField(analyzer = "name") // (1)
@KeywordField(name = "firstName_sort", sortable = Sortable.YES, normalizer = "sort") // (2)
public String firstName;
@FullTextField(analyzer = "name")
@KeywordField(name = "lastName_sort", sortable = Sortable.YES, normalizer = "sort")
public String lastName;
@OneToMany(mappedBy = "author", cascade = CascadeType.ALL, orphanRemoval = true, fetch = FetchType.EAGER)
@IndexedEmbedded // (3)
public List<Book> books;
// Preexisting equals()/hashCode() methods
}
-
We use a
@FullTextField
similar to what we did forBook
but you’ll notice that the analyzer is different - more on this later. -
As you can see, we can define several fields for the same property. Here, we define a
@KeywordField
with a specific name. The main difference is that a keyword field is not tokenized (the string is kept as one single token) but can be normalized (i.e. filtered) - more on this later. This field is marked as sortable as our intention is to use it for sorting our authors. -
The purpose of
@IndexedEmbedded
is to include theBook
fields into theAuthor
index. In this case, we just use the default configuration: all the fields of the associatedBook
entities are included in the index (i.e. thetitle
field). The nice thing with@IndexedEmbedded
is that it is able to automatically reindex anAuthor
if one of itsBook
s has been updated thanks to the bidirectional relation.@IndexedEmbedded
also supports nested documents (using thestorage = NESTED
attribute), but we don’t need it here. You can also specify the fields you want to include in your parent index using theincludePaths
attribute if you don’t want them all.
If, for some reason, adding Hibernate Search annotations to entities is not possible,
mapping can be applied programmatically instead.
Programmatic mapping is configured through the ProgrammaticMappingConfigurationContext
that is exposed via a mapping configurer (HibernateOrmSearchMappingConfigurer
).
Note
|
A mapping configurer ( |
Below is an example of a mapping configurer that applies programmatic mapping:
package org.acme.hibernate.search.elasticsearch.config;
import org.hibernate.search.mapper.orm.mapping.HibernateOrmMappingConfigurationContext;
import org.hibernate.search.mapper.orm.mapping.HibernateOrmSearchMappingConfigurer;
import org.hibernate.search.mapper.pojo.mapping.definition.programmatic.TypeMappingStep;
import io.quarkus.hibernate.search.orm.elasticsearch.SearchExtension;
@SearchExtension // (1)
public class CustomMappingConfigurer implements HibernateOrmSearchMappingConfigurer {
@Override
public void configure(HibernateOrmMappingConfigurationContext context) {
TypeMappingStep type = context.programmaticMapping() // (2)
.type(SomeIndexedEntity.class); // (3)
type.indexed() // (4)
.index(SomeIndexedEntity.INDEX_NAME); // (5)
type.property("id").documentId(); // (6)
type.property("text").fullTextField(); // (7)
}
}
-
Annotate the configurer implementation with the
@SearchExtension
qualifier to tell Quarkus it should be used by Hibernate Search in the default persistence unit.The annotation can also target a specific persistence unit (
@SearchExtension(persistenceUnit = "nameOfYourPU")
). -
Access the programmatic mapping context.
-
Create mapping step for the
SomeIndexedEntity
entity. -
Define the
SomeIndexedEntity
entity as indexed. -
Provide an index name to be used for the
SomeIndexedEntity
entity. -
Define the document id property.
-
Define a full-text search field for the
text
property.
Tip
|
Alternatively, if for some reason you can’t or don’t want to annotate your mapping configurer with quarkus.hibernate-search-orm.mapping.configurer=bean:myMappingConfigurer |
Analysis is a big part of full text search: it defines how text will be processed when indexing or building search queries.
The role of analyzers is to split the text into tokens (~ words) and filter them (making it all lowercase and removing accents for instance).
Normalizers are a special type of analyzers that keeps the input as a single token. It is especially useful for sorting or indexing keywords.
There are a lot of bundled analyzers, but you can also develop your own for your own specific purposes.
You can learn more about the Elasticsearch analysis framework in the Analysis section of the Elasticsearch documentation.
When we added the Hibernate Search annotations to our entities, we defined the analyzers and normalizers used. Typically:
@FullTextField(analyzer = "english")
@FullTextField(analyzer = "name")
@KeywordField(name = "lastName_sort", sortable = Sortable.YES, normalizer = "sort")
We use:
-
an analyzer called
name
for person names, -
an analyzer called
english
for book titles, -
a normalizer called
sort
for our sort fields
but we haven’t set them up yet.
Let’s see how you can do it with Hibernate Search.
It is an easy task, we just need to create an implementation of ElasticsearchAnalysisConfigurer
(and configure Quarkus to use it, more on that later).
To fulfill our requirements, let’s create the following implementation:
package org.acme.hibernate.search.elasticsearch.config;
import org.hibernate.search.backend.elasticsearch.analysis.ElasticsearchAnalysisConfigurationContext;
import org.hibernate.search.backend.elasticsearch.analysis.ElasticsearchAnalysisConfigurer;
import io.quarkus.hibernate.search.orm.elasticsearch.SearchExtension;
@SearchExtension // (1)
public class AnalysisConfigurer implements ElasticsearchAnalysisConfigurer {
@Override
public void configure(ElasticsearchAnalysisConfigurationContext context) {
context.analyzer("name").custom() // (2)
.tokenizer("standard")
.tokenFilters("asciifolding", "lowercase");
context.analyzer("english").custom() // (3)
.tokenizer("standard")
.tokenFilters("asciifolding", "lowercase", "porter_stem");
context.normalizer("sort").custom() // (4)
.tokenFilters("asciifolding", "lowercase");
}
}
-
Annotate the configurer implementation with the
@SearchExtension
qualifier to tell Quarkus it should be used in the default persistence unit, for all Elasticsearch indexes (by default).The annotation can also target a specific persistence unit (
@SearchExtension(persistenceUnit = "nameOfYourPU")
), backend (@SearchExtension(backend = "nameOfYourBackend")
), index (@SearchExtension(index = "nameOfYourIndex")
), or a combination of those (@SearchExtension(persistenceUnit = "nameOfYourPU", backend = "nameOfYourBackend", index = "nameOfYourIndex")
). -
This is a simple analyzer separating the words on spaces, removing any non-ASCII characters by its ASCII counterpart (and thus removing accents) and putting everything in lowercase. It is used in our examples for the author’s names.
-
We are a bit more aggressive with this one and we include some stemming: we will be able to search for
mystery
and get a result even if the indexed input containsmysteries
. It is definitely too aggressive for person names, but it is perfect for the book titles. -
Here is the normalizer used for sorting. Very similar to our first analyzer, except we don’t tokenize the words as we want one and only one token.
Tip
|
Alternatively, if for some reason you can’t or don’t want to annotate your analysis configurer with quarkus.hibernate-search-orm.elasticsearch.analysis.configurer=bean:myAnalysisConfigurer |
In our existing LibraryResource
, we just need to inject the SearchSession
:
@Inject
SearchSession searchSession; // (1)
-
Inject a Hibernate Search session, which relies on the
EntityManager
under the hood. Applications with multiple persistence units can use the CDI qualifier@io.quarkus.hibernate.orm.PersistenceUnit
to select the right one: see CDI integration.
And then we can add the following methods (and a few import
s):
@Transactional // (1)
void onStart(@Observes StartupEvent ev) throws InterruptedException { // (2)
// only reindex if we imported some content
if (Book.count() > 0) {
searchSession.massIndexer()
.startAndWait();
}
}
@GET
@Path("author/search") // (3)
@Transactional
public List<Author> searchAuthors(@RestQuery String pattern, // (4)
@RestQuery Optional<Integer> size) {
return searchSession.search(Author.class) // (5)
.where(f ->
pattern == null || pattern.trim().isEmpty() ?
f.matchAll() : // (6)
f.simpleQueryString()
.fields("firstName", "lastName", "books.title").matching(pattern) // (7)
)
.sort(f -> f.field("lastName_sort").then().field("firstName_sort")) // (8)
.fetchHits(size.orElse(20)); // (9)
}
-
Important point: we need a transactional context for these methods.
-
As we will import data into the PostgreSQL database using an SQL script, we need to reindex the data at startup. For this, we use Hibernate Search’s mass indexer, which allows to index a lot of data efficiently (you can fine tune it for better performances). All the upcoming updates coming through Hibernate ORM operations will be synchronized automatically to the full text index. If you don’t import data manually in the database, you don’t need that: the mass indexer should then only be used when you change your indexing configuration (adding a new field, changing an analyzer’s configuration…) and you want the new configuration to be applied to your existing entities.
-
This is where the magic begins: just adding the annotations to our entities makes them available for full text search: we can now query the index using the Hibernate Search DSL.
-
Use the
org.jboss.resteasy.reactive.RestQuery
annotation type to avoid repeating the parameter name. -
We indicate that we are searching for
Author
s. -
We create a predicate: if the pattern is empty, we use a
matchAll()
predicate. -
If we have a valid pattern, we create a
simpleQueryString()
predicate on thefirstName
,lastName
andbooks.title
fields matching our pattern. -
We define the sort order of our results. Here we sort by last name, then by first name. Note that we use the specific fields we created for sorting.
-
Fetch the
size
top hits,20
by default. Obviously, paging is also supported.
Note
|
The Hibernate Search DSL supports a significant subset of the Elasticsearch predicates (match, range, nested, phrase, spatial…). Feel free to explore the DSL using autocompletion. When that’s not enough, you can always fall back to defining a predicate using JSON directly. |
As usual, we can configure everything in the Quarkus configuration file, application.properties
.
Edit src/main/resources/application.properties
and inject the following configuration:
quarkus.ssl.native=false (1)
quarkus.datasource.db-kind=postgresql (2)
quarkus.hibernate-orm.sql-load-script=import.sql (3)
quarkus.hibernate-search-orm.elasticsearch.version=8 (4)
quarkus.hibernate-search-orm.indexing.plan.synchronization.strategy=sync (5)
%prod.quarkus.datasource.jdbc.url=jdbc:postgresql://localhost/quarkus_test (6)
%prod.quarkus.datasource.username=quarkus_test
%prod.quarkus.datasource.password=quarkus_test
%prod.quarkus.hibernate-orm.database.generation=create
%prod.hibernate-search-orm.elasticsearch.hosts=localhost:9200 (6)
-
We won’t use SSL, so we disable it to have a more compact native executable.
-
Let’s create a PostgreSQL datasource.
-
We load some initial data on startup.
-
We need to tell Hibernate Search about the version of Elasticsearch we will use. It is important because there are significant differences between Elasticsearch mapping syntax depending on the version. Since the mapping is created at build time to reduce startup time, Hibernate Search cannot connect to the cluster to automatically detect the version. Note that, for OpenSearch, you need to prefix the version with
opensearch:
; see OpenSearch compatibility. -
This means that we wait for the entities to be searchable before considering a write complete. On a production setup, the
write-sync
default will provide better performance. Usingsync
is especially important when testing as you need the entities to be searchable immediately. -
For development and tests, we rely on Dev Services, which means Quarkus will start a PostgreSQL database and Elasticsearch cluster automatically. In production mode, however, you will want to start a PostgreSQL database and Elasticsearch cluster manually, which is why we provide Quarkus with this connection info in the
prod
profile (%prod.
prefix).
Note
|
Because we rely on Dev Services, the database and Elasticsearch schema
will automatically be dropped and re-created on each application startup
in tests and dev mode
(unless If for some reason you cannot use Dev Services, you will have to set the following properties to get similar behavior: %dev.quarkus.hibernate-orm.database.generation=drop-and-create
%test.quarkus.hibernate-orm.database.generation=drop-and-create
%dev.quarkus.hibernate-search-orm.schema-management.strategy=drop-and-create
%test.quarkus.hibernate-search-orm.schema-management.strategy=drop-and-create |
Tip
|
For more information about the Hibernate Search extension configuration please refer to the Configuration Reference. |
Quarkus supports a feature called Dev Services that allows you to start various containers without any config.
In the case of Elasticsearch this support extends to the default Elasticsearch connection.
What that means practically, is that if you have not configured quarkus.hibernate-search-orm.elasticsearch.hosts
Quarkus will automatically
start an Elasticsearch container when running tests or in dev mode, and automatically configure the connection.
When running the production version of the application, the Elasticsearch connection needs to be configured as normal,
so if you want to include a production database config in your application.properties
and continue to use Dev Services
we recommend that you use the %prod.
profile to define your Elasticsearch settings.
Note
|
Dev Services for Elasticsearch is currently unable to start multiple clusters concurrently, so it only works with the default backend of the default persistence unit: named persistence units or named backends won’t be able to take advantage of Dev Services for Elasticsearch. |
For more information you can read the Dev Services for Elasticsearch guide.
Now let’s add a simple web page to interact with our LibraryResource
.
Quarkus automatically serves static resources located under the META-INF/resources
directory.
In the src/main/resources/META-INF/resources
directory, overwrite the existing index.html
file with the content from this
{quickstarts-blob-url}/hibernate-search-orm-elasticsearch-quickstart/src/main/resources/META-INF/resources/index.html[index.html] file.
For the purpose of this demonstration, let’s import an initial dataset.
Let’s create a src/main/resources/import.sql
file with the following content:
INSERT INTO author(id, firstname, lastname) VALUES (1, 'John', 'Irving');
INSERT INTO author(id, firstname, lastname) VALUES (2, 'Paul', 'Auster');
ALTER SEQUENCE author_seq RESTART WITH 3;
INSERT INTO book(id, title, author_id) VALUES (1, 'The World According to Garp', 1);
INSERT INTO book(id, title, author_id) VALUES (2, 'The Hotel New Hampshire', 1);
INSERT INTO book(id, title, author_id) VALUES (3, 'The Cider House Rules', 1);
INSERT INTO book(id, title, author_id) VALUES (4, 'A Prayer for Owen Meany', 1);
INSERT INTO book(id, title, author_id) VALUES (5, 'Last Night in Twisted River', 1);
INSERT INTO book(id, title, author_id) VALUES (6, 'In One Person', 1);
INSERT INTO book(id, title, author_id) VALUES (7, 'Avenue of Mysteries', 1);
INSERT INTO book(id, title, author_id) VALUES (8, 'The New York Trilogy', 2);
INSERT INTO book(id, title, author_id) VALUES (9, 'Mr. Vertigo', 2);
INSERT INTO book(id, title, author_id) VALUES (10, 'The Brooklyn Follies', 2);
INSERT INTO book(id, title, author_id) VALUES (11, 'Invisible', 2);
INSERT INTO book(id, title, author_id) VALUES (12, 'Sunset Park', 2);
INSERT INTO book(id, title, author_id) VALUES (13, '4 3 2 1', 2);
ALTER SEQUENCE book_seq RESTART WITH 14;
You can now interact with your REST service:
-
start your Quarkus application with:
-
open a browser to
http://localhost:8080/
-
search for authors or book titles (we initialized some data for you)
-
create new authors and books and search for them too
As you can see, all your updates are automatically synchronized to the Elasticsearch cluster.
Hibernate Search is compatible with both Elasticsearch and OpenSearch, but it assumes it is working with an Elasticsearch cluster by default.
To have Hibernate Search work with an OpenSearch cluster instead,
prefix the configured version with opensearch:
,
as shown below.
quarkus.hibernate-search-orm.elasticsearch.version=opensearch:1.2
All other configuration options and APIs are exactly the same as with Elasticsearch.
You can find more information about compatible distributions and versions of Elasticsearch in this section of Hibernate Search’s reference documentation.
With the Hibernate ORM extension, you can set up multiple persistence units, each with its own datasource and configuration.
If you do declare multiple persistence units, you will also configure Hibernate Search separately for each persistence unit.
The properties at the root of the quarkus.hibernate-search-orm.
namespace define the default persistence unit.
For instance, the following snippet defines a default datasource and a default persistence unit,
and sets the Elasticsearch host for that persistence unit to es1.mycompany.com:9200
.
quarkus.datasource.db-kind=h2
quarkus.datasource.jdbc.url=jdbc:h2:mem:default;DB_CLOSE_DELAY=-1
quarkus.hibernate-search-orm.elasticsearch.hosts=es1.mycompany.com:9200
quarkus.hibernate-search-orm.elasticsearch.version=8
Using a map based approach, it is also possible to configure named persistence units:
quarkus.datasource."users".db-kind=h2 (1)
quarkus.datasource."users".jdbc.url=jdbc:h2:mem:users;DB_CLOSE_DELAY=-1
quarkus.datasource."inventory".db-kind=h2 (2)
quarkus.datasource."inventory".jdbc.url=jdbc:h2:mem:inventory;DB_CLOSE_DELAY=-1
quarkus.hibernate-orm."users".datasource=users (3)
quarkus.hibernate-orm."users".packages=org.acme.model.user
quarkus.hibernate-orm."inventory".datasource=inventory (4)
quarkus.hibernate-orm."inventory".packages=org.acme.model.inventory
quarkus.hibernate-search-orm."users".elasticsearch.hosts=es1.mycompany.com:9200 (5)
quarkus.hibernate-search-orm."users".elasticsearch.version=8
quarkus.hibernate-search-orm."inventory".elasticsearch.hosts=es2.mycompany.com:9200 (6)
quarkus.hibernate-search-orm."inventory".elasticsearch.version=8
-
Define a datasource named
users
. -
Define a datasource named
inventory
. -
Define a persistence unit called
users
pointing to theusers
datasource. -
Define a persistence unit called
inventory
pointing to theinventory
datasource. -
Configure Hibernate Search for the
users
persistence unit, setting the Elasticsearch host for that persistence unit toes1.mycompany.com:9200
. -
Configure Hibernate Search for the
inventory
persistence unit, setting the Elasticsearch host for that persistence unit toes2.mycompany.com:9200
.
For each persistence unit, Hibernate Search will only consider indexed entities that are attached to that persistence unit. Entities are attached to a persistence unit by configuring the Hibernate ORM extension.
You can inject Hibernate Search’s main entry points, SearchSession
and SearchMapping
, using CDI:
@Inject
SearchSession searchSession;
This will inject the SearchSession
of the default persistence unit.
To inject the SearchSession
of a named persistence unit (users
in our example),
just add a qualifier:
@Inject
@PersistenceUnit("users") (1)
SearchSession searchSession;
-
This is the
@io.quarkus.hibernate.orm.PersistenceUnit
annotation.
You can inject the SearchMapping
of a named persistence unit using the exact same mechanism:
@Inject
@PersistenceUnit("users")
SearchMapping searchMapping;
You can build a native executable with the usual command:
Note
|
As usual with native executable compilation, this operation consumes a lot of memory. It might be safer to stop the two containers while you are building the native executable and start them again once you are done. |
Running it is as simple as executing ./target/hibernate-search-orm-elasticsearch-quickstart-1.0.0-SNAPSHOT-runner
.
You can then point your browser to http://localhost:8080/
and use your application.
Note
|
The startup is a bit slower than usual: it is mostly due to us dropping and recreating the database schema and the Elasticsearch mapping every time at startup. We also inject some data and execute the mass indexer. In a real life application, it is obviously something you won’t do at startup. |
By default, Hibernate Search sends a few requests to the Elasticsearch cluster on startup. If the Elasticsearch cluster is not necessarily up and running when Hibernate Search starts, this could cause a startup failure.
To address this, you can configure Hibernate Search to not send any request on startup:
-
Disable Elasticsearch version checks on startup by setting the configuration property
quarkus.hibernate-search-orm.elasticsearch.version-check.enabled
tofalse
. -
Disable schema management on startup by setting the configuration property
quarkus.hibernate-search-orm.schema-management.strategy
tonone
.
Of course, even with this configuration, Hibernate Search still won’t be able to index anything or run search queries until the Elasticsearch cluster becomes accessible.
Important
|
If you disable automatic schema creation by setting See this section of the reference documentation for more information. |
Caution
|
Coordination through outbox polling is considered preview. In preview, backward compatibility and presence in the ecosystem is not guaranteed. Specific improvements might require changing configuration or APIs, or even storage formats, and plans to become stable are under way. Feedback is welcome on our mailing list or as issues in our GitHub issue tracker. |
While it’s technically possible to use Hibernate Search and Elasticsearch in distributed applications, by default they suffer from a few limitations.
These limitations are the result of Hibernate Search not coordinating between threads or application nodes by default.
In order to get rid of these limitations, you can
use the outbox-polling
coordination strategy.
This strategy creates an outbox table in the database to push entity change events to,
and relies on a background processor to consume these events and perform indexing.
To enable the outbox-polling
coordination strategy, an additional extension is required:
Once the extension is there, you will need to explicitly select the outbox-polling
strategy
by setting quarkus.hibernate-search-orm.coordination.strategy
to outbox-polling
.
Finally, you will need to make sure that the Hibernate ORM entities added by Hibernate Search (to represent the outbox and agents) have corresponding tables/sequences in your database:
-
If you are just starting with your application and intend to let Hibernate ORM generate your database schema, then no worries: the entities required by Hibernate Search will be included in the generated schema.
-
Otherwise, you must manually alter your schema to add the necessary tables/sequences.
Once you are done with the above, you’re ready to use Hibernate Search with an outbox. Don’t change any code, and just start your application: it will automatically detect when multiple applications are connected to the same database, and coordinate the index updates accordingly.
Note
|
Hibernate Search mostly behaves the same when using the However, there is one key difference: index updates are necessarily asynchronous; they are guaranteed to happen eventually, but not immediately. This means in particular that the configuration property
This behavior is consistent with Elasticsearch’s near-real-time search and the recommended way of using Hibernate Search even when coordination is disabled. |
For more information about coordination in Hibernate Search, see this section of the reference documentation.
For more information about configuration options related to coordination, see Configuration of coordination with outbox polling.
If you need to use Amazon’s managed Elasticsearch service, you will find it requires a proprietary authentication method involving request signing.
You can enable AWS request signing in Hibernate Search by adding a dedicated extension to your project and configuring it.
See the documentation for the Hibernate Search ORM + Elasticsearch AWS extension for more information.
Caution
|
Hibernate Search’s management endpoint is considered preview. In preview, backward compatibility and presence in the ecosystem is not guaranteed. Specific improvements might require changing configuration or APIs, or even storage formats, and plans to become stable are under way. Feedback is welcome on our mailing list or as issues in our GitHub issue tracker. |
The Hibernate Search extension provides an HTTP endpoint to reindex your data through the management interface. By default, this endpoint is not available. It can be enabled through configuration properties as shown below.
quarkus.management.enabled=true (1)
quarkus.hibernate-search-orm.management.enabled=true (2)
-
Enable the management interface.
-
Enable Hibernate Search specific management endpoints.
Once the management is enabled, data can be re-indexed via /q/hibernate-search/reindex
, where /q
is the default management root path
and /hibernate-search
is the default Hibernate Search root management path.
It (/hibernate-search
) can be changed via configuration property as shown below.
quarkus.hibernate-search-orm.management.root-path=custom-root-path (1)
-
Use a custom
custom-root-path
path for Hibernate Search’s management endpoint. If the default management root path is used then the reindex path becomes/q/custom-root-path/reindex
.
This endpoint accepts POST
requests with application/json
content type only.
All indexed entities will be re-indexed if an empty request body is submitted.
If only a subset of entities must be re-indexed or
if there is a need to have a custom configuration of the underlying mass indexer
then this information can be passed through the request body as shown below.
{
"filter": {
"types": ["EntityName1", "EntityName2", "EntityName3", ...], (1)
},
"massIndexer":{
"typesToIndexInParallel": 1, (2)
}
}
-
An array of entity names that should be re-indexed. If unspecified or empty, all entity types will be re-indexed.
-
Sets the number of entity types to be indexed in parallel.
The full list of possible filters and available mass indexer configurations is presented in the example below.
{
"filter": { (1)
"types": ["EntityName1", "EntityName2", "EntityName3", ...], (2)
"tenants": ["tenant1", "tenant2", ...] (3)
},
"massIndexer":{ (4)
"typesToIndexInParallel": 1, (5)
"threadsToLoadObjects": 6, (6)
"batchSizeToLoadObjects": 10, (7)
"cacheMode": "IGNORE", (8)
"mergeSegmentsOnFinish": false, (9)
"mergeSegmentsAfterPurge": true, (10)
"dropAndCreateSchemaOnStart": false, (11)
"purgeAllOnStart": true, (12)
"idFetchSize": 100, (13)
"transactionTimeout": 100000, (14)
}
}
-
Filter object that allows to limit the scope of reindexing.
-
An array of entity names that should be re-indexed. If unspecified or empty, all entity types will be re-indexed.
-
An array of tenant ids, in case of multi-tenancy. If unspecified or empty, all tenants will be re-indexed.
-
Mass indexer configuration object.
-
Sets the number of entity types to be indexed in parallel.
-
Sets the number of threads to be used to load the root entities.
-
Sets the batch size used to load the root entities.
-
Sets the cache interaction mode for the data loading tasks.
-
Whether each index is merged into a single segment after indexing.
-
Whether each index is merged into a single segment after the initial index purge, just before indexing.
-
Whether the indexes and their schema (if they exist) should be dropped and re-created before indexing.
-
Whether all entities are removed from the indexes before indexing.
-
Specifies the fetch size to be used when loading primary keys if objects to be indexed.
-
Specifies the timeout of transactions for loading ids and entities to be re-indexed.
Note all the properties in the json are optional, and only those that are needed should be used.
For more detailed information on mass indexer configuration see the corresponding section of the Hibernate Search reference documentation.
Submitting the reindexing request will trigger indexing in the background. Mass indexing progress will appear in the application logs.
For testing purposes, it might be useful to know when the indexing finished. Adding wait_for=finished
query parameter to the URL
will result in the management endpoint returning a chunked response that will report when the indexing starts and then when it is finished.
When working with multiple persistence units, the name of the persistence unit to reindex can be supplied through the
persistence_unit
query parameter: /q/hibernate-search/reindex?persistence_unit=non-default-persistence-unit
.
If you are interested in learning more about Hibernate Search 6, the Hibernate team publishes an extensive reference documentation.
Hibernate Search supports both a Lucene backend and an Elasticsearch backend.
In the context of Quarkus and to build microservices, we thought the latter would make more sense. Thus, we focused our efforts on it.
We don’t have plans to support the Lucene backend in Quarkus for now.
Note
|
About bean references
When referencing beans using a string value in configuration properties, that string is parsed. Here are the most common formats:
Other formats are also accepted, but are only useful for advanced use cases. See this section of Hibernate Search’s reference documentation for more information. |
Note
|
These configuration properties require an additional extension. See Coordination through outbox polling. |