-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UNOMI-828] Support for OpenSearch persistence #715
base: master
Are you sure you want to change the base?
Conversation
…o be able to fix problems with deployments
…king, work on automated tests is still in progress.
# Conflicts: # graphql/graphql-playground/yarn.lock # itests/src/test/java/org/apache/unomi/itests/BaseIT.java # manual/src/main/asciidoc/graphql.adoc # persistence-elasticsearch/core/src/test/java/org/apache/unomi/persistence/elasticsearch/ElasticsearchPersistenceTest.java # persistence-spi/src/main/java/org/apache/unomi/persistence/spi/conditions/ConditionContextHelper.java # persistence-spi/src/main/java/org/apache/unomi/persistence/spi/conditions/ConditionEvaluatorDispatcherImpl.java # tools/shell-commands/src/main/java/org/apache/unomi/shell/services/internal/UnomiManagementServiceImpl.java
- Changed Unomi startup to use features instead of bundles - Added new build script that integrates all the functionality of the other build scripts - Fix IPv6 address parsing - Merge latest changes from master branch -
- Add option to new build script to be able to use opensearch with integration tests
- Replace hardcoded past event ES query builder in pasteventconditionbuilder to use a generic interface - Replace GeoDistance and DistanceUnit used directly from ElasticSearch with clean-room implementation that are validated through unit tests. Also added unit tests on the ElasticSearch implementation to be able to test any differences - Also added clean-room implementation of the DateMathParser so that it can be used with ElasticSearch or OpenSearch - Modified Unomi startup mechanism to use features instead of bundles. Unfortunately due to complex interdependencies between bundles the features could not be split as wanted, so there is some duplication in the list of bundles. - Removed last of inter-dependencies in the base plugin and the persistence-spi to try to resolve bundle inter-dependencies but that wasn't enough. - ElasticSearch integration tests are now execute without any errors !
- Removed elasticsearch-core from bundle watch requirements - Fix issues with date parsing due to case sensitivity - Improved test units for date parsing and date math handling - Modified HealthChecks to provide an OpenSearch check provider (not yet fully working) - Deactivate 1.x to 2.x migration integration test for OpenSearch (No OpenSearch users will be coming from 1.x) - Update OpenSearch past event query builder to latest changes done in ElasticSearch past event query builder - Various fixes in the integration tests to make them compatible with OpenSearch (removed hardcoded elasticsearch configuration and references) - Added new shell script in itests directory to make it easier to handle the dynamically generated Pax Exam Karaf test container directory. Documentation is also included in the README file inside the itests directory.
- Removed elasticsearch-core from bundle watch requirements - Fix issues with date parsing due to case sensitivity - Improved test units for date parsing and date math handling - Modified HealthChecks to provide an OpenSearch check provider (not yet fully working) - Deactivate 1.x to 2.x migration integration test for OpenSearch (No OpenSearch users will be coming from 1.x) - Update OpenSearch past event query builder to latest changes done in ElasticSearch past event query builder - Various fixes in the integration tests to make them compatible with OpenSearch (removed hardcoded elasticsearch configuration and references) - Added new shell script in itests directory to make it easier to handle the dynamically generated Pax Exam Karaf test container directory. Documentation is also included in the README file inside the itests directory.
…ess status in the integration tests - Make sure the Unomi Management Service is started in IT tests before starting the unomi:start command - Add support for minimal cluster state to allow to start an OpenSearch cluster with yellow status in IT tests - Fix OpenSearch configuration prefix - Modify HealthCheck providers to only be available depending on the availability of the persistence implementation. - Fix integration tests to work properly with OpenSearch. - Fix OpenSearch persistence initial startup - Restructure startFeatures configuration to use arrays instead of complex parsing - Modify OpenSearch custom object mapping to serialize map entries that have null values (which is the default for the ElasticSearch implementation). -
…ess status in the integration tests - Make sure the Unomi Management Service is started in IT tests before starting the unomi:start command - Add support for minimal cluster state to allow to start an OpenSearch cluster with yellow status in IT tests - Fix OpenSearch configuration prefix - Modify HealthCheck providers to only be available depending on the availability of the persistence implementation. - Fix integration tests to work properly with OpenSearch. - Fix OpenSearch persistence initial startup - Restructure startFeatures configuration to use arrays instead of complex parsing - Modify OpenSearch custom object mapping to serialize map entries that have null values (which is the default for the ElasticSearch implementation). - Make sure the OpenSearch docker container used for the IT tests is replaced when tests are restarted. - Fix the handling of the OffsetDateTime in the OpenSearch Property condition query builder - Fix the rule service IT to generate rules with proper conditions and actions
…sts output to be CSV compatible - Added a known issue in the itests README to reference the log issue on OpenSearch 2.18.
- Add docker compose support for OpenSearch - Fix startup issues with updates to UnomiManagementService - Documentation updates to add OpenSearch information (still to be completed)
…r both OpenSearch and ElasticSearch - Update Health check README to explain how it now works with both ElasticSearch and OpenSearch engines
- Add docker compose support for OpenSearch - Fix startup issues with updates to UnomiManagementService - Documentation updates to add OpenSearch information (still to be completed)
# Conflicts: # itests/src/test/java/org/apache/unomi/itests/migration/Migrate16xTo220IT.java
…tween the ElasticSearch and OpenSearch integration tests - Add documentation on how to migrate from ElasticSearch to OpenSearch (not tested yet)
Hello, Thanks a lot for the huge work on this PR, as mentioned during the demo, we plan to dedicate some time to review the pull request starting from January 31st (or earlier if we can). We are doing our best to chime in asap to avoid the PR dragging for too long. |
Thank you @Fgerthoffert that is much appreciated. Also, if I can help maybe by directly briefing reviewers on the changes in the code that's something I can do. Best regards, |
I started a review |
@@ -60,10 +57,21 @@ public class ElasticSearchHealthCheckProvider implements HealthCheckProvider { | |||
|
|||
private CloseableHttpClient httpClient; | |||
|
|||
@Reference(service = PersistenceService.class, cardinality = ReferenceCardinality.OPTIONAL, policy = ReferencePolicy.DYNAMIC, bind = "bind", unbind = "unbind") | |||
private volatile PersistenceService persistenceService; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to me, healthcheck is agnostic on unomi internal components because at a lower level (before unomi starts) Thus I think creating such coupling between healthcheck and unomi internal persistence layer (spi) does not comply with the expected component segregation.
Isn't there is any other way to define unomi topology elements/config (elasticsearch or opensearch) instead of reading it from persistence layer (moreover persistence layer availability is also part of another healthcheck probe) ?
I think using the existing healthcheck .cfg file defining which external services needs to be available should be enough.
* if OpenSearch is used instead as a persistence implementation. | ||
* @return true if the provider is available, false otherwise | ||
*/ | ||
default boolean isAvailable() { return true;} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding availability method, healthcheck provider could be inactivated (always return ok) when not taking part in the topology.
The unomi topology should be defined at health check config level, taking it from the environment variables.
|
||
# Security configuration | ||
authentication.realm = ${org.apache.unomi.security.realm:-karaf} | ||
|
||
# Health check configuration | ||
healthcheck.enabled = ${org.apache.unomi.healthcheck.enabled:-false} | ||
healthcheck.providers = ${org.apache.unomi.healthcheck.providers:-cluster,elasticsearch,unomi,persistence} | ||
healthcheck.providers = ${org.apache.unomi.healthcheck.providers:-cluster,elasticsearch,opensearch,unomi,persistence} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using the available() interface operation, using specific providers in different configuration could be enough.
@@ -40,7 +40,7 @@ protected Object doExecute() throws Exception { | |||
// the caller values easier to read. | |||
DefaultPrettyPrinter defaultPrettyPrinter = new DefaultPrettyPrinter(); | |||
defaultPrettyPrinter = defaultPrettyPrinter.withArrayIndenter(DefaultIndenter.SYSTEM_LINEFEED_INSTANCE); | |||
String jsonMetric = CustomObjectMapper.getObjectMapper().writer(defaultPrettyPrinter).writeValueAsString(metric); | |||
String jsonMetric = new ObjectMapper().writer(defaultPrettyPrinter).writeValueAsString(metric); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if creating ObjectMapper on each request is not too much expensive in terms of performances.
Most of the time there is a shared instance across the whole application.
Switching to a global instance may be more efficient I think
|
||
if (ids.size() > maximumIdsQueryCount) { | ||
// Avoid building too big ids query - throw exception instead | ||
throw new UnsupportedOperationException("Too many profiles"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As maximumIdsQueryCount is private and not configurable maybe returning the limit number in the exception could help
...java/org/apache/unomi/persistence/opensearch/conditions/PropertyConditionOSQueryBuilder.java
Outdated
Show resolved
Hide resolved
...java/org/apache/unomi/persistence/opensearch/conditions/PropertyConditionOSQueryBuilder.java
Outdated
Show resolved
Hide resolved
return Collections.singletonMap(property.getItemId(), definition); | ||
} | ||
|
||
private String convertValueTypeToESType(String valueTypeId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change ES to OS
.../src/main/java/org/apache/unomi/persistence/opensearch/OpenSearchPersistenceServiceImpl.java
Outdated
Show resolved
Hide resolved
String index = getIndex(itemType); | ||
client.indices().refresh(r->r.index(index)); | ||
} catch (IOException e) { | ||
e.printStackTrace();//TODO manage ES7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really ES 7 ?
…mi/persistence/opensearch/conditions/PropertyConditionOSQueryBuilder.java
…mi/persistence/opensearch/conditions/PropertyConditionOSQueryBuilder.java
…mi/persistence/opensearch/conditions/PropertyConditionOSQueryBuilder.java
…sistence/opensearch/OpenSearchPersistenceServiceImpl.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the works related to open search support looks good and promising.
I am a bit surprised by the quantity of addition/modifications that are not related to open search support directly. Due to the complexicity of the changes, and the tons of files it involved I would be in favor separating your PR in multiple ones, with dedicated tickets for fixes and quality of life improvements. And this for multiple reasons:
- Would be easier to review
- Would be easier to implem test for those new modified parts, I see the code contains some bug fixes, but I dont see any new integrations tests related to them.
- Would be easier to track in the history and link new features/commands/apis in dedicated ticket (since we are using JIRA ticket to generate change logs.)
One down side I see, is the refactoring of the packaging from kar to better decoupling in features, will impacts projects that are building/repackaging Unomi that rely on existing features/kar naming.
Finally I personally think that the configuration of the persistence layer, based on the unomi:start command is not adapted for configuring the persistence implementation that should be used. I would rather encourage for a configuration driven system.
Appart from that I added comments in few places, mostly regarding missing java docs or small improvements to consider.
Thanks a lot for the works ! 👍
// This is a fix to support proper IPv6 parsing | ||
if (ip != null) { | ||
if (ip.startsWith("[") && ip.endsWith("]")) { | ||
// This can happen with IPv6 addresses, we must remove the markers since our IPAddress library doesn't support them. | ||
ip = ip.substring(1, ip.length() - 1); | ||
} | ||
IPAddress ipAddress = new IPAddressString(ip).getAddress(); | ||
for (IPAddress serverIpAddress : server.getIpAddresses()) { | ||
if (serverIpAddress.contains(ipAddress)) { | ||
return server.getId(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix looks ok, but what is the link with open search support ?
If it's fixing a bug or issue, would it be more appropriate to separate it in a dedicated PR/tests + tickets.
(current PR is already quiet huge, making it difficult to review, I think that additional fixes not related to open search implem should not be part of the current PR.)
|
||
/** | ||
* Delete an event by specifying its event identifier | ||
* @param eventIdentifier the unique identifier for the event | ||
*/ | ||
void deleteEvent(String eventIdentifier); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those new API looks convenient, but I don't see the relation with open search implem.
What is the need to introduce it now ?
I dont see any test coverage for this new method.
|
||
/** | ||
* Delete a session using its identifier | ||
* @param sessionIdentifier the unique identifier for the session | ||
*/ | ||
void deleteSession(String sessionIdentifier); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those new API looks convenient, but I don't see the relation with open search implem.
What is the need to introduce it now ?
I dont see any test coverage for this new method.
2. When building from source, use the appropriate Maven profile: | ||
* For ElasticSearch (default): no special profile needed | ||
* For OpenSearch: add `-Duse.opensearch=true` to your Maven command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clarified, mention that it have an impact on build, but this use.opensearch
seem's only for integration tests purpose. So I'm not sure it's interesting to document this option in this place.
## Environment Variables | ||
|
||
### Common Variables | ||
- `UNOMI_AUTO_START`: Specifies the search engine type (`elasticsearch` or `opensearch`, defaults to `elasticsearch`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UNOMI_AUTO_START
naming does not reflect that it's used for choosing which persistence implem gonna be use. I would suggest to look for a more meaningful naming for this option.
@@ -1307,7 +1317,7 @@ protected Boolean execute(Object... args) throws Exception { | |||
itemType = customItemType; | |||
} | |||
String documentId = getDocumentIDForItemType(itemId, itemType); | |||
String index = getIndexNameForQuery(itemType); | |||
String index = getIndex(itemType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this will works for rollover indices like sessions and events.
What if the to_be_removed_itemId is in a past rolled over indice ?
@Argument(name = "persistence", description = "Persistence implementation to use (elasticsearch/opensearch)", valueToShowInHelp = "elasticsearch") | ||
private String selectedPersistenceImplementation = "elasticsearch"; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looks strnge that the start command is responsible of choosing the persistence implem, I still think this should be handled by configuration.
@Command(scope = "unomi", name = "action-remove", description = "This command will remove an action type.") | ||
@Service | ||
public class ActionRemove extends RemoveCommandSupport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition, but not related to open search support ?
@Command(scope = "unomi", name = "condition-remove", description = "This command will remove an condition type.") | ||
@Service | ||
public class ConditionRemove extends RemoveCommandSupport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition, but not related to open search support ?
@Command(scope = "unomi", name = "event-remove", description = "This command will remove an event.") | ||
@Service | ||
public class EventRemove extends RemoveCommandSupport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition, but not related to open search support ?
Hello @jkevan thank you for your review. I will look at your feedback but to answer some of your questions globally :
|
Dear Apache Unomi developers and users,
After 8 months of development, and while I'm at it to start 2025 on a high note, I am proud to share with you and submit for review my contribution on the support of OpenSearch as an alternative to ElasticSearch for Apache Unomi.
A lot of work has gone into this contribution, and I can fully understand that some of you will have questions. I propose to organize a meeting to go over the changes and also to answer any questions you might have.
LIVE DEMONSTRATION ANNOUNCEMENT
I'm pleased to invite you to a live presentation and demonstration of the OpenSearch integration and new development tools.
Date: January 13th, 2025
Time: 17:00 CET
Format: Online presentation with live demonstration
Link: https://calendar.app.google/7ZffzAgwPwhamAAU6
Agenda:
OPENSEARCH CONTRIBUTION RATIONALE
WHERE TO FIND IT
——————————
https://github.com/apache/unomi/tree/opensearch-persistence
WHAT’S INCLUDED
—————————
WHAT IS NOT INCLUDED
————————————
DETAILED CHANGES SUMMARY
I'd like to summarize the significant improvements and changes implemented in the opensearch-persistence branch. This work represents a major enhancement to Apache Unomi's persistence layer and development tooling.
Key Features and Improvements:
Testing and CI/CD Improvements
Architecture and Configuration
Development Tools and Scripts
Command Line Interface Improvements
Code Quality and Maintenance
Test Performance Insights:
The slowest test analysis reveals areas for potential optimization, with some tests taking up to 100 seconds to complete. This information will be valuable for future performance improvements. Also a intermediate progress report is now printed indicating how many tests have succeeded or failed, as well as a time estimation until they are completed and a progress bar.
Next Steps:
The changes represent a significant step forward in making Apache Unomi more flexible and maintainable, while providing developers with better tools for development and debugging.
Please following this checklist to help us incorporate your contribution quickly and easily:
for the change (usually before you start working on it). Trivial changes like typos do not
require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
[UNOMI-XXX] - Title of the pull request
significant new parts of code.
Copy the description to the related JIRA issue
mvn clean install -P integration-tests
to make sure basic checks pass. A more thorough check will beperformed on your pull request automatically.
Trivial changes like typos do not require a JIRA issue (javadoc, project build changes, small doc changes, comments...).
If this is your first contribution, you have to read the Contribution Guidelines
If your pull request is about ~20 lines of code you don't need to sign an Individual Contributor License Agreement
if you are unsure please ask on the developers list.
To make clear that you license your contribution under the Apache License Version 2.0, January 2004
you have to acknowledge this by using the following check-box.