Skip to content

Commit

Permalink
[9.0] [ML] Show analysis not available for vector fields in Index Dat…
Browse files Browse the repository at this point in the history
…a Visualizer (#209945) (#210251)

# Backport

This will backport the following commits from `main` to `9.0`:
- [[ML] Show analysis not available for vector fields in Index Data
Visualizer (#209945)](#209945)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Quynh Nguyen
(Quinn)","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-02-07T19:39:59Z","message":"[ML]
Show analysis not available for vector fields in Index Data Visualizer
(#209945)\n\n## Summary\n\nIn 9.0, vector fields like vector embeddings
or offsets are no longer\nexposed in Elasticsearch API, which makes it
not possible to sample the\ncount and show examples This PR makes it so
that the expanded rows for\nthese fields indicate analysis is not
available for these fields.\n\n<img width=\"1295\"
alt=\"image\"\nsrc=\"https://github.com/user-attachments/assets/60a95883-2918-4af5-821a-8f8a006d8441\"\n/>\n\n\n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n###
Identify risks\n\nDoes this PR introduce any risks? For example,
consider risks like hard\nto test bugs, performance regression,
potential of data loss.\n\nDescribe the risk, its severity, and
mitigation for each identified\nrisk. Invite stakeholders and evaluate
how to proceed before merging.\n\n- [ ] [See some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: Elastic Machine
<[email protected]>","sha":"14eefced0fb7f36b609d7a643215b158211e1b91","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":[":ml","release_note:skip","v9.0.0","backport:version","v8.18.0","v9.1.0"],"title":"[ML]
Show analysis not available for vector fields in Index Data
Visualizer","number":209945,"url":"https://github.com/elastic/kibana/pull/209945","mergeCommit":{"message":"[ML]
Show analysis not available for vector fields in Index Data Visualizer
(#209945)\n\n## Summary\n\nIn 9.0, vector fields like vector embeddings
or offsets are no longer\nexposed in Elasticsearch API, which makes it
not possible to sample the\ncount and show examples This PR makes it so
that the expanded rows for\nthese fields indicate analysis is not
available for these fields.\n\n<img width=\"1295\"
alt=\"image\"\nsrc=\"https://github.com/user-attachments/assets/60a95883-2918-4af5-821a-8f8a006d8441\"\n/>\n\n\n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n###
Identify risks\n\nDoes this PR introduce any risks? For example,
consider risks like hard\nto test bugs, performance regression,
potential of data loss.\n\nDescribe the risk, its severity, and
mitigation for each identified\nrisk. Invite stakeholders and evaluate
how to proceed before merging.\n\n- [ ] [See some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: Elastic Machine
<[email protected]>","sha":"14eefced0fb7f36b609d7a643215b158211e1b91"}},"sourceBranch":"main","suggestedTargetBranches":["9.0","8.18"],"targetPullRequestStates":[{"branch":"9.0","label":"v9.0.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/209945","number":209945,"mergeCommit":{"message":"[ML]
Show analysis not available for vector fields in Index Data Visualizer
(#209945)\n\n## Summary\n\nIn 9.0, vector fields like vector embeddings
or offsets are no longer\nexposed in Elasticsearch API, which makes it
not possible to sample the\ncount and show examples This PR makes it so
that the expanded rows for\nthese fields indicate analysis is not
available for these fields.\n\n<img width=\"1295\"
alt=\"image\"\nsrc=\"https://github.com/user-attachments/assets/60a95883-2918-4af5-821a-8f8a006d8441\"\n/>\n\n\n###
Checklist\n\nCheck the PR satisfies following conditions. \n\nReviewers
should verify this PR satisfies this list as well.\n\n- [ ] Any text
added follows [EUI's
writing\nguidelines](https://elastic.github.io/eui/#/guidelines/writing),
uses\nsentence case text and includes
[i18n\nsupport](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)\n-
[
]\n[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)\nwas
added for features that require explanation or tutorials\n- [ ] [Unit or
functional\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\nwere
updated or added to match the most common scenarios\n- [ ] If a plugin
configuration key changed, check if it needs to be\nallowlisted in the
cloud and added to the
[docker\nlist](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)\n-
[ ] This was checked for breaking HTTP API changes, and any
breaking\nchanges have been approved by the breaking-change committee.
The\n`release_note:breaking` label should be applied in these
situations.\n- [ ] [Flaky
Test\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\nused on any tests changed\n- [ ] The PR description includes the
appropriate Release Notes section,\nand the correct `release_note:*`
label is applied per
the\n[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)\n\n###
Identify risks\n\nDoes this PR introduce any risks? For example,
consider risks like hard\nto test bugs, performance regression,
potential of data loss.\n\nDescribe the risk, its severity, and
mitigation for each identified\nrisk. Invite stakeholders and evaluate
how to proceed before merging.\n\n- [ ] [See some
risk\nexamples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)\n-
[ ] ...\n\n---------\n\nCo-authored-by: Elastic Machine
<[email protected]>","sha":"14eefced0fb7f36b609d7a643215b158211e1b91"}}]}]
BACKPORT-->

Co-authored-by: Quynh Nguyen (Quinn) <[email protected]>
Co-authored-by: Elastic Machine <[email protected]>
  • Loading branch information
3 people authored Feb 10, 2025
1 parent 0d4f41a commit 3421829
Show file tree
Hide file tree
Showing 5 changed files with 56 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import type { FieldVisConfig } from '../stats_table/types';
import type { CombinedQuery } from '../../../index_data_visualizer/types/combined_query';
import { LoadingIndicator } from '../loading_indicator';
import { ErrorMessageContent } from '../stats_table/components/field_data_expanded_row/error_message';
import { NotSupportedContent } from '../not_in_docs_content/not_supported_content';

export const IndexBasedDataVisualizerExpandedRow = ({
item,
Expand Down Expand Up @@ -55,6 +56,10 @@ export const IndexBasedDataVisualizerExpandedRow = ({
const dvExpandedRow = useExpandedRowCss();

function getCardContent() {
if (type === 'unknown' || type.includes('vector') || item.secondaryType?.includes('vector')) {
return <NotSupportedContent />;
}

if (existsInDocs === false) {
return <NotInDocsContent />;
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import React, { Fragment } from 'react';
import { FormattedMessage } from '@kbn/i18n-react';
import { EuiIcon, EuiText } from '@elastic/eui';
import type { FC } from 'react';

export const NotSupportedContent: FC = () => (
<Fragment>
<EuiText textAlign="center">
<EuiIcon type="warning" />
</EuiText>
<EuiText textAlign="center" size={'xs'}>
<FormattedMessage
id="xpack.dataVisualizer.dataGrid.field.analysisNotSupportedLabel"
defaultMessage="Analysis is not available for this field."
/>
</EuiText>
</Fragment>
);
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,13 @@ const EmbeddableFieldStatsTableWrapper = (
searchString,
extendedColumns,
progress,
overallStats,
overallStatsProgress,
setLastRefresh,
} = useDataVisualizerGridData(props, dataVisualizerListState);

const totalCount = overallStats?.totalCount;

useEffect(() => {
setLastRefresh(Date.now());
}, [props?.lastReloadRequestTime, setLastRefresh]);
Expand Down Expand Up @@ -93,6 +96,7 @@ const EmbeddableFieldStatsTableWrapper = (
onChange={onTableUpdate}
loading={progress < 100}
overallStatsRunning={overallStatsProgress.isRunning}
totalCount={totalCount}
renderFieldName={props.renderFieldName}
/>
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import {
isAggregatableFieldOverallStats,
isNonAggregatableFieldOverallStats,
isNonAggregatableSampledDocs,
isUnsupportedVectorField,
processAggregatableFieldsExistResponse,
processNonAggregatableFieldsExistResponse,
} from '../search_strategy/requests/overall_stats';
Expand Down Expand Up @@ -214,6 +215,9 @@ export function useOverallStats<TParams extends OverallStatsSearchStrategyParams
const nonAggregatableFields = hasPopulatedFieldsInfo
? originalNonAggregatableFields.filter((fieldName) => populatedFieldsInIndex.has(fieldName))
: originalNonAggregatableFields;
const supportedNonAggregatableFields = nonAggregatableFields.filter((fieldName) => {
return !isUnsupportedVectorField(fieldName);
});

const documentCountStats = await getDocumentCountStats(
data.search,
Expand All @@ -227,7 +231,7 @@ export function useOverallStats<TParams extends OverallStatsSearchStrategyParams
.search<IKibanaSearchRequest, IKibanaSearchResponse>(
{
params: getSampleOfDocumentsForNonAggregatableFields(
nonAggregatableFields,
supportedNonAggregatableFields,
index,
searchQuery,
timeFieldName,
Expand All @@ -244,7 +248,7 @@ export function useOverallStats<TParams extends OverallStatsSearchStrategyParams
})
);

const nonAggregatableFieldsObs = nonAggregatableFields.map((fieldName: string) =>
const nonAggregatableFieldsObs = supportedNonAggregatableFields.map((fieldName: string) =>
data.search
.search<IKibanaSearchRequest, IKibanaSearchResponse>(
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,10 @@ export const checkNonAggregatableFieldExistsRequest = (

const DEFAULT_DOCS_SAMPLE_OF_TEXT_FIELDS_SIZE = 1000;

export const isUnsupportedVectorField = (fieldName: string) => {
return fieldName.endsWith('.chunks.embeddings') || fieldName.endsWith('.chunks.offset');
};

export const getSampleOfDocumentsForNonAggregatableFields = (
nonAggregatableFields: string[],
dataViewTitle: string,
Expand Down Expand Up @@ -305,6 +309,19 @@ export const processNonAggregatableFieldsExistResponse = (
});
return;
}
if (isUnsupportedVectorField(fieldName)) {
stats.nonAggregatableExistsFields.push({
fieldName,
existsInDocs: true,
stats: {
count: undefined,
cardinality: undefined,
sampleCount: undefined,
},
});
return;
}

const foundField = results.find((r) => r.rawResponse.fieldName === fieldName);
const existsInDocs = foundField !== undefined && foundField.rawResponse.hits.total > 0;

Expand Down

0 comments on commit 3421829

Please sign in to comment.