You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FWIW, I posted this on the Elasticsearch forum but got no response thus far, so I am posting it here too, since it definitely is analysis combo related.
I'm seeing duplicate concatenated values when using the combo analyzer for _all using a multi-field defined in a dynamic template.
e.g. Instead of seeing "Foo Bar" when listing the _all terms aggregation, I'm seeing "Foo Bar Foo Bar" for the token because my mulit-field defines 2 sub-fields. If the multi-field is defined with 4 sub-fields, then "Foo Bar" is concatenated 4 times.
My set up is below.
Elasticsearch 1.0.0 on CentOs 6.4 with Java 1.7.0_51.
FWIW, I posted this on the Elasticsearch forum but got no response thus far, so I am posting it here too, since it definitely is analysis combo related.
I'm seeing duplicate concatenated values when using the combo analyzer for _all using a multi-field defined in a dynamic template.
e.g. Instead of seeing "Foo Bar" when listing the _all terms aggregation, I'm seeing "Foo Bar Foo Bar" for the token because my mulit-field defines 2 sub-fields. If the multi-field is defined with 4 sub-fields, then "Foo Bar" is concatenated 4 times.
My set up is below.
Elasticsearch 1.0.0 on CentOs 6.4 with Java 1.7.0_51.
$ES_HOME/config/default-mapping.json:
{
"default": {
"_all": {
"enabled": true,
"analyzer": "combo",
"store": false
},
"dynamic_templates": {
"string_multifield_template": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"include_in_all": false,
"fields": {
"{name}": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"lowercase": {
"analyzer": "lowercase",
"index": "analyzed",
"store": false,
"type": "string"
}
}
}
}
}
}
}
$ES_HOME/config/elasticsearch.yml:
...
index.analysis.analyzer.lowercase.type: custom
index.analysis.analyzer.lowercase.tokenizer: keyword
index.analysis.analyzer.lowercase.filter [ lowercase ]
index.analysis.analyzer.combo.type: custom
index.analysis.analyzer.combo.sub_analyzers: [ keyword, lowercase ]
index.analysis.analyzer.combo.deduplication: true
index.analysis.analyzer.combo.tokenstream_reuse: false
...
The aggregation query I use is the following:
{
"aggs": {
"_all": {
"terms": {
"field": "_all"
}
}
}
}
The text was updated successfully, but these errors were encountered: