带通配符的 Elasticsearch 不区分大小写的 query_string 查询

在我的 ES 映射中,我有一个“uri”字段,该字段当前设置为 not_analysed,我不允许更改映射。我想使用这样的 query_string 查询搜索 uri 部分(此 ES 查询是自动生成的,即为什么它有点复杂,但让我们只关注 query_string 部分)

{
  "sort": [{"updated": {"order": "desc"}}], 
   "query": {
     "bool": {
       "must":[{
         "query_string": {
           "query":"*w3.org/2014/01/a*", 
           "lowercase_expanded_terms": true, 
           "default_field": "uri"
         }
       }], 
       "minimum_number_should_match": 1
     }
   }, "size": 50}

现在它通常可以正常工作,但我存储了以下 url(虚构 url): http://w3.org/2014/01/Abc.html 并且由于 A-a 差异,此查询不会将其恢复。将扩展项设置为 false 也不能解决此问题。我应该怎么做这个查询不区分大小写?

我在这里先向您的帮助表示感谢。

stack overflow Elasticsearch case-insensitive query_string query with wildcards
原文答案

答案:

作者头像

Try to use match query instead of query string.

{
"sort": [
    {
        "updated": {
            "order": "desc"
        }
    }
],
"query": {
    "bool": {
        "must": [
            {
                "match": {
                    "uri": "*w3\\.org\\/2014\\/01\\/a*"
                }
            }
        ]
    }
},
"size": 50
}

Query string queries are not analyzed and but match queries are analyzed.

作者头像

From the docs, it seems like you need a new analyzer that first transforms to lowercase and then can run the search. Have you tried that? http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/sorting-collations.html

As I read it, your pattern, lowercase_expanded_terms, only applies to expansions, not to regular words http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html lowercase_expanded_terms Whether terms of wildcard, prefix, fuzzy, and range queries are to be automatically lower-cased or not (since they are not analyzed). Default it true