Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore_malformed enabled by default and cannot be disabled on fields of type completion #47166

Closed
andrejbl opened this issue Sep 26, 2019 · 6 comments · Fixed by #48206
Closed
Assignees
Labels
>docs General docs changes :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v7.6.0 v8.0.0-alpha1

Comments

@andrejbl
Copy link

andrejbl commented Sep 26, 2019

Elasticsearch version (bin/elasticsearch --version):
6.8.3

Plugins installed: [discovery-ec2, repository-s3]

JVM version (java -version):
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)

OS version (uname -a if on a Unix-like system):
Linux es-data-2 4.14.143-91.122.amzn1.x86_64 #1 SMP Wed Sep 11 00:43:34 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
We have an index with a field mapping of type 'completion' defined like so:

{
  "mappings" : {
    "_doc" : {
      "dynamic" : "false",
      "properties" : {
        "id" : {
          "type" : "keyword"
        },
        "term" : {
          "type" : "completion",
          "analyzer" : "analyzer_search_term_suggestions",
          "search_analyzer" : "standard",
          "preserve_separators" : true,
          "preserve_position_increments" : true,
          "max_input_length" : 50
        },
        "type" : {
          "type" : "keyword"
        }
      }
    }
  },
  "settings" : {
    "index" : {
      "number_of_shards" : "1",
      "analysis" : {
        "filter" : {
          "custom_stop_filter" : {
            "type" : "stop",
            "stopwords" : [
              "within",
              "without"
            ]
          }
        },
        "analyzer" : {
          "analyzer_search_term_suggestions" : {
            "filter" : [
              "standard",
              "lowercase",
              "stop",
              "custom_stop_filter"
            ],
            "char_filter" : [
              "html_strip"
            ],
            "type" : "custom",
            "tokenizer" : "standard"
          }
        }
      }
    }
  }
}

In ES version 6.3.0 which we ran so far, ingesting a malformed document would fail, as would be expected:

curl -XPUT 'localhost:9200/app_store_suggested_search_term_test/_doc/app-name' -H 'Content-type: application/json' -d'{
   "id": "app-name",
   "term": "",
   "type" : "full-app-name"
}'

{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse"}],"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"illegal_argument_exception","reason":"value must have a length > 0"}},"status":400}

After upgrading ES to 6.8.3, the ingestion of such documents works, with the term field being flagged as _ignored:

curl -XPUT 'localhost:9200/app_store_suggested_search_term_test/_doc/full_app_name-' -H 'Content-type: application/json' -d'{
   "id": "full_app_name-",
   "term": "",
   "type" : "full-app-name"
}'
{"_index":"app_store_suggested_search_term_test","_type":"_doc","_id":"full_app_name-","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":2,"_primary_term":1}

...
{
        "_index" : "app_store_suggested_search_term_test",
        "_type" : "_doc",
        "_id" : "full_app_name-",
        "_score" : 1.0,
        "_ignored" : [
          "term"
        ],
        "_source" : {
          "id" : "full_app_name-",
          "term" : "",
          "type" : "full-app-name"
        }
      }
}

Documentation suggests that this behaviour should only happen when the ignore_malformed setting has been set to true on the index or the field itself. However, that setting is not set and we still see the behaviour. In fact, setting the ignore_malformed index level setting to either true or false on the index seems not to have any impact on the behaviour.

Steps to reproduce:

  1. Create an index with the ignore_malformed explicitly set to false:
curl -XPUT 'localhost:9200/my_index?pretty' -H 'Content-Type: application/json' -d'  
{
  "settings" : {
    "index" : {
     "mapping" : {
        "ignore_malformed": false
      },
      "number_of_shards" : 1, 
      "number_of_replicas" : 1,
      "analysis" : {
        "filter" : {
          "custom_stop_filter" : {
            "type" : "stop",
            "stopwords" : [
              "within",
              "without"
            ]
          }
        },
        "analyzer" : {
          "analyzer_search_term_suggestions" : {
            "filter" : [
              "standard",
              "lowercase",
              "stop",
              "custom_stop_filter"
            ],
            "char_filter" : [
              "html_strip"
            ],
            "type" : "custom",
            "tokenizer" : "standard"
          }
        }
      }
    }
  }
}'

curl -XPUT 'localhost:9200/my_index/_mapping/_doc?pretty' -H 'Content-Type: application/json' -d'  
{
 "dynamic" : "false",
 "properties" : {
   "id" : {
     "type" : "keyword"
   },
   "term" : {
     "type" : "completion",
     "analyzer" : "analyzer_search_term_suggestions",
     "search_analyzer" : "standard",
     "preserve_separators" : true,
     "preserve_position_increments" : true,
     "max_input_length" : 50
   },
   "type" : {
     "type" : "keyword"
   }
 }
}'
  1. Index a document with the malformed field:
curl -XPUT 'localhost:9200/my_index/_doc/full_app_name-' -H 'Content-type: application/json' -d'{
  "id": "full_app_name",
  "term": "",
  "type" : "full-app-name"
}'
  1. Ingestion should result in a failure. Instead it succeeds.
@ebadyano ebadyano added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Sep 26, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@cbuescher cbuescher self-assigned this Oct 11, 2019
@cbuescher
Copy link
Member

The change in behaviour is due to a bugfix for #23121 that when into 6.4.0 with #30713. The previous IAE behavior doesn't have anything to do with enabeling or disabling the ignore_malformed option which doesn't seem to have any effect on completion fields. In fact, when trying to set that option on the field level in newer versions you will get an error stating that this option is unsupported.
Since the behaviour was changed as response to something that was considered a bug I don't think we are going to change this, but out of interest, why would you need the "old" behaviour of getting an error on empty strings?

@andrejbl
Copy link
Author

OK, this clarifies the behaviour a bit. We do have some code, mainly test one, that gets all the documents from an index and therefore needed to be updated to understand ones with ignored fields (before it would safely handle the IAE, so resulting malformed documents would never end up in the index).

It would be good to update the documentation accordingly as that is what confused us in the first place. We were under impression that by setting the ignore_malformed setting to false we would get the old IAE behaviour instead of the new one.

@cbuescher
Copy link
Member

cbuescher commented Oct 17, 2019

It would be good to update the documentation accordingly

That makes sense, I'll mark this as a documentation issue then since I think the genereal behaviour is as intended.
We should look into updating this in the section about the ignore_malformed parameter

@cbuescher cbuescher added >docs General docs changes and removed feedback_needed labels Oct 17, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-docs (>docs)

cbuescher pushed a commit to cbuescher/elasticsearch that referenced this issue Oct 17, 2019
The `ignore_malformed` setting only works on selected mapping types, otherwise
we throw an mapper_parsing_exception. We should add a list of all the mapping
types that support it, since the number of types not supporting it seems larger.

Closes elastic#47166
cbuescher pushed a commit that referenced this issue Oct 18, 2019
The `ignore_malformed` setting only works on selected mapping types, otherwise
we throw an mapper_parsing_exception. We should add a list of all the mapping
types that support it, since the number of types not supporting it seems larger.

Closes #47166
cbuescher pushed a commit that referenced this issue Oct 18, 2019
The `ignore_malformed` setting only works on selected mapping types, otherwise
we throw an mapper_parsing_exception. We should add a list of all the mapping
types that support it, since the number of types not supporting it seems larger.

Closes #47166
cbuescher pushed a commit that referenced this issue Oct 18, 2019
The `ignore_malformed` setting only works on selected mapping types, otherwise
we throw an mapper_parsing_exception. We should add a list of all the mapping
types that support it, since the number of types not supporting it seems larger.

Closes #47166
cbuescher pushed a commit that referenced this issue Oct 18, 2019
The `ignore_malformed` setting only works on selected mapping types, otherwise
we throw an mapper_parsing_exception. We should add a list of all the mapping
types that support it, since the number of types not supporting it seems larger.

Closes #47166
@marstalk
Copy link

Any update in documentation? there isn't any description about type of completion in this section: ignore_malformed parameter

@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>docs General docs changes :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v7.6.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants