Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field level security for nested fields #2922

Closed
fdb3 opened this issue Jun 30, 2023 · 6 comments
Closed

Field level security for nested fields #2922

fdb3 opened this issue Jun 30, 2023 · 6 comments
Labels
bug Something isn't working triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. wontfix This will not be worked on

Comments

@fdb3
Copy link

fdb3 commented Jun 30, 2023

Hi there,

In trying to configure the access to specific indexes for some users/roles, we have encountered a problem. I currently work for Wageningen University & Research, team Integration Services, with Sander van de Geijn. Our situation is as follows: our team makes use of an open search cluster with a number of indexes. Some of these indexes have a structure that contains a list of multiple identifiers. Each identifier has a type and value. Because of the nature and purpose of these fields, it is important that a user can perform a nested query on these fields. Therefore, we have set this field as "nested" in its mapping.

However, not all users wish to use a nested query and prefer a normal query for these fields. To accommodate both preferences, we have tried applying the option ‘"include_in_parent": true’ for the index's nested field. According to the documentation and my interpretation of it, this option would enable both nested and non-nested queries for the respective nested field. We have tested this with our own admin rights and were indeed able to use both query types on a nested field that has this property.

This behaviour, however, does not hold for other users. Clients who make use of our open search service will not have the same rights/roles as an admin user. For security purposes, we use field level security to restrict their access to the required fields. In our tests, we discovered that the use of field level security combined with the option "include_in_parent" set to true does not yield the results we expected. We assumed that a user with field level security would also be able to use both nested and non-nested queries on a nested field for which the "include_in_parent" option is set to true, as was the case for admin users. Our tests showed that when "include_in_parent" is set to true, users can only perform non-nested queries, whereas nested queries will not yield any result and will not return an error or warning, either. On the other hand, when "include_in_parent" is set to false, the results seem to be reversed: nested queries provide results, but non-nested queries do not. Additionaly, when we remove the field level security restrictions for these users entirely, both nested and non-nested queries return results, thus showing the expected behaviour.

Therefore, I was wondering whether we have missed something. Is there an additional setting that must be added to enable a user/role with field level security to use both nested and non-nested queries on a nested field? Or might this indeed be erroneous behaviour? The fact that some queries will not return any results will inevitably lead to confusion and will probably mislead users into believing that the requested records do not exist.

@fdb3 fdb3 added bug Something isn't working untriaged Require the attention of the repository maintainers and may need to be prioritized labels Jun 30, 2023
@stephen-crawford
Copy link
Collaborator

[Triage] Hi @fdb3, thank you for filing this issue. This is a request we have encountered in the past but one that is not feasible due to the free-style structure of the low level objects inside the fields. There is a workaround possible by raising the schema level of the fields to treat the internal objects as the fields themselves. Hopefully this will help with your use case.

@stephen-crawford stephen-crawford added triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. wontfix This will not be worked on and removed untriaged Require the attention of the repository maintainers and may need to be prioritized labels Jul 10, 2023
@sandervandegeijn
Copy link

Hi Stephen, that could be a viable solution. Not entirely sure how to implement this, can you give us a pointer?

Thanks!

Sander

@stephen-crawford
Copy link
Collaborator

Hi @ict-one-nl,

So I saw you previously tried the combination of field mapping and nested fields that was suggested on #2834. What I was remembering was use of the nested fields which effectively raises the level of the field objects since it stores the different objects as documents and then you can execute fls on the fields as if you were using dls. The example here shows the general usage. That being said, I saw you mentioned it not working for you before...

I would have expected something like:

PUT testindex1/_doc/100
{ 
  "patients": [ 
    {"name" : "John Doe", "age" : 56, "smoker" : true},
    {"name" : "Mary Major", "age" : 85, "smoker" : false}
  ] 
}

GET testindex1/_search
{
  "query": {
    "match": {
      "patients.name.keyword": "John Doe"
    }
  }
}

To return just the entry matching a first name of John.

Unfortunately, because of how the Search operation works, we are instead returned the index with the matching term--yielding both John and Mary.

I tried to investigate this further and went so far as to look into whether you could use the nested fields and fls to provide a specific match. Unfortunately, while you can hide specific fields from the result, you cannot do so based only on the entries with matching fields. That is, you cannot hide the patient name only if it matches Mary. Either you hide all patient names, or you don't...

It seems like this is a configuration we do not currently support where your best bet would either be per-filtering or post-filtering you data to remove the sensitive entries. It is also possible to separate the data into two different indices and then perform your query only on the indices with the data you want accessed. Perhaps one of these options would support your use case?

@sandervandegeijn
Copy link

Great thanks Stephen, will look into it this week. Everybody is on holiday so got some time on my hands :)

@davidlago
Copy link

@ict-one-nl closing this for now. It seems like this will not be resulting in any fixes/enhancements (beyond perhaps the overlap with the conversation in #2834)

@sandervandegeijn
Copy link

Thanks Dave, no problem. Sorry I haven't followed up at this time, will do that. Been busy investigating a bug in the s3 snapshot plugin :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged Issues labeled as 'Triaged' have been reviewed and are deemed actionable. wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

4 participants