Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mapped types to scripting fields api #79105

Open
23 of 49 tasks
jdconrad opened this issue Oct 13, 2021 · 5 comments
Open
23 of 49 tasks

Add mapped types to scripting fields api #79105

jdconrad opened this issue Oct 13, 2021 · 5 comments
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement Meta Team:Core/Infra Meta label for core/infra team

Comments

@jdconrad
Copy link
Contributor

jdconrad commented Oct 13, 2021

We need to make all the mapped types (that make sense to expose) available for the scripting fields api. The following is a list of all mapped field types:

Second check box shows support for source fallback.

Done

Must

(Completed)

Nice to have

Unsupported

  • alias
  • aggregate_metric_double
  • match_only_text
  • annotated_text
  • completion
  • search_as_you_type
  • token_count
  • rank_feature
  • rank_features
  • percolator

Discuss

The following links have tests that may be re-usable as we add in certain basic field types:

@jdconrad jdconrad added >enhancement :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache labels Oct 13, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Oct 13, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@stu-elastic
Copy link
Contributor

Next steps:

  • figure out which fields we want along w/ what additional info we need
  • implement discussed fields with getValue and getValues for their native type

stu-elastic added a commit to stu-elastic/elasticsearch that referenced this issue Oct 28, 2021
Add support for boolean fields with the following public API:

* `boolean getValue(boolean)`
  Return the first boolean or the given default

* `boolean getValue(int, boolean)`
  Return the boolean at index or the given default

* `boolean[] getValues()`
  Return a copy of the array of booleans, may return a zero length array.

Refs: elastic#79105
stu-elastic added a commit that referenced this issue Nov 4, 2021
Add support for boolean fields with the following public API:

* `boolean get(boolean)`
  Return the first boolean or the given default

* `boolean get(int, boolean)`
  Return the boolean at index or the given default

* `Iterator<Boolean> Iterator()`
  Return a copy of the array of booleans, may return a zero length array.

Refs: #79105
stu-elastic added a commit to stu-elastic/elasticsearch that referenced this issue Nov 29, 2021
* Adds DocValuesField for byte, double, float, integer, long and short

The integral types implement ScriptDocValues.Supplier<Long> while the
floating point types implement ScriptDocValues.Supplier<Double>.

Refs: elastic#79105
stu-elastic added a commit that referenced this issue Nov 30, 2021
)

Adds DocValuesField for byte, double, float, integer, long and short.

The integral types implement ScriptDocValues.Supplier<Long> while the
floating point types implement ScriptDocValues.Supplier<Double>.

Refs: #79105
stu-elastic added a commit that referenced this issue Dec 2, 2021
Add plumbing for ordinal field data for the field API.

The scripting fields API needs to know the mapped type of the each field in the document.  This is ensured by having a `ToScriptField` method reference passed from the `MappedField`, through the `IndexFieldData`, to the `LeafFieldData`.

Knowing the mapped type allows the API to provide relevant helper methods as well as appropriately use the fields available in the document.

Refs: #79105
stu-elastic added a commit to stu-elastic/elasticsearch that referenced this issue Dec 2, 2021
Adds support for keyword fields.

Refs: elastic#79105
stu-elastic added a commit that referenced this issue Dec 2, 2021
* Script: keyword fields api support

Adds basic support for keyword fields in the scripting fields api.

Refs: #79105
@jdconrad
Copy link
Contributor Author

jdconrad commented Dec 6, 2021

Pointer to (#81357) for unmapped source field access.

stu-elastic added a commit to stu-elastic/elasticsearch that referenced this issue Dec 6, 2021
Adds support in the scripting fields API for the `ip` mapped type,
including the runtime script type.

Adds a new value object, `IPAddress`, to avoid exposing Java's
`InetAddress`. `InetAddress` may cause name resolution if whitelisted
improperly.

`field('ip')`, implemented by `IpDocValuesField` exposes:
  `IPAddress get(IPAddress)`
  `IPAddress get(int, IPAddress)`
  `Iterator<IPAddress> iterator()`
  `List asStrings()`
  `String asString(String)`
  `String asString(int, String)`

`IPAddress` exposes:
  `boolean isV4()`
  `boolean isV6()`
  `String toString()`

Refs: elastic#79105
stu-elastic added a commit that referenced this issue Dec 7, 2021
* Script: fields API for IP mapped type

Adds support in the scripting fields API for the `ip` mapped type,
including the runtime script type.

Adds a new value object, `IPAddress`, to avoid exposing Java's
`InetAddress`. `InetAddress` may cause name resolution if whitelisted
improperly.

`field('ip')`, implemented by `IpDocValuesField` exposes:
  `IPAddress get(IPAddress)`
  `IPAddress get(int, IPAddress)`
  `Iterator<IPAddress> iterator()`
  `List asStrings()`
  `String asString(String)`
  `String asString(int, String)`

`IPAddress` exposes:
  `boolean isV4()`
  `boolean isV6()`
  `String toString()`

Refs: #79105
stu-elastic added a commit to stu-elastic/elasticsearch that referenced this issue Dec 7, 2021
@jdconrad
Copy link
Contributor Author

jdconrad commented Jan 4, 2022

Added a geo issue to discuss what we expose as part of the scripting fields API. (#82229)

stu-elastic added a commit that referenced this issue Jan 5, 2022
…81476)

Adds scripting fields API support the rest of the long fields:
* [`_version`](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) - `VersionDocValuesField`
* [`_seq_no`](https://www.elastic.co/guide/en/elasticsearch/reference/master/optimistic-concurrency-control.html) - `SeqNoDocValuesField`
* [`murmur3`](https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-murmur3-usage.html) - `Murmur3DocValueField`
  * Added Painless support to the murmur3 mapper plugin.

All `SortedNumericDocValues` that are interpreted as longs are now subclasses of `AbstractLongDocValuesField`, including murmur, doc version and seq no above as well as `LongDocValuesField` and `UnsignedLongDocValuesField`

Also adds:
* [x-pack's version](https://www.elastic.co/guide/en/elasticsearch/reference/master/version.html) - `VersionStringDocValuesField`
  * Created new `Version` value type as a location for future helpers for comparing versions. 
    * Implements `toString` for the expected representation of the version
  * Implements `asString(String)` and `asString(int, String)`, `asStrings()` converters on field.


Refs: #79105
stu-elastic added a commit to stu-elastic/elasticsearch that referenced this issue Jan 5, 2022
Adds the fields API for the constant_keyword field mapper

Moves implementation to `AbstractKeywordDocValuesField`, allowing
code sharing with `KeywordDocValuesField`.

API:
```
field('const').get('default')
field('const').get('default', 0)
```

Refs: elastic#79105
stu-elastic added a commit that referenced this issue Jan 6, 2022
* Script: fields API for x-pack constant_keyword

Adds the fields API for the constant_keyword field mapper

Moves implementation to `AbstractKeywordDocValuesField`, allowing
code sharing with `KeywordDocValuesField`.

API:
```
field('const').get('default')
field('const').get(0, 'default')
```

Refs: #79105
astefan pushed a commit to astefan/elasticsearch that referenced this issue Jan 7, 2022
…lastic#81476)

Adds scripting fields API support the rest of the long fields:
* [`_version`](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) - `VersionDocValuesField`
* [`_seq_no`](https://www.elastic.co/guide/en/elasticsearch/reference/master/optimistic-concurrency-control.html) - `SeqNoDocValuesField`
* [`murmur3`](https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-murmur3-usage.html) - `Murmur3DocValueField`
  * Added Painless support to the murmur3 mapper plugin.

All `SortedNumericDocValues` that are interpreted as longs are now subclasses of `AbstractLongDocValuesField`, including murmur, doc version and seq no above as well as `LongDocValuesField` and `UnsignedLongDocValuesField`

Also adds:
* [x-pack's version](https://www.elastic.co/guide/en/elasticsearch/reference/master/version.html) - `VersionStringDocValuesField`
  * Created new `Version` value type as a location for future helpers for comparing versions. 
    * Implements `toString` for the expected representation of the version
  * Implements `asString(String)` and `asString(int, String)`, `asStrings()` converters on field.


Refs: elastic#79105
astefan pushed a commit to astefan/elasticsearch that referenced this issue Jan 7, 2022
* Script: fields API for x-pack constant_keyword

Adds the fields API for the constant_keyword field mapper

Moves implementation to `AbstractKeywordDocValuesField`, allowing
code sharing with `KeywordDocValuesField`.

API:
```
field('const').get('default')
field('const').get(0, 'default')
```

Refs: elastic#79105
astefan pushed a commit to astefan/elasticsearch that referenced this issue Jan 7, 2022
…lastic#81476)

Adds scripting fields API support the rest of the long fields:
* [`_version`](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) - `VersionDocValuesField`
* [`_seq_no`](https://www.elastic.co/guide/en/elasticsearch/reference/master/optimistic-concurrency-control.html) - `SeqNoDocValuesField`
* [`murmur3`](https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-murmur3-usage.html) - `Murmur3DocValueField`
  * Added Painless support to the murmur3 mapper plugin.

All `SortedNumericDocValues` that are interpreted as longs are now subclasses of `AbstractLongDocValuesField`, including murmur, doc version and seq no above as well as `LongDocValuesField` and `UnsignedLongDocValuesField`

Also adds:
* [x-pack's version](https://www.elastic.co/guide/en/elasticsearch/reference/master/version.html) - `VersionStringDocValuesField`
  * Created new `Version` value type as a location for future helpers for comparing versions. 
    * Implements `toString` for the expected representation of the version
  * Implements `asString(String)` and `asString(int, String)`, `asStrings()` converters on field.


Refs: elastic#79105
astefan pushed a commit to astefan/elasticsearch that referenced this issue Jan 7, 2022
* Script: fields API for x-pack constant_keyword

Adds the fields API for the constant_keyword field mapper

Moves implementation to `AbstractKeywordDocValuesField`, allowing
code sharing with `KeywordDocValuesField`.

API:
```
field('const').get('default')
field('const').get(0, 'default')
```

Refs: elastic#79105
stu-elastic added a commit to stu-elastic/elasticsearch that referenced this issue Jan 13, 2022
The flattened field type exposes all leaf values
as keyword doc values.  Additionally, specific keys
are available via object dot notation.

For example:

```
{
  "flat": {
    "abc": "bar",
    "def": "foo",
    "hij": {
      "lmn": "pqr",
      "stu": 123
    }
  }
}

field('flat').get('default') // returns 123
field('flat.abc').get('default') // returns bar
```

API:
* `iterator()`
* `get(String default)`
* `get(String default, int index)`

Refs: elastic#79105
stu-elastic added a commit that referenced this issue Jan 18, 2022
* Script: fields API for flattened mapped type

The flattened field type exposes all leaf values
as keyword doc values.  Additionally, specific keys
are available via object dot notation.

For example:

```
{
  "flat": {
    "abc": "bar",
    "def": "foo",
    "hij": {
      "lmn": "pqr",
      "stu": 123
    }
  }
}

field('flat').get('default') // returns 123
field('flat.abc').get('default') // returns bar
```

API:
* `iterator()`
* `get(String default)`
* `get(String default, int index)`

Refs: #79105
@stu-elastic
Copy link
Contributor

Tracking issue for getting access to flattened fields keys: #43805

stu-elastic added a commit to stu-elastic/elasticsearch that referenced this issue Feb 3, 2022
Adds the fields API for `dense_vector` field mapper.

Adds a `DenseVector` interface for the value type.

Implemented by:
 * `KnnDenseVector` which wraps a decoded float array from `VectorValues`
 * `BinaryDenseVector` which lazily decodes a `BytesRef` from `BinaryDocValues`

The `DenseVector` API is simliar to `BinaryDenseVectorScriptDocValues`.

```
float getMagnitude();
double dotProduct(float[]);
double l1Norm(float[]);
double l2Norm(float[]);
float[] getVector();
int dims();

boolean isEmpty(); // does the value exist
int size();        // 0 if isEmpty(), 1 otherwise
PrimitiveIterator.OfDouble iterator()
```

The `DenseVectorDocValuesField` abstract class contains two getter APIS.
It is implemented by  `KnnDenseVectorDocValuesField` and
`BinaryDenseVectorDocValuesField`.

```
DenseVector get()
DenseVector get(DenseVector)
```

The `get()` method is included because there isn't a good default dense vector,
so that API returns an empty `DenseVector` which throws an
`IllegalArgumentException` for all method calls other than `isEmpty()`,
`size()` and `iterator()`.

The empty dense vector will always be `DenseVector.EMPTY` in case users want
to use equality checks.

Refs: elastic#79105
stu-elastic added a commit that referenced this issue Feb 16, 2022
Adds the fields API for `dense_vector` field mapper.

Adds a `DenseVector` interface for the value type.

Implemented by:
 * `KnnDenseVector` which wraps a decoded float array from `VectorValues`
 * `BinaryDenseVector` which lazily decodes a `BytesRef` from `BinaryDocValues`

The vector operations have moved into those implements from `BinaryDenseVectorScriptDocValues.java` and  `KnnDenseVectorScriptDocValues.java`, respectively.

The `DenseVector` API is:
```
float getMagnitude();
double dotProduct(float[] | List);
double l1Norm(float[] | List);
double l2Norm(float[] | List);
float[] getVector();
int dims();

boolean isEmpty(); // does the value exist
int size();        // 0 if isEmpty(), 1 otherwise
Iterator<Float> iterator()
```

`dotProduct`, `l1Norm` and `l2Norm` take a `float[]` or a `List` via the
a delegating `default` method on the `DenseVector` interface.

The `DenseVectorDocValuesField` abstract class contains two getter APIS.
It is implemented by  `KnnDenseVectorDocValuesField` and
`BinaryDenseVectorDocValuesField`.

```
DenseVector get()
DenseVector get(DenseVector defaultValue)
```

The `get()` method is included because there isn't a good default dense vector,
so that API returns an empty `DenseVector` which throws an
`IllegalArgumentException` for all method calls other than `isEmpty()`,
`size()` and `iterator()`.

The empty dense vector will always be `DenseVector.EMPTY` in case users want
to use equality checks.

Refs: #79105
probakowski pushed a commit to probakowski/elasticsearch that referenced this issue Feb 23, 2022
Adds the fields API for `dense_vector` field mapper.

Adds a `DenseVector` interface for the value type.

Implemented by:
 * `KnnDenseVector` which wraps a decoded float array from `VectorValues`
 * `BinaryDenseVector` which lazily decodes a `BytesRef` from `BinaryDocValues`

The vector operations have moved into those implements from `BinaryDenseVectorScriptDocValues.java` and  `KnnDenseVectorScriptDocValues.java`, respectively.

The `DenseVector` API is:
```
float getMagnitude();
double dotProduct(float[] | List);
double l1Norm(float[] | List);
double l2Norm(float[] | List);
float[] getVector();
int dims();

boolean isEmpty(); // does the value exist
int size();        // 0 if isEmpty(), 1 otherwise
Iterator<Float> iterator()
```

`dotProduct`, `l1Norm` and `l2Norm` take a `float[]` or a `List` via the
a delegating `default` method on the `DenseVector` interface.

The `DenseVectorDocValuesField` abstract class contains two getter APIS.
It is implemented by  `KnnDenseVectorDocValuesField` and
`BinaryDenseVectorDocValuesField`.

```
DenseVector get()
DenseVector get(DenseVector defaultValue)
```

The `get()` method is included because there isn't a good default dense vector,
so that API returns an empty `DenseVector` which throws an
`IllegalArgumentException` for all method calls other than `isEmpty()`,
`size()` and `iterator()`.

The empty dense vector will always be `DenseVector.EMPTY` in case users want
to use equality checks.

Refs: elastic#79105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement Meta Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

3 participants