Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ingest pipelines] Test pipeline enhancements #74964

Conversation

alisonelizabeth
Copy link
Contributor

@alisonelizabeth alisonelizabeth commented Aug 13, 2020

This PR improves the pipeline debugging workflow in the ingest node pipelines UI.

At this point, I consider this feature functionally complete. However, I think there is still room for UX improvements, which I will continue to work on with Michael this week. Feel free to leave any thoughts on the PR as well.

Other notes:

  • There are some features that will be added as a separate PR, including the ability to simulate a pipeline from the pipeline table and/or details panel, and the ability to select an indexed document when testing a pipeline (Debug an ingest node pipeline using an indexed document  #74796).
  • Currently the tag field is added to the simulate API request as a way to map the results to each processor in the pipeline. If a user already has a tag field defined, it will be temporarily overridden for this call. I don't foresee any issues with this, but wanted to call it out.
  • [ES bug/limitation]: There is currently some unexpected behavior when testing a pipeline with a pipeline processor. The verbose response does not include the processor output in the response; it only returns the status. Also, if you test with a pipeline processor that results in an error, the simulate API times out with a 502 Bad Gateway error. Jake L. is out this week, but I will discuss this with him when he returns.

How to test

  1. Create an ingest node pipeline. For example:
PUT _ingest/pipeline/test
{
   "description": "_description",
    "processors": [
      {
        "set": {
          "field": "field1",
          "value": "value1"
        }
      },
      {
        "rename": {
          "field": "dont_exist",
          "target_field": "field1",
          "ignore_failure": true
        }
      },
      {
        "rename": {
          "field": "foofield",
          "target_field": "new_field",
          "on_failure": [
            {
              "set": {
                "field": "field2",
                "value": "value2"
              }
            }
          ]
        }
      },
      {
        "drop": {
          "if": "false"
        }
      },
      {
        "drop": {
          "if": "true"
        }
      }
    ]
}
  1. Use the Ingest Node Pipelines UI and go to the "Edit" page. Click the "Add documents" link, add some sample documents, and run the pipeline. For example:
// The first document in this example should trigger the on_failure processor, while the second one should succeed.
 [
    {
      "_index": "index",
      "_id": "id1",
      "_source": {
        "foo": "bar"
      }
    },
   {
      "_index": "index",
      "_id": "id2",
      "_source": {
        "foo": "baz",
        "foofield": "bar"
      }
    }
  ]
  1. Verify the results:
    • Verify the per processor status
    • View the "Output" tab per processor
    • Update/move/delete processors and verify simulation automatically runs and updates the status
    • Change the documents dropdown and verify the results

Screenshots

Screen Shot 2020-08-17 at 3 45 52 PM

Screen Shot 2020-08-17 at 3 16 26 PM

Screen Shot 2020-08-17 at 3 17 03 PM

Release note

This features refines the debugging user experience when creating or editing an ingest node pipeline in the existing Ingest Node Pipelines UI. Once a sample document(s) is provided, the pipeline is executed. The UI highlights the status of each processor, and shows the user how their sample documents change shape at each step in the pipeline.

@alisonelizabeth alisonelizabeth added release_note:enhancement v8.0.0 Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more v7.10.0 Feature:Ingest Node Pipelines Ingest node pipelines management labels Aug 13, 2020
@alisonelizabeth alisonelizabeth force-pushed the feature/test_pipeline_enhancements branch from b328e84 to 1f2ca4a Compare August 13, 2020 20:35
const outProcessor = {
[type]: {
...options,
},
};

if (onFailure?.length) {
outProcessor[type].on_failure = convertProcessors(onFailure);
} else if (onFailure) {
outProcessor[type].on_failure = [];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jloleysens can you verify I didn't introduce any regressions by deleting this line? It was causing issues when calling the simulate API, as the on_failure field cannot be an empty array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I think this should be fine. The thinking here was that we return to the caller the same set of values that was passed in - so if they passed in an empty array, we would return one. That being said, I am happy to collapse that value because the ES API accepts undefined for that value.

Not accepting an empty array is kinda weird though!

@alisonelizabeth alisonelizabeth marked this pull request as ready for review August 18, 2020 01:14
@alisonelizabeth alisonelizabeth requested a review from a team as a code owner August 18, 2020 01:14
@elasticmachine
Copy link
Contributor

Pinging @elastic/es-ui (Team:Elasticsearch UI)

@alisonelizabeth alisonelizabeth changed the title [Ingest pipelines] [WIP] Test pipeline enhancements [Ingest pipelines] Test pipeline enhancements Aug 18, 2020
Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @alisonelizabeth 🍾! I think this is a really exciting change and am eager to see it get to users :)

Bugs

While following the steps in the PR description I got to step 2, pressed the "Run the pipeline" button and nothing happened. This is what I saw in the console

Screenshot 2020-08-18 at 13 12 03

Code

I've left some non-blocking comments in the code, the one I think most useful to hear your thoughts on would be regarding moving the call to test pipeline context into pipeline_processors_editor and only using it from within there.

const {
state: { processors },
} = usePipelineProcessorsContext();
const { testPipelineData, setCurrentTestPipelineData } = useTestPipelineContext();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the TestPipelineContext is available to processors_header.tsx but I think it might provide a slightly better decoupling between processors_header and pipeline_processors_editor:

  1. Update the pipeline_processors_editor to just export <ManageDocumentsButton /> component (I made this up but it would wrap DocumentsDropdown and AddDocumentsButton ) which contains the logic for showing/hiding the flyout
  2. Only call useTestPipelineContext from within new 👆🏻 button component
  3. Drive the state updates to test pipeline context from within pipeline_processors_editor (call to setCurrentTestPipelineData).
  4. Remove DocumentsDropdown, AddDocumentsButton, TestPipelineFlyout and TestPipelineFlyoutTab from pipeline processors exports.

Not major, but I believe this would be cleaner. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. I agree; I think this would be cleaner. I think I may have started in this direction, but can't remember now why I pivoted 🤔 . I'll look into refactoring this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up creating a component called TestPipelineActions that renders the add documents button/documents dropdown and the output button. The pipeline processor editor exports this as well as the DocumentsDropdown, since the dropdown is also reused in the output tab of each processor. Let me know what you think!

<EuiFlexItem grow={false} className="documentsDropdown__selectContainer">
<EuiSelect
compressed
options={getDocumentOptions(documents!)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to remove the ! from documents!

label: string;
}

const mapProcessorStatusToIcon: Record<string, ProcessorStatusIcon> = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could string be typed as ProcessorStatus here?

}

export const PipelineProcessorsItemStatus: FunctionComponent<Props> = ({ processorStatus }) => {
const { icon, iconColor, label } = mapProcessorStatusToIcon[processorStatus];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; I think for future refactors it could make this code a bit safer if we define a fallback value for when processorStatus does not map to an expected value. Also happy if we leave that to throw an exception but if the value comes straight from ES slightly more robust might be good.

const { services } = useKibana();

const setCurrentTestPipelineData = useCallback((data: Action): void => {
dispatch(data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not necessary to wrap this dispatch in a useCallback as the function is not new on every render. At least according to this fella (https://kentcdodds.com/blog/how-to-use-react-context-effectively#what-about-dispatch-type-typos). So you can call dispatch directly.

const result = { ...currentResult };
const resultId = result.tag;

if (index !== 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we skip the first index here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first processor result would not have a previous result 😄 . I can update the code comment below to be more clear.

I also realized earlier today that this logic may need to be more complex. For example, if the previous processor was skipped there would also not be a previous result. Interested in your thoughts here - is this OK, or does it make more sense to display the last successful processor output?

return convertedProcessors;
};

export const serialize = ({ processors, onFailure }: SerializeArgs): SerializeResult => {
export const serialize = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; a default value and renaming to copyIdToTag could be extra clear :)

Then we document like

/**
  * param@ processor: The deserialized processor to convert
  * param@ copyIdToTag: For simulation, we add the "tag" field equal to the internal processor id so that we can map the simulate results to each processor
  */
const serialize = (processor: ProcessorInternal, copyIdToTag: boolean = false)

I think doing this on interface could be even better, but understand that is more of a refactor:

interface SerializeArgs {
  /**
     * The deserialized processor to convert
     */
  processor: ProcessorInternal;

  /**
     * For simulation, we add the "tag" field equal to the internal processor id so that we can map the simulate results to each processor
     */
  copyIdToTag: boolean;
}

export const serialize = ({
  pipeline: { processors, onFailure },
  copyIdToTag = false,
}: SerializeArgs): SerializeResult => {

@alisonelizabeth
Copy link
Contributor Author

@jloleysens thanks for the very helpful review! I will address the feedback today/tomorrow.

In the meantime, I'm not able to reproduce the error you are seeing. Are you running with the latest ES snapshot? Maybe we can zoom tomorrow if you're still having issues.

@jloleysens
Copy link
Contributor

Are you running with the latest ES snapshot? Maybe we can zoom tomorrow if you're still having issues.

Ah that could be the case! I thought my snapshot was recent enough, but it might be the case that it is not. Will re-test 👍

@alisonelizabeth
Copy link
Contributor Author

@jloleysens I believe I addressed all of your feedback. Let me know what you think, and reach out if you still run into issues testing. Thanks!

Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the PR feedback @alisonelizabeth ! This is looking really great.

I have a few UX related questions that I think we can address in (a) follow up PR(s):

As a user, it wasn't exactly clear to me how I could get out of live testing mode once I had added documents to test with. This is not a major issue but I thought could merit being intentional about in our design. The only instance I think something like this could be useful in is when a user wants to silence constant feedback or when they are working under high network latency conditions. I tried clearing the docs in the flyout but the form prevented me from doing so. I'd imagine clearing the docs should also clear all of the current processor statuses.

Screenshot 2020-08-19 at 16 04 21

I understand that being "dropped" is a state we get back from ES simulate, but perhaps in this case it would be better suited if the drop processor had a green tick (i.e., it did it's job).

Screenshot 2020-08-19 at 16 06 07

Bug

I don't mind if we address this in follow up PR, but if I submit a simulation with a badly configured processor currently the processor states do not update and the user does not receive an indication that their processor is incorrectly configured unless they try to create the pipeline or open the output flyout. With the following 👇🏻 configuration for "Set" and documents loaded up the UI still rendered a green tick (from my initial good configuration).

Screenshot 2020-08-19 at 16 26 08

Perhaps we should render the entire pipeline's processors in "not run" status when simulate returns a 400.

Component Integration Tests

I think it would be good to add them as the very next thing so that following PRs can benefit from them 👍

const outProcessor = {
[type]: {
...options,
},
};

if (onFailure?.length) {
outProcessor[type].on_failure = convertProcessors(onFailure);
} else if (onFailure) {
outProcessor[type].on_failure = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I think this should be fine. The thinking here was that we return to the caller the same set of values that was passed in - so if they passed in an empty array, we would return one. That being said, I am happy to collapse that value because the ES API accepts undefined for that value.

Not accepting an empty array is kinda weird though!

@alisonelizabeth
Copy link
Contributor Author

Thanks @jloleysens for the second review!

As a user, it wasn't exactly clear to me how I could get out of live testing mode once I had added documents to test with. This is not a major issue but I thought could merit being intentional about in our design. The only instance I think something like this could be useful in is when a user wants to silence constant feedback or when they are working under high network latency conditions. I tried clearing the docs in the flyout but the form prevented me from doing so. I'd imagine clearing the docs should also clear all of the current processor statuses.

This is a great point and not something I had thought about yet. I'll work with Michael on this and address in a separate PR.

I understand that being "dropped" is a state we get back from ES simulate, but perhaps in this case it would be better suited if the drop processor had a green tick (i.e., it did it's job).

I think I originally only showed success/fail statuses per processor (i.e., error_ignored, skipped, and dropped == success). However, when I held the initial stakeholder review, feedback I received was that it would be helpful to call out each status. The icons I currently have are probably not the best representation. I've mentioned this to Michael already, and he's going to look into it.

I don't mind if we address this in follow up PR, but if I submit a simulation with a badly configured processor currently the processor states do not update and the user does not receive an indication that their processor is incorrectly configured unless they try to create the pipeline or open the output flyout. With the following 👇🏻 configuration for "Set" and documents loaded up the UI still rendered a green tick (from my initial good configuration).

Good catch! For now, I reset the output when there is an error (back to "inactive" state), but I think there's probably more that could be done here to help the user and possibly surface the error.

I think it would be good to add them as the very next thing so that following PRs can benefit from them 👍

Agreed! Plan to work on tests next.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

@kbn/optimizer bundle module count

id value diff baseline
ingestPipelines 405 +15 390

async chunks size

id value diff baseline
ingestPipelines 672.8KB +29.3KB 643.4KB

page load bundle size

id value diff baseline
ingestPipelines 32.7KB -6.0B 32.7KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Ingest Node Pipelines Ingest node pipelines management release_note:enhancement Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more v7.10.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants