Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry for invalid_grant errors - Silent Iframe #7218

Merged
merged 18 commits into from
Aug 7, 2024

Conversation

jo-arroyo
Copy link
Collaborator

@jo-arroyo jo-arroyo commented Jul 25, 2024

First of 3 PRs addressing need for retry for backup auth service.

@github-actions github-actions bot added the msal-browser Related to msal-browser package label Jul 25, 2024
@jo-arroyo jo-arroyo changed the title Retry for invalid_gratn errors - Silent Iframe Retry for invalid_grant errors - Silent Iframe Jul 25, 2024
@jo-arroyo jo-arroyo marked this pull request as ready for review July 29, 2024 22:05
@tnorling
Copy link
Collaborator

I wonder if we should log a flag to telemetry to explicitly indicate a retry was performed? Or is it preferable to determine this by building queries with the existing data? I expect this will have big perf implications if this is done at scale and it might be nice to have an easy way to track this. Just a suggestion but I'll leave that up to you and @konstantin-msft

@peterzenz
Copy link
Contributor

I wonder if we should log a flag to telemetry to explicitly indicate a retry was performed? Or is it preferable to determine this by building queries with the existing data? I expect this will have big perf implications if this is done at scale and it might be nice to have an easy way to track this. Just a suggestion but I'll leave that up to you and @konstantin-msft

All of these retries need an explicit indicator in telemetry that they are happening. It would likely be helpful to log the error returned for both the original call (invalid grant today, possibly others in the future?) as well as the retry result.

@konstantin-msft
Copy link
Collaborator

I wonder if we should log a flag to telemetry to explicitly indicate a retry was performed? Or is it preferable to determine this by building queries with the existing data? I expect this will have big perf implications if this is done at scale and it might be nice to have an easy way to track this. Just a suggestion but I'll leave that up to you and @konstantin-msft

All of these retries need an explicit indicator in telemetry that they are happening. It would likely be helpful to log the error returned for both the original call (invalid grant today, possibly others in the future?) as well as the retry result.

It makes sense to add a new telemetry data point that indicates a retry was performed. Error codes, in turn, are captured in the "context" field for each instrumented function. However, there are two scenarios here:

  • Both the original call and retry fail -> the context is always set. The retry error is placed in the errorCode field, while the first error can be found in the "context" field only.
  • Original call fails and retry succeeds -> the context is redacted for 90% of the calls. Hence, we won't be able to capture most of the original errors so counting their numbers by type won't be possible. So, it makes sense to add yet another telemetry data point that captures the original error code.

@github-actions github-actions bot added the msal-common Related to msal-common package label Jul 30, 2024
@jo-arroyo
Copy link
Collaborator Author

I wonder if we should log a flag to telemetry to explicitly indicate a retry was performed? Or is it preferable to determine this by building queries with the existing data? I expect this will have big perf implications if this is done at scale and it might be nice to have an easy way to track this. Just a suggestion but I'll leave that up to you and @konstantin-msft

All of these retries need an explicit indicator in telemetry that they are happening. It would likely be helpful to log the error returned for both the original call (invalid grant today, possibly others in the future?) as well as the retry result.

It makes sense to add a new telemetry data point that indicates a retry was performed. Error codes, in turn, are captured in the "context" field for each instrumented function. However, there are two scenarios here:

  • Both the original call and retry fail -> the context is always set. The retry error is placed in the errorCode field, while the first error can be found in the "context" field only.
  • Original call fails and retry succeeds -> the context is redacted for 90% of the calls. Hence, we won't be able to capture most of the original errors so counting their numbers by type won't be possible. So, it makes sense to add yet another telemetry data point that captures the original error code.

Thanks for the input, all! I added a new data point. Let me know if you prefer a different name.

Co-authored-by: Thomas Norling <thomas.norling@microsoft.com>
e instanceof ServerError &&
e.errorCode === BrowserConstants.INVALID_GRANT_ERROR
) {
if (!authClient) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional. You can simplify nested if conditions by joining them together into if (!authClient || !(e instanceof ServerError) || e.errorCode !== BrowserConstants.INVALID_GRANT_ERROR) { then throw e } so you won't need else.

@github-actions github-actions bot added the documentation Related to documentation. label Aug 6, 2024
@jo-arroyo jo-arroyo merged commit e25f917 into dev Aug 7, 2024
8 of 9 checks passed
@jo-arroyo jo-arroyo deleted the invalid-grant-retry-iframe branch August 7, 2024 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Related to documentation. msal-browser Related to msal-browser package msal-common Related to msal-common package
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants