Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Passport IAM Errors and AWS Alarm #2183

Open
2 tasks
larisa17 opened this issue Feb 16, 2024 · 0 comments · May be fixed by #2450
Open
2 tasks

Investigate Passport IAM Errors and AWS Alarm #2183

larisa17 opened this issue Feb 16, 2024 · 0 comments · May be fixed by #2450

Comments

@larisa17
Copy link
Collaborator

larisa17 commented Feb 16, 2024

User Story:

As a Passport user,
I want to receive clear and meaningful error messages when encountering issues, especially when claiming stamps,
so that I understand the reasons behind any issues and know what steps I may need to take next.

As a developer,
I want to ensure that any unhandled errors in IAM trigger a reliable alarm,
so that we can promptly address potential security or operational issues.

Acceptance Criteria

User-Facing Errors:
GIVEN I am a passport user,
WHEN I encounter an error while claiming a stamp,
THEN I should receive a meaningful message in the UI explaining why the error occurred,
AND this explanation should clarify why I did not qualify for the stamp,
AND the information should be sufficient for Passport support to guide me on necessary steps to qualify.

Developer-Facing Monitoring:
GIVEN I am a developer,
WHEN there are unhandled errors in the IAM,
THEN an alarm should be issued on PagerDuty (PD) to alert the development team.

Documentation and Monitoring Overview:
As part of managing this feature, it's crucial to maintain a current and comprehensive record of all monitoring configurations and their statuses. For each task or update, the Notion page on Passport Monitors & PD Alarms must be updated to reflect the latest state and provide an overview of the monitoring topic. This will ensure transparency and continuity in monitoring practices.

Product & Design Links:

Link to the relevant Discord thread for more details

Tech Details:

  • Alarm Management: Review the current AWS alarm settings to determine why the alarm was left active without resolution. Develop guidelines or automation to reset or update alarm states after a certain period or upon resolution of the issue.
  • Error Messaging: Implement enhanced error handling within the Passport application to capture and display errors more effectively to users. This should include contextual information specific to the IAM and stamp claiming processes.
  • Monitoring and Alerts:
    • Review and possibly redefine the parameters that trigger alarms related to IAM errors.
    • Ensure that alarms are actionable and that their triggers are clearly defined to avoid unnecessary alerts.

Open Questions:

  • What are the specific conditions under which the existing AWS alarm triggers, and are these conditions still relevant?
  • How can we automate the reset or deactivation of alarms to prevent them from remaining in an alert state indefinitely?

Notes/Assumptions:

  • Assume that the current infrastructure and monitoring tools are capable of being configured to meet the new requirements.
  • The effectiveness of the new error messages and alarm configurations will be evaluated periodically to ensure they meet user and developer needs.
@erichfi erichfi changed the title Investigate Passport IAM errors & AWS alarm Investigate Passport IAM Errors and AWS Alarm Apr 19, 2024
@tim-schultz tim-schultz self-assigned this May 10, 2024
@tim-schultz tim-schultz linked a pull request May 10, 2024 that will close this issue
@tim-schultz tim-schultz removed their assignment May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

2 participants