Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: creating S3 backed aws_appconfig_configuration_profile hits IAM trying to assume role error #38206

Open
reikje opened this issue Jul 1, 2024 · 7 comments
Labels
bug Addresses a defect in current functionality. needs-triage Waiting for first response or review from a maintainer. service/appconfig Issues and PRs that pertain to the appconfig service. waiting-response Maintainers are waiting on response from community or contributor.

Comments

@reikje
Copy link

reikje commented Jul 1, 2024

Terraform Core Version

1.6.1

AWS Provider Version

5.52.0

Affected Resource(s)

  • aws_appconfig_configuration_profile

Expected Behavior

When creating the aws_appconfig_configuration_profile, the IAM role specified via the retrieval_role_arn can be assumed properly for creating the resource.

Actual Behavior

A permission error happens occasionally.

As a workaround, I added a time_sleep dependency of 10 seconds to the aws_appconfig_configuration_profile which seem to improve things, but still fails for the same reason from time to time.

Relevant Error/Panic Output Snippet

╷
│ Error: creating AppConfig Configuration Profile (s3) for Application (3atugpj): operation error AppConfig: CreateConfigurationProfile, https response error StatusCode: 400, RequestID: f77a17f6-b682-49f1-b93e-a59d26d3aca8, BadRequestException: Error trying to assume role arn:aws:iam::<account>:role/<resource>
│ 
│   with module.infrastructure.aws_appconfig_configuration_profile.default,
│   on ../common/integration/app_config.tf line 13, in resource "aws_appconfig_configuration_profile" "default":
│   13: resource "aws_appconfig_configuration_profile" "default" {
│ 
╵
Error: Terraform exited with code 1.
Error: Process completed with exit code 1.

Terraform Configuration Files

locals {
  resource_prefix = "namespace-project"
  tags = {
    CreatedBy = "Terraform"
  }
}

resource "aws_s3_bucket" "config" {
  bucket          = "${local.resource_prefix}-config"
  force_destroy   = true
  tags            = local.tags
}

resource "aws_s3_bucket_versioning" "config" {
  bucket = aws_s3_bucket.config.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_iam_role" "config" {
  name = "${local.resource_prefix}-config"
  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Action = "sts:AssumeRole",
        Effect = "Allow",
        Principal = {
          Service = "appconfig.amazonaws.com"
        }
      }
    ]
  })
  tags = local.tags
}

resource "aws_iam_policy" "config" {
  name   = "${local.resource_prefix}-config"

  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Effect = "Allow",
        Action = [
          "s3:*"
        ],
        Resource = [
          "${aws_s3_bucket.config.arn}",
          "${aws_s3_bucket.config.arn}/*"
        ]
      }
    ]
  })
}

resource "aws_iam_role_policy_attachment" "config" {
  policy_arn = aws_iam_policy.config.arn
  role       = aws_iam_role.config.name
}

resource "aws_appconfig_application" "default" {
  name        = local.resource_prefix
  depends_on  = [aws_iam_role.config]
  tags        = local.tags
}

resource "aws_s3_object" "peers_config" {
  bucket      = aws_s3_bucket_versioning.config.id
  key         = "peers.json"
  content     = jsonencode([{id = "foo", name = "Bar"}])
  depends_on  = [aws_s3_bucket_versioning.config]
}

resource "aws_appconfig_configuration_profile" "default" {
  application_id      = aws_appconfig_application.default.id
  name                = "s3"
  location_uri        = "s3://${local.resource_prefix}-config/${local.peers_json}"
  retrieval_role_arn  = aws_iam_role.config.arn
  tags                = local.tags
}

resource "aws_appconfig_environment" "default" {
  name            = "dev"
  description     = "AppConfig Environment (dev)"
  application_id  = aws_appconfig_application.default.id
  tags            = local.tags
}

resource "aws_appconfig_deployment" "default" {
  application_id            = aws_appconfig_application.default.id
  environment_id            = aws_appconfig_environment.default.environment_id
  configuration_profile_id  = aws_appconfig_configuration_profile.default.configuration_profile_id
  configuration_version     = aws_s3_object.peers_config.version_id
  deployment_strategy_id    = "AppConfig.AllAtOnce"

  tags = local.tags
}

Steps to Reproduce

Use the given snippet to create an AppConfig having a configuration profile that is backed by an object inside an S3 bucket.

Debug Output

No response

Panic Output

No response

Important Factoids

This error doesn't happen all the time. There must be some sort of race condition that occasionally makes this fail. Expecting the IAM role to be not fully propagated and ready when AppConfig already tries to use it for creating the configuration profile.

References

No response

Would you like to implement a fix?

None

@reikje reikje added the bug Addresses a defect in current functionality. label Jul 1, 2024
Copy link

github-actions bot commented Jul 1, 2024

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added service/appconfig Issues and PRs that pertain to the appconfig service. service/iam Issues and PRs that pertain to the iam service. service/s3 Issues and PRs that pertain to the s3 service. labels Jul 1, 2024
@terraform-aws-provider terraform-aws-provider bot added the needs-triage Waiting for first response or review from a maintainer. label Jul 1, 2024
@justinretzolk
Copy link
Member

Hey @reikje 👋 Thank you for taking the time to raise this!

I often find that intermittent failures like this are caused by the order in which resources are created, so I traced the dependency graph a bit, starting at the failure point. The aws_appconfig_configuration_profile.default resource implicitly depends on aws_appconfig_application.default and aws_iam_role.config due to interpolations. It's not, however, dependent on aws_iam_policy.config or aws_iam_role_policy_attachment.config, so there's no guarantee that the policy will be attached to the role before aws_appconfig_configuration_profile.default is created. The error message isn't particularly helpful in this case, but could that be the cause here? If so, adding an explicit depends_on could help.

If that doesn't help, are you able to supply debug logs (redacted as needed), in case whoever picks this up needs that information?

@justinretzolk justinretzolk added waiting-response Maintainers are waiting on response from community or contributor. and removed service/iam Issues and PRs that pertain to the iam service. service/s3 Issues and PRs that pertain to the s3 service. labels Jul 9, 2024
@reikje
Copy link
Author

reikje commented Jul 12, 2024

@justinretzolk that seem to do the trick, thanks a million. Curious how you traced the dependency graph? I'd like to learn how to find these missing dependencies. Could you also tell me how to get access to Terraform debug logs?

@github-actions github-actions bot removed the waiting-response Maintainers are waiting on response from community or contributor. label Jul 12, 2024
@reikje
Copy link
Author

reikje commented Jul 12, 2024

I was a bit quick here.

Error: creating AppConfig Configuration Profile (s3) for Application (13i1pwq): 
operation error AppConfig: CreateConfigurationProfile, https response error StatusCode: 400, 
RequestID: d34d0aa0-927c-485a-bc7c-8a08deffb9b4, BadRequestException: An error occurred when calling 
the S3 GetBucketLocation operation: Access Denied (Service: Amazon S3; Status Code: 403; 
Error Code: AccessDenied; Request ID: 6WJX54XPD4VXG9MS; S3 Extended Request ID: C4poF3jKvfNS/gHeLLAll2A0xzPCCSS/v+ou70Wmbzj9zZLVE6uWookF/KVUsYonChG2UbJXNZ0=; Proxy: null)

   with module.infrastructure.aws_appconfig_configuration_profile.default,
   on ../common/integration/app_config.tf line 8, in resource "aws_appconfig_configuration_profile" "default":
    8: resource "aws_appconfig_configuration_profile" "default" {

after adding:

resource "aws_appconfig_configuration_profile" "default" {
  ...
  depends_on          = [aws_iam_policy.config, aws_iam_role_policy_attachment.config]
}

I'll investigate :)

@reikje
Copy link
Author

reikje commented Jul 15, 2024

Debug logs: tf_log.txt

@justinretzolk
Copy link
Member

Thanks for the updated information here @reikje! I took a look over the logging that you sent over and wanted to call out two lines that were of particular interest to me:

# aws_iam_role_policy_attachment.config creation completion
2024-07-15T08:48:58.164+0200 [DEBUG] provider.terraform-provider-aws_v5.58.0_x5: HTTP Response Received: rpc.service=IAM rpc.system=aws-api tf_aws.sdk=aws-sdk-go-v2 tf_provider_addr=registry.terraform.io/hashicorp/aws http.response.header.date="Mon, 15 Jul 2024 06:48:58 GMT" rpc.method=ListAttachedRolePolicies aws.region=us-east-1 http.response.header.content_type=text/xml http.response.header.x_amzn_requestid=bf5444b5-e034-49ed-bf8a-520044640200 http.response_content_length=581 tf_aws.signing_region="" @module=aws tf_resource_type=aws_iam_role_policy_attachment tf_rpc=ApplyResourceChange http.status_code=200 tf_mux_provider="*schema.GRPCProviderServer" tf_req_id=0960eace-301f-4dad-3cd2-4420971fdcaf @caller=github.com/hashicorp/aws-sdk-go-base/v2@v2.0.0-beta.54/logging/tf_logger.go:45 http.duration=153

# aws_appconfig_configuration_profile.default creation attempt
2024-07-15T08:49:00.524+0200 [DEBUG] provider.terraform-provider-aws_v5.58.0_x5: HTTP Request Sent: rpc.system=aws-api http.request.header.amz_sdk_invocation_id=64e01ef7-be26-4a5d-b839-d2ba4636a027

If you attempt a subsequent apply shortly after this one fails, does the apply succeed as you'd expect? Given that the IAM change completes ~2 seconds before it's attempted to be used, I suspect you may be running into eventual consistency issues.

If that does seem to be the case, I don't suspect a code change (to the provider) will be the right path, so the best course of action is likely going to be to refactor the configuration a bit. You could do this via two separate applies (generally speaking, my recommendation) or by trying to introduce a bit more delay between those particular resources (e.g. separating the IAM resources into a different module, so that the entire module must complete before the aws_appconfig_configuration_profile creation is attempted). In my opinion, the latter would probably be flakier.

As far as tracing dependencies, the following resources offer, I think, a pretty comprehensive overview of understanding dependencies in Terraform.

@justinretzolk justinretzolk added the waiting-response Maintainers are waiting on response from community or contributor. label Jul 23, 2024
@reikje
Copy link
Author

reikje commented Jul 23, 2024

Thanks @justinretzolk we worked around the eventual consistency issue by no longer basing our AppConfig on an S3 object. Instead we switched to the hosted configuration, which doesn't have the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Addresses a defect in current functionality. needs-triage Waiting for first response or review from a maintainer. service/appconfig Issues and PRs that pertain to the appconfig service. waiting-response Maintainers are waiting on response from community or contributor.
Projects
None yet
Development

No branches or pull requests

2 participants