Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENI is not deleted when sagemaker notebook instance is destroyed and a security group is attached to it #26090

Closed
DavideAG opened this issue Aug 2, 2022 · 2 comments · Fixed by #26553
Labels
bug Addresses a defect in current functionality. service/sagemaker Issues and PRs that pertain to the sagemaker service. service/vpc Issues and PRs that pertain to the vpc service.
Milestone

Comments

@DavideAG
Copy link

DavideAG commented Aug 2, 2022

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

  • Terraform v1.1.9
  • provider registry.terraform.io/hashicorp/aws v4.24.0

Affected Resource(s)

  • aws_sagemaker_notebook_instance
  • aws_security_group

Terraform Configuration Files

Actually I'm using a simple configuration with two different modules.
The idea is to create an AWS sagemaker notebook instance in a subnet and then attach a SG to it.

module "sagemaker_notebook_instance" {
  source                   = "../../../modules/sagemaker-notebookinstance"
  instance_type            = "ml.t2.medium"
  name                     = "test-sg"
  instance_subnet_id       = "subnet-0751a8c66XXXXXXXX" # hidden
  instance_security_groups = [module.security-group.id]
}

module "security-group" {
  source      = "../../../modules/security-group"
  name        = "sagemaker-test-sg"
  description = "sagemaker-test security group"
  vpc_id      = "vpc-06c413a2eXXXXXXXX" # hidden
}

Some detail about the two modules

resource "aws_sagemaker_notebook_instance" "sagemaker_instance" {
  for_each                = var.instance_ids
  instance_type           = var.instance_type
  name                    = join("-", [module.resource_naming.composed_resource_name, each.value])
  role_arn                = aws_iam_role.default.arn
  lifecycle_config_name   = var.create_lifecycle ? aws_sagemaker_notebook_instance_lifecycle_configuration.sagemaker_lifecycle_conf[0].name : ""
  default_code_repository = var.create_git_config ? aws_sagemaker_code_repository.repo[0].code_repository_name : ""

  subnet_id              = var.instance_subnet_id
  security_groups        = var.instance_security_groups
  kms_key_id             = var.instance_kms_key_id
  direct_internet_access = var.instance_direct_internet_access

  lifecycle {
    create_before_destroy = true
    ignore_changes        = []
  }
}
resource "aws_security_group" "self_sg" {
  name_prefix = module.resource_naming.composed_resource_name
  description = var.description
  vpc_id      = var.vpc_id

  ingress {
    description = "Allow inbound traffic from network interfaces (and their associated instances) that are assigned to the same security group."
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    self        = true
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  lifecycle {
    create_before_destroy = true
  }
}

Expected Behavior

ENI should be removed when terraform tries to destroy our sagemaker notebook instance, in order to avoid dependency violations when security group is deleted. Actually, this creates a deadlock and our security group cannot be deleted.

The basic idea is that we need to delete ENI resource before trying to delete the attached SG. In the following AWS provider doc is described the process to use in order to avoid stucks during SG config changes, but is not mentioned how to overcome a fully deletion problem

Actual Behavior

After a bit, terraform destroy fails because ENI still exists and security group cannot be deleted.

...
module.security-group.aws_security_group.self_sg: Still destroying... [id=sg-08cb8e3c478XXXXXX, 14m30s elapsed]
module.security-group.aws_security_group.self_sg: Still destroying... [id=sg-08cb8e3c478XXXXXX, 14m40s elapsed]
module.security-group.aws_security_group.self_sg: Still destroying... [id=sg-08cb8e3c478XXXXXX, 14m50s elapsed]
╷
│ Error: deleting Security Group (sg-08cb8e3c478XXXXXX): DependencyViolation: resource sg-08cb8e3c478XXXXXX has a dependent object
│       status code: 400, request id: 96f38922-3170-4dec-93c7-92c39XXXXXX
│
│
╵

Steps to Reproduce

  1. terraform apply
  2. terraform destroy

References

Probably this issue is similar to:

@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/sagemaker Issues and PRs that pertain to the sagemaker service. service/vpc Issues and PRs that pertain to the vpc service. labels Aug 2, 2022
@justinretzolk justinretzolk added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Aug 22, 2022
@github-actions github-actions bot added this to the v4.29.0 milestone Aug 31, 2022
@github-actions
Copy link

github-actions bot commented Sep 2, 2022

This functionality has been released in v4.29.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

github-actions bot commented Oct 3, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/sagemaker Issues and PRs that pertain to the sagemaker service. service/vpc Issues and PRs that pertain to the vpc service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants