Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various issue with the sample code #20

Open
WesleyCharlesBlake opened this issue Mar 1, 2018 · 4 comments
Open

Various issue with the sample code #20

WesleyCharlesBlake opened this issue Mar 1, 2018 · 4 comments

Comments

@WesleyCharlesBlake
Copy link

WesleyCharlesBlake commented Mar 1, 2018

    • The code can post messages to the wrong SNS topic when retrying. It looks for the first SNS topic in the account that has a lambda function subscribed to it and posts the retry message to that topic. See line # 161 in index.py for the implementation. The SNS message that is received by the Lambda already has the ARN of the SNS topic.
    • The code does not do any kind of pagination against the ECS API when reading the list of EC2 instances. So if it couldn't find the instance ID that was about to be terminated on the first page, then the instance was not set to DRAINING and the end users would see 50X messages when the operation timed out and autoscaling killed the instance.
  1. There is a large amount of unused code and variables in the in the code. The index.py file can be knocked down by half. 👕 resolve lint errors in index.py #19 should resolve the unused variables.

    • The retry logic did not put in any kind of delay in place. The Lambda function would be invoked about 5-10 times a second, and each Lambda function invocation would provably make close to a dozen AWS API calls. This could result in accounts being throttled.
    • The .zip archive code differs to what is in VCS (particular there are changes to code/index.py. which are not included in the bundled archive. This could result in unexpected results for users following the blog article's recommendation
@ardelio
Copy link

ardelio commented Mar 4, 2018 via email

@hwatts
Copy link

hwatts commented Mar 15, 2018

I'd add to the issues that it discovers the ECS cluster name by parsing userdata and looking for text strings.

This has potential to go wrong when used in environments that may be bootstrapping other things in userdata and given that this example is used in a CFN script that also creates the ECS Cluster (e.g. the name's already known), this should be passed as an environment variable to the lambda function.

@danspain-plutoflume
Copy link

I'd add to the issues that it discovers the ECS cluster name by parsing userdata and looking for text strings.

Yes, I've run in to this issue, It fails to find the cluster name if the user data is gzip encoded.

@masneyb
Copy link
Contributor

masneyb commented Apr 16, 2018

The original bug report text in this issue is what I submitted to AWS support on 2018-02-26. Here is a pull request that addresses all of these issues: https://github.com/aws-samples/ecs-cid-sample/pull/23/files. The Lambda function in my new version is now small enough that I'm including it inline within the CloudFormation template so that there are no separate ZIP files to upload.

Brian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants