Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support InferenceComponent CRD. #260

Merged
merged 11 commits into from
Mar 11, 2024

Conversation

suryans-commit
Copy link
Member

@suryans-commit suryans-commit commented Feb 27, 2024

Description of changes:

  • Add new InferenceComponent CRD
  • Add unit and e2e tests

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@suryans-commit
Copy link
Member Author

/hold

@ack-prow ack-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 27, 2024
@suryans-commit
Copy link
Member Author

/retest

@suryans-commit
Copy link
Member Author

/unhold

@ack-prow ack-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 6, 2024
@suryans-commit
Copy link
Member Author

/retest

4 similar comments
@suryans-commit
Copy link
Member Author

/retest

@suryans-commit
Copy link
Member Author

/retest

@suryans-commit
Copy link
Member Author

/retest

@suryans-commit
Copy link
Member Author

/retest

assert deleted


@pytest.fixture(scope="module")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a module wide fixture?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to reuse the endpoint within the inference component test and thus it has a module wide scope, similar to all other components.

var lastSpecForUpdateString *string = nil
// get last endpoint config name used for update from annotations
annotations := desired.ko.ObjectMeta.GetAnnotations()
for k, v := range annotations {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a for loop needed for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean, why loop instead of directly retrieving the value? I can make that change in the next release since functionally there is no change, and this release is time sensitive. Let me know if you think otherwise.

// Update is blocked in the following cases:
// 1. while InferenceComponentStatus != InService (handled by requeueUntilCanModify method).
// 2. InferenceComponentStatus == Failed.
// 3. A previous update to the InferenceComponent with same spec failed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit curious about this, if this isnt checked will ack keep performing reconciliation loops as desired != latest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the reason for checking this condition. Since InferenceComponent is a Blue/Green deployment if the Update fails it will rollback to last successful state with Status = InService. So ACK will re-trigger the update. After discussion with product it was decided to move the resource to terminal state if an update fails.

// 2. desiredSpec == lastSpecForUpdate only tells us an update was tried with lastSpecForUpdate
// but does not tell us anything if the update was successful or not in the past because
// it is set if updateInferenceComponent returns 200 (async operation).
// 3. Now, sdkUpdate can execute because of change in any field in Spec (like tags/deploymentConfig in future)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Based off line 65/105 tags are ignored in the spec comparison.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since tags are ignored, that is why SDK tags will trigger an update.

}

// EqualInferenceComponentSpec checks if two InferenceComponentSpec instances are equal
func EqualInferenceComponentSpec(desiredSpec *svcapitypes.InferenceComponentSpec,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the metadata like annotations also compared here or is it just the Spec's API shape?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No metadata is not part of the InferenceComponentSpec struct.

template_path: common/sdk_delete_pre_build_request.go.tpl
sdk_delete_post_request:
template_path: common/sdk_delete_post_request.go.tpl
fields:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that theres a CurrentCopyCount(https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker/client/describe_inference_component.html), does that need to be included here as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since that is not a read only field and can be updated by the customer.

@ananth102
Copy link
Member

looks good at a high level

@suryans-commit
Copy link
Member Author

/retest

func EqualInferenceComponentSpec(desiredSpec *svcapitypes.InferenceComponentSpec,
lastSpec *svcapitypes.InferenceComponentSpec) bool {
if desiredSpec == nil || lastSpec == nil {
return desiredSpec == lastSpec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this because .DeepEqual cannot handle nil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Reflect package does not handle nil pointers gracefully.

The reason for the nil check before using reflect.DeepEqual is to prevent a potential panic caused by attempting to dereference a nil pointer. If either desiredSpec or lastSpec is nil, attempting to pass them to reflect.DeepEqual directly would result in a runtime panic because reflect.DeepEqual expects both of its arguments to be non-nil.

@ryansteakley
Copy link
Member

overall looks good. Have you verified the status of the IC prints in the console when using kubectl?

@suryans-commit
Copy link
Member Author

suryans-commit commented Mar 11, 2024

overall looks good. Have you verified the status of the IC prints in the console when using kubectl?

Yes, have verified creating IC via kubectl and the corresponding prints in the console.

Copy link
Member

@a-hilaly a-hilaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @suryans-commit !
lgtm-ing as implementation is very similar to the existing resources.
/lgtm

@ack-prow ack-prow bot added the lgtm Indicates that a PR is ready to be merged. label Mar 11, 2024
Copy link

ack-prow bot commented Mar 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: a-hilaly, suryans-commit

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ack-prow ack-prow bot added the approved label Mar 11, 2024
@ack-prow ack-prow bot merged commit ae1e108 into aws-controllers-k8s:main Mar 11, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants