Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vanilla: return volume path as part of ListVolumes response for migrated in-tree volumes #2558

Merged

Conversation

adikul30
Copy link
Contributor

@adikul30 adikul30 commented Sep 18, 2023

What this PR does / why we need it:
ListVolumes call returns volumeID to nodeID mapping even in case of migrated in-tree volumes. In case of migrated volumes, external-attacher will look for the initial volumePath in the ListVolumes response. Since vSphere CSI Driver's ListVolumes response doesn't emit the volumePath for migrated volumes, external-attacher continuously makes AttachVolume calls to CSI and the VC UI gets filled with AttachVolume tasks.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged):
fixes #2534

Testing done:

Manual

#2558 (comment)

block vanilla precheckin

Build #2441 (Sep 21, 2023, 7:20:34 PM)
adkulkarni
PR 2558
------------------------------ Ran 1 of 803 Specs in 368.841 seconds SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 802 Skipped PASS Ginkgo ran 1 suite in 18m43.833249849s Test Suite Passed -- ------------------------------ Ran 13 of 803 Specs in 6636.796 seconds SUCCESS! -- 13 Passed | 0 Failed | 0 Pending | 790 Skipped PASS Ginkgo ran 1 suite in 1h51m53.279468293s Test Suite Passed -- Ran 50 of 803 Specs in 4666.036 seconds FAIL! -- 49 Passed | 1 Failed | 0 Pending | 753 Skipped --- FAIL: TestE2E (4666.33s) FAIL Ginkgo ran 1 suite in 1h29m46.981350902s Test Suite Failed 

Build #2442 (Sep 21, 2023, 11:26:18 PM)
adkulkarni
PR 2558
------------------------------ Ran 1 of 803 Specs in 103.482 seconds SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 802 Skipped PASS Ginkgo ran 1 suite in 17m37.383719365s Test Suite Passed 

vcp to csi

Build #627 (Sep 22, 2023, 12:10:08 AM)
adkulkarni
PR 2558
Nimbus testbed password: KWAE0smy5QKw-*wd
------------------------------ Ran 20 of 803 Specs in 4391.390 seconds SUCCESS! -- 20 Passed | 0 Failed | 0 Pending | 783 Skipped PASS Ginkgo ran 1 suite in 1h14m45.85898182s Test Suite Passed 

Special notes for your reviewer:

Release note:

vanilla: return volume path as part of ListVolumes response for migrated in-tree volumes

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 18, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @adikul30. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 18, 2023
@adikul30 adikul30 changed the title Return volume path as part of ListVolumes response for migrated volumes (wip) vanilla: return volume path as part of ListVolumes response for migrated volumes Sep 18, 2023
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 18, 2023
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 19, 2023
@adikul30
Copy link
Contributor Author

adikul30 commented Sep 19, 2023

root@k8s-control-689-1695085409:~# k describe pv
Name:            pvc-69d697b0-d048-4869-a037-e3803ed8ee6c
Labels:          <none>
Annotations:     pv.kubernetes.io/migrated-to: csi.vsphere.vmware.com
                 pv.kubernetes.io/provisioned-by: kubernetes.io/vsphere-volume
                 volume.kubernetes.io/provisioner-deletion-secret-name:
                 volume.kubernetes.io/provisioner-deletion-secret-namespace:
Finalizers:      [kubernetes.io/pv-protection external-attacher/csi-vsphere-vmware-com]
StorageClass:    vcp-sc
Status:          Bound
Claim:           default/vcp-pvc
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        2Gi
Node Affinity:   <none>
Message:
Source:
    Type:               vSphereVolume (a Persistent Disk resource in vSphere)
    VolumePath:         [vsanDatastore] 95e90965-9a8b-888d-8057-0200ae74606b/f961f10c92e6403bb1054b0f6682e08f.vmdk
    FSType:             ext4
    StoragePolicyName:
Events:                 <none>


Name:            pvc-ef31dafa-b061-4035-95c8-687c490afc26
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: csi.vsphere.vmware.com
                 volume.kubernetes.io/provisioner-deletion-secret-name:
                 volume.kubernetes.io/provisioner-deletion-secret-namespace:
Finalizers:      [kubernetes.io/pv-protection external-attacher/csi-vsphere-vmware-com]
StorageClass:    example-raw-block-sc
Status:          Bound
Claim:           default/example-raw-block-pvc
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Block
Capacity:        5Gi
Node Affinity:   <none>
Message:
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            csi.vsphere.vmware.com
    FSType:
    VolumeHandle:      cb683610-6fe9-4946-98c7-4d5af6e9c350
    ReadOnly:          false
    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=1695329784127-9375-csi.vsphere.vmware.com
                           type=vSphere CNS Block Volume
Events:                <none>
---
2023-09-21T21:45:58.035Z	DEBUG	vanilla/controller.go:2566	List volume response: entries:<volume:<volume_id:"cb683610-6fe9-4946-98c7-4d5af6e9c350" > status:<published_node_ids:"42210368-b370-ef50-0f72-893f7216b46d" > > entries:<volume:<volume_id:"[vsanDatastore] 95e90965-9a8b-888d-8057-0200ae74606b/f961f10c92e6403bb1054b0f6682e08f.vmdk" > status:<published_node_ids:"4221a956-0581-ac5a-51a6-c454ebcf1b36" > > 	{"TraceId": "c750e4db-1c8b-4b86-9adf-6b560cb7afbc"}

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 21, 2023
@adikul30 adikul30 force-pushed the list-vol-vcp-csi-migration branch 2 times, most recently from 5fdc404 to 8fb07ec Compare September 21, 2023 08:01
@divyenpatel
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 21, 2023
@adikul30 adikul30 force-pushed the list-vol-vcp-csi-migration branch 3 times, most recently from eb124da to c8ada38 Compare September 21, 2023 21:49
@svcbot-qecnsdp
Copy link

Started Vanilla block pre-checkin pipeline... Build Number: 2439

@svcbot-qecnsdp
Copy link

Build ID: 2439
Block vanilla build status: FAILURE 
Stage before exit: e2e-tests 
Jenkins E2E Test Results: 
------------------------------

Ran 1 of 803 Specs in 365.815 seconds
SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 802 Skipped
PASS

Ginkgo ran 1 suite in 17m35.344693425s
Test Suite Passed
--
Ran 13 of 803 Specs in 9112.987 seconds
FAIL! -- 7 Passed | 6 Failed | 0 Pending | 790 Skipped
--- FAIL: TestE2E (9113.17s)
FAIL

Ginkgo ran 1 suite in 2h33m20.255389531s

Test Suite Failed

volumeId := blockVolID
migratedVolumePath, err := volumeMigrationService.GetVolumePath(ctx, blockVolID, false)
if err != nil {
log.Errorf("failed to get volumePath for migrated volume. err: %v", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If GetVolumePath fails for a migrated volume, we just log an error and proceed with line: 2655. Don't we have to return error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the modified GetVolumePath method (passing a flag to only check the in-memory cache to see if it's an in-tree volume), we are simply going through a map and returning true or false. No error is returned.

So, I'm not expecting the control to reach this line unless something weird like volumeMigrationService is not initialized. Are you referring to such an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on Divyen's above comment, I am changing the method to return ErrNotFound when we don't get a cache hit. But I still don't expect an error other than that one. Can you confirm if we should return an error in ListVolumes response if we get an error other than ErrNotFound?

@adikul30 adikul30 force-pushed the list-vol-vcp-csi-migration branch 2 times, most recently from 415401a to 877941c Compare September 22, 2023 19:37
Copy link
Member

@divyenpatel divyenpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 22, 2023
Copy link
Collaborator

@chethanv28 chethanv28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments

Copy link
Collaborator

@chethanv28 chethanv28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have few minor comments after the latest commit.
Overall, code changes look good to me
/approve

Copy link
Collaborator

@chethanv28 chethanv28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 22, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adikul30, chethanv28, divyenpatel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [chethanv28,divyenpatel]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 255a239 into kubernetes-sigs:master Sep 22, 2023
12 checks passed
chethanv28 pushed a commit to chethanv28/vsphere-csi-driver that referenced this pull request Sep 22, 2023
k8s-ci-robot pushed a commit that referenced this pull request Sep 25, 2023
* skip adding node to cache during attach or detach operation (#2546)

* Return volume path as part of ListVolumes response for migrated volumes (#2558)

---------

Co-authored-by: Divyen Patel <divyenp@vmware.com>
Co-authored-by: Aditya Kulkarni <adkulkarni@vmware.com>
chethanv28 pushed a commit to chethanv28/vsphere-csi-driver that referenced this pull request Sep 25, 2023
k8s-ci-robot pushed a commit that referenced this pull request Sep 26, 2023
…related to AttachVolumes and also the nodes cache fix during attach/detach. (#2561)

* Determine cluster distribution server version & type using kubernetes API server version (#2404)

* Determine cluster distribution server version & type using kubernetes API server version

* Exit the container if we are not able to fetch the k8s client

* break the loop as soon as we find the distribution type

* skip adding node to cache during attach or detach operation (#2546)

* Return volume path as part of ListVolumes response for migrated volumes (#2558)

---------

Co-authored-by: Divyen Patel <divyenp@vmware.com>
Co-authored-by: Aditya Kulkarni <adkulkarni@vmware.com>
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-3.1.1-candidate size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
5 participants