-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ucx support (prototype) #18631
Open
lucyge2022
wants to merge
1
commit into
Alluxio:main
Choose a base branch
from
lucyge2022:ucx_squash
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add ucx support (prototype) #18631
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Author: Lucy Ge <lucy.ge@alluxio.com> Date: Mon Jun 24 15:23:06 2024 -0700 Squashed commit of the following: commit 36f0ea4 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Thu Dec 14 15:31:43 2023 -0800 add test to alloc gpu mem and call readRMA, mark test ignore for now whenever there's available hardward fabric env commit f0ba8ac Author: Lucy Ge <lucy.ge@alluxio.com> Date: Tue Dec 12 17:07:21 2023 -0800 1. make LocalCacheManager.cache() available in interface 2. allocate direct ByteBuffer and then register with UcpMemory to provide UcpMemory wrapped buffer to avoid ucx failure to allocate user mem ( mm_sysv.c:114 UCX ERROR failed to allocate 4096 bytes with mm for user memory ) 3. make UcxReadTest#testClientServerr do random unaligned read, and add testStandaloneServer as a sanity test for standalone UcpServer 4. downgrade debugging logs' level from info to debug 5. remove standalone testing process class UcpClientTest commit 7154d4f Author: Lucy Ge <lucy.ge@alluxio.com> Date: Fri Dec 8 22:47:37 2023 -0800 add getUcpMemory in wrapper cachemgr implementations commit 5b72806 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Fri Dec 8 22:27:51 2023 -0800 instantiate listener in start() instead of constructor commit 1def30d Author: Lucy Ge <lucy.ge@alluxio.com> Date: Fri Dec 8 21:43:36 2023 -0800 fixes for ucp server module commit 99eea03 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Thu Dec 7 17:39:46 2023 -0800 additonal changes to make UcpServer a module commit a7237fb Author: Lucy Ge <lucy.ge@alluxio.com> Date: Thu Dec 7 13:05:14 2023 -0800 WIP - make UcpServer a module commit c6cd05a Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Dec 6 17:19:31 2023 -0800 stash changes - worker version using alluxioworker process to start standalone ucp server commit f293e49 Author: LucyGe <lucy.ge@alluxio.com> Date: Wed Dec 6 21:13:50 2023 +0000 compile error and add start scripts for UcpServer / UcpClientTest commit 3a2c618 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Dec 6 11:04:14 2023 -0800 stash local changes to debug stressucxbench commit 9bc39c5 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Thu Nov 30 17:01:14 2023 -0800 1. add cache / getUcpMemory api in CacheManager interface 2. add error case handling in getUcpMemory 3. add UcxConnectionPool 4. ReadRequestRMAHandler should break without error if can't serve requested read len 5. have UcpServer own its own cachemanager instead of relying on worker, add temporary prefill func to warm up cache 6. fix UcxDataReader to return correct read len 7. add StressUcxBench commit 4eb130b Author: Lucy Ge <lucy.ge@alluxio.com> Date: Fri Nov 17 14:27:09 2023 -0800 WIP - making multi-iteration read UT work commit c573235 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Nov 8 15:07:00 2023 -0800 Initial working version of ReadRequestRMA for both client and server + add UT UcxReadTest commit 3a13395 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Fri Nov 3 17:18:16 2023 -0700 mv UT class commit 2150e4d Author: Lucy Ge <lucy.ge@alluxio.com> Date: Fri Nov 3 17:00:05 2023 -0700 WIP -add readRMA in reader client + add related read UT commit 131441d Author: Lucy Ge <lucy.ge@alluxio.com> Date: Fri Nov 3 12:24:18 2023 -0700 1. fixes on the buffer to send back info in accepting conn 2. fix for UcxConnectionTest.testEstablishConnection, it's now working to test the UcxConnection establishment logics commit 653da91 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Nov 1 16:55:56 2023 -0700 test file name change commit 5bbe0fc Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Nov 1 15:51:43 2023 -0700 sort pom commit e0d8592 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Nov 1 15:47:07 2023 -0700 compile errors commit 49944e2 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Nov 1 15:04:30 2023 -0700 add missing files commit 29106a7 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Nov 1 15:02:41 2023 -0700 WIP - add init new conn / accept income conn logics commit 2104843 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Tue Oct 31 11:08:20 2023 -0700 WIP - 1) add RMA read request handler 2) tag establishment fixes commit 9e2648e Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Oct 25 21:38:02 2023 -0700 WIP - basic skeleton commit c4206be Author: Lucy Ge <lucy.ge@alluxio.com> Date: Mon Oct 23 10:22:48 2023 -0700 add missing file in refactoring commit 2928a78 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Mon Oct 23 10:10:18 2023 -0700 WIP - refactor commit 2e7950b Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Oct 18 16:21:13 2023 -0700 worker end-to-end read workflow version of ucpserver/ucxDataReader commit f443ec6 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Wed Oct 11 16:45:50 2023 -0700 WIP: 1. req sendTagged and recvTagged should have same buffersize 2. use different tag for different client inetaddr 3. start recvReq on accepting conn on server side, and keep recvReq for the same client one after another commit 37ee296 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Tue Oct 10 21:21:37 2023 -0700 WIP - add UcpServer / UcpClientTest standalone main() commit 9d58cc4 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Mon Oct 9 14:54:57 2023 -0700 WIP - 1. use correct dependency in pom 2. add test to initially test client/server commit 3d5c353 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Fri Oct 6 12:37:37 2023 -0700 use abs path from local for now commit 95a69e0 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Tue Sep 19 14:17:39 2023 -0700 jar change commit 1472081 Author: Lucy Ge <lucy.ge@alluxio.com> Date: Tue Sep 19 14:01:03 2023 -0700 ucp server/client WIP
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
add UcpServer to accept ucx(ucp) protocols thru network.