Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous orchestrator architecture #379

Merged
merged 1 commit into from
Mar 31, 2023
Merged

Synchronous orchestrator architecture #379

merged 1 commit into from
Mar 31, 2023

Conversation

lkurija1
Copy link
Contributor

@lkurija1 lkurija1 commented Mar 28, 2023

Description

Synchronous orchestrator architecture for Flame
Additions:

  • New mode added: coord_syncfl
    The new mode revolves around the new component - Coordinator.
    • Top aggregator notifies the Coordinator of training state state/finish.
    • Mid aggregators now consult the Coordinator for selected trainers.
    • Trainers now register themselves with the Coordinator.
  • New component added: Coordinator
    Coordinators role is to coordinate the training process. It is responsible for selecting the trainers and aggregators and sending the selected trainers to the aggregators.
    New architecture implements top and middle aggregators as well as trainers with additional lifespan steps which accomodate the protocol between them and the Coordinator.
  • New example added: coord_hier_syncfl

Type of Change

  • Bug Fix
  • New Feature
  • Breaking Change
  • Refactor
  • Documentation
  • Other (please describe)

Checklist

  • I have read the contributing guidelines
  • Existing issues have been referenced (where applicable)
  • I have verified this change is not present in other open pull requests
  • Functionality is documented
  • All code style checks pass
  • New code contribution is covered by automated tests
  • All new and existing tests pass

@codecov-commenter
Copy link

codecov-commenter commented Mar 29, 2023

Codecov Report

Merging #379 (bdf4566) into main (225c86d) will increase coverage by 0.64%.
The diff coverage is n/a.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##             main     #379      +/-   ##
==========================================
+ Coverage   14.48%   15.13%   +0.64%     
==========================================
  Files          48       48              
  Lines        2768     2775       +7     
==========================================
+ Hits          401      420      +19     
+ Misses       2338     2327      -11     
+ Partials       29       28       -1     

see 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Additions:
- New mode added: coord_syncfl
The new mode revolves around the new component - Coordinator.
  - Top aggregator notifies the Coordinator of training state state/finish.
  - Mid aggregators now consult the Coordinator for selected trainers.
  - Trainers now register themselves with the Coordinator.
- New component added: Coordinator
Coordinators role is to coordinate the training process. It is responsible for selecting the trainers and aggregators and sending the selected trainers to the aggregators.
New architecture implements top and middle aggregators as well as trainers with additional lifespan steps which accomodate the protocol between them and the Coordinator.
- New example added: coord_hier_syncfl
@lkurija1 lkurija1 marked this pull request as ready for review March 31, 2023 16:43
@lkurija1 lkurija1 changed the title [WIP] Initial implementation of orchestrator arch Synchronous orchestrator architecture Mar 31, 2023
Copy link
Contributor

@myungjin myungjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@myungjin myungjin merged commit f6c291e into cisco-open:main Mar 31, 2023
@lkurija1 lkurija1 deleted the orchestrator-architecture branch April 10, 2023 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants