Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add SchedulerV3 #1996

Merged
merged 8 commits into from
Jun 4, 2024
Merged

feat: add SchedulerV3 #1996

merged 8 commits into from
Jun 4, 2024

Conversation

OlivierDehaene
Copy link
Member

@OlivierDehaene OlivierDehaene commented Jun 3, 2024

  • Refactor code to allow supporting multiple versions of the generate.proto at the same time
  • Add v3/generate.proto (ISO to generate.proto for now but allow for future changes without impacting v2 backends)
  • Add Schedule trait to abstract queuing and batching mechanisms that will be different in the future
  • Add SchedulerV2/V3 impl

message Input {
repeated InputChunk chunks = 1;
}

enum GrammarType {
GRAMMAR_TYPE_NONE = 0;
GRAMMAR_TYPE_JSON = 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be moved to v2 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I added it to v3 directly to avoid any changes to v2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the file generate.proto


(scheduler, health_ext, shard_info, max_batch_total_tokens)
}
Err(_) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of error does connecting a v3 client to a v2 shard create ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClientError::Connection("Server does not support v3 interface".to_string())

Copy link
Collaborator

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@OlivierDehaene OlivierDehaene merged commit 757223b into main Jun 4, 2024
8 checks passed
@OlivierDehaene OlivierDehaene deleted the feat/generate_v3 branch June 4, 2024 13:56
yuanwu2017 pushed a commit to yuanwu2017/tgi-gaudi that referenced this pull request Sep 26, 2024
- Refactor code to allow supporting multiple versions of the
generate.proto at the same time
- Add v3/generate.proto (ISO to generate.proto for now but allow for
future changes without impacting v2 backends)
- Add Schedule trait to abstract queuing and batching mechanisms that
will be different in the future
- Add SchedulerV2/V3 impl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants