Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chat/base.py: extend checkpoint_dir before accessing it #1575

Merged
merged 4 commits into from
Jul 13, 2024

Conversation

Andrei-Aksionov
Copy link
Collaborator

Hi there 👋

Inside auto_download_checkpoint the code also extends the path by prepending checkpoints/ if it's not provided.

Since currently it's done after check_file_size_on_cpu_and_warn, the command

litgpt chat google/gemma-2-9b-it

will fail.

@Andrei-Aksionov Andrei-Aksionov requested review from rasbt and removed request for lantiga and awaelchli July 12, 2024 10:07
@Andrei-Aksionov Andrei-Aksionov marked this pull request as draft July 12, 2024 12:47
@rasbt
Copy link
Collaborator

rasbt commented Jul 12, 2024

The reason why auto-download is so far down there (compared to the other places) is that I had to move it below the LoRA merging, because otherwise it will download a model if you want to use LoRA weights.

@rasbt rasbt marked this pull request as ready for review July 12, 2024 16:02
@rasbt
Copy link
Collaborator

rasbt commented Jul 12, 2024

I also didn't notice the missing CPU warning because I was running it on GPU 😅. The following reorg might work ...

@rasbt
Copy link
Collaborator

rasbt commented Jul 12, 2024

I think this should be good now. Feel free to merge if you agree

@Andrei-Aksionov
Copy link
Collaborator Author

There is a problem.
checkpoint_path is created before auto_download_checkpoint is called, which means that the checpoint_dir wasn't extended at the moment of creation of checkpoint_path.
So when we do load_checkpoint(fabric, model, checkpoint_path) it will throw an error (if the path doesn't contain checkpoints/).
This needs to be fixed and the test should be updated.

Copy link
Collaborator

@rasbt rasbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see now what you meant previously.

@Andrei-Aksionov
Copy link
Collaborator Author

Are we good to merge? What do you think?

@rasbt
Copy link
Collaborator

rasbt commented Jul 13, 2024

Yes, I think so. But one last question, after the update, have you double-checked / tested it on custom paths that don't start with `"checkpoints", like

litgpt chat my_custom_dir/google/gemma-2-9b-it

@Andrei-Aksionov
Copy link
Collaborator Author

have you double-checked / tested it on custom paths that don't start with `"checkpoints"

No 😊.
Only just now checked with a Pythia model.

fix_chat_checkpoint_dir_extension ~/lit-gpt litgpt chat custom_dir/$repo_id
{'access_token': None,
 'checkpoint_dir': PosixPath('custom_dir/EleutherAI/pythia-70m'),
 'compile': False,
 'multiline': False,
 'precision': None,
 'quantize': None,
 'temperature': 0.8,
 'top_k': 200,
 'top_p': 1.0}
Now chatting with pythia-70m.
To exit, press 'Enter' on an empty prompt.

Seed set to 1234
>> Prompt: Hello
>> Reply: *
Time for inference: 0.28 sec total, 3.52 tokens/sec, 1 tokens

@rasbt
Copy link
Collaborator

rasbt commented Jul 13, 2024

Nice, thanks for checking! Looks all good to me now :)

@rasbt rasbt merged commit d4c87bd into main Jul 13, 2024
9 checks passed
@rasbt rasbt deleted the fix_chat_checkpoint_dir_extension branch July 13, 2024 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants