Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Learning Rate Modified by Steps #2

Closed
torridgristle opened this issue Mar 7, 2021 · 4 comments
Closed

[Feature] Learning Rate Modified by Steps #2

torridgristle opened this issue Mar 7, 2021 · 4 comments

Comments

@torridgristle
Copy link

I've experimented with a learning rate that changes as the steps increase due to seeing Aphantasia develop a general image very quickly, but then slowing down to make small details. I believe that my proposed alternative puts more focus on larger shapes, and less on details.

I expose the learning_rate variable and add a learning_rate_max variable in the Generate cell, remove the optimizer = torch.optim.Adam(params, learning_rate) line and instead add this to def train(i):

learning_rate_new = learning_rate + (i / steps) * (learning_rate_max - learning_rate)
optimizer_new = torch.optim.Adam(params, learning_rate_new)

With this, I find that a learning_rate of 0.0001 and a learning_rate_max of 0.008 at the highest value works well, for 300-400 steps and about 50 samples at least.

@eps696
Copy link
Owner

eps696 commented Mar 8, 2021

thanks for proposal, will check that! i did notice fancy behaviour of training details, but haven't performed so thorough tests.
the process for single image looked ok for me as is, but i had tough time struggling with multi-phrase continuous runs - the imagery tended to get stuck after 7-10 steps. meanwhile i resorted to weird tricks of mixed initialization + further interpolation; hopefully your approach may help there (since optimizer and params are recreated every cycle anyway).

i'll keep this open till further exploration.

@eps696
Copy link
Owner

eps696 commented Mar 11, 2021

@torridgristle i gave it a try, here is my findings:

  • your settings decrease "overpopulation" indeed, producing more sparse and less clogged look;
  • lower rate (0.003~0.004) seems to produce similar effect on its own, without progressive setup;
  • progressive decrease also gives quite interesting results (dense painting, better colors);
  • all above was observed on ViT model; i didn't notice any specific effect for RN101 yet.

i will add and expose progressive mode as an option (and change default lrate) anyway, to encourage further experiments.

please note also that:

  1. learning rate of the optimizer can be changed on the fly without recreation, as following:
for g in optimizer.param_groups: 
    g['lr'] = lr_init + (i / steps) * (lr_max - lr_init)
  1. the generation is proceeded by patches/samples of random size (and position), and it's their size that directly affects size of painted features. so it may be worth changing that size progressively in slice_imgs function (@JonathanFly proposed that while ago, but i didn't dig into that).

@eps696
Copy link
Owner

eps696 commented Mar 12, 2021

further tests have shown that in most cases progressive lrate does have some impact on the composition. i would not call it "larger shapes enhancements" (sometimes it just drew significantly less elements of all kinds), but it's worth having in the options.
exact lrate values depend heavily on the model, prompts and (especially) the resolution: even 0.03-0.05 was not enough to cover the whole 4k frame in some cases (training up to 1000 steps and 256 samples). i also tested rates as big as 1~10, and they also have their interesting specifics (not explicitly generalizable as a common rule).

@eps696
Copy link
Owner

eps696 commented Mar 16, 2021

@torridgristle implemented and mentioned on readme. thanks for discovering; closing now.

@eps696 eps696 closed this as completed Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants