Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training epochs loss #11

Open
kankratekaran opened this issue Oct 1, 2019 · 0 comments
Open

Training epochs loss #11

kankratekaran opened this issue Oct 1, 2019 · 0 comments

Comments

@kankratekaran
Copy link

I fine tuned mittens using stanford glove embeddings on my review dataset. After I prepared my co-occurence matrix the vocabulary size was 43,933. Therefore, given the capacity of my computer I fine tuned in two parts.

  1. used 22000 of initial vocab as first pass to fine tune embeddings and,
  2. used remaining vocab data in second pass.

The strange thing that I observe is that for first pass error over 1000 iterations reduced from 91000 (approx.) to 30000(approx.), but for second pass over 1000 iterations error scale was between 95 and 0.79 (approx).

I am confused to see this behaviour because both pass had almost same amount of data. I would like to know why is this happening.

Is this good or bad? If Yes, then how can I fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant