Training epochs loss #11

kankratekaran · 2019-10-01T10:01:25Z

I fine tuned mittens using stanford glove embeddings on my review dataset. After I prepared my co-occurence matrix the vocabulary size was 43,933. Therefore, given the capacity of my computer I fine tuned in two parts.

used 22000 of initial vocab as first pass to fine tune embeddings and,
used remaining vocab data in second pass.

The strange thing that I observe is that for first pass error over 1000 iterations reduced from 91000 (approx.) to 30000(approx.), but for second pass over 1000 iterations error scale was between 95 and 0.79 (approx).

I am confused to see this behaviour because both pass had almost same amount of data. I would like to know why is this happening.

Is this good or bad? If Yes, then how can I fix it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training epochs loss #11

Training epochs loss #11

kankratekaran commented Oct 1, 2019

Training epochs loss #11

Training epochs loss #11

Comments

kankratekaran commented Oct 1, 2019