-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embeddings and hidden states of Agro-NT model (New to this field so please excuse my question if it is really naive/stupid.) #70
Comments
Hello @NikeeShrestha What is referred to as If you compare the last hidden state out of the HF model with the embedding from the 40th layer of the agro_nt model, you should get the same value! Do not hesitate if you have any other questions :) |
Hi just wanted to ask a follow-up question: I was wondering when using the For example, If so, I was wondering why the readme file of the github main page used '20' for the 500m human model: # Get pretrained model
parameters, forward_fn, tokenizer, config = get_pretrained_model(
model_name="500M_human_ref",
embeddings_layers_to_save=(20,),
max_positions=32,
)
forward_fn = hk.transform(forward_fn) For the embedding, should we use the |
Hello @hongruhu , The 20th layer is an arbitrary choice as embedding of intermediate layers can also be interesting to use. Indeed if you want the final embedding layer of the "500m human model", since there are 24 layers, you should use As for the representation, a very common practice is to average the embeddings of the tokens across the sequence length dimension! You can an example of this in the example notebook here. Best regards, |
How do embeddings layers saved from inference notebook in the github and hidden states from the Hugging face inference notebook differ from each other? When I compare the these two outputs for a sequence, they are different. If I want to do downstreams classification task, which one should be best to work with? They both have same dimension.
Github Inference model loading and using:
parameters, forward_fn, tokenizer, config = get_pretrained_model(
model_name=model_name,
embeddings_layers_to_save=(20,),
attention_maps_to_save=((1, 4), (7, 18)),
max_positions=26,
# output_hidden_states=True,
# If the progress bar gets stuck at the start of the model wieghts download,
# you can set verbose=False to download without the progress bar.
verbose=True
)
Hugging Face Inference:
outs = agro_nt_model(
torch_batch_tokens,
attention_mask=attention_mask,
encoder_attention_mask=attention_mask,
output_hidden_states=True,
)
The text was updated successfully, but these errors were encountered: