Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"array must not contain infs or NaNs" when training/evaluating vqvae #29

Open
Keneyr opened this issue Mar 6, 2024 · 2 comments
Open

Comments

@Keneyr
Copy link

Keneyr commented Mar 6, 2024

  1. I tried to train train_vq.py, but got error like array must not contain infs or NaNs, the call stack is:
# vq_trainer.py
best_fid, best_div, best_top1, best_top2, best_top3, best_matching, writer = evaluation_vqvae(
            self.opt.model_dir, eval_val_loader, self.vq_model, self.logger, epoch, best_fid=1000,
            best_div=100, best_top1=0,
            best_top2=0, best_top3=0, best_matching=100,
            eval_wrapper=eval_wrapper, save=False)

# eval_t2m.py
diversity_real = calculate_diversity(motion_annotation_np, 300 if nb_sample > 300 else 100)

# metrics.py
dist = linalg.norm(activation[first_indices] - activation[second_indices], axis=1)

it seems like activation[first_indices] has nan elements.

I used numpy.nan_to_num() to avoid the error, but will it affect my training effect?

  1. I tried to run eval_t2m_vq.py following README, which means I was using the evaluation model downloaded from Google(given by the repo author), also got the same error, the call stack is:
# eval_t2m_vq.py
best_fid, best_div, Rprecision, best_matching, l1_dist = \
                eval_t2m.evaluation_vqvae_plus_mpjpe(eval_val_loader, net, i, eval_wrapper=eval_wrapper, num_joint=args.nb_joints)

# eval_t2m.py
diversity = calculate_diversity(motion_pred_np, 300 if nb_sample > 300 else 100)

# metrics.py
dist = linalg.norm(activation[first_indices] - activation[second_indices], axis=1)

What should I do? Thank you!

@Keneyr Keneyr changed the title "array must not contain infs or NaNs" when training vqvae "array must not contain infs or NaNs" when training/evaluating vqvae Mar 6, 2024
@imzeroan
Copy link

imzeroan commented Mar 6, 2024

There is a closed issue figured it out. Just find a humanml3d data which has nan and delete it. There were only *007975.npy has nan.

@Keneyr
Copy link
Author

Keneyr commented Mar 7, 2024

yeah, thank you :). It seems 007975.npy is the reason, but I am still confused, why this file is failed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants