Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logger emits exception when there's None in hparams #984

Closed
kyoungrok0517 opened this issue Feb 29, 2020 · 5 comments
Closed

Logger emits exception when there's None in hparams #984

kyoungrok0517 opened this issue Feb 29, 2020 · 5 comments
Labels
bug Something isn't working help wanted Open to be worked on

Comments

@kyoungrok0517
Copy link

To Reproduce

My hparams:

{
	'n': [8000],
	'k': [30],
	'batch_size': 512,
	'data_dir': '/Users/kyoungrok/Resilio Sync/Dataset/2019 TREC/passage_ranking/dataset',
	'max_nb_epochs': 500,
	'learning_rate': 0.0001,
	'nodes': 1,
	'distributed_backend': None,
	'eval_test_set': False,
	'check_val_every_n_epoch': 1,
	'accumulate_grad_batches': 1,
	'max_epochs': 200,
	'min_epochs': 2,
	'train_percent_check': 1.0,
	'val_percent_check': 1.0,
	'test_percent_check': 1.0,
	'val_check_interval': 0.95,
	'log_save_interval': 100,
	'row_log_interval': 100,
	'enable_early_stop': True,
	'early_stop_metric': 'val_acc',
	'early_stop_mode': 'min',
	'early_stop_patience': 3,
	'gradient_clip_val': -1,
	'track_grad_norm': -1,
	'model_save_path': '/Users/kyoungrok/Desktop/trec-2019-deep-learning/trec2019/sparse/sparsenet/model_weights',
	'model_save_monitor_value': 'val_acc',
	'model_save_monitor_mode': 'max',
	'model_load_weights_path': None,
	'tt_name': 'pt_test',
	'tt_description': 'pytorch lightning test',
	'tt_save_path': '/Users/kyoungrok/Desktop/trec-2019-deep-learning/trec2019/sparse/sparsenet/test_tube_logs',
	'single_run': False,
	'nb_hopt_trials': 1,
	'log_stdout': False,
	'gpus': None,
	'single_run_gpu': False,
	'default_tensor_type': 'torch.cuda.FloatTensor',
	'use_amp': False,
	'check_grad_nans': False,
	'amp_level': 'O2',
	'on_cluster': False,
	'fast_dev_run': True,
	'enable_tqdm': False,
	'overfit': -1,
	'interactive': False,
	'debug': False,
	'local': False,
	'lr_scheduler_milestones': None,
	'k_inference_factor': 1.5,
	'weight_sparsity': [0.3],
	'boost_strength': 1.5,
	'boost_strength_factor': 0.85,
	'dropout': 0.0,
	'use_batch_norm': True,
	'normalize_weights': False,
	'hpc_exp_number': None
}

I see the following errors because I have None value in my hparams dict

Traceback (most recent call last):
  File "sparsenet_trainer.py", line 73, in <module>
    main(hparam_trial)
  File "sparsenet_trainer.py", line 47, in main
    trainer.fit(model)
  File "/Users/kyoungrok/anaconda3/envs/trec/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 707, in fit
    self.run_pretrain_routine(model)
  File "/Users/kyoungrok/anaconda3/envs/trec/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 757, in run_pretrain_routine
    self.logger.log_hyperparams(ref_model.hparams)
  File "/Users/kyoungrok/anaconda3/envs/trec/lib/python3.7/site-packages/pytorch_lightning/logging/base.py", line 14, in wrapped_fn
    fn(self, *args, **kwargs)
  File "/Users/kyoungrok/anaconda3/envs/trec/lib/python3.7/site-packages/pytorch_lightning/logging/tensorboard.py", line 88, in log_hyperparams
    self.experiment.add_hparams(hparam_dict=params, metric_dict={})
  File "/Users/kyoungrok/anaconda3/envs/trec/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 300, in add_hparams
    exp, ssi, sei = hparams(hparam_dict, metric_dict)
  File "/Users/kyoungrok/anaconda3/envs/trec/lib/python3.7/site-packages/torch/utils/tensorboard/summary.py", line 156, in hparams
    raise ValueError('value should be one of int, float, str, bool, or torch.Tensor')
ValueError: value should be one of int, float, str, bool, or torch.Tensor

Environment

PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: None

OS: Mac OSX 10.15.3
GCC version: Could not collect
CMake version: version 3.16.1

Python version: 3.7
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA

Versions of relevant libraries:
[pip] numpy==1.18.1
[pip] pytorch-lightning==0.6.0
[pip] torch==1.4.0
[pip] torchvision==0.5.0
[conda] blas 1.0 mkl
[conda] mkl 2019.4 233
[conda] mkl-service 2.3.0 py37hfbe908c_0
[conda] mkl_fft 1.0.15 py37h5e564d8_0
[conda] mkl_random 1.1.0 py37ha771720_0
[conda] pytorch 1.4.0 py3.7_0 pytorch
[conda] pytorch-lightning 0.6.0 pypi_0 pypi
[conda] torchvision 0.5.0 py37_cpu pytorch

@kyoungrok0517 kyoungrok0517 added bug Something isn't working help wanted Open to be worked on labels Feb 29, 2020
@Borda
Copy link
Member

Borda commented Mar 2, 2020

Hello, thanks for letting us know, could you also share with us the model/trainer so we can replicate your problem... :]

@kyoungrok0517
Copy link
Author

kyoungrok0517 commented Mar 3, 2020

Hi, thanks for the response. Here's my code and data. The error can be reproduced under pytorch-lightning >= 0.5.3.3

https://www.dropbox.com/sh/5dyq5dp5l8zfc3r/AAAiK-IoihpgdnJ3L8QzKAxNa?dl=0

  • pip install -e . && pip install -r requirements.txt
  • (at the root) python ./code/trec2019/sparse/sparsenet/sparsenet_trainer.py --data_dir ./data

@Borda
Copy link
Member

Borda commented Mar 3, 2020

@kyoungrok0517 and what would be your expected behaviour, just ignore all parameters with none value?

@kyoungrok0517
Copy link
Author

kyoungrok0517 commented Mar 4, 2020 via email

@awaelchli
Copy link
Member

This is fixed on master. See #1130. @Borda it can be closed.

@Borda Borda closed this as completed Apr 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

No branches or pull requests

3 participants