default logger is now tensorboard #609

williamFalcon · 2019-12-08T04:37:39Z

Sets Tensorboard logger as default instead of test-tube.

Also removes the test-tube dep from the package and thus the tensorflow dep

neggert · 2019-12-08T04:46:23Z

I was hoping to have a few people try it out before we make it the default...

williamFalcon · 2019-12-08T04:54:14Z

sure. let's keep the PR open for a few days while we try it out.
This was the last main thing for this release.

@tullie

Borda

replace tensorboard longer also in tests as default logger

Borda · 2019-12-08T08:05:04Z

requirements.txt

@@ -4,5 +4,4 @@ numpy>=1.16.4
 torch>=1.1
 torchvision>=0.4.0
 pandas>=0.24  # lower version do not support py3.7
-test-tube>=0.7.5


add test-tube to test/requirements.txt

Borda · 2019-12-08T11:31:48Z

The logger is broken:

tmpdir = local('/tmp/pytest-of-travis/pytest-0/test_tensorboard_logger0')
    def test_tensorboard_logger(tmpdir):
        """Verify that basic functionality of Tensorboard logger works."""
    
        hparams = tutils.get_hparams()
        model = LightningTestModel(hparams)
    
        logger = TensorboardLogger(save_dir=tmpdir, name="tensorboard_logger_test")
    
        trainer_options = dict(max_epochs=1, train_percent_check=0.01, logger=logger)
    
        trainer = Trainer(**trainer_options)
>       result = trainer.fit(model)
tests/test_logging.py:207: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pytorch_lightning/trainer/trainer.py:403: in fit
    self.run_pretrain_routine(model)
pytorch_lightning/trainer/trainer.py:455: in run_pretrain_routine
    self.logger.save()
pytorch_lightning/logging/base.py:14: in wrapped_fn
    fn(self, *args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <pytorch_lightning.logging.tensorboard.TensorboardLogger object at 0x7fd30a81e8d0>
    @rank_zero_only
    def save(self):
>       self.experiment.flush()
E       AttributeError: 'SummaryWriter' object has no attribute 'flush'
pytorch_lightning/logging/tensorboard.py:78: AttributeError

Lets fix the CI - Travis, Appveyor and CircleCI so we do not merge code which fails...

Borda · 2019-12-08T20:06:17Z

@williamFalcon rebase master, pls...

snie2012 · 2019-12-15T05:27:16Z

Any update on this?

CarloLucibello · 2019-12-21T13:38:17Z

I get the following error due to a call to SummaryWriter.add_hparams:

{'batch_size': 128, 'epochs': 10, 'lr': 0.1, 'weight_decay': 0.0005, 'save_model': False, 'load_model': '', 'droplr': 5, 'opt': 'nesterov', 'loss': 'nll', 'model': 'mlp_100_100', 'dataset': 'fashion', 'datapath': '~/data/', 'log_interval': 2, 'train_frac': 1.0, 'test_frac': 1.0, 'preprocess': False, 'gpus': None, 'use_16bit': False, 'seed': 872846804}
ERROR:LightDNN:Failed after 0:00:00!
Traceback (most recent calls WITHOUT Sacred internals):
  File "scripts/light_dnn.py", line 220, in main
    trainer.fit(model)
  File "/home/carlo/Git/pytorch-lightning/pytorch_lightning/trainer/trainer.py", line 403, in fit
    self.run_pretrain_routine(model)
  File "/home/carlo/Git/pytorch-lightning/pytorch_lightning/trainer/trainer.py", line 453, in run_pretrain_routine
    self.logger.log_hyperparams(ref_model.hparams)
  File "/home/carlo/Git/pytorch-lightning/pytorch_lightning/logging/base.py", line 14, in wrapped_fn
    fn(self, *args, **kwargs)
  File "/home/carlo/Git/pytorch-lightning/pytorch_lightning/logging/tensorboard.py", line 76, in log_hyperparams
    self.experiment.add_hparams(hparam_dict=dict(params), metric_dict={})
  File "/home/carlo/miniconda3/envs/sacred37/lib/python3.7/site-packages/torch/utils/tensorboard/writer.py", line 292, in add_hparams
    exp, ssi, sei = hparams(hparam_dict, metric_dict)
  File "/home/carlo/miniconda3/envs/sacred37/lib/python3.7/site-packages/torch/utils/tensorboard/summary.py", line 156, in hparams
    raise ValueError('value should be one of int, float, str, bool, or torch.Tensor')
ValueError: value should be one of int, float, str, bool, or torch.Tensor

This is because I have a NoneType value for the gpus params:

hparams = { 'gpus': None}

It can also be a problem, for list of gpus indexes, since lists are not supported as well.
I suggest we cast the gpus field (or any other non-supported field type) to string.

Borda · 2019-12-21T21:32:20Z

This is because I have a NoneType value for the gpus params:
hparams = { 'gpus': None}
It can also be a problem, for list of gpus indexes, since lists are not supported as well.
I suggest we cast the gpus field (or any other non-supported field type) to string.

good point, for list I would assume string with some common separator and of None I would drop such option from hparams...

Borda · 2019-12-21T21:32:26Z

This is because I have a NoneType value for the gpus params:
hparams = { 'gpus': None}
It can also be a problem, for list of gpus indexes, since lists are not supported as well.
I suggest we cast the gpus field (or any other non-supported field type) to string.

good point, for list I would assume string with some common separator and of None I would drop such option from hparams...

williamFalcon · 2020-01-14T03:14:51Z

@neggert @Borda is this ready to merge?

neggert · 2020-01-14T18:57:25Z

It looks like somehow the test-tube dependency got added back in with the merge commit.

neggert · 2020-01-14T18:58:30Z

Oh, wait. I was just doing weird things with the merge commit summary. LGTM.

snie2012 · 2020-01-15T23:08:01Z

Quick question, why is test-tube still in requirements.txt?

Borda · 2020-01-15T23:21:26Z

we are fixing it in #609 together with some other minor issues... :]

snie2012 · 2020-01-16T02:57:39Z

Do you mean #687 ? @Borda

shubhamagarwal92 · 2020-03-12T15:31:29Z

This is because I have a NoneType value for the gpus params:

```python
hparams = { 'gpus': None}
It can also be a problem, for list of gpus indexes, since lists are not supported as well.
I suggest we cast the gpus field (or any other non-supported field type) to string.

@williamFalcon @Borda I am still facing this issue for pytorch-lightning==0.7.1.

Basically, I tried to pass a list of gpus in the NER example in the transformers repo here and was facing the same value error raise ValueError('value should be one of int, float, str, bool, or torch.Tensor'). I can see a simple fix for this. Tensorboard limits passing Nonetype or list or even dictionary values here

Instead of passing all the hparams to tensorboard via pl here, we put a small check to pass int, float, str, bool, or torch.Tensor for logging.

        from six import string_types
        from torch.utils.tensorboard.summary import hparams
        tensorboard_params = {}
        for k, v in params.items():
            if isinstance(v, int) or isinstance(v, float) or isinstance(v, string_types) or isinstance(
                    v, bool) or isinstance(v, torch.Tensor):
                tensorboard_params[k] = v
        exp, ssi, sei = hparams(tensorboard_params, {})

IMO, there could be cases when the user wants to parse a list or maybe nonetype objects through argparse but doesnt want them to be logged via tensorboard.

Please let me know if this makes sense and I should raise a PR!

Borda · 2020-03-12T15:50:55Z

@shubhamagarwal92 hs for this elaboration, sure PR is welcome!

williamFalcon added 4 commits December 7, 2019 23:30

refactor

94bd2ae

refactor

1b86ed9

refactor

47a82cf

made tensorboard the default not test-tube

99c9b82

Borda requested changes Dec 8, 2019

View reviewed changes

This was referenced Dec 8, 2019

Implement TensorboardLogger #607

Merged

Fix logger, tensorboard #610

Merged

awaelchli mentioned this pull request Dec 8, 2019

Rename TensorboardLogger -> TensorBoardLogger #611

Closed

4 tasks

Merge branch 'master' into tb

d794ee4

Borda added this to the 0.6.0 milestone Jan 14, 2020

williamFalcon merged commit 88b750a into master Jan 14, 2020

Borda mentioned this pull request Jan 15, 2020

fixing TensorBoard #687

Merged

williamFalcon deleted the tb branch February 16, 2020 16:29

shubhamagarwal92 added a commit to shubhamagarwal92/pytorch-lightning that referenced this pull request Mar 12, 2020

Related to Lightning-AI#609. Filter params for tensorboard logging.

9473303

shubhamagarwal92 mentioned this pull request Mar 12, 2020

setting PGU device #1128

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

default logger is now tensorboard #609

default logger is now tensorboard #609

williamFalcon commented Dec 8, 2019 •

edited

Loading

neggert commented Dec 8, 2019

williamFalcon commented Dec 8, 2019 •

edited

Loading

Borda left a comment

Borda Dec 8, 2019

Borda commented Dec 8, 2019

Borda commented Dec 8, 2019

snie2012 commented Dec 15, 2019

CarloLucibello commented Dec 21, 2019 •

edited

Loading

Borda commented Dec 21, 2019

Borda commented Dec 21, 2019

williamFalcon commented Jan 14, 2020

neggert commented Jan 14, 2020

neggert commented Jan 14, 2020

snie2012 commented Jan 15, 2020

Borda commented Jan 15, 2020

snie2012 commented Jan 16, 2020

shubhamagarwal92 commented Mar 12, 2020 •

edited

Loading

Borda commented Mar 12, 2020

default logger is now tensorboard #609

default logger is now tensorboard #609

Conversation

williamFalcon commented Dec 8, 2019 • edited Loading

neggert commented Dec 8, 2019

williamFalcon commented Dec 8, 2019 • edited Loading

Borda left a comment

Choose a reason for hiding this comment

Borda Dec 8, 2019

Choose a reason for hiding this comment

Borda commented Dec 8, 2019

Borda commented Dec 8, 2019

snie2012 commented Dec 15, 2019

CarloLucibello commented Dec 21, 2019 • edited Loading

Borda commented Dec 21, 2019

Borda commented Dec 21, 2019

williamFalcon commented Jan 14, 2020

neggert commented Jan 14, 2020

neggert commented Jan 14, 2020

snie2012 commented Jan 15, 2020

Borda commented Jan 15, 2020

snie2012 commented Jan 16, 2020

shubhamagarwal92 commented Mar 12, 2020 • edited Loading

Borda commented Mar 12, 2020

williamFalcon commented Dec 8, 2019 •

edited

Loading

williamFalcon commented Dec 8, 2019 •

edited

Loading

CarloLucibello commented Dec 21, 2019 •

edited

Loading

shubhamagarwal92 commented Mar 12, 2020 •

edited

Loading