Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another profiling tool is already active #19983

Open
zhaohm14 opened this issue Jun 17, 2024 · 3 comments
Open

Another profiling tool is already active #19983

zhaohm14 opened this issue Jun 17, 2024 · 3 comments
Labels
bug Something isn't working help wanted Open to be worked on profiler ver: 2.2.x

Comments

@zhaohm14
Copy link

zhaohm14 commented Jun 17, 2024

Bug description

When I try to use profiler='advanced' when creating a trainer, this error will be raised inside trainer.fit():

ValueError: Another profiling tool is already active

It will be ok if use profiler='simple'

What version are you seeing the problem on?

master

How to reproduce the bug

trainer = L.Trainer(
        default_root_dir=config.train.save_dir,
        callbacks=[
            ModelCheckpoint(
                dirpath=config.train.save_dir,
                every_n_train_steps=config.train.save_step,
                save_top_k=config.train.save_ckpt_keep_num,
                mode='max',
                monitor='global_step'
            ),
            ModelSummary(max_depth=9)
        ],
        logger=WandbLogger(log_model="all"),
        **config.train.trainer
    )
    if config.train.resume_from_ckpt:
        trainer.fit(
            model=model,
            train_dataloaders=train_loader,  # TODO: dose dataloader needed?
            val_dataloaders=val_loader,
            ckpt_path=config.train.resume_from_ckpt
        )
    else:
        trainer.fit(
            model=model,
            train_dataloaders=train_loader,
            val_dataloaders=val_loader
        )

Error messages and logs

# Error messages and logs here please

Environment

Current environment
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0):
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 2.0):
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):
#- Running environment of LightningApp (e.g. local, cloud):

More info

No response

cc @carmocca

@zhaohm14 zhaohm14 added bug Something isn't working needs triage Waiting to be triaged by maintainers labels Jun 17, 2024
@awaelchli
Copy link
Member

The explanation for why this happens is here: python/cpython#110770 (comment)

The AdvancedProfiler in Lightning enables multiple profilers in a nested fashion, which is apparently not supported by Python but so far was not complaining, until Python 3.12. To resolve this, the AdvancedProfiler will have to be reworked somehow. So there is some work needed here.

@awaelchli awaelchli added profiler help wanted Open to be worked on and removed needs triage Waiting to be triaged by maintainers labels Jul 12, 2024
@zhaohm14
Copy link
Author

Thanks a lot for your help!

@awaelchli
Copy link
Member

I'd like to keep the issue open, because we need to work on this (help from community appreciated).

@awaelchli awaelchli reopened this Jul 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on profiler ver: 2.2.x
Projects
None yet
Development

No branches or pull requests

2 participants