Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Error while training on yolov5.transfer_learn_pruned_quantized.md #11

Open
SHIVAM3052 opened this issue Aug 31, 2022 · 0 comments

Comments

@SHIVAM3052
Copy link

train: weights=yolov5s.pt, cfg=./yolov5-train/models/yolov5s.yaml, data=./datasets/data.yaml, hyp=data/hyps/hyp.scratch.yaml, epochs=300, batch_size=64, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=yolov5-deepsparse, name=yolov5s-sgd-pruned-quantized, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest, recipe=./recipes/yolov5.transfer_learn_pruned_quantized.md, disable_ema=False, max_train_steps=-1, max_eval_steps=-1, one_shot=False, num_export_samples=0
github: skipping check (not a git repository), for updates see https://github.com/ultralytics/yolov5
requirements: /content/drive/MyDrive/colab/yolov5_cpu/yolov5-train/requirements.txt not found, check failed.
fatal: not a git repository (or any parent up to mount point /content)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
YOLOv5 🚀 2022-6-27 torch 1.9.0+cu111 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0, copy_paste=0.0
Weights & Biases: run 'pip install wandb' to automatically track and visualize YOLOv5 🚀 runs (RECOMMENDED)
TensorBoard: Start with 'tensorboard --logdir yolov5-deepsparse', view at http://localhost:6006/
Overriding model.yaml nc=80 with nc=2

             from  n    params  module                                  arguments                     

0 -1 1 3520 models.common.Conv [3, 32, 6, 2, 2]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 2 115712 models.common.C3 [128, 128, 2]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 3 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 1182720 models.common.C3 [512, 512, 1]
9 -1 1 656896 models.common.SPPF [512, 512, 5]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 361984 models.common.C3 [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 90880 models.common.C3 [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 296448 models.common.C3 [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 18879 models.yolo.Detect [2, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
YOLOv5s summary: 270 layers, 7025023 parameters, 7025023 gradients

2022-08-31 08:51:10 sparseml.optim.manager INFO Created recipe manager with metadata: {
"metadata": null
}
Created recipe manager with metadata: {
"metadata": null
}
Transferred 342/349 items from yolov5s.pt
Scaled weight_decay = 0.0005
optimizer: SGD with parameter groups 57 weight (no decay), 60 weight, 60 bias
albumentations: Blur(always_apply=False, p=0.01, blur_limit=(3, 7)), MedianBlur(always_apply=False, p=0.01, blur_limit=(3, 7)), ToGray(always_apply=False, p=0.01), CLAHE(always_apply=False, p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))
train: Scanning '/content/drive/MyDrive/colab/yolov5_cpu/datasets/train/labels.cache' images and labels... 2661 found, 0 missing, 0 empty, 0 corrupt: 100% 2661/2661 [00:00<?, ?it/s]
val: Scanning '/content/drive/MyDrive/colab/yolov5_cpu/datasets/valid/labels.cache' images and labels... 254 found, 0 missing, 0 empty, 0 corrupt: 100% 254/254 [00:00<?, ?it/s]
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
Plotting labels to yolov5-deepsparse/yolov5s-sgd-pruned-quantized2/labels.jpg...

AutoAnchor: 4.94 anchors/target, 0.998 Best Possible Recall (BPR). Current anchors are a good fit to dataset ✅
Image sizes 640 train, 640 val
Using 4 dataloader workers
Logging results to yolov5-deepsparse/yolov5s-sgd-pruned-quantized2
Starting training for 300 epochs...
Disabling LR scheduler, managing LR using SparseML recipe
Overriding number of epochs from SparseML manager to 300

 Epoch   gpu_mem       box       obj       cls    labels  img_size
 0/239     13.5G   0.09351   0.08568   0.02641       403       640: 100% 42/42 [04:13<00:00,  6.03s/it]
           Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 2/2 [00:08<00:00,  4.22s/it]
             all        254       1962      0.143      0.549      0.206     0.0441

 Epoch   gpu_mem       box       obj       cls    labels  img_size
 1/239     15.3G    0.0648   0.08876   0.01514       413       640: 100% 42/42 [01:05<00:00,  1.56s/it]
           Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 2/2 [00:03<00:00,  1.92s/it]
             all        254       1962       0.45      0.568      0.474      0.132

 Epoch   gpu_mem       box       obj       cls    labels  img_size
 2/239     15.4G   0.05597    0.0865  0.006696       498       640: 100% 42/42 [01:09<00:00,  1.64s/it]
           Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 2/2 [00:03<00:00,  1.89s/it]
             all        254       1962      0.673      0.727      0.696      0.227

 Epoch   gpu_mem       box       obj       cls    labels  img_size
 3/239     15.4G   0.04954   0.08305  0.004265       406       640: 100% 42/42 [01:10<00:00,  1.67s/it]
           Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 2/2 [00:03<00:00,  1.52s/it]
             all        254       1962      0.725       0.81      0.812      0.303

 .

.
.
.
.
Epoch gpu_mem box obj cls labels img_size
148/239 15.7G 0.02369 0.04871 0.0005125 364 640: 100% 42/42 [01:11<00:00, 1.70s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.34s/it]
all 254 1962 0.872 0.917 0.918 0.47

 Epoch   gpu_mem       box       obj       cls    labels  img_size

149/239 15.7G 0.02356 0.04775 0.000539 431 640: 100% 42/42 [01:11<00:00, 1.69s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.35s/it]
all 254 1962 0.857 0.93 0.925 0.482

 Epoch   gpu_mem       box       obj       cls    labels  img_size

150/239 15.7G 0.02364 0.04813 0.0004901 356 640: 100% 42/42 [01:09<00:00, 1.65s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.40s/it]
all 254 1962 0.89 0.895 0.925 0.492

 Epoch   gpu_mem       box       obj       cls    labels  img_size

151/239 15.7G 0.02364 0.04794 0.000441 352 640: 100% 42/42 [01:11<00:00, 1.70s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.43s/it]
all 254 1962 0.883 0.915 0.922 0.499

 Epoch   gpu_mem       box       obj       cls    labels  img_size

152/239 15.7G 0.02343 0.0488 0.0004617 474 640: 100% 42/42 [01:09<00:00, 1.65s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.29s/it]
all 254 1962 0.872 0.918 0.921 0.476

 Epoch   gpu_mem       box       obj       cls    labels  img_size

153/239 15.7G 0.02356 0.04794 0.0004719 414 640: 100% 42/42 [01:11<00:00, 1.71s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.30s/it]
all 254 1962 0.874 0.901 0.916 0.453

 Epoch   gpu_mem       box       obj       cls    labels  img_size

154/239 15.7G 0.0236 0.04801 0.0004782 365 640: 100% 42/42 [01:09<00:00, 1.65s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.37s/it]
all 254 1962 0.877 0.92 0.921 0.469

 Epoch   gpu_mem       box       obj       cls    labels  img_size

155/239 15.7G 0.02342 0.0487 0.0004548 422 640: 100% 42/42 [01:11<00:00, 1.71s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.36s/it]
all 254 1962 0.876 0.908 0.918 0.457

 Epoch   gpu_mem       box       obj       cls    labels  img_size

156/239 15.7G 0.02342 0.04807 0.0004876 431 640: 100% 42/42 [01:08<00:00, 1.64s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.32s/it]
all 254 1962 0.884 0.899 0.924 0.461

 Epoch   gpu_mem       box       obj       cls    labels  img_size

157/239 15.7G 0.02369 0.04774 0.0004496 445 640: 100% 42/42 [01:11<00:00, 1.70s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.38s/it]
all 254 1962 0.889 0.904 0.925 0.492

 Epoch   gpu_mem       box       obj       cls    labels  img_size

158/239 15.7G 0.02323 0.04796 0.000464 465 640: 100% 42/42 [01:09<00:00, 1.66s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.36s/it]
all 254 1962 0.866 0.909 0.913 0.473

 Epoch   gpu_mem       box       obj       cls    labels  img_size

159/239 15.7G 0.02335 0.04735 0.0004544 466 640: 100% 42/42 [01:10<00:00, 1.68s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.37s/it]
all 254 1962 0.869 0.898 0.914 0.457

 Epoch   gpu_mem       box       obj       cls    labels  img_size

160/239 15.7G 0.02319 0.04785 0.0004265 533 640: 100% 42/42 [01:09<00:00, 1.65s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.42s/it]
all 254 1962 0.88 0.907 0.92 0.476

 Epoch   gpu_mem       box       obj       cls    labels  img_size

161/239 15.7G 0.02315 0.04724 0.0004371 472 640: 100% 42/42 [01:10<00:00, 1.67s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.29s/it]
all 254 1962 0.884 0.9 0.918 0.445

 Epoch   gpu_mem       box       obj       cls    labels  img_size

162/239 15.7G 0.02322 0.04771 0.0004138 369 640: 100% 42/42 [01:10<00:00, 1.67s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.31s/it]
all 254 1962 0.872 0.915 0.919 0.451

 Epoch   gpu_mem       box       obj       cls    labels  img_size

163/239 15.7G 0.02302 0.04781 0.0004746 431 640: 100% 42/42 [01:10<00:00, 1.67s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.38s/it]
all 254 1962 0.863 0.92 0.918 0.45

 Epoch   gpu_mem       box       obj       cls    labels  img_size

164/239 15.7G 0.02305 0.04701 0.0004676 507 640: 100% 42/42 [01:09<00:00, 1.66s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.36s/it]
all 254 1962 0.858 0.919 0.914 0.447

 Epoch   gpu_mem       box       obj       cls    labels  img_size

165/239 15.7G 0.02285 0.04715 0.0004571 502 640: 100% 42/42 [01:10<00:00, 1.69s/it]
Class Images Labels P R mAP@.5 mAP@.5:.95: 100% 2/2 [00:02<00:00, 1.39s/it]
all 254 1962 0.86 0.906 0.914 0.426
Stopping training early as no improvement observed in last 100 epochs. Best results observed at epoch 65, best model saved as best.pt.
To update EarlyStopping(patience=100) pass a new patience value, i.e. python train.py --patience 300 or use --patience 0 to disable EarlyStopping.

241 epochs completed in 3.501 hours.
Optimizer stripped from yolov5-deepsparse/yolov5s-sgd-pruned-quantized2/weights/last.pt, 42.5MB
Optimizer stripped from yolov5-deepsparse/yolov5s-sgd-pruned-quantized2/weights/best.pt, 42.5MB

Validating yolov5-deepsparse/yolov5s-sgd-pruned-quantized2/weights/best.pt...
Fusing layers...
YOLOv5s summary: 213 layers, 7015519 parameters, 0 gradients
2022-08-31 12:21:43 sparseml.optim.manager INFO Created recipe manager with metadata: {
"metadata": null
}
Created recipe manager with metadata: {
"metadata": null
}
Traceback (most recent call last):
File "./yolov5-train/train.py", line 745, in
main(opt)
File "./yolov5-train/train.py", line 641, in main
train(opt.hyp, opt, device, callbacks)
File "./yolov5-train/train.py", line 514, in train
model=load_checkpoint(type_='ensemble', weights=best, device=device)[0],
File "/content/drive/MyDrive/colab/yolov5_cpu/yolov5-train/export.py", line 529, in load_checkpoint
state_dict = load_state_dict(model, state_dict, run_mode=not ensemble_type, exclude_anchors=exclude_anchors)
File "/content/drive/MyDrive/colab/yolov5_cpu/yolov5-train/export.py", line 553, in load_state_dict
model.load_state_dict(state_dict, strict=not run_mode) # load
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1407, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Model:
Missing key(s) in state_dict: "model.0.conv.quant.activation_post_process.scale", "model.0.conv.quant.activation_post_process.zero_point", "model.0.conv.quant.activation_post_process.fake_quant_enabled", "model.0.conv.quant.activation_post_process.observer_enabled", "model.0.conv.quant.activation_post_process.scale", "model.0.conv.quant.activation_post_process.zero_point", "model.0.conv.quant.activation_post_process.activation_post_process.min_val", "model.0.conv.quant.activation_post_process.activation_post_process.max_val", "model.0.conv.module.weight", "model.0.conv.module.bias", "model.0.conv.module.weight_fake_quant.scale", "model.0.conv.module.weight_fake_quant.zero_point", "model.0.conv.module.weight_fake_quant.fake_quant_enabled", "model.0.conv.module.weight_fake_quant.observer_enabled", "model.0.conv.module.weight_fake_quant.scale", "model.0.conv.module.weight_fake_quant.zero_point", "model.0.conv.module.weight_fake_quant.activation_post_process.min_val", "model.0.conv.module.weight_fake_quant.activation_post_process.max_val", "model.0.conv.module.activation_post_process.scale", "model.0.conv.module.activation_post_process.zero_point", "model.0.conv.module.activation_post_process.fake_quant_enabled", "model.0.conv.module.activation_post_process.observer_enabled", "model.0.conv.module.activation_post_process.scale", "model.0.conv.module.activation_post_process.zero_point", "model.0.conv.module.activation_post_process.activation_post_process.min_val", "model.0.conv.module.activation_post_process.activation_post_process.max_val", "model.1.conv.quant.activation_post_process.scale", "model.1.conv.quant.activation_post_process.zero_point", "model.1.conv.quant.activation_post_process.fake_quant_enabled", "model.1.conv.quant.activation_post_process.observer_enabled", "model.1.conv.quant.activation_post_process.scale", "model.1.conv.quant.activation_post_process.zero_point", "model.1.conv.quant.activation_post_process.activation_post_process.min_val", "model.1.conv.quant.activation_post_process.activation_post_process.max_val", "model.1.conv.module.weight", .................So on..................

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant