Issue when training and predicting with a custom dataset and the YOLO_NAS_S model #2028

Esidell · 2024-07-02T15:35:14Z

💡 Your Question

Hi everyone,

I'm trying to use the Yolo_nas_s model on a custom dataset made of 2D gaussians in order to simulate galaxies so it can recognise them in astronomical images, however during training, many parts of the loss function and validation are equal to zero, thus preventing the model from doing anyprediction when using model.predict().

I've already checked the labels and they seem to be correctly working ( a .txt file with c x y w h , normalised for the coordinates and dimensions of the box.)

The model only uses 1 class and uses the PPyoloELoss function, here are the training parameters and other related parts :

from super_gradients.training.losses import PPYoloELoss
from super_gradients.training.metrics import DetectionMetrics_050
from super_gradients.training.models.detection_models.pp_yolo_e import PPYoloEPostPredictionCallback


CLASS_NAMES = ['zero_order']
NUM_CLASSES = len(CLASS_NAMES)

train_params = {
    "warmup_initial_lr": 1e-5,
    "initial_lr": 5e-4,
    "lr_mode": "cosine",
    "cosine_final_lr_ratio": 0.5,
    "optimizer": "SGD",
    "zero_weight_decay_on_bias_and_bn": True,
    "lr_warmup_epochs": 1,
    "warmup_mode": "LinearEpochLRWarmup",
    "optimizer_params": {"weight_decay": 0.0001},
    "ema": False,
    "average_best_models": False,
    "ema_params": {"beta": 25, "decay_type": "exp"},
    "max_epochs": 20,
    "mixed_precision": True,
    "loss": PPYoloELoss(use_static_assigner=True, num_classes=NUM_CLASSES, reg_max=None),
    "valid_metrics_list": [
        DetectionMetrics_050(
            score_thres=0.1,
            top_k_predictions=300,
            num_cls=NUM_CLASSES,
            normalize_targets=True,
            include_classwise_ap=True,
            class_names=CLASS_NAMES,
            post_prediction_callback=PPYoloEPostPredictionCallback(score_threshold=0.01, nms_top_k=1000, max_predictions=300, nms_threshold=0.7),
        )
    ],
    "metric_to_watch": "mAP@0.50",
}

from super_gradients.training import Trainer
from super_gradients.common.object_names import Models
from super_gradients.training import models

trainer = Trainer(experiment_name="yolo_nas_s", ckpt_root_dir="CHECKPOINT_DIR")
model = models.get(Models.YOLO_NAS_S, num_classes=NUM_CLASSES, pretrained_weights="coco")
trainer.train(model=model, training_params=train_params, train_loader=train_loader, valid_loader=valid_loader)

IMAGE = "/data/split/test/images/image_0010.jpg"

images_predictions = model.to("cuda").predict(IMAGE, conf = 0.1)

images_predictions.show(box_thickness=2, show_confidence=True)

Here is an epoch summary to demonstrate the problem :

SUMMARY OF EPOCH 1
├── Train
│ ├── Ppyoloeloss/loss_cls = 0.0076
│ │ ├── Epoch N-1 = 0.4161 (↘ -0.4085)
│ │ └── Best until now = 0.4161 (↘ -0.4085)
│ ├── Ppyoloeloss/loss_iou = 0.0
│ │ ├── Epoch N-1 = 0.0 (= 0.0)
│ │ └── Best until now = 0.0 (= 0.0)
│ ├── Ppyoloeloss/loss_dfl = 0.0
│ │ ├── Epoch N-1 = 0.0 (= 0.0)
│ │ └── Best until now = 0.0 (= 0.0)
│ └── Ppyoloeloss/loss = 0.0076
│ ├── Epoch N-1 = 0.4161 (↘ -0.4085)
│ └── Best until now = 0.4161 (↘ -0.4085)
└── Validation
├── Ppyoloeloss/loss_cls = 0.0024
│ ├── Epoch N-1 = 0.0582 (↘ -0.0558)
│ └── Best until now = 0.0582 (↘ -0.0558)
├── Ppyoloeloss/loss_iou = 0.0
│ ├── Epoch N-1 = 0.0 (= 0.0)
│ └── Best until now = 0.0 (= 0.0)
├── Ppyoloeloss/loss_dfl = 0.0
│ ├── Epoch N-1 = 0.0 (= 0.0)
│ └── Best until now = 0.0 (= 0.0)
...
├── Epoch N-1 = 0.0 (= 0.0)
└── Best until now = 0.0 (= 0.0)

Thank you for your help !

Versions

No response

The text was updated successfully, but these errors were encountered:

BloodAxe · 2024-07-09T17:05:17Z

The first thing I would try - double-check that dataset is loaded properly.
Use this callback to visualize the data during training and see if there are correct boxes drawn

super-gradients/src/super_gradients/training/utils/callbacks/callbacks.py

Line 1260 in bcdc0d1

    
           class ExtremeBatchDetectionVisualizationCallback(ExtremeBatchCaseVisualizationCallback):

Esidell · 2024-07-19T14:04:21Z

The first thing I would try - double-check that dataset is loaded properly. Use this callback to visualize the data during training and see if there are correct boxes drawn

super-gradients/src/super_gradients/training/utils/callbacks/callbacks.py

Line 1260 in bcdc0d1

class ExtremeBatchDetectionVisualizationCallback(ExtremeBatchCaseVisualizationCallback):

I've tried a different approach and added normalization to my dataset class instead of bringing in already normalized images, and it seems to have fixed the issue of the model not predicting, however I get this issue instead, which I only get on certain images :

AttributeError Traceback (most recent call last)
AttributeError: 'int' object has no attribute 'sqrt'

The above exception was the direct cause of the following exception:
in ImageDetectionPrediction.show(self, box_thickness, show_confidence, color_mapping, target_bboxes, target_bboxes_format, target_class_ids, class_names)

line 52 : diag_length = np.sqrt(bbox_width2 + bbox_height2)

TypeErrr: loop of ufunc does not support argument 0 of type int which has no callable sqrt method

Does anyone know of what could be causing this in the normalization process?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue when training and predicting with a custom dataset and the YOLO_NAS_S model #2028

Issue when training and predicting with a custom dataset and the YOLO_NAS_S model #2028

Esidell commented Jul 2, 2024

BloodAxe commented Jul 9, 2024

Esidell commented Jul 19, 2024

Issue when training and predicting with a custom dataset and the YOLO_NAS_S model #2028

Issue when training and predicting with a custom dataset and the YOLO_NAS_S model #2028

Comments

Esidell commented Jul 2, 2024

💡 Your Question

Versions

BloodAxe commented Jul 9, 2024

Esidell commented Jul 19, 2024