Confused about network's input size support, img_size in the code, and "rectangular training"? #332

furkankirac · 2019-06-14T21:58:08Z

First, thanks for the great repo with valuable commented code + support.
I'd like to train a network that matches the aspect ratio of a 1080p resolution. Let's say 1056x608 resolution is close to 1080p aspect ratio. I want to train a network from scratch with that resolution. I have a few questions around that:

Does this repo support training with network sizes that are not square? (Rectangular network size?)
If so, what is img_size in the code? Is it always a square image fed to the network in the inference step?
Is so called "rectangular training" support coming in v7 related to training non-square network sizes, or is it just an optimization around rectangular images in dataset during training?
If I change number of filters of convolutional layers in cfg file, will this repo correctly initialize weights? It looks like it always initializes with darknet's already trained weights file.
Best

glenn-jocher · 2019-06-14T23:39:28Z

@furkankirac yes you can optionally do rectangular training in train.py. Rectangular inference is already on by default in test.py and detect.py, no changes needed. See #232

img_size represents the longest dimension if rectangular inference is used. You can leave a new model initialized with random weights rather than a backbone (i.e. darknet53) if you would like to structure your own network. darknet53 is used as the backbone to help start training on more normal yolov3 variants like yolov3-spp etc.

To detect with rectangular inference simply run python3 detect.py. To train with rectangular inference set the flag in the code here and run python3 train.py etc. (git pull to get the latest).

yolov3/train.py

Lines 142 to 149 in bb36820

    
           # Dataset 
        
           rectangular_training = False 
        
           dataset = LoadImagesAndLabels(train_path, 
        
                                         img_size, 
        
                                         batch_size, 
        
                                         augment=True, 
        
                                         rect=rectangular_training)

glenn-jocher closed this as completed Jun 25, 2019

JVision mentioned this issue Feb 8, 2021

how to train a detector with input images of a rectangular shape 192x32? #1684

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confused about network's input size support, img_size in the code, and "rectangular training"? #332

Confused about network's input size support, img_size in the code, and "rectangular training"? #332

furkankirac commented Jun 14, 2019

glenn-jocher commented Jun 14, 2019

Confused about network's input size support, img_size in the code, and "rectangular training"? #332

Confused about network's input size support, img_size in the code, and "rectangular training"? #332

Comments

furkankirac commented Jun 14, 2019

glenn-jocher commented Jun 14, 2019