Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Simple and Efficient Network for Small Target Detection #4213

Closed
mrhosseini opened this issue Nov 3, 2019 · 28 comments
Closed

A Simple and Efficient Network for Small Target Detection #4213

mrhosseini opened this issue Nov 3, 2019 · 28 comments
Labels
Solved The problem is solved using the correct settings want enhancement Want to improve accuracy, speed or functionality

Comments

@mrhosseini
Copy link

Hi,

This paper proposes a new network configuration for small target detection and claims that it has a performance near YoloV3 while a speed near YoloV3-Tiny. The main idea is to use dilated and 1x1 convolutions.

image

image

I tried to implement the network using this repo but in training always get NaN for loss and avg loss.

Here is the configuration that I used for single class detection:

[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=8
width=512
height=512
channels=1
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 2.0
hue=0

learning_rate=0.001
burn_in=1000
max_batches = 8000
policy=steps
steps=6400,7200
scales=.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=16
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
dilation=2

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
dilation=4

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=9, 13

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[route]
layers=8

[convolutional]
batch_normalize=1
filters=16
size=1
stride=1
pad=1
activation=leaky

[reorg3d]
stride=2

[route]
layers=25, 28

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=leaky

and this is the proposed network in the paper:
image

Any advice for solving the problem?

@AlexeyAB AlexeyAB added the want enhancement Want to improve accuracy, speed or functionality label Nov 3, 2019
@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 3, 2019

I don't see [yolo] layer in you cfg-file.
Can you rename to txt-file and attach whole your cfg-file?

@mrhosseini
Copy link
Author

The cfg-file: network.cfg.txt

I don't see [yolo] layer in you cfg-file.

The proposed network in the paper does not have any [yolo] or [cost] layers.

Based on the yolov3-tiny.cfg file, I changed the activation function of last layer to linear and added a [yolo] layer after it (network_with_yolo.cfg.txt). Now it can be trained but the performance is weaker than YoloV3-Tiny. No NaN for loss and avg loss values and these values oscillate in a much larger range compared to the YoloV3-Tiny.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 4, 2019

The proposed network in the paper does not have any [yolo] or [cost] layers.

The proposed network in the paper has [yolo] layer

image

@mrhosseini
Copy link
Author

mrhosseini commented Nov 4, 2019

I thought that in the table, left column is the architecture of authors proposed network and right column is the architecture of Tiny YoloV3 and each column presents a separate independent architecture. Therefore, the [yolo] layer you mentioned, is in the Tiny YoloV3 not the proposed network.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 4, 2019

Yes, sure, you are right ) But still no detection network can work without a detection head: [yolo], SSD, Faster RCNN, ...

@mrhosseini
Copy link
Author

Thanks. So there may be a mistake in the table. As I mentioned before, adding a [yolo] layer after the last convolution layer did not give any interesting results:

Based on the yolov3-tiny.cfg file, I changed the activation function of last layer to linear and added a [yolo] layer after it (network_with_yolo.cfg.txt). Now it can be trained but the performance is weaker than YoloV3-Tiny. No NaN for loss and avg loss values and these values oscillate in a much larger range compared to the YoloV3-Tiny.

Despite the [yolo] layer, is the configuration in network_with_yolo.cfg.txt conforming with the proposed network in the paper? I used [route] layer for Concatenation layers and [reorg3d] layer for the Passthrough layer.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 4, 2019

Now it can be trained but the performance is weaker than YoloV3-Tiny.

  • What dataset do you use?
  • How many training images?
  • What is the average size of objects after resizing images to the network size 512x512?
  • What mAP did you get in both cases?
  • Can you show chart.png with Loss & mAP for both network_with_yolo.cfg.txt and yolov3-tiny.cfg ?

image

Yes, it seems network_with_yolo.cfg.txt conforming with the proposed network in the paper

I used [route] layer for Concatenation layers and [reorg3d] layer for the Passthrough layer.

Thats right.

Try to use in the [yolo] layer

filters=36
...

mask = 0,1,2,3,4,5
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319

@mrhosseini
Copy link
Author

  • What dataset do you use?

A custom dataset. I have not tested the datasets used in the paper.

  • How many training images?

1734 images.

  • What is the average size of objects after resizing images to the network size 512x512?

About 30x30.

  • What mAP did you get in both cases?
  • Can you show chart.png with Loss & mAP for both network_with_yolo.cfg.txt and yolov3-tiny.cfg ?

chart.png for yolov3-tiny.cfg.txt:

yolov3-tiny

chart.png for network_with_yolo.cfg.txt:

network_with_yolo

Note that:

  • The validation set used for mAP calculation is different from the training set.
  • Anchors are calculated for the dataset using darknet detector calc_anchors.
  • Network image size for Tiny YoloV3 is 416x416.

Try to use in the [yolo] layer

filters=36
...

mask = 0,1,2,3,4,5

Without these changes the mAP was lower with avg loss swinging in a larger range.


We have a separate test set. Here are the results of darknet detector map:

With best weights using yolov3-tiny.cfg.txt:

 calculation mAP (mean average precision)...
380
 detections_count = 573, unique_truth_count = 108  
class_id = 0, name = cls, ap = 74.57%   	 (TP = 83, FP = 44) 

 for conf_thresh = 0.25, precision = 0.65, recall = 0.77, F1-score = 0.71 
 for conf_thresh = 0.25, TP = 83, FP = 44, FN = 25, average IoU = 45.68 % 

 IoU threshold = 40 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.40) = 0.745731, or 74.57 % 
Total Detection Time: 0.000000 Seconds

With best weights using network_with_yolo.cfg.txt:

 calculation mAP (mean average precision)...
380
 detections_count = 576, unique_truth_count = 108  
class_id = 0, name = cls, ap = 67.20%   	 (TP = 82, FP = 67) 

 for conf_thresh = 0.25, precision = 0.55, recall = 0.76, F1-score = 0.64 
 for conf_thresh = 0.25, TP = 82, FP = 67, FN = 26, average IoU = 40.94 % 

 IoU threshold = 40 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.40) = 0.671979, or 67.20 % 
Total Detection Time: 2.000000 Seconds

@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 5, 2019

Are mAPs on the charts for Training or Validation dataset?

@mrhosseini
Copy link
Author

Are mAPs on the charts for Training or Validation dataset?

Validation

@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 5, 2019

Why on the chart you get 99.9% but for ./darknet detector map ... you get 67.20% for network_with_yolo.cfg.txt ?

@mrhosseini
Copy link
Author

mrhosseini commented Nov 6, 2019

Why on the chart you get 99.9% but for ./darknet detector map ... you get 67.20% for network_with_yolo.cfg.txt ?

I used a separate test set for darknet detector map, which is different from the validation set used in training.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 6, 2019

Did you get Training/Valid/Test dataset by randomly uniform dividing single dataset to 80%/10%/10%?

@mrhosseini
Copy link
Author

Did you get Training/Valid/Test dataset by randomly uniform dividing single dataset to 80%/10%/10%?

Train and valid sets are selected randomly from a single dataset with 1734 images for train and 530 images for valid . But the test set is an independent set.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 7, 2019

So may be this is the reason. Your train for one objects, but test for others.

@mrhosseini
Copy link
Author

So may be this is the reason. Your train for one objects, but test for others.

Yes, you are right

@sctrueew
Copy link

sctrueew commented Nov 9, 2019

@mrhosseini Hi,

When I using network_with_yolo.cfg I’m faced with this error.

cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16()
cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

I have 18 classes and I just changed:
filters=138 and classes to 18.

@mrhosseini
Copy link
Author

When I using network_with_yolo.cfg I’m faced with this error.

cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16()
cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

I have 18 classes and I just changed:
filters=138 and classes to 18.

@zpmmehrdad
Hi,
Unfortunately I’m not familiar with cuDNN. May be @AlexeyAB can help you.

@sctrueew
Copy link

sctrueew commented Nov 9, 2019

@mrhosseini Hi,

Thanks, What CUDNN and CUDA version are you using?

@AlexeyAB
Copy link
Owner

AlexeyAB commented Nov 9, 2019

@zpmmehrdad

  • What GPU do you use?
  • What command do you use?
  • Can you show output of commands:
nvcc --version
nvidia-smi

@sctrueew
Copy link

@AlexeyAB Hi,

I'm using
OS: Win10,
command: darknet.exe detector train a.obj network_with_yolo.cfg -map

output:

compute_capability = 610, cudnn_half = 0
layer filters size/strd(dil) input output
0 conv 16 3 x 3/ 1 512 x 512 x 1 -> 512 x 512 x 16 0.075 BF
1 conv 32 3 x 3/ 2 512 x 512 x 16 -> 256 x 256 x 32 0.604 BF
2 conv 16 1 x 1/ 1 256 x 256 x 32 -> 256 x 256 x 16 0.067 BF
3 conv 32 3 x 3/ 1 256 x 256 x 16 -> 256 x 256 x 32 0.604 BF
4 conv 64 3 x 3/ 2 256 x 256 x 32 -> 128 x 128 x 64 0.604 BF
5 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF
6 conv 64 3 x 3/ 1 128 x 128 x 32 -> 128 x 128 x 64 0.604 BF
7 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF
8 conv 64 3 x 3/ 1 128 x 128 x 32 -> 128 x 128 x 64 0.604 BF
9 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF
10
cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16() : line: 157 : build time: Oct 22 2019 - 09:30:52
cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED: No error
Assertion failed: 0, file ....\src\utils.c, line 293

@AlexeyAB AlexeyAB added the Likely bug Maybe a bug, maybe not label Nov 10, 2019
@AlexeyAB
Copy link
Owner

@zpmmehrdad What GPU do you use?

@sctrueew
Copy link

sctrueew commented Nov 10, 2019

@zpmmehrdad What GPU do you use?

@AlexeyAB Hi,
GTX 1080 ti

@sctrueew
Copy link

sctrueew commented Nov 12, 2019

@AlexeyAB Hi,
I found the problem. I updated the CUDA version from 9.1 to 10.0 and it's work.

@AlexeyAB AlexeyAB added Solved The problem is solved using the correct settings and removed Likely bug Maybe a bug, maybe not labels Nov 12, 2019
@leiyaohui
Copy link

leiyaohui commented Dec 9, 2019

@mrhosseini Hello? I'm also studying this field recently. Are you running on windows? If so, can you send me a copy of your compiled Darknet and pack it for me? I encountered a lot of errors in compiling. My email is 1373890292@qq.com,I look forward to your reply.

@mrhosseini
Copy link
Author

Hi @leiyaohui , unfortunately I use Ubuntu. Try one of the methods here. You may open a new issue if encountered with errors.

@leiyaohui
Copy link

leiyaohui commented Dec 9, 2019 via email

@mrhosseini
Copy link
Author

Did you write the expansion convolution or did it come with Darknet itself?

The dilated convolution is implemented in this repository. You can use this configuration file for the proposed network of the paper which mentioned above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Solved The problem is solved using the correct settings want enhancement Want to improve accuracy, speed or functionality
Projects
None yet
Development

No branches or pull requests

5 participants