Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some tricks to improve yolov5. #2

Closed
2 of 3 tasks
SpongeBab opened this issue Jul 16, 2021 · 6 comments
Closed
2 of 3 tasks

Some tricks to improve yolov5. #2

SpongeBab opened this issue Jul 16, 2021 · 6 comments
Labels
enhancement New feature or request Stale

Comments

@SpongeBab
Copy link
Owner

SpongeBab commented Jul 16, 2021

🚀 Feature

original issue: ultralytics#3993

Edit:A list of tricks( TBC ): What I want to do.

@SpongeBab SpongeBab added the enhancement New feature or request label Jul 16, 2021
@SpongeBab
Copy link
Owner Author

SpongeBab commented Jul 16, 2021

At first, I tired yolov5-four-FPN.
I created the new yaml.

# Parameters
nc: 80  # number of classes
depth_multiple: 1.0  # model depth multiple
width_multiple: 1.0  # layer channel multiple
anchors:
#  - [ 10,13, 16,30, 33,23 ]  # P3/8
#  - [ 30,61, 62,45, 59,119 ]  # P4/16
#  - [ 116,90, 156,198, 373,326 ]  # P5/32
  - [ 5,7, 8,12, 15,13 ]  # P2/4  # custom
  - [ 10,13, 16,30, 33,23 ]  # P3/8
  - [ 30,61, 62,45, 59,119 ]  # P4/16
  - [ 116,90, 156,198, 373,326 ]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]
  [ [ -1, 1, Focus, [ 64, 3 ] ],  # 0-P1/2
    [ -1, 1, Conv, [ 128, 3, 2 ] ],  # 1-P2/4
    [ -1, 3, Bottleneck, [ 128 ] ],
    [ -1, 1, Conv, [ 256, 3, 2 ] ],  # 3-P3/8
    [ -1, 9, BottleneckCSP, [ 256 ] ],
    [ -1, 1, Conv, [ 512, 3, 2 ] ],  # 5-P4/16
    [ -1, 9, BottleneckCSP, [ 512 ] ],
    [ -1, 1, Conv, [ 1024, 3, 2 ] ],  # 7-P5/32
    [ -1, 1, SPP, [ 1024, [ 5, 9, 13 ] ] ],
    [ -1, 6, BottleneckCSP, [ 1024 ] ],  # 9
  ]

# YOLOv5 FPN head
head:
  [ [ -1, 3, BottleneckCSP, [ 1024, False ] ],  # 10 (P5/32-large)

    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 6 ], 1, Concat, [ 1 ] ],  # cat backbone P4
    [ -1, 1, Conv, [ 512, 1, 1 ] ],
    [ -1, 3, BottleneckCSP, [ 512, False ] ],  # 14 (P4/16-medium)

    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 4 ], 1, Concat, [ 1 ] ],  # cat backbone P3
    [ -1, 1, Conv, [ 256, 1, 1 ] ],
    [ -1, 3, BottleneckCSP, [ 256, False ] ],  # 18 (P3/8-small)

    [ -1, 1, nn.Upsample, [ None, 2, 'nearest' ] ],
    [ [ -1, 2 ], 1, Concat, [ 1 ] ],  # cat backbone P2
    [ -1, 1, Conv, [ 128, 1, 1 ] ],
    [ -1, 3, BottleneckCSP, [ 128, False ] ],  # 22 (P2/4-small)

    [ [ 18, 14, 10, 6 ], 1, Detect, [ nc, anchors ] ],  # Detect(P2, P3, P4, P5)
  ]

And I begin to retrain my custom dataset.
So far, expectations are likely to be lower.
yolov5-fpn:

    12/299     1.72G    0.1028    0.2015         0    0.3043        85       640     0.396    0.3296    0.3256   **0.08686**   0.09502    0.2732         0
    13/299     1.72G   0.09832    0.1952         0    0.2935        80       640   0.08477   0.04569      0.01  **0.002103**     0.143     1.027         0
    14/299     1.72G   0.09651    0.2055         0     0.302        44       640    0.4327    0.4439    0.4303     **0.128**   0.09133    0.2394         0
    15/299     1.72G   0.09085    0.1977         0    0.2886        24       640    0.4973    0.4347    0.4324    **0.1315**   0.09125    0.2948         0

yolov5-four-fpn:

    12/299     2.34G    0.1029    0.1509         0    0.2538        46       640    0.3257    0.2285    0.1548   **0.04483**    0.1016    0.2063         0
    13/299     2.34G   0.09988    0.1454         0    0.2452        85       640    0.2229    0.1991   0.07238   **0.01867**    0.1062    0.1958         0
    14/299     2.34G   0.09809    0.1433         0    0.2414        99       640     0.329    0.2369    0.1658   **0.04505**    0.1014    0.1966         0
    15/299     2.34G   0.09579    0.1469         0    0.2427       159       640    0.4225    0.2317    0.2051   **0.06759**   0.09982    0.1883         0

Unfortunately, I only have a 2070, 8GB graphics card. My batch-size is both 2.
If the batch-size is bigger my memory will overflow.
But maybe I can get better results with bigger batch-size modifications.

Some results about four output v5-p5:
yolov5-four-fpn:

               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 25/25 [00:05<00:00,  4.64it/s]
                 all         50       1532      0.961      0.916      0.963      0.665
Speed: 0.4ms pre-process, 32.8ms inference, 5.3ms NMS per image at shape (2, 3, 640, 640)

mAP 0.665, latency 38.5ms.

yolov5-fpn:

               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 25/25 [00:04<00:00,  5.25it/s]
                 all         50       1532      0.986      0.944      0.973      0.728
Speed: 0.5ms pre-process, 27.9ms inference, 3.0ms NMS per image at shape (2, 3, 640, 640) 

mAP 0.728, latency 31.4ms
Unfortunately, the mAP is even lower.

@glenn-jocher
Copy link

@SpongeBab thanks for the ideas! As I mentioned previously YOLOv5-P6 models include outputs at 4 scales already:
https://github.com/ultralytics/yolov5/blob/62409eea0807830669f21a84733e73052ee85c07/models/hub/yolov5s6.yaml#L1-L58

@SpongeBab
Copy link
Owner Author

SpongeBab commented Jul 17, 2021

Yeah.I just want to test whether increasing the number of FPN layers can bring improvement without increasing the number of network layers. If I use the P6 model, it will not be surprising to see improvements, as the deeper the network is, the higher the mAP is generally.
And I think for the P6 model, even though 1280 gets a higher mAP than 640, if you switch to 640, the mAP may be lower. The P6 model needs a larger size to be obtained.Do you tried to train p6 with 640x640?
At last, I think that if FPN integrates more feature layers, it needs a larger batch-size to play its potential.
Just record.
yolov5-four-fpn (the latest result in batch-size is 3):

               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100%|██████████| 25/25 [00:04<00:00,  5.00it/s]
                 all         50       1532      0.982      0.945      0.976      0.741
Speed: 0.4ms pre-process, 31.3ms inference, 3.2ms NMS per image at shape (2, 3, 640, 640)

mAP: 0.741(+0.013), latency 34.9ms (+3.5ms)

Although batch-size is 3, different from before,but it prove the better mAP with p5-fourHead.
Next , try to retrain original v5-FPN with batch-size 3.

@glenn-jocher
Copy link

@SpongeBab yes P6 models benefit 640 trainings also. YOLOv5l6 models trained at 640 produce 49.0 mAP vs YOLOv5l models trained at 640 at 48.2.

@SpongeBab
Copy link
Owner Author

Hello, @glenn-jocher .
Some thing want to tell you. The latest repo about yolo. It's a anchor-free version of yolo.If you are interested in this, you can see this:
https://github.com/Megvii-BaseDetection/YOLOX
paper: https://arxiv.org/pdf/2107.08430.pdf.

@github-actions
Copy link

github-actions bot commented Nov 3, 2021

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

@github-actions github-actions bot added the Stale label Nov 3, 2021
@github-actions github-actions bot closed this as completed Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale
Projects
None yet
Development

No branches or pull requests

2 participants