Is there any way I can use yolov5 with opencv dnn #239

chaUAV · 2020-06-30T03:51:05Z

🚀 Feature

Is there any way I can use yolov5 with opencv dnn

edurenye · 2020-06-30T22:35:41Z

Yes @chaUAV it is possible, you need to export it using https://github.com/ultralytics/yolov5/blob/master/models/export.py, inside the file there is an usage example, then the model will be exported as an ONNX model and it can be imported in OpenCV using cv2.dnn.readNetFromONNX(model_path)

Or at least this is the supposed way, I found this issue doing that: #250

glenn-jocher · 2020-06-30T23:08:56Z

@chaUAV @edurenye I've added a pinned documentation issue now at the top of https://github.com/ultralytics/yolov5/issues for this, hopefully this will help everyone to understand the basic functionality.

The INT64's remain one mystery among many in the export process though.

edurenye · 2020-06-30T23:19:01Z

Thanks @glenn-jocher I think it's the labels but I need to test, and I'm having some problems with Docker and me trying to update the nvidia drivers to 450, so it might take me a while.

chaUAV · 2020-07-01T16:33:35Z

Thanks you guys, I had already export to onnx and use it with opencv but I got the same error as #250 is there anythings i could do to fix it? @glenn-jocher

github-actions · 2020-08-01T00:30:24Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

MohamedAliRashad · 2020-09-06T12:54:19Z

This issue needs to be reopened @glenn-jocher
we need an alternative way to the onnx method.

glenn-jocher · 2020-09-06T22:53:54Z

@MohamedAliRashad sure. Are you trying to export an official YOLOv5 model for use with opencv? I can provide versions of these in ONNX format with outputs structured correctly, but they will lack NMS functionality. Is there a way to append an NMS module in ONNX?

MohamedAliRashad · 2020-09-08T13:09:02Z

@glenn-jocher
I was thinking about readNetFromDarkNet like the previous versions of YOLO

glenn-jocher · 2020-09-11T20:17:53Z

@MohamedAliRashad sorry I've just never used opencv dnn. Can you provide demo code for how this would work ideally? As I said I can provide fully functional exports in all supported formats for COCO and VOC trained YOLOv5 models. What format do you need it in exactly, and how is NMS handled?

MohamedAliRashad · 2020-09-19T15:00:50Z

@glenn-jocher it's quite simple actually.
First, you read the model weights and configuration to construct the network
net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)

then, we infer an input

blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416), swapRB=True, crop=False)
net.setInput(blob)
detections = net.forward(ln)

And finally we run thresholds for filtration with a code like this

boxes = []
confidences = []
classIDs = []
for output in detections:
    for detection in output:
        scores = detection[5:]
        classID = np.argmax(scores)
        confidence = scores[classID]
        if confidence > args["confidence"]:
            # W, H are the dimensions of the input image
            box = detection[0:4] * np.array([W, H, W, H])
            (centerX, centerY, width, height) = box.astype("int")
            x = int(centerX - (width / 2))
            y = int(centerY - (height / 2))
            boxes.append([x, y, int(width), int(height)])
            confidences.append(float(confidence))
            classIDs.append(classID)
idxs = cv2.dnn.NMSBoxes(boxes, confidences, confidence, threshold)

sean-wade · 2021-01-12T02:32:39Z

Has anyone done it properly? Using opencv-dnn to inference yolov5 models ... ? Is there any guide ?

leeyunhome · 2021-02-20T18:37:25Z

@MohamedAliRashad sure. Are you trying to export an official YOLOv5 model for use with opencv? I can provide versions of these in ONNX format with outputs structured correctly, but they will lack NMS functionality. Is there a way to append an NMS module in ONNX?

Hello?

Can I take this onnx model and test it out?

Thank you.

leeyunhome · 2021-02-20T18:40:49Z

@chaUAV @edurenye I've added a pinned documentation issue now at the top of https://github.com/ultralytics/yolov5/issues for this, hopefully this will help everyone to understand the basic functionality.

The INT64's remain one mystery among many in the export process though.

Hello?

The savings seem to have disappeared from the top over time. Could you tell me the url of the document?

Thank you.

glenn-jocher · 2021-02-20T21:58:10Z

@leeyunhome I've exported a YOLOv5s.onnx model at 640x640 here. It has two outputs, boxes (25200,4), and classes (25200,80).
https://github.com/ultralytics/yolov5/releases/download/v4.0/yolov5s.onnx

leeyunhome · 2021-02-20T22:55:11Z

@leeyunhome I've exported a YOLOv5s.onnx model at 640x640 here. It has two outputs, boxes (25200,4), and classes (25200,80).
https://github.com/ultralytics/yolov5/releases/download/v4.0/yolov5s.onnx

Thank you for answer.
I have an additional question.

Where did you change from the original repo to produce this output?
I would like to know the contents of the torch.onnx.export function used for this conversion.
You should be able to interpret the contents of the output tensor.
I don't know how the two outputs, boxes (25200,4), and classes (25200,80) are boxes and classes.
Can you tell me what I need to study in this regard?
I guess 80 in classes(25200, 80) is the number of classes I have in a file like coco.names, but I don't know about 25200.

Thank you

glenn-jocher · 2021-02-21T19:35:05Z

@leeyunhome this is an optimized ONNX model that we create using a private repo (ultralytics/yolov5-export). It's part of our paid product offerings. It works well for fixed output shapes, i.e. if you want an ONNX model to view 720p webcam streams.

25200 is the number of output points from a 640x640 image. You pass these through NMS to get your detections.

vishal-nasre · 2021-07-31T19:09:55Z

Does anyone have a prepared notebook on yolov5 with OpenCV for live stream..?
I am about the drop the plan to use yolov5 :) due to this.

glenn-jocher · 2021-07-31T20:29:26Z

@vishal-nasre YOLOv5 runs inference out of the box on a variety of sources including remote streams (RTSP, HTTP etc.) and local webcams. See https://github.com/ultralytics/yolov5#quick-start-examples for details.

glenn-jocher · 2021-10-11T18:18:04Z

@edurenye @chaUAV @MohamedAliRashad @a954217436 @leeyunhome good news 😃! Your original issue may now be fixed ✅ in PR #4833 by @SamFC10. This PR implements architecture updates to allow for ONNX-exported YOLOv5 models to be used with OpenCV DNN.

To receive this update:

Git – git pull from within your yolov5/ directory or git clone https://github.com/ultralytics/yolov5 again
PyTorch Hub – Force-reload with model = torch.hub.load('ultralytics/yolov5', 'yolov5s', force_reload=True)
Notebooks – View updated notebooks
Docker – sudo docker pull ultralytics/yolov5:latest to update your image

Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀!

snehitvaddi · 2021-10-14T11:30:21Z

Good to hear this update @glenn-jocher.
Can you brief out steps involved one last time like
How to export the ONNX model, Is there any additional changes we need to do of Open-CV compatibility.

glenn-jocher · 2021-10-14T17:28:47Z

@edurenye @chaUAV @MohamedAliRashad @a954217436 @leeyunhome steps for OpenCV DNN inference:

# Export to ONNX
python export.py --weights yolov5s.pt --include onnx --simplify

# Inference
python detect.py --weights yolov5s.onnx  # ONNX Runtime inference
# -- or --
python detect.py --weights yolov5s.onnx --dnn  # OpenCV DNN inference

snehitvaddi · 2021-10-22T13:35:36Z

Has anyone implemented inference through webcam & OpenCV using exported onnx model ?🤔

I knew python detect.py --weights yolov5s.onnx --dnn is for inference but I'm trying to implement something in real-time from webcam. It would be really helpful, if anyone can share the OpenCV-webcam implementation of exported ONNX model.

glenn-jocher · 2021-10-22T13:53:10Z

@snehitvaddi read the README

PauloMendes33 · 2021-12-03T16:44:12Z

The code is as it follows.

`/*

To change this license header, choose License Headers in Project Properties.
To change this template file, choose Tools | Templates
and open the template in the editor.
*/

/*

File: main.cpp
Author: pauloasmendes
Created on 24 de Novembro de 2021, 16:32
*/

#include
#include
#include
#include <opencv4/opencv2/opencv.hpp>

using namespace std;

// YOLO
#include
#include <opencv4/opencv2/dnn.hpp>
#include <opencv4/opencv2/dnn/all_layers.hpp>
constexpr float CONFIDENCE_THRESHOLD = 0;
constexpr float NMS_THRESHOLD = 0.4;
//number of classes to detect
//constexpr int NUM_CLASSES = 80;
constexpr int NUM_CLASSES = 5;// to detect only one class -> the first in the coco_names_txt file list ?!??
// colors for bounding boxes
const cv::Scalar colors[] = {
{0, 255, 255},
{255, 255, 0},
{0, 255, 0},
{255, 0, 0}
};
const auto NUM_COLORS = sizeof(colors)/sizeof(colors[0]);
//
/*
*
/
int main(int argc, char* argv) {
cout << CV_VERSION<< endl;
cv::Mat im_1;

im_1 = cv::imread("im_14_RGB.jpg", cv::IMREAD_COLOR);
if(!im_1.data){
    cout << "\n\t Could not open or find the image 1" << endl;
}
// let's downscale the image using new  width and height
int down_width = 640;
int down_height = 640;

//resize down
cv::resize(im_1, im_1, cv::Size(down_width, down_height), cv::INTER_LINEAR);

// YOLO V5
// read coco class names do ficheiro .txt
std::vector<std::string> class_names;
{
    std::ifstream class_file("coco_names.txt");
    if (!class_file)
    {
        std::cerr << "failed to open classes.txt\n";
        return 0;
    }

    std::string line;
    while (std::getline(class_file, line))
        class_names.push_back(line);
}
// Initialize the parameters para alocação de memoria for object detection using YOLOV4
// faço load dos ficheiros de configuração do método YOLOV4
//auto net = cv::dnn::readNetFromDarknet("custom-yolov4-detector.cfg", "custom-yolov4-detector_best.weights");
//auto net = cv::dnn::readNetFromDarknet("yolov4.cfg", "custom-yolov4-tiny-detector_best.weights");    
//cv::dnn::Net net = cv::dnn::readNetFromONNX("best.onnx");
auto net = cv::dnn::readNetFromONNX("yolov5.onnx");

cout << "here" << endl;
// using GPU for image processing
//net.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
//net.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA);
// using CPU for image processing
net.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
net.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
auto output_names = net.getUnconnectedOutLayersNames();
cv::Mat blob;
std::vector<cv::Mat> detections;
std::vector<int> indices[NUM_CLASSES];
std::vector<cv::Rect> boxes[NUM_CLASSES];
std::vector<float> scores[NUM_CLASSES]; 
 
 

// Creates 4-dimensional blob from image.
cv::dnn::blobFromImage(im_1, blob, 0.00392, cv::Size(im_1.rows, im_1.cols), cv::Scalar(), true, false, CV_32F);
net.setInput(blob);
net.forward(detections, output_names);
// object detection using YOLOV4
for (auto& output : detections){
        const auto num_boxes = output.rows;
        for (int i = 0; i < num_boxes; i++){
            //calculo das 5 predições para cada bounding box: x, y, w, h , confiança
            auto x = output.at<float>(i, 0) * im_1.cols;
            auto y = output.at<float>(i, 1) * im_1.rows;
            auto width = output.at<float>(i, 2) * im_1.cols;
            auto height = output.at<float>(i, 3) * im_1.rows;
            cv::Rect rect(x - width/2, y - height/2, width, height);
            
            for (int c = 0; c < NUM_CLASSES; c++){
                auto confidence = *output.ptr<float>(i, 5 + c);
                if (confidence >= CONFIDENCE_THRESHOLD){
                    boxes[c].push_back(rect);
                    scores[c].push_back(confidence);
                }
            }
        }
}
// Realiza a supressão não máxima das bounding boxes e das pontuações  de confiança correspondentes.
// eliminação de bounding boxes repetidas que identificam o mesmo objecto.
for (int c = 0; c < NUM_CLASSES; c++)
        cv::dnn::NMSBoxes(boxes[c], scores[c], 0.0, NMS_THRESHOLD, indices[c]);
    
// identificação dos objectos e correspondentes pontuações de confiança através de bounding boxes. 
for (int c= 0; c < NUM_CLASSES; c++){
        for (size_t i = 0; i < indices[c].size(); ++i){
            const auto color = colors[c % NUM_COLORS];

            auto idx = indices[c][i];
            const auto& rect = boxes[c][idx];
            cv::rectangle(im_1, cv::Point(rect.x, rect.y), cv::Point(rect.x + rect.width, rect.y + rect.height), color, 3);
            
            // coloco a identificação da classe do objeto contido na bounding box - pedestre ou garrafa por ex.
            std::ostringstream label_ss;
            label_ss << class_names[c] << ": " << std::fixed << std::setprecision(2) << scores[c][idx];
            auto label = label_ss.str();
            
            int baseline;
            auto label_bg_sz = cv::getTextSize(label.c_str(), cv::FONT_HERSHEY_COMPLEX_SMALL, 1, 1, &baseline);
            // defino o rectangulo que define o objeto detectado
            cv::rectangle(im_1, cv::Point(rect.x, rect.y - label_bg_sz.height - baseline - 10), cv::Point(rect.x + label_bg_sz.width, rect.y), color, cv::FILLED);
            // coloco a identificação da classe do objecto detectado.
            cv::putText(im_1, label.c_str(), cv::Point(rect.x, rect.y - baseline - 5), cv::FONT_HERSHEY_COMPLEX_SMALL, 1, cv::Scalar(0, 0, 0));
        }
}
cv::namedWindow("YOLOV5 detection", cv::WINDOW_NORMAL);
cv::imshow("YOLOV5 detection", im_1);
cv::waitKey(0);
cv::imwrite("YOLOV5_res.jpg", im_1);



return 0;

}

`

doleron · 2022-01-18T06:31:20Z

@MohamedAliRashad sorry I've just never used opencv dnn. Can you provide demo code for how this would work ideally? As I said I can provide fully functional exports in all supported formats for COCO and VOC trained YOLOv5 models. What format do you need it in exactly, and how is NMS handled?

@glenn-jocher @PauloMendes33 I use this code to run YOLO V5 with OpenCV DNN:

import cv2
import time
import sys
import numpy as np

def build_model(is_cuda):
    net = cv2.dnn.readNet("config_files/yolov5s.onnx")
    if is_cuda:
        print("Attempty to use CUDA")
        net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
        net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA_FP16)
    else:
        print("Running on CPU")
        net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
        net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
    return net

INPUT_WIDTH = 640
INPUT_HEIGHT = 640
SCORE_THRESHOLD = 0.2
NMS_THRESHOLD = 0.4
CONFIDENCE_THRESHOLD = 0.4

def detect(image, net):
    blob = cv2.dnn.blobFromImage(image, 1/255.0, (INPUT_WIDTH, INPUT_HEIGHT), swapRB=True, crop=False)
    net.setInput(blob)
    preds = net.forward()
    return preds

def load_capture():
    capture = cv2.VideoCapture("sample.mp4")
    return capture

def load_classes():
    class_list = []
    with open("config_files/classes.txt", "r") as f:
        class_list = [cname.strip() for cname in f.readlines()]
    return class_list

class_list = load_classes()

def wrap_detection(input_image, output_data):
    class_ids = []
    confidences = []
    boxes = []

    rows = output_data.shape[0]

    image_width, image_height, _ = input_image.shape

    x_factor = image_width / INPUT_WIDTH
    y_factor =  image_height / INPUT_HEIGHT

    for r in range(rows):
        row = output_data[r]
        confidence = row[4]
        if confidence >= 0.4:

            classes_scores = row[5:]
            _, _, _, max_indx = cv2.minMaxLoc(classes_scores)
            class_id = max_indx[1]
            if (classes_scores[class_id] > .25):

                confidences.append(confidence)

                class_ids.append(class_id)

                x, y, w, h = row[0].item(), row[1].item(), row[2].item(), row[3].item() 
                left = int((x - 0.5 * w) * x_factor)
                top = int((y - 0.5 * h) * y_factor)
                width = int(w * x_factor)
                height = int(h * y_factor)
                box = np.array([left, top, width, height])
                boxes.append(box)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.25, 0.45) 

    result_class_ids = []
    result_confidences = []
    result_boxes = []

    for i in indexes:
        result_confidences.append(confidences[i])
        result_class_ids.append(class_ids[i])
        result_boxes.append(boxes[i])

    return result_class_ids, result_confidences, result_boxes

def format_yolov5(frame):

    row, col, _ = frame.shape
    _max = max(col, row)
    result = np.zeros((_max, _max, 3), np.uint8)
    result[0:row, 0:col] = frame
    return result


colors = [(255, 255, 0), (0, 255, 0), (0, 255, 255), (255, 0, 0)]

is_cuda = len(sys.argv) > 1 and sys.argv[1] == "cuda"

net = build_model(is_cuda)
capture = load_capture()

start = time.time_ns()
frame_count = 0
total_frames = 0
fps = -1

while True:

    _, frame = capture.read()
    if frame is None:
        print("End of stream")
        break

    inputImage = format_yolov5(frame)
    outs = detect(inputImage, net)

    class_ids, confidences, boxes = wrap_detection(inputImage, outs[0])

    frame_count += 1
    total_frames += 1

    for (classid, confidence, box) in zip(class_ids, confidences, boxes):
         color = colors[int(classid) % len(colors)]
         cv2.rectangle(frame, box, color, 2)
         cv2.rectangle(frame, (box[0], box[1] - 20), (box[0] + box[2], box[1]), color, -1)
         cv2.putText(frame, class_list[classid], (box[0], box[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, .5, (0,0,0))

    if frame_count >= 30:
        end = time.time_ns()
        fps = 1000000000 * frame_count / (end - start)
        frame_count = 0
        start = time.time_ns()
    
    if fps > 0:
        fps_label = "FPS: %.2f" % fps
        cv2.putText(frame, fps_label, (10, 25), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)

    cv2.imshow("output", frame)

    if cv2.waitKey(1) > -1:
        print("finished by user")
        break

print("Total frames: " + str(total_frames))

C++ version:

#include <fstream>

#include <opencv2/opencv.hpp>

std::vector<std::string> load_class_list()
{
    std::vector<std::string> class_list;
    std::ifstream ifs("config_files/classes.txt");
    std::string line;
    while (getline(ifs, line))
    {
        class_list.push_back(line);
    }
    return class_list;
}

void load_net(cv::dnn::Net &net, bool is_cuda)
{
    auto result = cv::dnn::readNet("config_files/yolov5s.onnx");
    if (is_cuda)
    {
        std::cout << "Attempty to use CUDA\n";
        result.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
        result.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA_FP16);
    }
    else
    {
        std::cout << "Running on CPU\n";
        result.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
        result.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
    }
    net = result;
}

const std::vector<cv::Scalar> colors = {cv::Scalar(255, 255, 0), cv::Scalar(0, 255, 0), cv::Scalar(0, 255, 255), cv::Scalar(255, 0, 0)};

const float INPUT_WIDTH = 640.0;
const float INPUT_HEIGHT = 640.0;
const float SCORE_THRESHOLD = 0.2;
const float NMS_THRESHOLD = 0.4;
const float CONFIDENCE_THRESHOLD = 0.4;

struct Detection
{
    int class_id;
    float confidence;
    cv::Rect box;
};

cv::Mat format_yolov5(const cv::Mat &source) {
    int col = source.cols;
    int row = source.rows;
    int _max = MAX(col, row);
    cv::Mat result = cv::Mat::zeros(_max, _max, CV_8UC3);
    source.copyTo(result(cv::Rect(0, 0, col, row)));
    return result;
}

void detect(cv::Mat &image, cv::dnn::Net &net, std::vector<Detection> &output, const std::vector<std::string> &className) {
    cv::Mat blob;

    auto input_image = format_yolov5(image);
    
    cv::dnn::blobFromImage(input_image, blob, 1./255., cv::Size(INPUT_WIDTH, INPUT_HEIGHT), cv::Scalar(), true, false);
    net.setInput(blob);
    std::vector<cv::Mat> outputs;
    net.forward(outputs, net.getUnconnectedOutLayersNames());

    float x_factor = input_image.cols / INPUT_WIDTH;
    float y_factor = input_image.rows / INPUT_HEIGHT;
    
    float *data = (float *)outputs[0].data;

    const int dimensions = 85;
    const int rows = 25200;
    
    std::vector<int> class_ids;
    std::vector<float> confidences;
    std::vector<cv::Rect> boxes;

    for (int i = 0; i < rows; ++i) {

        float confidence = data[4];
        if (confidence >= CONFIDENCE_THRESHOLD) {

            float * classes_scores = data + 5;
            cv::Mat scores(1, className.size(), CV_32FC1, classes_scores);
            cv::Point class_id;
            double max_class_score;
            minMaxLoc(scores, 0, &max_class_score, 0, &class_id);
            if (max_class_score > SCORE_THRESHOLD) {

                confidences.push_back(confidence);

                class_ids.push_back(class_id.x);

                float x = data[0];
                float y = data[1];
                float w = data[2];
                float h = data[3];
                int left = int((x - 0.5 * w) * x_factor);
                int top = int((y - 0.5 * h) * y_factor);
                int width = int(w * x_factor);
                int height = int(h * y_factor);
                boxes.push_back(cv::Rect(left, top, width, height));
            }

        }

        data += 85;

    }

    std::vector<int> nms_result;
    cv::dnn::NMSBoxes(boxes, confidences, SCORE_THRESHOLD, NMS_THRESHOLD, nms_result);
    for (int i = 0; i < nms_result.size(); i++) {
        int idx = nms_result[i];
        Detection result;
        result.class_id = class_ids[idx];
        result.confidence = confidences[idx];
        result.box = boxes[idx];
        output.push_back(result);
    }
}

int main(int argc, char **argv)
{

    std::vector<std::string> class_list = load_class_list();

    cv::Mat frame;
    cv::VideoCapture capture("sample.mp4");
    if (!capture.isOpened())
    {
        std::cerr << "Error opening video file\n";
        return -1;
    }

    bool is_cuda = argc > 1 && strcmp(argv[1], "cuda") == 0;

    cv::dnn::Net net;
    load_net(net, is_cuda);

    auto start = std::chrono::high_resolution_clock::now();
    int frame_count = 0;
    float fps = -1;
    int total_frames = 0;

    while (true)
    {
        capture.read(frame);
        if (frame.empty())
        {
            std::cout << "End of stream\n";
            break;
        }

        std::vector<Detection> output;
        detect(frame, net, output, class_list);

        frame_count++;
        total_frames++;

        int detections = output.size();

        for (int i = 0; i < detections; ++i)
        {

            auto detection = output[i];
            auto box = detection.box;
            auto classId = detection.class_id;
            const auto color = colors[classId % colors.size()];
            cv::rectangle(frame, box, color, 3);

            cv::rectangle(frame, cv::Point(box.x, box.y - 20), cv::Point(box.x + box.width, box.y), color, cv::FILLED);
            cv::putText(frame, class_list[classId].c_str(), cv::Point(box.x, box.y - 5), cv::FONT_HERSHEY_SIMPLEX, 0.5, cv::Scalar(0, 0, 0));
        }

        if (frame_count >= 30)
        {

            auto end = std::chrono::high_resolution_clock::now();
            fps = frame_count * 1000.0 / std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();

            frame_count = 0;
            start = std::chrono::high_resolution_clock::now();
        }

        if (fps > 0)
        {

            std::ostringstream fps_label;
            fps_label << std::fixed << std::setprecision(2);
            fps_label << "FPS: " << fps;
            std::string fps_label_str = fps_label.str();

            cv::putText(frame, fps_label_str.c_str(), cv::Point(10, 25), cv::FONT_HERSHEY_SIMPLEX, 1, cv::Scalar(0, 0, 255), 2);
        }

        cv::imshow("output", frame);

        if (cv::waitKey(1) != -1)
        {
            capture.release();
            std::cout << "finished by user\n";
            break;
        }
    }

    std::cout << "Total frames: " << total_frames << "\n";

    return 0;
}

More details can be find in this repository: https://github.com/doleron/yolov5-opencv-cpp-python

glenn-jocher · 2022-01-18T07:31:24Z

@doleron thanks for the examples! I've added a link to your repo on the export tutorial in https://docs.ultralytics.com/yolov5/tutorials/model_export

jebastin-nadar · 2022-01-18T08:12:05Z

@doleron I think YOLOv5 expects inputs in [0, 1] without any mean subtraction, just dividing by 255 should be enough.

blob = cv2.dnn.blobFromImage(image, 1/255.0, (INPUT_WIDTH, INPUT_HEIGHT), swapRB=True, crop=False)

doleron · 2022-01-18T10:36:39Z

@doleron I think YOLOv5 expects inputs in [0, 1] without any mean subtraction, just dividing by 255 should be enough.
blob = cv2.dnn.blobFromImage(image, 1/255.0, (INPUT_WIDTH, INPUT_HEIGHT), swapRB=True, crop=False)

@SamFC10 You're right. I just edited the code. Thanks!

alimousavi1377 · 2022-01-28T23:49:27Z

I have to run yolov5 for my project but I don't know how to run it ?? previously we used opencv to load models , labels and weight
but now yolov5 does not support this structure . can everybody help me for it ???

doleron · 2022-01-28T23:57:21Z

I have to run yolov5 for my project but I don't know how to run it ?? previously we used opencv to load models , labels and weight but now yolov5 does not support this structure . can everybody help me for it ???

@alimousavi1377 YOLOv5 does support this structure. Check #239 (comment) and #6309 (comment) for runnable examples of using YOLOv5 with built-in/custom models. In addition, if you really want to use OpenCV, check the C++/Python example few replies above to learn how to use .onnx files, OpenCV and YOLOv5.

haimat · 2022-02-16T13:27:50Z

Thanks guys for this thread, helped me a lot. One question though: Any ideas how to use the YOLOv5 augment feature when running ONNX via CV2? Or would I need to manually implement it in my own code then?

doleron · 2022-02-16T14:02:13Z

Hi @haimat ! As far I understand, talking about data augmentation only makes sense during the training time. Thus, once the model training is finished, the final model structure/topology does not reflect any of the augmentation hyperparametization set for the model training. The unique influence of augmentation is in the dataset preparation in order to achieve a better weight generalization power.
In resume, IMO no action must be done on the ONNX conversion or even during the future model usage.
PS.: I'm only a YOLO user. Please wait for a more accurate/reliable position from ultralytics team though.
PS2: are you facing some specific ONNX conversion error?

glenn-jocher · 2022-02-16T14:05:10Z

@haimat Test Time Augmentation (TTA) flag --augment is only applied to PyTorch and TorchScript inference:

yolov5/models/common.py

Lines 395 to 400 in 1ff4370

    
           def forward(self, im, augment=False, visualize=False, val=False): 
        
               # YOLOv5 MultiBackend inference 
        
               b, ch, h, w = im.shape  # batch, channel, height, width 
        
               if self.pt or self.jit:  # PyTorch 
        
                   y = self.model(im) if self.jit else self.model(im, augment=augment, visualize=visualize) 
        
                   return y if val else y[0]

@doleron see TTA tutorial for more info:

YOLOv5 Tutorials

Train Custom Data 🚀 RECOMMENDED
Tips for Best Training Results ☘️ RECOMMENDED
Weights & Biases Logging 🌟 NEW
Supervisely Ecosystem 🌟 NEW
Multi-GPU Training
PyTorch Hub ⭐ NEW
TFLite, ONNX, CoreML, TensorRT Export 🚀
Test-Time Augmentation (TTA)
Model Ensembling
Model Pruning/Sparsity
Hyperparameter Evolution
Transfer Learning with Frozen Layers ⭐ NEW
TensorRT Deployment

Good luck 🍀 and let us know if you have any other questions!

haimat · 2022-02-16T17:17:40Z

@glenn-jocher Thanks for you reply, I was expecting something like that. So in other words, if I would like to use CV2+ONNX+TTA I would need to implement the TTA part in my own code, right?

glenn-jocher · 2022-02-16T18:55:27Z

@haimat well that's an option. The TTA code can also be in the DetectMultiBackend() forward method. It just depends on what level the code is, right now it's at a low level inside the torch and torchvision models.

kXborg · 2022-04-21T07:45:12Z

Hi all,
If you are looking for a thorough analysis and implementation of Yolov5 with OpenCV DNN, check out our LearnOpenCV blog post here.

akbarali2019 · 2022-04-29T12:46:04Z

@glenn-jocher

python detect.py --weights best.onnx --dnn --source 0

When I use the above command, it is working and detecting on my custom dataset well. The problem is it is showing the class label as a "person". But my custom dataset has only one class and it is labeled as a "ball". How to change it into ball.

glenn-jocher · 2022-04-29T19:18:17Z

@akbarali2019 for ONNX inference class names are handled automatically. For DNN inference you must pass your --data yaml to detect.py to retrieve class names:

python detect.py --data DATA.yaml

AbinJilson · 2022-06-17T05:50:16Z

@MohamedAliRashad sorry I've just never used opencv dnn. Can you provide demo code for how this would work ideally? As I said I can provide fully functional exports in all supported formats for COCO and VOC trained YOLOv5 models. What format do you need it in exactly, and how is NMS handled?

@glenn-jocher @PauloMendes33 I use this code to run YOLO V5 with OpenCV DNN:

import cv2
import time
import sys
import numpy as np

def build_model(is_cuda):
    net = cv2.dnn.readNet("config_files/yolov5s.onnx")
    if is_cuda:
        print("Attempty to use CUDA")
        net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
        net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA_FP16)
    else:
        print("Running on CPU")
        net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
        net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
    return net

INPUT_WIDTH = 640
INPUT_HEIGHT = 640
SCORE_THRESHOLD = 0.2
NMS_THRESHOLD = 0.4
CONFIDENCE_THRESHOLD = 0.4

def detect(image, net):
    blob = cv2.dnn.blobFromImage(image, 1/255.0, (INPUT_WIDTH, INPUT_HEIGHT), swapRB=True, crop=False)
    net.setInput(blob)
    preds = net.forward()
    return preds

def load_capture():
    capture = cv2.VideoCapture("sample.mp4")
    return capture

def load_classes():
    class_list = []
    with open("config_files/classes.txt", "r") as f:
        class_list = [cname.strip() for cname in f.readlines()]
    return class_list

class_list = load_classes()

def wrap_detection(input_image, output_data):
    class_ids = []
    confidences = []
    boxes = []

    rows = output_data.shape[0]

    image_width, image_height, _ = input_image.shape

    x_factor = image_width / INPUT_WIDTH
    y_factor =  image_height / INPUT_HEIGHT

    for r in range(rows):
        row = output_data[r]
        confidence = row[4]
        if confidence >= 0.4:

            classes_scores = row[5:]
            _, _, _, max_indx = cv2.minMaxLoc(classes_scores)
            class_id = max_indx[1]
            if (classes_scores[class_id] > .25):

                confidences.append(confidence)

                class_ids.append(class_id)

                x, y, w, h = row[0].item(), row[1].item(), row[2].item(), row[3].item() 
                left = int((x - 0.5 * w) * x_factor)
                top = int((y - 0.5 * h) * y_factor)
                width = int(w * x_factor)
                height = int(h * y_factor)
                box = np.array([left, top, width, height])
                boxes.append(box)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.25, 0.45) 

    result_class_ids = []
    result_confidences = []
    result_boxes = []

    for i in indexes:
        result_confidences.append(confidences[i])
        result_class_ids.append(class_ids[i])
        result_boxes.append(boxes[i])

    return result_class_ids, result_confidences, result_boxes

def format_yolov5(frame):

    row, col, _ = frame.shape
    _max = max(col, row)
    result = np.zeros((_max, _max, 3), np.uint8)
    result[0:row, 0:col] = frame
    return result


colors = [(255, 255, 0), (0, 255, 0), (0, 255, 255), (255, 0, 0)]

is_cuda = len(sys.argv) > 1 and sys.argv[1] == "cuda"

net = build_model(is_cuda)
capture = load_capture()

start = time.time_ns()
frame_count = 0
total_frames = 0
fps = -1

while True:

    _, frame = capture.read()
    if frame is None:
        print("End of stream")
        break

    inputImage = format_yolov5(frame)
    outs = detect(inputImage, net)

    class_ids, confidences, boxes = wrap_detection(inputImage, outs[0])

    frame_count += 1
    total_frames += 1

    for (classid, confidence, box) in zip(class_ids, confidences, boxes):
         color = colors[int(classid) % len(colors)]
         cv2.rectangle(frame, box, color, 2)
         cv2.rectangle(frame, (box[0], box[1] - 20), (box[0] + box[2], box[1]), color, -1)
         cv2.putText(frame, class_list[classid], (box[0], box[1] - 10), cv2.FONT_HERSHEY_SIMPLEX, .5, (0,0,0))

    if frame_count >= 30:
        end = time.time_ns()
        fps = 1000000000 * frame_count / (end - start)
        frame_count = 0
        start = time.time_ns()
    
    if fps > 0:
        fps_label = "FPS: %.2f" % fps
        cv2.putText(frame, fps_label, (10, 25), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)

    cv2.imshow("output", frame)

    if cv2.waitKey(1) > -1:
        print("finished by user")
        break

print("Total frames: " + str(total_frames))

C++ version:

#include <fstream>

#include <opencv2/opencv.hpp>

std::vector<std::string> load_class_list()
{
    std::vector<std::string> class_list;
    std::ifstream ifs("config_files/classes.txt");
    std::string line;
    while (getline(ifs, line))
    {
        class_list.push_back(line);
    }
    return class_list;
}

void load_net(cv::dnn::Net &net, bool is_cuda)
{
    auto result = cv::dnn::readNet("config_files/yolov5s.onnx");
    if (is_cuda)
    {
        std::cout << "Attempty to use CUDA\n";
        result.setPreferableBackend(cv::dnn::DNN_BACKEND_CUDA);
        result.setPreferableTarget(cv::dnn::DNN_TARGET_CUDA_FP16);
    }
    else
    {
        std::cout << "Running on CPU\n";
        result.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
        result.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
    }
    net = result;
}

const std::vector<cv::Scalar> colors = {cv::Scalar(255, 255, 0), cv::Scalar(0, 255, 0), cv::Scalar(0, 255, 255), cv::Scalar(255, 0, 0)};

const float INPUT_WIDTH = 640.0;
const float INPUT_HEIGHT = 640.0;
const float SCORE_THRESHOLD = 0.2;
const float NMS_THRESHOLD = 0.4;
const float CONFIDENCE_THRESHOLD = 0.4;

struct Detection
{
    int class_id;
    float confidence;
    cv::Rect box;
};

cv::Mat format_yolov5(const cv::Mat &source) {
    int col = source.cols;
    int row = source.rows;
    int _max = MAX(col, row);
    cv::Mat result = cv::Mat::zeros(_max, _max, CV_8UC3);
    source.copyTo(result(cv::Rect(0, 0, col, row)));
    return result;
}

void detect(cv::Mat &image, cv::dnn::Net &net, std::vector<Detection> &output, const std::vector<std::string> &className) {
    cv::Mat blob;

    auto input_image = format_yolov5(image);
    
    cv::dnn::blobFromImage(input_image, blob, 1./255., cv::Size(INPUT_WIDTH, INPUT_HEIGHT), cv::Scalar(), true, false);
    net.setInput(blob);
    std::vector<cv::Mat> outputs;
    net.forward(outputs, net.getUnconnectedOutLayersNames());

    float x_factor = input_image.cols / INPUT_WIDTH;
    float y_factor = input_image.rows / INPUT_HEIGHT;
    
    float *data = (float *)outputs[0].data;

    const int dimensions = 85;
    const int rows = 25200;
    
    std::vector<int> class_ids;
    std::vector<float> confidences;
    std::vector<cv::Rect> boxes;

    for (int i = 0; i < rows; ++i) {

        float confidence = data[4];
        if (confidence >= CONFIDENCE_THRESHOLD) {

            float * classes_scores = data + 5;
            cv::Mat scores(1, className.size(), CV_32FC1, classes_scores);
            cv::Point class_id;
            double max_class_score;
            minMaxLoc(scores, 0, &max_class_score, 0, &class_id);
            if (max_class_score > SCORE_THRESHOLD) {

                confidences.push_back(confidence);

                class_ids.push_back(class_id.x);

                float x = data[0];
                float y = data[1];
                float w = data[2];
                float h = data[3];
                int left = int((x - 0.5 * w) * x_factor);
                int top = int((y - 0.5 * h) * y_factor);
                int width = int(w * x_factor);
                int height = int(h * y_factor);
                boxes.push_back(cv::Rect(left, top, width, height));
            }

        }

        data += 85;

    }

    std::vector<int> nms_result;
    cv::dnn::NMSBoxes(boxes, confidences, SCORE_THRESHOLD, NMS_THRESHOLD, nms_result);
    for (int i = 0; i < nms_result.size(); i++) {
        int idx = nms_result[i];
        Detection result;
        result.class_id = class_ids[idx];
        result.confidence = confidences[idx];
        result.box = boxes[idx];
        output.push_back(result);
    }
}

int main(int argc, char **argv)
{

    std::vector<std::string> class_list = load_class_list();

    cv::Mat frame;
    cv::VideoCapture capture("sample.mp4");
    if (!capture.isOpened())
    {
        std::cerr << "Error opening video file\n";
        return -1;
    }

    bool is_cuda = argc > 1 && strcmp(argv[1], "cuda") == 0;

    cv::dnn::Net net;
    load_net(net, is_cuda);

    auto start = std::chrono::high_resolution_clock::now();
    int frame_count = 0;
    float fps = -1;
    int total_frames = 0;

    while (true)
    {
        capture.read(frame);
        if (frame.empty())
        {
            std::cout << "End of stream\n";
            break;
        }

        std::vector<Detection> output;
        detect(frame, net, output, class_list);

        frame_count++;
        total_frames++;

        int detections = output.size();

        for (int i = 0; i < detections; ++i)
        {

            auto detection = output[i];
            auto box = detection.box;
            auto classId = detection.class_id;
            const auto color = colors[classId % colors.size()];
            cv::rectangle(frame, box, color, 3);

            cv::rectangle(frame, cv::Point(box.x, box.y - 20), cv::Point(box.x + box.width, box.y), color, cv::FILLED);
            cv::putText(frame, class_list[classId].c_str(), cv::Point(box.x, box.y - 5), cv::FONT_HERSHEY_SIMPLEX, 0.5, cv::Scalar(0, 0, 0));
        }

        if (frame_count >= 30)
        {

            auto end = std::chrono::high_resolution_clock::now();
            fps = frame_count * 1000.0 / std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();

            frame_count = 0;
            start = std::chrono::high_resolution_clock::now();
        }

        if (fps > 0)
        {

            std::ostringstream fps_label;
            fps_label << std::fixed << std::setprecision(2);
            fps_label << "FPS: " << fps;
            std::string fps_label_str = fps_label.str();

            cv::putText(frame, fps_label_str.c_str(), cv::Point(10, 25), cv::FONT_HERSHEY_SIMPLEX, 1, cv::Scalar(0, 0, 255), 2);
        }

        cv::imshow("output", frame);

        if (cv::waitKey(1) != -1)
        {
            capture.release();
            std::cout << "finished by user\n";
            break;
        }
    }

    std::cout << "Total frames: " << total_frames << "\n";

    return 0;
}

More details can be find in this repository: https://github.com/doleron/yolov5-opencv-cpp-python

when I run this code in my own custom onnx file I'm getting this error:

  File "C:\Users\acer\.spyder-py3\metallic surface defect detection\untitled3.py", line 57, in wrap_detection
    if confidence >= 0.4:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

anybody help me to fix this?

alkhalisy · 2022-07-19T15:10:14Z

def wrap_detection(input_image, output_data):
class_ids = []
confidences = []
boxes = []

rows = output_data.shape[0]

image_width, image_height, _ = input_image.shape

x_factor = image_width / INPUT_WIDTH
y_factor = image_height / INPUT_HEIGHT

for r in range(rows):
    row = output_data[r]
    confidence = row[4]
    if confidence >= 0.4:

        classes_scores = row[5:]
        _, _, _, max_indx = cv2.minMaxLoc(classes_scores)
        class_id = max_indx[1]
        if (classes_scores[class_id] > .25):
            confidences.append(confidence)

            class_ids.append(class_id)

            x, y, w, h = row[0].item(), row[1].item(), row[2].item(), row[3].item()
            left = int((x - 0.5 * w) * x_factor)
            top = int((y - 0.5 * h) * y_factor)
            width = int(w * x_factor)
            height = int(h * y_factor)
            box = np.array([left, top, width, height])
            boxes.append(box)

indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.25, 0.45)

result_class_ids = []
result_confidences = []
result_boxes = []

for i in indexes:
    result_confidences.append(confidences[i])
    result_class_ids.append(class_ids[i])
    result_boxes.append(boxes[i])

return result_class_ids, result_confidences, result_boxes

Return error
Traceback (most recent call last):
File "H:\workspace\my_phd_project\yolov5live_opencv_DNN_onnx\yolov5_opencv DNN onnx_5.py", line 126, in
class_ids, confidences, boxes = wrap_detection(inputImage, outs[0])
File "H:\workspace\my_phd_project\yolov5live_opencv_DNN_onnx\yolov5_opencv DNN onnx_5.py", line 64, in wrap_detection
if confidence >= 0.4:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

PLS what the problem ??

kXborg · 2022-07-20T11:14:20Z

@alkhalisy,

Just check the shape of the outs once.

In my case, I had to format the code the following way. Checkout Source.

def post_process(input_image, outputs):
      # Lists to hold respective values while unwrapping.
      class_ids = []
      confidences = []
      boxes = []
      # Rows.
      rows = outputs[0].shape[1]
      image_height, image_width = input_image.shape[:2]
      # Resizing factor.
      x_factor = image_width / INPUT_WIDTH
      y_factor =  image_height / INPUT_HEIGHT
      # Iterate through detections.
      for r in range(rows):
            row = outputs[0][0][r]
            confidence = row[4]
            # Discard bad detections and continue.
            if confidence >= CONFIDENCE_THRESHOLD:
                  classes_scores = row[5:]
                  # Get the index of max class score.
                  class_id = np.argmax(classes_scores)
                  #  Continue if the class score is above threshold.
                  if (classes_scores[class_id] > SCORE_THRESHOLD):
                        confidences.append(confidence)
                        class_ids.append(class_id)
                        cx, cy, w, h = row[0], row[1], row[2], row[3]
                        left = int((cx - w/2) * x_factor)
                        top = int((cy - h/2) * y_factor)
                        width = int(w * x_factor)
                        height = int(h * y_factor)
                        box = np.array([left, top, width, height])
                        boxes.append(box)

alkhalisy · 2022-07-20T20:59:18Z

dear Kukil thanks for your response
the code from your repository ....above is the format for the function but really the same error happens
"if confidence >= CONFIDENCE_THRESHOLD:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() "
so PLS can help me by this

alkhalisy · 2022-07-20T23:36:49Z

dear Kukil thanks for your response
the code from your repository ....above is the format for the function but really the same error happens
"if confidence >= CONFIDENCE_THRESHOLD:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() "
so PLS can help me by this

kXborg · 2022-07-22T13:15:08Z

Hi @alkhalisy,

I checked it and yes I am able to reproduce the error. The yolov5s.onnx model is not the right one. Looks like something went wrong while converting to onnx. I found the other two models yolov5n.onnx and yolov5m.onnx working fine.

while checking the shape of the output, I observed [1, 3, 80, 80, 85]. It should be [25200×85] for default 640 exports.

Please try with the rest of the available models and verify.

You can use the converter notebook to get the correct yolov5s.onnx model. Also, make sure to use torch==1.11 while doing so.

I will be updating the code in sometime.

alkhalisy · 2022-07-22T19:50:41Z

Dear Kukil, thanks a lot for your help. you are right now it works with yolov5n.onnx waiting for your update on the code

alkhalisy · 2022-07-22T21:59:19Z

import cv2
import numpy as np

Constants.

INPUT_WIDTH = 640
INPUT_HEIGHT = 640
SCORE_THRESHOLD = 0.5
NMS_THRESHOLD = 0.45
CONFIDENCE_THRESHOLD = 0.45

Text parameters.

FONT_FACE = cv2.FONT_HERSHEY_SIMPLEX
FONT_SCALE = 0.7
THICKNESS = 1

Colors

BLACK = (0,0,0)
BLUE = (255,178,50)
YELLOW = (0,255,255)
RED = (0,0,255)

Load class names.

classesFile = "models/coco.names"
classes = None
with open(classesFile, 'rt') as f:
classes = f.read().rstrip('\n').split('\n')

Give the weight files to the model and load the network using them.

modelWeights = "models/yolov5n.onnx"
net = cv2.dnn.readNet(modelWeights)

def image_show(frames):
cv2.imshow('proctoring', frames)
# video capture function

def video_capture(source):
# source = 0 for web camera 1
video_frames = cv2.VideoCapture(source)
return video_frames

def draw_label(input_image, label, left, top):
"""Draw text onto image at location."""

# Get text size.
text_size = cv2.getTextSize(label, FONT_FACE, FONT_SCALE, THICKNESS)
dim, baseline = text_size[0], text_size[1]
# Use text size to create a BLACK rectangle. 
cv2.rectangle(input_image, (left, top), (left + dim[0], top + dim[1] + baseline), BLACK, cv2.FILLED);
# Display text inside the rectangle.
cv2.putText(input_image, label, (left, top + dim[1]), FONT_FACE, FONT_SCALE, YELLOW, THICKNESS, cv2.LINE_AA)

def pre_process(input_image, net):
# Create a 4D blob from a frame.
#blob = cv2.dnn.blobFromImage(input_image, 1/255.0, (INPUT_WIDTH, INPUT_HEIGHT), [0,0,0], 1, crop=False)
blob = cv2.dnn.blobFromImage(input_image, 1 / 255.0, (INPUT_WIDTH, INPUT_HEIGHT), swapRB=True, crop=False)

# Sets the input to the network.
net.setInput(blob)

# Runs the forward pass to get output of the output layers.
output_layers = net.getUnconnectedOutLayersNames()
outputs = net.forward(output_layers)
# print(outputs[0].shape)

return outputs

def post_process(input_image, outputs):
# Lists to hold respective values while unwrapping.
class_ids = []
confidences = []
boxes = []

# Rows.
rows = outputs[0].shape[1]

image_height, image_width = input_image.shape[:2]

# Resizing factor.
x_factor = image_width / INPUT_WIDTH
y_factor =  image_height / INPUT_HEIGHT

# Iterate through 25200 detections.
for r in range(rows):
	row = outputs[0][0][r]
	confidence = row[4]

	# Discard bad detections and continue.
	if (confidence) >= CONFIDENCE_THRESHOLD:
		classes_scores = row[5:]

		# Get the index of max class score.
		class_id = np.argmax(classes_scores)

		#  Continue if the class score is above threshold.
		if (classes_scores[class_id] > SCORE_THRESHOLD):
			confidences.append(confidence)
			class_ids.append(class_id)

			cx, cy, w, h = row[0], row[1], row[2], row[3]

			left = int((cx - w/2) * x_factor)
			top = int((cy - h/2) * y_factor)
			width = int(w * x_factor)
			height = int(h * y_factor)

			box = np.array([left, top, width, height])
			boxes.append(box)

# Perform non maximum suppression to eliminate redundant overlapping boxes with
# lower confidences.
indices = cv2.dnn.NMSBoxes(boxes, confidences, CONFIDENCE_THRESHOLD, NMS_THRESHOLD)
for i in indices:
	box = boxes[i]
	left = box[0]
	top = box[1]
	width = box[2]
	height = box[3]
	cv2.rectangle(input_image, (left, top), (left + width, top + height), BLUE, 3*THICKNESS)
	label = "{}:{:.2f}".format(classes[class_ids[i]], confidences[i])
	draw_label(input_image, label, left, top)

return input_image

def yolo5_detect(ca_images):

frame = ca_images
# Load image.
#frame = cv2.imread('sample.jpg')

# Process image.
detections = pre_process(frame, net)
img = post_process(frame.copy(), detections)

# Put efficiency information. The function getPerfProfile returns the overall time for inference(t) and the timings for each of the layers(in layersTimes)
t, _ = net.getPerfProfile()
label = 'Inference time: %.2f ms' % (t * 1000.0 / cv2.getTickFrequency())
print(label)
cv2.putText(img, label, (20, 40), FONT_FACE, FONT_SCALE, RED, THICKNESS, cv2.LINE_AA)

#cv2.imshow('Output', img)
return img
#cv2.waitKey(0)

if name == 'main':

#setA(1)
frame_cp = video_capture(0)
while frame_cp.isOpened():
	_, frame = frame_cp.read()
	if frame is None:
		print("End of stream")
		break
	# images1 = head_pos(frame)
	# images2 = mouth_open(images1)
	images3 = yolo5_detect(frame)
	image_show(images3)

	if cv2.waitKey(5) & 0xFF == 27:
		break
		frame_cp.release()
		cv2.destroyAllWindows()

Dear Kukil I try to capture from the webcam as in the above code the program worked with no error but slow is there any idea to solve this problem, especially since I need to use other models in the same program that gets the output image from one model and put it as input to other models, by this the execution will become very slow, but now I try just the Yolo model but is look slow

kXborg · 2022-07-25T06:42:21Z

Dear Kukil, thanks a lot for your help. you are right now it works with yolov5n.onnx waiting for your update on the code

The repo has been updated.

KishoreElvicto · 2022-07-27T09:22:42Z

cv2.error: OpenCV(4.6.0) D:\a\opencv-python\opencv-python\opencv\modules\dnn\src\onnx\onnx_importer.cpp:1040: error: (-2:Unspecified error) in function 'cv::dnn::dnn4_v20220524::ONNXImporter::handleNode'

Node [Identity@ai.onnx]:(onnx_node!Identity_0) parse error: OpenCV(4.6.0) D:\a\opencv-python\opencv-python\opencv\modules\dnn\src\layer.cpp:246: error: (-215:Assertion failed) inputs.size() in function 'cv::dnn::dnn4_v20220524::Layer::getMemoryShapes'

how to fix this error

alkhalisy · 2022-07-29T18:41:21Z

Dear Kukil, thanks for the update, now it works with yolov5s but is still very slow. is just 1 or 2 FPS may be hardware related? any suggestion? ------------------------------------------------------------- Muhanad Abdul Elah Alkhalisy *MSc Software **Engineering* University of Information Technology and Communications Mobile: +96407718662757 Email: ***@***.*** ----------------------------------------------------------------

…

On Mon, Jul 25, 2022 at 9:42 AM Kukil Kashyap Borgohain < ***@***.***> wrote: Dear Kukil, thanks a lot for your help. you are right now it works with yolov5n.onnx waiting for your update on the code The repo has been updated. — Reply to this email directly, view it on GitHub <#239 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALWYIEGPJNPRK3EZVIFKLCTVVYZNTANCNFSM4OL2CLYQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Calviansyah · 2022-12-13T12:17:40Z

@leeyunhome I've exported a YOLOv5s.onnx model at 640x640 here. It has two outputs, boxes (25200,4), and classes (25200,80). https://github.com/ultralytics/yolov5/releases/download/v4.0/yolov5s.onnx

the page is not found, is it expiredd?

glenn-jocher · 2022-12-17T11:33:54Z

@Calviansyah 👋 Hello! Thanks for asking about Export Formats. YOLOv5 🚀 offers export to almost all of the common export formats. See our TFLite, ONNX, CoreML, TensorRT Export Tutorial for full details.

Formats

YOLOv5 inference is officially supported in 11 formats:

💡 ProTip: Export to ONNX or OpenVINO for up to 3x CPU speedup. See CPU Benchmarks.
💡 ProTip: Export to TensorRT for up to 5x GPU speedup. See GPU Benchmarks.

Format	`export.py --include`	Model
PyTorch	-	`yolov5s.pt`
TorchScript	`torchscript`	`yolov5s.torchscript`
ONNX	`onnx`	`yolov5s.onnx`
OpenVINO	`openvino`	`yolov5s_openvino_model/`
TensorRT	`engine`	`yolov5s.engine`
CoreML	`coreml`	`yolov5s.mlmodel`
TensorFlow SavedModel	`saved_model`	`yolov5s_saved_model/`
TensorFlow GraphDef	`pb`	`yolov5s.pb`
TensorFlow Lite	`tflite`	`yolov5s.tflite`
TensorFlow Edge TPU	`edgetpu`	`yolov5s_edgetpu.tflite`
TensorFlow.js	`tfjs`	`yolov5s_web_model/`
PaddlePaddle	`paddle`	`yolov5s_paddle_model/`

Benchmarks

Benchmarks below run on a Colab Pro with the YOLOv5 tutorial notebook . To reproduce:

python benchmarks.py --weights yolov5s.pt --imgsz 640 --device 0

Colab Pro V100 GPU

benchmarks: weights=/content/yolov5/yolov5s.pt, imgsz=640, batch_size=1, data=/content/yolov5/data/coco128.yaml, device=0, half=False, test=False
Checking setup...
YOLOv5 🚀 v6.1-135-g7926afc torch 1.10.0+cu111 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)
Setup complete ✅ (8 CPUs, 51.0 GB RAM, 46.7/166.8 GB disk)

Benchmarks complete (458.07s)
                   Format  mAP@0.5:0.95  Inference time (ms)
0                 PyTorch        0.4623                10.19
1             TorchScript        0.4623                 6.85
2                    ONNX        0.4623                14.63
3                OpenVINO           NaN                  NaN
4                TensorRT        0.4617                 1.89
5                  CoreML           NaN                  NaN
6   TensorFlow SavedModel        0.4623                21.28
7     TensorFlow GraphDef        0.4623                21.22
8         TensorFlow Lite           NaN                  NaN
9     TensorFlow Edge TPU           NaN                  NaN
10          TensorFlow.js           NaN                  NaN

Colab Pro CPU

benchmarks: weights=/content/yolov5/yolov5s.pt, imgsz=640, batch_size=1, data=/content/yolov5/data/coco128.yaml, device=cpu, half=False, test=False
Checking setup...
YOLOv5 🚀 v6.1-135-g7926afc torch 1.10.0+cu111 CPU
Setup complete ✅ (8 CPUs, 51.0 GB RAM, 41.5/166.8 GB disk)

Benchmarks complete (241.20s)
                   Format  mAP@0.5:0.95  Inference time (ms)
0                 PyTorch        0.4623               127.61
1             TorchScript        0.4623               131.23
2                    ONNX        0.4623                69.34
3                OpenVINO        0.4623                66.52
4                TensorRT           NaN                  NaN
5                  CoreML           NaN                  NaN
6   TensorFlow SavedModel        0.4623               123.79
7     TensorFlow GraphDef        0.4623               121.57
8         TensorFlow Lite        0.4623               316.61
9     TensorFlow Edge TPU           NaN                  NaN
10          TensorFlow.js           NaN                  NaN

Export a Trained YOLOv5 Model

This command exports a pretrained YOLOv5s model to TorchScript and ONNX formats. yolov5s.pt is the 'small' model, the second smallest model available. Other options are yolov5n.pt, yolov5m.pt, yolov5l.pt and yolov5x.pt, along with their P6 counterparts i.e. yolov5s6.pt or you own custom training checkpoint i.e. runs/exp/weights/best.pt. For details on all available models please see our README table.

python export.py --weights yolov5s.pt --include torchscript onnx

💡 ProTip: Add --half to export models at FP16 half precision for smaller file sizes

Output:

export: data=data/coco128.yaml, weights=['yolov5s.pt'], imgsz=[640, 640], batch_size=1, device=cpu, half=False, inplace=False, train=False, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=12, verbose=False, workspace=4, nms=False, agnostic_nms=False, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['torchscript', 'onnx']
YOLOv5 🚀 v6.2-104-ge3e5122 Python-3.7.13 torch-1.12.1+cu113 CPU

Downloading https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5s.pt to yolov5s.pt...
100% 14.1M/14.1M [00:00<00:00, 274MB/s]

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients

PyTorch: starting from yolov5s.pt with output shape (1, 25200, 85) (14.1 MB)

TorchScript: starting export with torch 1.12.1+cu113...
TorchScript: export success ✅ 1.7s, saved as yolov5s.torchscript (28.1 MB)

ONNX: starting export with onnx 1.12.0...
ONNX: export success ✅ 2.3s, saved as yolov5s.onnx (28.0 MB)

Export complete (5.5s)
Results saved to /content/yolov5
Detect:          python detect.py --weights yolov5s.onnx 
Validate:        python val.py --weights yolov5s.onnx 
PyTorch Hub:     model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.onnx')
Visualize:       https://netron.app/

The 3 exported models will be saved alongside the original PyTorch model:

Netron Viewer is recommended for visualizing exported models:

Exported Model Usage Examples

detect.py runs inference on exported models:

python detect.py --weights yolov5s.pt                 # PyTorch
                           yolov5s.torchscript        # TorchScript
                           yolov5s.onnx               # ONNX Runtime or OpenCV DNN with --dnn
                           yolov5s_openvino_model     # OpenVINO
                           yolov5s.engine             # TensorRT
                           yolov5s.mlmodel            # CoreML (macOS only)
                           yolov5s_saved_model        # TensorFlow SavedModel
                           yolov5s.pb                 # TensorFlow GraphDef
                           yolov5s.tflite             # TensorFlow Lite
                           yolov5s_edgetpu.tflite     # TensorFlow Edge TPU
                           yolov5s_paddle_model       # PaddlePaddle

val.py runs validation on exported models:

python val.py --weights yolov5s.pt                 # PyTorch
                        yolov5s.torchscript        # TorchScript
                        yolov5s.onnx               # ONNX Runtime or OpenCV DNN with --dnn
                        yolov5s_openvino_model     # OpenVINO
                        yolov5s.engine             # TensorRT
                        yolov5s.mlmodel            # CoreML (macOS Only)
                        yolov5s_saved_model        # TensorFlow SavedModel
                        yolov5s.pb                 # TensorFlow GraphDef
                        yolov5s.tflite             # TensorFlow Lite
                        yolov5s_edgetpu.tflite     # TensorFlow Edge TPU
                        yolov5s_paddle_model       # PaddlePaddle

Use PyTorch Hub with exported YOLOv5 models:

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'custom', 'yolov5s.pt')
                                                       'yolov5s.torchscript ')       # TorchScript
                                                       'yolov5s.onnx')               # ONNX Runtime
                                                       'yolov5s_openvino_model')     # OpenVINO
                                                       'yolov5s.engine')             # TensorRT
                                                       'yolov5s.mlmodel')            # CoreML (macOS Only)
                                                       'yolov5s_saved_model')        # TensorFlow SavedModel
                                                       'yolov5s.pb')                 # TensorFlow GraphDef
                                                       'yolov5s.tflite')             # TensorFlow Lite
                                                       'yolov5s_edgetpu.tflite')     # TensorFlow Edge TPU
                                                       'yolov5s_paddle_model')       # PaddlePaddle

# Images
img = 'https://ultralytics.com/images/zidane.jpg'  # or file, Path, PIL, OpenCV, numpy, list

# Inference
results = model(img)

# Results
results.print()  # or .show(), .save(), .crop(), .pandas(), etc.

OpenCV DNN inference

OpenCV inference with ONNX models:

python export.py --weights yolov5s.pt --include onnx

python detect.py --weights yolov5s.onnx --dnn  # detect
python val.py --weights yolov5s.onnx --dnn  # validate

C++ Inference

YOLOv5 OpenCV DNN C++ inference on exported ONNX model examples:

YOLOv5 OpenVINO C++ inference examples:

Good luck 🍀 and let us know if you have any other questions!

alkhalisy · 2023-01-11T12:18:19Z

Dear, I try to remove P3 and P5 detection and still tp P4, and I do require a change in the Neck, and everything becomes well and works. When I try to delete C5 from the feature Extraction Backbone and do modifications in the neck, the error happened
"File "/content/yolov5/models/yolo.py", line 334, in
args.append([ch[x] for x in f])
IndexError: list index out of range"
why any I can not do any change to the backbone???

nc: 80 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
anchors:

[30,61, 62,45, 59,119] # P4/16
backbone:

[[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [128, 3, 2]], # 1-P2/4
[-1, 3, C3, [128]],
[-1, 1, Conv, [256, 3, 2]], # 3-P3/8
[-1, 3, C3, [256]],
[-1, 1, Conv, [512, 3, 2]], # 5-P4/16
[-1, 3, C3, [512]],
[-1, 1, SPPF, [512, 5]], # 9
]

head:
[
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [512, False]], # 13

[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [256, False]], # 17 (P3/8-small)

[-1, 1, Conv, [256, 3, 2]],
[[-1, 14], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [512, False]], # 20 (P4/16-medium)

[[20], 1, Detect, [nc, anchors]], # Detect( P4)
]

glenn-jocher · 2023-11-15T08:08:43Z

@alkhalisy the issue you are encountering in modifying the backbone of the YOLOv5 model might be due to incorrect indexing or layer shapes. Since YOLOv5 expects a specific structure for its backbone and neck, modifying them without adjusting the subsequent layers, concatenations, or connections can lead to errors such as "IndexError: list index out of range".

When modifying the YOLOv5 backbone and neck, ensure that the changes maintain the overall structure and input/output shapes required by the subsequent layers and the head. Additionally, verify that the connections between the backbone, neck, and head are updated accordingly.

If you still encounter errors, you might consider sharing the complete modified configuration of the backbone, neck, and head, or provide more details about the specific changes you made. This will help in diagnosing the issue more effectively.

chaUAV added the enhancement New feature or request label Jun 30, 2020

github-actions bot added the Stale label Aug 1, 2020

github-actions bot closed this as completed Aug 6, 2020

glenn-jocher reopened this Sep 6, 2020

github-actions bot closed this as completed Sep 25, 2020

EnoxSoftware mentioned this issue Dec 10, 2020

Run yolov5 on opencvforunity yolo scene EnoxSoftware/OpenCVForUnity#82

Open

glenn-jocher linked a pull request Oct 11, 2021 that will close this issue

Refactor Detect() anchors for ONNX <> OpenCV DNN compatibility #4833

Merged

Is there any way I can use yolov5 with opencv dnn #239

Is there any way I can use yolov5 with opencv dnn #239

Comments

chaUAV commented Jun 30, 2020

🚀 Feature

edurenye commented Jun 30, 2020

glenn-jocher commented Jun 30, 2020

edurenye commented Jun 30, 2020

chaUAV commented Jul 1, 2020

github-actions bot commented Aug 1, 2020

MohamedAliRashad commented Sep 6, 2020

glenn-jocher commented Sep 6, 2020

MohamedAliRashad commented Sep 8, 2020

glenn-jocher commented Sep 11, 2020

MohamedAliRashad commented Sep 19, 2020

sean-wade commented Jan 12, 2021

leeyunhome commented Feb 20, 2021

leeyunhome commented Feb 20, 2021

glenn-jocher commented Feb 20, 2021

leeyunhome commented Feb 20, 2021

glenn-jocher commented Feb 21, 2021

vishal-nasre commented Jul 31, 2021

glenn-jocher commented Jul 31, 2021

glenn-jocher commented Oct 11, 2021

snehitvaddi commented Oct 14, 2021

glenn-jocher commented Oct 14, 2021 • edited Loading

snehitvaddi commented Oct 22, 2021 • edited Loading

glenn-jocher commented Oct 22, 2021

PauloMendes33 commented Dec 3, 2021

doleron commented Jan 18, 2022 • edited Loading

glenn-jocher commented Jan 18, 2022 • edited Loading

jebastin-nadar commented Jan 18, 2022

doleron commented Jan 18, 2022 • edited Loading

alimousavi1377 commented Jan 28, 2022

doleron commented Jan 28, 2022 • edited Loading

haimat commented Feb 16, 2022

doleron commented Feb 16, 2022

glenn-jocher commented Feb 16, 2022 • edited Loading

YOLOv5 Tutorials

haimat commented Feb 16, 2022

glenn-jocher commented Feb 16, 2022

kXborg commented Apr 21, 2022 • edited Loading

akbarali2019 commented Apr 29, 2022 • edited Loading

glenn-jocher commented Apr 29, 2022

AbinJilson commented Jun 17, 2022 • edited Loading

alkhalisy commented Jul 19, 2022

kXborg commented Jul 20, 2022

alkhalisy commented Jul 20, 2022 • edited Loading

alkhalisy commented Jul 20, 2022

kXborg commented Jul 22, 2022 • edited Loading

alkhalisy commented Jul 22, 2022

alkhalisy commented Jul 22, 2022 • edited Loading

Constants.

Text parameters.

Colors

Load class names.

Give the weight files to the model and load the network using them.

kXborg commented Jul 25, 2022

KishoreElvicto commented Jul 27, 2022

alkhalisy commented Jul 29, 2022 via email

Calviansyah commented Dec 13, 2022

glenn-jocher commented Dec 17, 2022 • edited Loading

Formats

Benchmarks

Colab Pro V100 GPU

Colab Pro CPU

Export a Trained YOLOv5 Model

Exported Model Usage Examples

OpenCV DNN inference

C++ Inference

alkhalisy commented Jan 11, 2023

glenn-jocher commented Nov 15, 2023

glenn-jocher commented Oct 14, 2021 •

edited

Loading

snehitvaddi commented Oct 22, 2021 •

edited

Loading

doleron commented Jan 18, 2022 •

edited

Loading

glenn-jocher commented Jan 18, 2022 •

edited

Loading

doleron commented Jan 18, 2022 •

edited

Loading

doleron commented Jan 28, 2022 •

edited

Loading

glenn-jocher commented Feb 16, 2022 •

edited

Loading

kXborg commented Apr 21, 2022 •

edited

Loading

akbarali2019 commented Apr 29, 2022 •

edited

Loading

AbinJilson commented Jun 17, 2022 •

edited

Loading

alkhalisy commented Jul 20, 2022 •

edited

Loading

kXborg commented Jul 22, 2022 •

edited

Loading

alkhalisy commented Jul 22, 2022 •

edited

Loading

glenn-jocher commented Dec 17, 2022 •

edited

Loading