Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run the trt_yolov3.py error #419

Closed
kaiven11 opened this issue May 31, 2021 · 11 comments
Closed

run the trt_yolov3.py error #419

kaiven11 opened this issue May 31, 2021 · 11 comments

Comments

@kaiven11
Copy link

kaiven11 commented May 31, 2021

helmet.txt

  1. env
    os:ubuntu 18.04
    python: 3.6.9

  2. change the class num in yolo.py
    category_num=5

  3. change the class name dict in yolov3_class.py

  4. convert the weights to onnx file (success)

  5. convert the onnx to trt file (success)

  6. execute commad
    python3_ trt_yolov3.py --model yolov3-416 --rtsp --uri rtsp://admin:abcd1234@192.168.3.11:554/h264/ch1/main/av_stream

    the error message :
    trt_outputs = [output.reshape(shape) for output, shape
    ValueError: cannot reshape array of size 5070 into shape (1,255,13,13)

cfg file:

helmet.txt

weights file:
链接:https://pan.baidu.com/s/1LQVdKnSkQQtLsUbJgDPi3w
提取码:u39i
i had try many ways ,but i still can't resolve the error,pls give me some advice.thk you .

@jkjung-avt
Copy link
Owner

"trt_yolov3.py" is a rather old version of the code. It has been replaced by "trt_yolo.py" now.

Please git pull the latest code from the repo and follow the steps in README.md to convert your model.

@kaiven11
Copy link
Author

kaiven11 commented Jun 1, 2021

I had git pull the latest code ,and flollow the steps in README.md, when I try to excute "python onnx_to_tensorrt.py -m helmet",

the error message output :

image

how to solove this? thk you.

@kaiven11
Copy link
Author

kaiven11 commented Jun 1, 2021

the error message:
message.txt

@jkjung-avt
Copy link
Owner

jkjung-avt commented Jun 1, 2021

My code assumes the conv features maps before yolo layers are:

yolo_whs = [[w // 32, h // 32], [w // 16, h // 16], [w // 8, h // 8]]

But your custom model is using "// 4" instead of "// 8", i.e. you're using "stride=4" for the last yolo layer. As a quick fix, you could patch the above line of code with the following:

    yolo_whs = [[w // 32, h // 32], [w // 16, h // 16], [w // 4, h // 4]] 

@kaiven11
Copy link
Author

kaiven11 commented Jun 1, 2021

Thk you! It works.
image

@kaiven11
Copy link
Author

kaiven11 commented Jun 1, 2021

The FPS is amazing!

image

@kaiven11
Copy link
Author

kaiven11 commented Jun 1, 2021

I tried to use the trt file on camera detection, But i faced a new problem, The detector drawed so many boxs on the picture.
1622549417(1)

I tried to modify the conf_th and nms_threshold in yolo_with_plugins.py , But it did not work,Why this happens?

@jkjung-avt
Copy link
Owner

This happens when your TensorRT inference runs too fast (inference and draw detection boxes on the same image multiple times).

You could use the "--copy_frame" command-line option to get around that. (FPS would be slightly lower since the image needs to be copied before each inference.)

@kaiven11
Copy link
Author

kaiven11 commented Jun 1, 2021

i had use the "--copy_frame" option, And now I‘m trying to detect 18 rtsp urls, But i get a problem.
image

I think the Tensorrt inference runs enough fast, it should not show this message.

By the way,I use the follow code to open multiple IP or RTSP cameras.

def letterbox(img, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True):
    # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
    try:
        shape = img.shape[:2]  # current shape [height, width]
    except Exception as e:
        pass
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better test mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - \
        new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, 64), np.mod(dh, 64)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / \
            shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    img = cv2.copyMakeBorder(img, top, bottom, left, right,
                             cv2.BORDER_CONSTANT, value=color)  # add border
    return img, ratio, (dw, dh)

class LoadStreams:  # multiple IP or RTSP cameras
    def __init__(self, sources='streams.txt', img_size=640):
        self.mode = 'images'
        self.img_size = img_size
        self.cam_id = None

        if os.path.isfile(sources):
            with open(sources, 'r') as f:
                sources = [x.strip()
                           for x in f.read().splitlines() if len(x.strip())]
        else:
            sources = [sources]

        n = len(sources)
        self.imgs = [None] * n
        self.sources = sources
        for i, s in enumerate(sources):
            # Start the thread to read frames from the video stream
            print('%g/%g: %s... ' % (i + 1, n, s), end='')
            cap = cv2.VideoCapture(eval(s) if s.isnumeric() else s)
            assert cap.isOpened(), 'Failed to open %s' % s
            w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
            h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
            fps = cap.get(cv2.CAP_PROP_FPS) % 100
            _, self.imgs[i] = cap.read()  # guarantee first frame
            thread = Thread(target=self.update, args=([i, cap]), daemon=True)
            print(' success (%gx%g at %.2f FPS).' % (w, h, fps))
            thread.start()
        print('')  # newline

        # check for common shapes

        s = np.stack([letterbox(x, new_shape=self.img_size)[
                     0].shape for x in self.imgs], 0)  # inference shapes
        # rect inference if all shapes equal
        self.rect = np.unique(s, axis=0).shape[0] == 1
        if not self.rect:
            print('WARNING: Different stream shapes detected. For optimal performance supply similarly-shaped streams.')

    def update(self, index, cap):
        self.cam_id = index
        # Read next stream frame in a daemon thread
        n = 0
        while cap.isOpened():
            n += 1
            # _, self.imgs[index] = cap.read()
            cap.grab()
            if n == 1:  # read every 4th frame
                _, self.imgs[index] = cap.retrieve()
                n = 0
            # time.sleep(0.01)  # wait time

    def __iter__(self):
        self.count = -1
        return self

    def __next__(self):
        self.count += 1
        img0 = self.imgs.copy()
        if cv2.waitKey(1) == ord('q'):  # q to quit
            cv2.destroyAllWindows()
            raise StopIteration

        # Letterbox
        img = [letterbox(x, new_shape=self.img_size, auto=self.rect)[0]
               for x in img0]

        # Stack
        img = np.stack(img, 0)

        # Convert
        # BGR to RGB, to bsx3x416x416
        img = img[:, :, :, ::-1].transpose(0, 3, 1, 2)
        img = np.ascontiguousarray(img)

        return self.sources, self.imgs, img0, None, self.cam_id, img

    def __len__(self):
        return 0  # 1E12 frames = 32 streams at 30 FPS for 30 years

when detect the image,I use like this:

def loop_and_detect(trt_yolo, conf_th, vis, dataset):
    """Continuously capture images from camera and do object detection.

    # Arguments
      cap: the camera instance (video source).
      trt_yolo: the TRT YOLO object detector instance.
      conf_th: confidence/score threshold for object detection.
      vis: for visualization.
      writer: the VideoWriter object for the output video.
    """

    for _, raw_image, _, _, cam_id, _ in dataset:
        for i, frame in enumerate(raw_image):

            boxes, confs, clss = trt_yolo.detect(frame, conf_th)
            frame = vis.draw_bboxes(frame, boxes, confs, clss)
            img_show = cv2.resize(frame, (348, 348))
            cv2.imshow(str(i), img_show)
            # print('.', end='', flush=True)

Is there a problem with my code ?

The code file:

trt_yolo_cv.txt

@jkjung-avt
Copy link
Owner

@kaiven11 Please debug the code by yourself. I don't have time to review it.

@kaiven11
Copy link
Author

kaiven11 commented Jun 2, 2021

Thk you! I tried to run the code , And not found the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants