Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

its possible to get output like autoflip #10479

Closed
1 task done
akashAD98 opened this issue Dec 12, 2022 · 8 comments
Closed
1 task done

its possible to get output like autoflip #10479

akashAD98 opened this issue Dec 12, 2022 · 8 comments
Labels
question Further information is requested Stale

Comments

@akashAD98
Copy link
Contributor

Search before asking

Question

https://google.github.io/mediapipe/solutions/autoflip.html
i wan to use object detection code,it will detect person & crop according to it & save the result.

so my question is can only by detecting object person - ,can we achieve this kind of solution?

if we draw a bounding box for a person & then increase w & h,& it will get perfect aspect ratio?

@glenn-jocher @AyushExel

Additional

No response

@akashAD98 akashAD98 added the question Further information is requested label Dec 12, 2022
@glenn-jocher
Copy link
Member

@akashAD98 👋 Hello! Thanks for asking about cropping results with YOLOv5 🚀. Cropping bounding box detections can be useful for training classification models on box contents for example. This feature was added in PR #2827. You can crop detections using either detect.py or YOLOv5 PyTorch Hub:

detect.py

Crops will be saved under runs/detect/exp/crops, with a directory for each class detected.

python detect.py --save-crop

Original

Crop

YOLOv5 PyTorch Hub

Crops will be saved under runs/detect/exp/crops if save=True, and also returned as a dictionary with crops as numpy arrays.

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # or yolov5m, yolov5l, yolov5x, custom

# Images
img = 'https://ultralytics.com/images/zidane.jpg'  # or file, Path, PIL, OpenCV, numpy, list

# Inference
results = model(img)

# Results
crops = results.crop(save=True) 
# -- or --
crops = results.crop(save=True, save_dir='runs/detect/exp')  # specify save dir

Good luck 🍀 and let us know if you have any other questions!

@akashAD98
Copy link
Contributor Author

@glenn-jocher i don't want to crop image, i want to crop all video which will focus on person only, like autoflip

@akashAD98
Copy link
Contributor Author

akashAD98 commented Dec 13, 2022

@glenn-jocher my question is using only object detection & code logic its possible to achieve performance like autofip?

below image,show that they have done reframing based on person in frame, so using only bounding box can we achieve this ?

image

i tried adding extra width & height for detected person bounding box . but each bonding box is having different size & its hard for matching w & h & saving that video. I'm looking for your suggetions

@glenn-jocher
Copy link
Member

glenn-jocher commented Dec 18, 2022

@akashAD98 I'd just create a custom python workflow after loading a YOLOv5 model with PyTorch Hub. Some details are below, but basically you can handle the video cropping however you like based on the contents, i.e. detected people and their boxes.

You might want some filtering to smooth transitions etc over time.

YOLOv5 🚀 PyTorch Hub models allow for simple model loading and inference in a pure python environment without using detect.py.

Simple Inference Example

This example loads a pretrained YOLOv5s model from PyTorch Hub as model and passes an image for inference. 'yolov5s' is the YOLOv5 'small' model. For details on all available models please see the README. Custom models can also be loaded, including custom trained PyTorch models and their exported variants, i.e. ONNX, TensorRT, TensorFlow, OpenVINO YOLOv5 models.

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # yolov5n - yolov5x6 official model
#                                            'custom', 'path/to/best.pt')  # custom model

# Images
im = 'https://ultralytics.com/images/zidane.jpg'  # or file, Path, URL, PIL, OpenCV, numpy, list

# Inference
results = model(im)

# Results
results.print()  # or .show(), .save(), .crop(), .pandas(), etc.
results.xyxy[0]  # im predictions (tensor)

results.pandas().xyxy[0]  # im predictions (pandas)
#      xmin    ymin    xmax   ymax  confidence  class    name
# 0  749.50   43.50  1148.0  704.5    0.874023      0  person
# 2  114.75  195.75  1095.0  708.0    0.624512      0  person
# 3  986.00  304.00  1028.0  420.0    0.286865     27     tie

results.pandas().xyxy[0].value_counts('name')  # class counts (pandas)
# person    2
# tie       1

See YOLOv5 PyTorch Hub Tutorial for details.

Good luck 🍀 and let us know if you have any other questions!

@akashAD98
Copy link
Contributor Author

@glenn-jocher i can crop images & save easily, but cropping in realtime video & focusing on person is i think different, how can I achieve this using pytorchHub

@akashAD98
Copy link
Contributor Author

@glenn-jocher im using torch.hub for model loading & other operations,
i was able to detect person class & I'm getting result
but I'm looking for x,y,wh values from result , how should I get these values? can we do modification ?
also I want to do some custom modification ,help is really appriciated.

this is my logic

import cv2
import torch

cap = cv2.VideoCapture('/content/withbackgroung_op.mp4')
# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
model.classes = [0]

I'm loading video using torch.hub & want to add this modifications

        face = max(crop, key=lambda x: x[2] * x[3])

        # Get the face's bounding box
        x, y, w, h = person

        # Determine the desired size for the face
        person_size = 200

        # Calculate the aspect ratio of the face
        aspect_ratio = w / h

        # Calculate the width and height of the resized frame
        new_width = int(person_size * aspect_ratio)
        new_height = int(person_size / aspect_ratio)

        # Resize the frame
        frame = cv2.resize(frame, (new_width, new_height))


@github-actions
Copy link
Contributor

github-actions bot commented Feb 12, 2023

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

@github-actions github-actions bot added the Stale label Feb 12, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 23, 2023
@glenn-jocher
Copy link
Member

@akashAD98 You can certainly achieve real-time object detection and video cropping using YOLOv5 model loaded with PyTorch Hub. You can modify the detection results to perform custom cropping and resizing on the video frames as per your requirements. Below is a modified version of your logic that showcases how to accomplish this using YOLOv5 with some simple custom modifications:

import cv2
import torch

# Load the YOLOv5 model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
model.classes = [0]  # Set class to person

cap = cv2.VideoCapture('/content/withbackgroung_op.mp4')

while cap.isOpened():
    ret, frame = cap.read()
    
    if ret:
        results = model(frame)  # Perform object detection on the frame
        
        # Processing the detection results
        for det in results.pred[0]:
            if det[5] == 0:  # Check if the detection is a person
                x, y, w, h = int(det[0]), int(det[1]), int(det[2]-det[0]), int(det[3]-det[1])  # Get x, y, w, h values from the detection
                aspect_ratio = w / h 
                person_size = 200
                new_width = int(person_size * aspect_ratio)
                new_height = int(person_size)
                resized_frame = cv2.resize(frame[y:y+h, x:x+w], (new_width, new_height))  # Crop and resize the frame based on the detected person

        # Show the modified results or save them as required
        cv2.imshow('frame', resized_frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        break

cap.release()
cv2.destroyAllWindows()

This snippet processes each video frame, performs YOLOv5 object detection on it, and then crops and resizes the frame based on the detected person. Feel free to further modify and customize the logic as per your specific use case.

Let me know if you need any further assistance or modifications!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

2 participants