Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save Left Right camera videos and then construct depth from it later #13346

Open
Hasnain1997-ai opened this issue Sep 14, 2024 · 37 comments
Open

Comments

@Hasnain1997-ai
Copy link

Hi,

I’m new to using RealSense cameras and don’t know much about them, so I need some help.

I have a RealSense D415 camera connected to my Xavier, and I’m working with it using Python.

I want to save the left and right camera videos along with all the necessary configurations. Later, when I need to (since I’ll have the left and right videos and the configuration saved), I want to reconstruct the depth frames using the left and right videos and the saved configurations with the camera connected to the Xavier.

How can I achieve this using Python?
If you can provide any code, that would be great.

Thanks.

Details:

  • Camera: Intel RealSense D415
  • Firmware Version: 5.16.0.1
  • pyrealsense2 Version: 2.55.1.6486
@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Sep 15, 2024

Hi @Hasnain1997-ai Intel's Depth from Stereo beginner guide at the link below provides Python code for using rectified left and right images with block-matching to create a depth map.

https://github.com/IntelRealSense/librealsense/blob/master/doc/depth-from-stereo.md

Alternatively, you could use OpenCV's StereoBM depth engine to create the depth image from left and right images, as described at #5950 (comment)

https://docs.opencv.org/4.x/dd/d53/tutorial_py_depthmap.html

@Hasnain1997-ai
Copy link
Author

I used both methods mentioned, but I am not getting the same results as when I save the depth as a frame from the RealSense D415

@MartyG-RealSense
Copy link
Collaborator

When a RealSense 400 Series camera constructs a depth frame from raw left and right images, it does so in a piece of hardware inside the camera called the Vision Processor D4 Board that applies a distortion model and rectification to the depth image.

If you construct the depth frame manually yourself then this may account for the differences between a RealSense depth frame and your self-constructed depth frame (which will not have been processed on the Vision Processor D4 board).

@Hasnain1997-ai
Copy link
Author

Hasnain1997-ai commented Sep 20, 2024

Suppose I have deployed the camera in an operational environment and saved the left and right camera videos (as they are lighter than the depth video). After returning from the deployment, I want to run some tests on the depth frames. I currently have:

  • A D415 camera connected to my Xavier.

  • The left and right camera videos.

Is there any code I can use to reconstruct the depth frames using the camera's Vision Processor D4 Board with my saved left and right videos?

@MartyG-RealSense
Copy link
Collaborator

To have that level of control over the camera's raw streams at the point of capture in the camera hardware, you would likely have to make use of RealSense's Low-Level Device API

https://dev.intelrealsense.com/docs/api-architecture#low-level-device-api

There are very few programs that make use of the Low-Level Device API, so there are not many programming references. An SDK program that does make use of the API is rs-data-collect, as described at IntelRealSense/realsense-ros#1409

@Hasnain1997-ai
Copy link
Author

My goal is to make the Depth Video from the saved Left And Right Camera videos. The Depth Video should be the same as that of when recorded from realSense d415 using the D415 camera
I am unable to use Low-Level Device API due to little knowledge about it

How can I achieve

@MartyG-RealSense
Copy link
Collaborator

Whilst it may be possible to combine single left and right image frames into a depth image in OpenCV, I am not aware of a way to do this with a live stream of images, unfortunately. It is not usually something that is attempted with RealSense cameras because the camera creates the depth frame for you automatically.

The RealSense SDK does have an interface called software-device that enables you to generate frames in OpenCV and feed them into the RealSense SDK via software-device. It works better with C++ language than it does with Python though.

https://github.com/IntelRealSense/librealsense/tree/master/examples/software-device

@Hasnain1997-ai
Copy link
Author

Can I replace the camera's left and right frames with the left and right frames from a video?

Explanation:
I have saved left and right video streams.
When I connect the camera and start it, it generates the depth map using the real-time frames captured by the left and right cameras.
What I want to do is replace the camera's left and right frames with the saved video's left and right frames, so that the camera sees the saved video frames as its current input, and then generates the depth map from the saved videos

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Sep 23, 2024

You could do that if you recorded the frames into a .bag format file, which is like a video recording of camera data.

If you insert an enable_device_from_file() config instruction before the pipeline start line then the script can use the data in the bag file as its data source instead of a live camera.

https://github.com/IntelRealSense/librealsense/blob/master/wrappers/python/examples/read_bag_example.py#L42

@Hasnain1997-ai
Copy link
Author

Hi MartyG,
I used the .bag method, but the saved video file is too large.
The video data for a single minute is around 1GB(ie if I saved for hours, then it will make it of Tbs size).
I want to reduce the file size.

@MartyG-RealSense
Copy link
Collaborator

Bag files are the best way of storing RealSense data but yes, they are multi-GB files. There is not much that can be done to reduce the size except for using a lower stream resolution or FPS speed.

@Hasnain1997-ai
Copy link
Author

So how can I make a workout for it

bcs My use case is that I have to deploy the camera in an operational environment and save data there
and then process the data and construct depth maps from it in lab

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Oct 1, 2024

I would strongly recommend not creating your own custom depth frame and just using the RealSense depth stream. All the work of combining the left and right frames is done for you, and you only need to have the depth stream enabled when recording and not the left and right infrared streams too.

If you need the stored data to be compact and you only need single frames and not to capture continuously like a video then exporting the depth map to a .png image file with Python code might work well for you. This file will have a very small size. A lot of the depth value information of the coordinates is lost when saving to .png, but you should have a visually accurate record of the depth map.

@Hasnain1997-ai
Copy link
Author

Hasnain1997-ai commented Oct 1, 2024

I My use case is that I have to deploy the camera in an operational environment and save data there(video data).
I am not interested in creating my custom depth frame, but how can I save the RealSense depth stream? Apart from the .blob format
because the .blob format take huge amount of space

@MartyG-RealSense
Copy link
Collaborator

#2731 (comment) has a Python script for saving depth and color to .avi video file. You could edit that script to remove the color references so that it only saves depth to .avi video.

@Hasnain1997-ai
Copy link
Author

Suppose I save depth to .avi video file
Now how can I use it in the real sense D415 code of depth
ie how can I replace my depth with this depth or how can even I process it in depth calculation

@MartyG-RealSense
Copy link
Collaborator

An .avi video file will play back in a standard video player application and provide a visual record of the depth map. It is not importable back into the RealSense SDK though to perform further calculation on it.

You can use Python code to save a single frame of RealSense depth data to a bag file with an instruction called 'save_single_frameset()' and so greatly reduce the file size. A Python example script for this can be found at the link below.

https://github.com/soarwing52/RealsensePython/blob/master/separate%20functions/single_frameset.py

A bag file can be imported into a RealSense script and used as the data source by the script as though it were a live camera.

If you want to save a continuous video-like stream of camera data that can be imported back into the RealSense SDK and have calculations done on it, a multi-gigabyte .bag file is the only option, unfortunately.

@Hasnain1997-ai
Copy link
Author

Apart from.bag there is no other method to save the data and then use it later
Nor saving the stereo Left and Right Camera videos saving(using them to reconstruct) will work

@MartyG-RealSense
Copy link
Collaborator

Aside from the bag format, you could generate a depth point cloud and export it to a . ply format file. You can then import a .ply into a 3D tool such as MeshLab to do further work on the data.

https://www.andreasjakl.com/capturing-3d-point-cloud-intel-realsense-converting-mesh-meshlab/

You can also save the depth data to a .csv file, which is a text-based representation of the coordinates and their values that can be imported into a database or spreadsheet application.

.ply and .csv cannot be imported back into the RealSense SDK though.


The SDK does have an interface called software-device for importing data in formats not usually supported for import, such as .png images. The example below demonstrates using a .png images to generate a depth pointcloud image in the SDK.

https://github.com/IntelRealSense/librealsense/tree/master/examples/software-device

It should also be possible to feed video frames into the software-device interface from OpenCV.

@Hasnain1997-ai
Copy link
Author

Any Python code? for software-device?

@MartyG-RealSense
Copy link
Collaborator

There are not many examples of Python code for software-device. A couple of references that do have scripts are #7057 and #12677

@Hasnain1997-ai
Copy link
Author

Isued this code to store the data images(depth and RGB) in

import cv2
import pyrealsense2 as rs
import numpy as np
import os
import time

fps = 30                  # frame rate
tv = 1000.0 / fps         # time interval between frames in miliseconds

max_num_frames  = 100      # max number of framesets to be captured into npy files and processed with software device
depth_folder = "npys_Saving/depth"
color_folder = "npys_Saving/color"

depth_file_name = "depth"  # depth_file_name + str(i) + ".npy"
color_file_name = "color"  # color_file_name + str(i) + ".npy"

# intrinsic and extrinsic from the camera
camera_depth_intrinsics          = rs.intrinsics()  # camera depth intrinsics
camera_color_intrinsics          = rs.intrinsics()  # camera color intrinsics
camera_depth_to_color_extrinsics = rs.extrinsics()  # camera depth to color extrinsics


######################## Start of first part - capture images from live device #######################################
# stream depth and color on attached realsnese camera and save depth and color frames into files with npy format
try:
    # create a context object, this object owns the handles to all connected realsense devices
    ctx = rs.context()
    devs = list(ctx.query_devices())
    
    if len(devs) > 0:
        print("Devices: {}".format(devs))
    else:
        print("No camera detected. Please connect a realsense camera and try again.")
        exit(0)
    
    pipeline = rs.pipeline()

    # configure streams
    config = rs.config()
    config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, fps)
    config.enable_stream(rs.stream.color, 1280, 720, rs.format.bgr8, fps)

    # start streaming with pipeline and get the configuration
    cfg = pipeline.start(config)
    
    # get intrinsics
    camera_depth_profile = cfg.get_stream(rs.stream.depth)                                      # fetch depth depth stream profile
    camera_depth_intrinsics = camera_depth_profile.as_video_stream_profile().get_intrinsics()   # downcast to video_stream_profile and fetch intrinsics
    
    camera_color_profile = cfg.get_stream(rs.stream.color)                                      # fetch color stream profile
    camera_color_intrinsics = camera_color_profile.as_video_stream_profile().get_intrinsics()   # downcast to video_stream_profile and fetch intrinsics
    
    camera_depth_to_color_extrinsics = camera_depth_profile.get_extrinsics_to(camera_color_profile)
 
    print("camera depth intrinsic:", camera_depth_intrinsics)
    print("camera color intrinsic:", camera_color_intrinsics)
    print("camera depth to color extrinsic:", camera_depth_to_color_extrinsics)

    print("streaming attached camera and save depth and color frames into files with npy format ...")

    i = 0
    while i < max_num_frames:
        # wait until a new coherent set of frames is available on the device
        frames = pipeline.wait_for_frames()
        depth = frames.get_depth_frame()
        color = frames.get_color_frame()

        if not depth or not color: continue
        
        # convert images to numpy arrays
        depth_image = np.asanyarray(depth.get_data())
        color_image = np.asanyarray(color.get_data())
  
        # save images in npy format
        depth_file = os.path.join(depth_folder, depth_file_name + str(i) + ".npy")
        color_file = os.path.join(color_folder, color_file_name + str(i) + ".npy")
        print("saving frame set ", i, depth_file, color_file)
        
        with open(depth_file, 'wb') as f1:
            np.save(f1,depth_image)
        
        with open(color_file, 'wb') as f2:
            np.save(f2,color_image)

        # next frameset
        i = i +1

except Exception as e:
    print(e)
    pass

######################## End of first part - capture images from live device #######################################

I am using this code to load the npys and put the images(depth and RGB) in the camera

import cv2
import pyrealsense2 as rs
import numpy as np
import os
import time

fps = 30  # frame rate
tv = 1000.0 / fps  # time interval between frames in milliseconds

max_num_frames = 100  # max number of framesets to be processed
depth_folder = "npys_Saving/depth"
color_folder = "npys_Saving/color"

depth_file_name = "depth"  # depth_file_name + str(i) + ".npy"
color_file_name = "color"  # color_file_name + str(i) + ".npy"

# intrinsic and extrinsic from the camera
camera_depth_intrinsics = rs.intrinsics()  # camera depth intrinsics
camera_color_intrinsics = rs.intrinsics()  # camera color intrinsics
camera_depth_to_color_extrinsics = rs.extrinsics()  # camera depth to color extrinsics


######################## Start of second part - align depth to color in software device #############################
# align depth to color with the above precaptured images in the software device

# software device
sdev = rs.software_device()

# software depth sensor
depth_sensor: rs.software_sensor = sdev.add_sensor("Depth")

# depth instrincis
depth_intrinsics = rs.intrinsics()

# Assuming you already have intrinsic/extrinsic from previous data (replace later if needed)
depth_intrinsics.width = camera_depth_intrinsics.width
depth_intrinsics.height = camera_depth_intrinsics.height

depth_intrinsics.ppx = camera_depth_intrinsics.ppx
depth_intrinsics.ppy = camera_depth_intrinsics.ppy

depth_intrinsics.fx = camera_depth_intrinsics.fx
depth_intrinsics.fy = camera_depth_intrinsics.fy

depth_intrinsics.coeffs = camera_depth_intrinsics.coeffs  ## [0.0, 0.0, 0.0, 0.0, 0.0]
depth_intrinsics.model = camera_depth_intrinsics.model  ## rs.pyrealsense2.distortion.brown_conrady

# Depth stream
depth_stream = rs.video_stream()
depth_stream.type = rs.stream.depth
depth_stream.width = depth_intrinsics.width
depth_stream.height = depth_intrinsics.height
depth_stream.fps = fps
depth_stream.bpp = 2  # depth z16 2 bytes per pixel
depth_stream.fmt = rs.format.z16
depth_stream.intrinsics = depth_intrinsics
depth_stream.index = 0
depth_stream.uid = 1

depth_profile = depth_sensor.add_video_stream(depth_stream)

# software color sensor
color_sensor: rs.software_sensor = sdev.add_sensor("Color")

# color intrinsic:
color_intrinsics = rs.intrinsics()
color_intrinsics.width = camera_color_intrinsics.width
color_intrinsics.height = camera_color_intrinsics.height

color_intrinsics.ppx = camera_color_intrinsics.ppx
color_intrinsics.ppy = camera_color_intrinsics.ppy

color_intrinsics.fx = camera_color_intrinsics.fx
color_intrinsics.fy = camera_color_intrinsics.fy

color_intrinsics.coeffs = camera_color_intrinsics.coeffs
color_intrinsics.model = camera_color_intrinsics.model

color_stream = rs.video_stream()
color_stream.type = rs.stream.color
color_stream.width = color_intrinsics.width
color_stream.height = color_intrinsics.height
color_stream.fps = fps
color_stream.bpp = 3  # color stream rgb8, 3 bytes per pixel in this example
color_stream.fmt = rs.format.rgb8
color_stream.intrinsics = color_intrinsics
color_stream.index = 0
color_stream.uid = 2

color_profile = color_sensor.add_video_stream(color_stream)

# Depth to color extrinsics
depth_to_color_extrinsics = rs.extrinsics()
depth_to_color_extrinsics.rotation = camera_depth_to_color_extrinsics.rotation
depth_to_color_extrinsics.translation = camera_depth_to_color_extrinsics.translation
depth_profile.register_extrinsics_to(depth_profile, depth_to_color_extrinsics)

# Start software sensors
depth_sensor.open(depth_profile)
color_sensor.open(color_profile)

# Synchronize frames from depth and color streams
camera_syncer = rs.syncer()
depth_sensor.start(camera_syncer)
color_sensor.start(camera_syncer)

# Create a depth alignment object
# Align depth frame to color frame
align_to = rs.stream.color
align = rs.align(align_to)

# Colorizer for depth rendering
colorizer = rs.colorizer()

# Loop through pre-captured frames
for i in range(0, max_num_frames):
    print("\nProcessing frame set:", i)

    # Pre-captured depth and color image file paths in npy format
    df = os.path.join(depth_folder, depth_file_name + str(i) + ".npy")
    cf = os.path.join(color_folder, color_file_name + str(i) + ".npy")

    if not os.path.exists(cf) or not os.path.exists(df):
        continue

    # Load depth frame from pre-captured npy file
    print('Loading depth frame:', df)
    depth_npy = np.load(df, mmap_mode='r')

    # Create software depth frame
    depth_swframe = rs.software_video_frame()
    depth_swframe.stride = depth_stream.width * depth_stream.bpp
    depth_swframe.bpp = depth_stream.bpp
    depth_swframe.timestamp = i * tv
    depth_swframe.pixels = depth_npy.copy()
    depth_swframe.domain = rs.timestamp_domain.hardware_clock
    depth_swframe.frame_number = i
    depth_swframe.profile = depth_profile.as_video_stream_profile()

    depth_sensor.on_video_frame(depth_swframe)

    # Load the color frame from pre-captured npy file
    print('Loading color frame:', cf)
    color_npy = np.load(cf, mmap_mode='r')

    # Create software color frame
    color_swframe = rs.software_video_frame()
    color_swframe.stride = color_stream.width * color_stream.bpp
    color_swframe.bpp = color_stream.bpp
    color_swframe.timestamp = i * tv
    color_swframe.pixels = color_npy.copy()
    color_swframe.domain = rs.timestamp_domain.hardware_clock
    color_swframe.frame_number = i
    color_swframe.profile = color_profile.as_video_stream_profile()

    color_sensor.on_video_frame(color_swframe)

    # Synchronize depth and color, receive as frameset
    frames = camera_syncer.wait_for_frames()

    # Get unaligned depth frame
    unaligned_depth_frame = frames.get_depth_frame()
    if not unaligned_depth_frame:
        continue

    # Align depth frame to color frame
    aligned_frames = align.process(frames)
    aligned_depth_frame = aligned_frames.get_depth_frame()
    color_frame = aligned_frames.get_color_frame()

    if not aligned_depth_frame or not color_frame:
        continue

    aligned_depth_frame = colorizer.colorize(aligned_depth_frame)

    npy_aligned_depth_image = np.asanyarray(aligned_depth_frame.get_data())
    npy_color_image = np.asanyarray(color_frame.get_data())

    # Display images side by side:
    images = np.hstack((npy_aligned_depth_image, npy_color_image))
    cv2.namedWindow('Aligned Depth & Color', cv2.WINDOW_NORMAL)
    cv2.imshow('Aligned Depth & Color', images)

    # Render the original unaligned depth as reference
    colorized_unaligned_depth_frame = colorizer.colorize(unaligned_depth_frame)
    npy_unaligned_depth_image = np.asanyarray(colorized_unaligned_depth_frame.get_data())

    cv2.imshow("Unaligned Depth", npy_unaligned_depth_image)

    key = cv2.waitKey(1)  # Wait for user input
    if key == 27:  # Press ESC to exit
        break

# Close all OpenCV windows
cv2.destroyAllWindows()

But while reconstruction I am getting the depth and RGB frames empty ie
np.asanyarray(frames.get_color_frame().get_data())
array([], shape=(0, 0, 3), dtype=uint8)

@Hasnain1997-ai
Copy link
Author

SO is this the correct method?
How can I solve it?

@MartyG-RealSense
Copy link
Collaborator

The subject of loading npy files with np.load is outside of my programming knowledge unfortunately, so I do not have commentary to offer about the code's correctness. I do apologize.

Another RealSense user's approach to using np.load to load data from an npy file can be found at #10431 (comment)

Another approach I found to using np.load for the depth image is shown in the snapshot below.

image

@Hasnain1997-ai
Copy link
Author

I save the depth using

        frames = pipeline.wait_for_frames()
        depth = frames.get_depth_frame()
        color = frames.get_color_frame()


        depth_image = np.asanyarray(depth.get_data())
        color_image = np.asanyarray(color.get_data())
        
  
        # save images in npy format
        depth_file = os.path.join(depth_folder, depth_file_name + str(i) + ".npy")
        color_file = os.path.join(color_folder, color_file_name + str(i) + ".npy")
        
        with open(depth_file, 'wb') as f1:
            np.save(f1,depth_image)
        
        with open(color_file, 'wb') as f2:
            np.save(f2,color_image)

Now Thne I load it using the code

sdev = rs.software_device()

# software depth sensor
depth_sensor: rs.software_sensor = sdev.add_sensor("Depth")

# depth instrincis
depth_intrinsics = rs.intrinsics()

depth_intrinsics.width  = camera_depth_intrinsics.width
depth_intrinsics.height = camera_depth_intrinsics.height

depth_intrinsics.ppx = camera_depth_intrinsics.ppx
depth_intrinsics.ppy = camera_depth_intrinsics.ppy

depth_intrinsics.fx = camera_depth_intrinsics.fx
depth_intrinsics.fy = camera_depth_intrinsics.fy

depth_intrinsics.coeffs = camera_depth_intrinsics.coeffs       ## [0.0, 0.0, 0.0, 0.0, 0.0]
depth_intrinsics.model = camera_depth_intrinsics.model         ## rs.pyrealsense2.distortion.brown_conrady

#depth stream
depth_stream = rs.video_stream()
depth_stream.type = rs.stream.depth
depth_stream.width = depth_intrinsics.width
depth_stream.height = depth_intrinsics.height
depth_stream.fps = fps
depth_stream.bpp = 2                              # depth z16 2 bytes per pixel
depth_stream.fmt = rs.format.z16
depth_stream.intrinsics = depth_intrinsics
depth_stream.index = 0
depth_stream.uid = 1

depth_profile = depth_sensor.add_video_stream(depth_stream)

# software color sensor
color_sensor: rs.software_sensor = sdev.add_sensor("Color")

# color intrinsic:
color_intrinsics = rs.intrinsics()
color_intrinsics.width = camera_color_intrinsics.width
color_intrinsics.height = camera_color_intrinsics.height

color_intrinsics.ppx = camera_color_intrinsics.ppx
color_intrinsics.ppy = camera_color_intrinsics.ppy

color_intrinsics.fx = camera_color_intrinsics.fx
color_intrinsics.fy = camera_color_intrinsics.fy

color_intrinsics.coeffs = camera_color_intrinsics.coeffs
color_intrinsics.model = camera_color_intrinsics.model

color_stream = rs.video_stream()
color_stream.type = rs.stream.color
color_stream.width = color_intrinsics.width
color_stream.height = color_intrinsics.height
color_stream.fps = fps
color_stream.bpp = 3                                # color stream rgb8 3 bytes per pixel in this example
color_stream.fmt = rs.format.rgb8
color_stream.intrinsics = color_intrinsics
color_stream.index = 0
color_stream.uid = 2

color_profile = color_sensor.add_video_stream(color_stream)

# depth to color extrinsics
depth_to_color_extrinsics = rs.extrinsics()
depth_to_color_extrinsics.rotation = camera_depth_to_color_extrinsics.rotation
depth_to_color_extrinsics.translation = camera_depth_to_color_extrinsics.translation
depth_profile.register_extrinsics_to(depth_profile, depth_to_color_extrinsics)

# start software sensors
depth_sensor.open(depth_profile)
color_sensor.open(color_profile)

# syncronize frames from depth and color streams
camera_syncer = rs.syncer()
depth_sensor.start(camera_syncer)
color_sensor.start(camera_syncer)

# create a depth alignment object
# rs.align allows us to perform alignment of depth frames to others frames
# the "align_to" is the stream type to which we plan to align depth frames
# align depth frame to color frame
align_to = rs.stream.color
align = rs.align(align_to)

# colorizer for depth rendering
colorizer = rs.colorizer()

# use "Enter", "Spacebar", "p", keys to pause for 5 seconds
paused = False

# loop through pre-captured frames
for i in range(0, max_num_frames):
    print("\nframe set:", i)
    
    # pause for 5 seconds at frameset 15 to allow user to better observe the images rendered on screen
    if i == 15: paused = True

    # precaptured depth and color image files in npy format
    df = os.path.join(depth_folder, depth_file_name + str(i) + ".npy")
    cf = os.path.join(color_folder, color_file_name + str(i) + ".npy")

    if (not os.path.exists(cf)) or (not os.path.exists(df)): continue

    # load depth frame from precaptured npy file
    print('loading depth frame ', df)
    depth_npy = np.load(df, mmap_mode='r')

    # create software depth frame
    depth_swframe = rs.software_video_frame()
    depth_swframe.stride = depth_stream.width * depth_stream.bpp
    depth_swframe.bpp = depth_stream.bpp
    depth_swframe.timestamp = i * tv
    depth_swframe.pixels = depth_npy
    depth_swframe.domain = rs.timestamp_domain.hardware_clock
    depth_swframe.frame_number = i
    depth_swframe.profile = depth_profile.as_video_stream_profile()
    depth_swframe.pixels = depth_npy

    depth_sensor.on_video_frame(depth_swframe)

    # load color frame from precaptured npy file
    print('loading color frame ', cf)
    color_npy = np.load(cf, mmap_mode='r')
 
    # create software color frame
    color_swframe = rs.software_video_frame()
    color_swframe.stride = color_stream.width * color_stream.bpp
    color_swframe.bpp = color_stream.bpp
    color_swframe.timestamp = i * tv
    color_swframe.pixels = color_npy
    color_swframe.domain = rs.timestamp_domain.hardware_clock
    color_swframe.frame_number = i
    color_swframe.profile = color_profile.as_video_stream_profile()
    color_swframe.pixels = color_npy

    color_sensor.on_video_frame(color_swframe)
    
    # synchronize depth and color, receive as frameset
    frames = camera_syncer.wait_for_frames()
    print("frame set:", frames.size(), " ", frames)

    # get unaligned depth frame
    unaligned_depth_frame = frames.get_depth_frame()
    if not unaligned_depth_frame: continue

    # align depth frame to color frame
    aligned_frames = align.process(frames)

    aligned_depth_frame = aligned_frames.get_depth_frame()
    color_frame = aligned_frames.get_color_frame()

    if (not aligned_depth_frame) or (not color_frame): continue

    aligned_depth_frame = colorizer.colorize(aligned_depth_frame)
    
    print("converting frames into npy array")
    npy_aligned_depth_image = np.asanyarray(aligned_depth_frame.get_data())
    npy_color_image = np.asanyarray(color_frame.get_data())

Now I want to find distacne of a pixel ie

           depth_frame = unaligned_depth_frame.as_depth_frame() # or aligned(I tried both)
            x_tl, y_tl = int(top_left[0]), int(top_left[1])

            depth_tl = depth_frame.get_distance(x_tl, y_tl)
            if depth_tl == 0:
                print(f- Top-left depth value is zero')
                continue
            point_tl = rs.rs2_deproject_pixel_to_point(depth_intrinsics, [x_tl, y_tl], depth_tl)
            

I am getting depth ie depth_tl =0 always

although while save the data I check and the depth was not zero and I saved those frames and depth for which the depth is not zero then I used the code to load the saved depth and frame npy and then when I found the depth it is zero

How can I correct it? any suggestion about what I am doing wrong?

@MartyG-RealSense
Copy link
Collaborator

#13346 (comment) is a good example of a Python script for retrieving a coordinate's real-world distance with the get_distance() instruction.

@Hasnain1997-ai
Copy link
Author

my question is that
I am getting depth ie depth_tl =0 always

although while saving the data I checked and the depth was not zero I saved those frames and depth for which the depth was not zero then I used the code to load the saved depth and frame .npy and then when I found the depth it was zero

How can I correct it? Do you have any suggestion about what I am doing wrong?

@MartyG-RealSense
Copy link
Collaborator

I do not have any advice to offer about what the problem may be, unfortunately, as I do not have knowledge of software-device or .npy programming. I do apologize.

@Hasnain1997-ai
Copy link
Author

So what method I should use to save the depth data
For a pixel
x,y
i get the depth value ie depth_frame.get_distance and also
rs.rs2_deproject_pixel_to_point(depth_intrinsics, [x_tl, y_tl], depth_tl)

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Oct 5, 2024

An alternative way to get the real-world depth value of a coordinate without using rs2_deproject_pixel_to_point or depth-color alignment is to convert a color pixel into a depth pixel using rs2_project_color_pixel_to_depth_pixel. A Python script at #5603 (comment) demonstrates this principle.

@MartyG-RealSense
Copy link
Collaborator

Hi @Hasnain1997-ai Do you require further assistance with this case, please? Thanks!

@Hasnain1997-ai
Copy link
Author

So I saved the data using the code below:

        depth_image = np.asanyarray(depth.get_data())
        color_image = np.asanyarray(color.get_data())

        depth_file = os.path.join(depth_folder, depth_file_name + str(i) + ".npy")
        color_file = os.path.join(color_folder, color_file_name + str(i) + ".npy")
        print("saving frame set ", i, depth_file, color_file)
        
        with open(depth_file, 'wb') as f1:
            np.save(f1,depth_image)
        
        with open(color_file, 'wb') as f2:
            np.save(f2,color_image)

Then I estimted the dimension using these saved data for which I used the code:

depth_image, color_image = load_npy_files(depth_file, color_file)

# Create depth intrinsics (replace with your camera's actual values)
depth_intrinsics = rs.intrinsics()
depth_intrinsics.width = 640
depth_intrinsics.height = 480
depth_intrinsics.ppx = 320.788
depth_intrinsics.ppy = 238.423
depth_intrinsics.fx = 384.744
depth_intrinsics.fy = 384.744
depth_intrinsics.model = rs.distortion.brown_conrady
depth_intrinsics.coeffs = [0, 0, 0, 0, 0]

camera_depth_intrinsics          = rs.intrinsics()  # camera depth intrinsics
camera_color_intrinsics          = rs.intrinsics()  # camera color intrinsics
camera_depth_to_color_extrinsics = rs.extrinsics() 
ctx = rs.context()
devs = list(ctx.query_devices())

if len(devs) > 0:
    print("Devices: {}".format(devs))
else:
    print("No camera detected. Please connect a realsense camera and try again.")
    exit(0)

pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.color)
config.enable_stream(rs.stream.depth)
profile = pipeline.start(config)

depth_sensor = profile.get_device().first_depth_sensor()
depth_scale = depth_sensor.get_depth_scale() # Typical value for RealSense depth cameras, adjust if needed
scaled_depth_image = depth_image * depth_scale

def get_distance(scaled_depth_image, x, y):
    return scaled_depth_image[y, x]  # Already in meters

def rs2_deproject_pixel_to_point(intrinsics, pixel, depth):
    x = (pixel[0] - intrinsics.ppx) / intrinsics.fx
    y = (pixel[1] - intrinsics.ppy) / intrinsics.fy
    z = depth
    return [x * z, y * z, z]

depth_pixel = get_distance(scaled_depth_image, PIXEL_X, PIXEL_Y)
point_pixel = rs2_deproject_pixel_to_point(depth_intrinsics, [PIXEL_X, PIXEL_Y], depth_pixel)

SO using this code I estimated the dimensions of the object. but there was a discrepancy b/w the original and the estiamed dimension
The original dimension was: 8.3 cm
While I was getting: 13.1 cm
WHy is this discrepancy in b/w and this ratio ie 8.3/13.1 is constant, ie if I use another object of different dimension, still I get the difference of almost the same ration ie 8.3/13.1

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Oct 16, 2024

A possible cause of error could be that you are multiplying depth_image by the depth scale to calculate the scaled_depth_image value.

scaled_depth_image = depth_image * depth_scale
def get_distance(scaled_depth_image, x, y):

There are two main ways of calculating the real-world distance in meters: (1) multiply the raw pixel depth value by the camera's depth scale value; or (2) use the get_distance() instruction. One method or the other should be used, but not both methods in the same script, as get_distance() already automatically takes account of the camera's depth scale without you having to provide it.

@Hasnain1997-ai
Copy link
Author

Hasnain1997-ai commented Oct 16, 2024

So you are suggesting that
As I am using def get_distance(scaled_depth_image, x, y): function to clalulate distance so I should make scaled_depth_image = depth_image * 1
In this way I will get the correct results right?

@MartyG-RealSense
Copy link
Collaborator

If depth_image represents the pixel depth value then multiplying it by the D415's depth scale (which is 0.001) will give the coordinate's real-world distance in meters.

For example, if the pixel depth value was '4000' then 4000 x 0.001 = 4 (real-world meters).

@Hasnain1997-ai
Copy link
Author

frames = realsense_pipeline.wait_for_frames()
aligned_frames = align_to_color.process(frames)
depth_image = aligned_frames.get_depth_frame()
np.save('Path.npy',depth_image)

I save the depth image as above

So what should I do now, to get the correct distance

@MartyG-RealSense
Copy link
Collaborator

Are you aiming to get the distance from a saved .npy file? If you are then gdlg/panoramic-depth-estimation#3 - whilst not a librealsense case - advises loading an .npy file containing depth data with the numpy instruction np.load (an instruction you are already using) to retrieve the depth value for each pixel from an array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants