Skip to content

Latest commit

 

History

History
231 lines (194 loc) · 10.9 KB

tutorial.md

File metadata and controls

231 lines (194 loc) · 10.9 KB

Part 1 - MagiScan 3D and Blender to Create the Pikachu Model

  • Download the free app MagiScan3D and follow the instruction to create the 3D model.
  • Once the model is ready, export it as glb format. At this stage the 3D scan is raw, and needs a cleanup.
  • Download Blender 3.6.3 and open it.
  • File > Import > glTF 2.0 > Load the model from MagiScan3D.
  • First, on the top right, change the view of the object. Then, change Object Mode to Edit Mode.

  • Select all the vertices to be deleted > Right click > Delete Vertices.

  • The final model should be clean and should look as follow.

  • File > Export > .fbx > On the right column, Path Mode > Copy > Select the box near Copy and save.

  • UV Editing > Image > Save As... > Save the image texture of the object (as RGBA).

Part 2 - Unity Perception for Synthetic Data Generation

  • Download Unity Hub and Unity 2022.3.21f1 Silicon.

  • Start a new High Definition 3D project.

  • Window > Package Manager > Add package from git URL > Insert com.unity.perception.
  • Window > Package Manager > Perception > Samples > Tutorial Files > Import.

  • Project tab > Assets > Create a new folder called Scene.
  • Inside the Scene folder > Create > Scene, and call it TutorialScene, then double click on it.

  • In the Hierarchy panel, double click the Main Camera.
  • In the Inspector panel of the Main Camera modify the values according to the image.

  • Always in the Inspector panel of the Main Camera click on Add Component and add Perception Camera.
  • Edit > Project Settings > Editor > disable Asynchronous Shader Compilation.

  • Project tab > Look for "HDRP High Fidelity" in the search tab > Lit Shader Mode > Both.

  • Main Camera > Inspector > Perception Camera (Script) > Camera Labelers > +, and add first BoundingBox2DLabeler, and then SemanticSegmentationLabeler.

  • Project > Assets folder > Create > Perception > ID Label Config, renamed TutorialIdLabelConfig.

  • Project > Assets folder > Create > Perception > Semantic Segmentation Label Config, renamed TutorialSemanticSegmentationLabelConfig.

  • Main Camera > Perception Camera (Script) > Drag and drop the newly created files to the corresponding Camera Labelers Label Config (see image).

  • Project > Scene > Drag and drop the Pikachu model (.fbx), the model texture (.png), and the background image (png).
  • Project > Scene > Create > Material > Drag and drop the model texture (.png) to the new material's Surface Inputs > Base Map.

  • Project > Scene > Drag and drop the Pikachu model into the Hierarchy. For the moment, this Pikachu will appear without colors nor texture.
  • Drag and drop the material ball on the white Pikachu in Scene. Now the Pikachu should appear colored.
  • Hierarchy > Pikachu object > Inspector > Add Component > Labeling > Use Automatic Labeling > Labeling Scheme > Use asset name > Add to Label Config... > Select both TutorialIdLabelConfig and TutorialSemanticSegmentationLabelConfig (Add Label for both).

  • Hierarchy > Right click > 3D Object > Cube > Drag the background image and drop it on the Cube object (which should now have the texture of the background image).

  • Hierarchy > Cube > Inspector > Adjust the values of Transform according to the image.

  • Before proceeding, it might be necessary to modify the Directional Light to match some better values.

  • Hierarchy > Pikachu object > Inspector > Add Component > Fixed Lenght Scenario > Add Randomizer > RotationRandomizer > Set the values of the image.

  • Lastly, always in the Pikachu object Inspector > Add Component > Rotation Randomizer Tag (which is already present in the above image).
  • Now, by pressing the play button the data generation will begin.

  • To find where the images are being saved: Edit > Project Settings > Perception > Solo Endpoint > Base Path is the folder where the outputs are collected. (Show Folder) to check.

Part 3 - Train YOLO Model and use it in Real Time

  • It is convenient to repeat the synthetic data generation process with multiple position of the object in the frame. In this case, repeat the generation with 4 different position-size combination.

  • For each data generation (4) we have now a folder of sequences. The decision of generating sequences rather than single frames is because the first shot is blurry; the Perception package is so fast in making screenshots that the object movement cannot follow. The structure of the Unity outputs is as follows.
data
 |
 └── pika1
 |    |
 |    └── annotation_definitions.json
 |    └── metadata.json
 |    └── metric_definition.json
 |    └── sensor_definitions.json
 |    └── sequence.0
 |    └── sequence.1
 |    └── ..
 |    └── sequence.2
 |    |    |
 |    |    └── step0.camera.png
 |    |    └── step0.camera.semantic.segmentation.png
 |    |    └── step0.frame_data.json
 |    |    └── ..
 |    |    └── step4.camera.png
 |    |    └── step4.camera.semantic.segmentation.png
 |    |    └── step4.frame_data.json
 |    |
 |    └── ..
 |    └── sequence.110
 |
 └── pika2
 └── pika3
 └── pika4
  • From each sequence extract the last frame, the 5-th. Together with the frame, we collect the data from the corresponding .json annotation and we save it in the YOLOv8 format, namely <class_id> <x_center> <y_center> <width> <height>. The script that does that is code/extract_frame_and_data.py.
  • Navigate to the output of the script and create a new file, data.yaml, with the following content. This is needed during the training of the YOLOv8 model.
train: ../images
val: ../images

nc: 1
names: ['Pikachu']
  • The folder has to have the following format.
dataset
 |
 └── data.yaml
 |
 └── images
 |    |
 |    └── v1__1.png
 |    └── ..
 |    └── v1__100.png
 |    └── v2__1.png
 |    └── ..
 |    └── v2__100.png
 |    └── v3__1.png
 |    └── ..
 |    └── v3__100.png
 |    └── v4__1.png
 |    └── ..
 |    └── v4__100.png
 |
 └── labels
      |
      └── v1__1.txt
      └── ..
      └── v1__100.txt
      └── v2__1.txt
      └── ..
      └── v2__100.txt
      └── v3__1.txt
      └── ..
      └── v3__100.txt
      └── v4__1.txt
      └── ..
      └── v4__100.txt
  • Zip the folder and load it on Colab.
  • Move to Colab > Use the notebook code/train_yolov8_model.ipynb to train a YOLOv8 model.
  • Save /content/runs/detect/train/weights/best.pt locally.
  • To run the model, connect a webcam and run code/run_realtime_pikachu_detection.py.