Merge branch 'master' of github.com:jsk-ros-pkg/jsk_recognition

jsk-ros-pkg · Feb 2, 2024 · 55b488d · 55b488d
2 parents 580132f + 6f2b856
commit 55b488d
Show file tree

Hide file tree

Showing 81 changed files with 2,366 additions and 430 deletions.
diff --git a/audio_to_spectrogram/CHANGELOG.rst b/audio_to_spectrogram/CHANGELOG.rst
@@ -2,6 +2,19 @@
 Changelog for package audio_to_spectrogram
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
+1.2.17 (2023-11-14)
+-------------------
+
+1.2.16 (2023-11-10)
+-------------------
+* [audio_to_spectrogram, sound_classification] Add data_to_spectrogram (`#2767 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2767>`_)
+* [audio_to_spectrogram] Enable to change spectrum plot from rosparam (`#2760 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2760>`_)
+* Fix audio to spectrogram plot and add test (`#2764 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2764>`_)
+* [audio_to_spectrogram] Add AudioAmplitudePlot node to visualize audio amplitude `#2657 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2657>`_ (`#2755 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2755>`_)
+* [audio_to_spectrogram] Enable publishing frequency vs amplitude plot (`#2654 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2654>`_)
+* use catkin_install_python to install python scripts under node_scripts/ scripts/ (`#2743 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2743>`_)
+* Contributors: Kei Okada, Naoto Tsukamoto, Shingo Kitagawa, Shun Hasegawa, Yoshiki Obinata, Iory Yanokura
+
 1.2.15 (2020-10-10)
 -------------------
 

diff --git a/audio_to_spectrogram/CMakeLists.txt b/audio_to_spectrogram/CMakeLists.txt
@@ -8,7 +8,7 @@ find_package(catkin REQUIRED COMPONENTS
 catkin_python_setup()
 
 generate_dynamic_reconfigure_options(
-  cfg/AudioAmplitudePlot.cfg
+  cfg/DataAmplitudePlot.cfg
 )
 
 catkin_package()
@@ -41,4 +41,10 @@ if(CATKIN_ENABLE_TESTING)
   find_package(catkin REQUIRED COMPONENTS rostest roslaunch)
   add_rostest(test/audio_to_spectrogram.test)
   roslaunch_add_file_check(launch/audio_to_spectrogram.launch)
+  if(NOT $ENV{ROS_DISTRO} STRLESS "kinetic")
+    # Under kinetic, eval cannot be used in launch files
+    # http://wiki.ros.org/roslaunch/XML#substitution_args
+    add_rostest(test/wrench_to_spectrogram.test)
+    roslaunch_add_file_check(launch/wrench_to_spectrogram.launch)
+  endif()
 endif()
diff --git a/audio_to_spectrogram/README.md b/audio_to_spectrogram/README.md
@@ -1,6 +1,6 @@
 # audio_to_spectrogram
 
-This package converts audio data to spectrum and spectrogram data.
+This package converts audio data (or other time-series data) to spectrum and spectrogram data.
 
 # Usage
 By following command, you can publish audio, spectrum and spectrogram topics. Please set correct args for your microphone configuration, such as mic\_sampling\_rate or bitdepth.
@@ -9,6 +9,13 @@ By following command, you can publish audio, spectrum and spectrogram topics. Pl
 roslaunch audio_to_spectrogram audio_to_spectrogram.launch
 ```
 
+Its data conversion pipeline is as follows:
+```
+audio_to_spectrum.py -> spectrum
+                     -> normalized_half_spectrum
+                     -> log_spectrum             -> preprocess node(s) -> preprocessed spectrum -> spectrum_to_spectrogram.py -> spectrogram
+```
+
 Here is an example using rosbag with 300Hz audio.
 ```bash
 roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
@@ -18,19 +25,48 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
 |---|---|---|
 |<img src="docs/images/audio_amplitude.jpg" width="429">|![](https://user-images.githubusercontent.com/19769486/82075694-9a7ac300-9717-11ea-899c-db6119a76d52.png)|![](https://user-images.githubusercontent.com/19769486/82075685-96e73c00-9717-11ea-9abc-e6e74104d666.png)|
 
+You can also convert data other than audio to spectrum and spectrogram data using this package.  
+Here is an example using rosbag of a force torque sensor sensing drill vibration.
+```bash
+roslaunch audio_to_spectrogram sample_wrench_to_spectrogram.launch
+```
+
+|Z-axis Force Amplitude|Normalized Half Spectrum|Spectrogram Source Spectrum|Spectrogram|
+|---|---|---|---|
+|<img src="docs/images/wrench_amplitude.jpg">|<img src="docs/images/wrench_normalized_half_spectrum.jpg">|<img src="docs/images/wrench_spectrogram_source.jpg">|<img src="docs/images/wrench_spectrogram.jpg">|
+
 # Scripts
 
 ## audio_to_spectrum.py
+
   A script to convert audio to spectrum.
 
   - ### Publishing topics
-
     - `~spectrum` (`jsk_recognition_msgs/Spectrum`)
 
-      Spectrum data calculated from audio by FFT.
+      Spectrum data calculated from audio by FFT.  
+      It is usual "amplitude spectrum".  
+      See https://ryo-iijima.com/fftresult/ for details.
+
+    - `~normalized_half_spectrum` (`jsk_recognition_msgs/Spectrum`)
+
+      Spectrum data which is "half" (having non-negative frequencies (0Hz-Nyquist frequency)) and is "normalized" (consistent with the amplitude of the original signal).  
+      See the following for details.
+      - https://ryo-iijima.com/fftresult/
+      - https://stackoverflow.com/questions/63211851/why-divide-the-output-of-numpy-fft-by-n
+      - https://github.com/jsk-ros-pkg/jsk_recognition/issues/2761#issue-1550715400
+
+    - `~log_spectrum` (`jsk_recognition_msgs/Spectrum`)
+
+      Log-scaled spectrum data.  
+      It is calculated by applying log to the absolute value of the FFT result.  
+      Usually, log is applied to "power spectrum", but we don't use it for simplicity.  
+      See the following for details.
+      - https://github.com/jsk-ros-pkg/jsk_recognition/issues/2761#issuecomment-1445810380
+      - http://makotomurakami.com/blog/2020/05/23/5266/
 
   - ### Subscribing topics
-    - `audio` (`audio_common_msgs/AudioData`)
+    - `~audio` (`audio_common_msgs/AudioData`)
 
       Audio stream data from microphone. The audio format must be `wave`.
 
@@ -55,15 +91,94 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
 
       Number of bits per audio data.
 
-    - `~high_cut_freq` (`Int`, default: `800`)
+    - `~fft_exec_rate` (`Double`, default: `50`)
+
+      Rate [Hz] to execute FFT and publish its results.
+
+## data_to_spectrum.py
+
+  Generalized version of `audio_to_spectrum.py`.  
+  This script can convert multiple message types to spectrum.
+
+  - ### Publishing topics
+
+    Same as `audio_to_spectrum.py`.
+
+  - ### Subscribing topics
+    - `~input` (`AnyMsg`)
+
+      Topic to which message including data you want to convert to spectrum is published.
+
+  - ### Parameters
+    - `~expression_to_get_data` (`String`, default: `m.data`)
+
+      Python expression to get data from the input message `m`. For example, if your input is `std_msgs/Float64`, it is `m.data`.  
+      Just accessing a field of `m` is recommended.  
+      If you want to do a complex calculation (e.g., using `numpy`), use `transform` of `topic_tools` before this node.
+
+    - `~data_sampling_rate` (`Int`, default: `500`)
+
+      Sampling rate [Hz] of input data.
+
+    - `~fft_sampling_period` (`Double`, default: `0.3`)
+
+      Period [s] to sample input data for one FFT.
+
+    - `~fft_exec_rate` (`Double`, default: `50`)
+
+      Rate [Hz] to execute FFT and publish its results.
+
+    - `~is_integer` (`Bool`, default: `false`)
+
+      Whether input data is integer or not. For example, if your input is `std_msgs/Float64`, it is `false`.
+
+    - `~is_signed` (`Bool`, default: `true`)
+
+      Whether input data is signed or not. For example, if your input is `std_msgs/Float64`, it is `true`.
+
+    - `~bitdepth` (`Int`, default: `64`)
+
+      Number of bits per input data. For example, if your input is `std_msgs/Float64`, it is `64`.
+
+    - `~n_channel` (`Int`, default: `1`)
+
+      If your input is scalar, it is `1`.  
+      If your input is flattened 2D matrix, it is number of channel of original matrix.
+
+    - `~target_channel` (`Int`, default: `0`)
+
+      If your input is scalar, it is `0`.  
+      If your input is flattened 2D matrix, it is target channel.
+
+## spectrum_filter.py
+
+  A script to filter spectrum.
+
+  - ### Publishing topics
+    - `~output` (`jsk_recognition_msgs/Spectrum`)
+
+      Filtered spectrum data (`low_cut_freq`-`high_cut_freq`).
+
+  - ### Subscribing topics
+    - `~input` (`jsk_recognition_msgs/Spectrum`)
+
+      Original spectrum data.
+
+  - ### Parameters
+    - `~data_sampling_rate` (`Int`, default: `500`)
+
+      Sampling rate [Hz] of data used in generation of original spectrum data.
+
+    - `~high_cut_freq` (`Int`, default: `250`)
 
       Threshold to limit the maximum frequency of the output spectrum.
 
-    - `~low_cut_freq` (`Int`, default: `1`)
+    - `~low_cut_freq` (`Int`, default: `0`)
 
       Threshold to limit the minimum frequency of the output spectrum.
 
 ## spectrum_to_spectrogram.py
+
   A script to convert spectrum to spectrogram.
 
   - ### Publishing topics
@@ -128,7 +243,7 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
 
       Number of bits per audio data.
 
-    - `~maximum_amplitude` (`Int`, default: `10000`)
+    - `~maximum_amplitude` (`Double`, default: `10000.0`)
 
       Maximum range of amplitude to plot.
 
@@ -140,6 +255,66 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
 
       Publish rate [Hz] of audio amplitude image topic.
 
+## data_amplitude_plot.py
+
+  Generalized version of `audio_amplitude_plot.py`.
+
+  - ### Publishing topics
+
+    - `~output/viz` (`sensor_msgs/Image`)
+
+      Data amplitude plot image.
+
+  - ### Subscribing topics
+    - `~input` (`AnyMsg`)
+
+      Topic to which message including data whose amplitude you want to plot is published.
+
+  - ### Parameters
+    - `~expression_to_get_data` (`String`, default: `m.data`)
+
+      Python expression to get data from the input message `m`. For example, if your input is `std_msgs/Float64`, it is `m.data`.  
+      Just accessing a field of `m` is recommended.  
+      If you want to do a complex calculation (e.g., using `numpy`), use `transform` of `topic_tools` before this node.
+
+    - `~data_sampling_rate` (`Int`, default: `500`)
+
+      Sampling rate [Hz] of input data.
+
+    - `~is_integer` (`Bool`, default: `false`)
+
+      Whether input data is integer or not. For example, if your input is `std_msgs/Float64`, it is `false`.
+
+    - `~is_signed` (`Bool`, default: `true`)
+
+      Whether input data is signed or not. For example, if your input is `std_msgs/Float64`, it is `true`.
+
+    - `~bitdepth` (`Int`, default: `64`)
+
+      Number of bits per input data. For example, if your input is `std_msgs/Float64`, it is `64`.
+
+    - `~n_channel` (`Int`, default: `1`)
+
+      If your input is scalar, it is `1`.  
+      If your input is flattened 2D matrix, it is number of channel of original matrix.
+
+    - `~target_channel` (`Int`, default: `0`)
+
+      If your input is scalar, it is `0`.  
+      If your input is flattened 2D matrix, it is target channel.
+
+    - `~maximum_amplitude` (`Double`, default: `10.0`)
+
+      Maximum range of amplitude to plot.
+
+    - `~window_size` (`Double`, default: `10.0`)
+
+      Window size of input data to plot.
+
+    - `~rate` (`Double`, default: `10.0`)
+
+      Publish rate [Hz] of data amplitude image topic.
+
 ## spectrum_plot.py
 
   A script to publish frequency vs amplitude plot image.
@@ -159,14 +334,18 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
       Spectrum data calculated from audio by FFT.
 
   - ### Parameters
-    - `~plot_amp_min` (`Double`, default: `0.0`)
+    - `~min_amp` (`Double`, default: `0.0`)
 
-      Minimum value of amplitude in plot
+      Minimum value of amplitude in plot.
 
-    - `~plot_amp_max` (`Double`, default: `20.0`)
+    - `~max_amp` (`Double`, default: `20.0`)
 
-      Maximum value of amplitude in plot
+      Maximum value of amplitude in plot.
 
     - `~queue_size` (`Int`, default: `1`)
 
-      Queue size of spectrum subscriber
+      Queue size of spectrum subscriber.
+
+    - `~max_rate` (`Double`, default: `-1`)
+
+      Maximum publish rate [Hz] of frequency vs amplitude plot image. Setting this value low reduces CPU load. `-1` means no maximum limit.
diff --git a/audio_to_spectrogram/cfg/AudioAmplitudePlot.cfg b/audio_to_spectrogram/cfg/AudioAmplitudePlot.cfg
diff --git a/audio_to_spectrogram/cfg/DataAmplitudePlot.cfg b/audio_to_spectrogram/cfg/DataAmplitudePlot.cfg
@@ -0,0 +1,13 @@
+#! /usr/bin/env python
+
+PACKAGE = 'audio_to_spectrogram'
+
+from dynamic_reconfigure.parameter_generator_catkin import *
+
+
+gen = ParameterGenerator()
+
+gen.add("maximum_amplitude", double_t, 0, "Maximum range of amplitude to plot", 10.0)
+gen.add("window_size", double_t, 0,  "Window size of data input to plot", 10.0)
+
+exit(gen.generate(PACKAGE, PACKAGE, "DataAmplitudePlot"))
diff --git a/audio_to_spectrogram/docs/images/wrench_amplitude.jpg b/audio_to_spectrogram/docs/images/wrench_amplitude.jpg
diff --git a/audio_to_spectrogram/docs/images/wrench_normalized_half_spectrum.jpg b/audio_to_spectrogram/docs/images/wrench_normalized_half_spectrum.jpg
diff --git a/audio_to_spectrogram/docs/images/wrench_spectrogram.jpg b/audio_to_spectrogram/docs/images/wrench_spectrogram.jpg
diff --git a/audio_to_spectrogram/docs/images/wrench_spectrogram_source.jpg b/audio_to_spectrogram/docs/images/wrench_spectrogram_source.jpg