Skip to content

Commit

Permalink
Merge branch 'master' of github.com:jsk-ros-pkg/jsk_recognition
Browse files Browse the repository at this point in the history
  • Loading branch information
k-okada committed Feb 2, 2024
2 parents 580132f + 6f2b856 commit 55b488d
Show file tree
Hide file tree
Showing 81 changed files with 2,366 additions and 430 deletions.
13 changes: 13 additions & 0 deletions audio_to_spectrogram/CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,19 @@
Changelog for package audio_to_spectrogram
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

1.2.17 (2023-11-14)
-------------------

1.2.16 (2023-11-10)
-------------------
* [audio_to_spectrogram, sound_classification] Add data_to_spectrogram (`#2767 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2767>`_)
* [audio_to_spectrogram] Enable to change spectrum plot from rosparam (`#2760 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2760>`_)
* Fix audio to spectrogram plot and add test (`#2764 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2764>`_)
* [audio_to_spectrogram] Add AudioAmplitudePlot node to visualize audio amplitude `#2657 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2657>`_ (`#2755 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2755>`_)
* [audio_to_spectrogram] Enable publishing frequency vs amplitude plot (`#2654 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2654>`_)
* use catkin_install_python to install python scripts under node_scripts/ scripts/ (`#2743 <https://github.com/jsk-ros-pkg/jsk_recognition/issues/2743>`_)
* Contributors: Kei Okada, Naoto Tsukamoto, Shingo Kitagawa, Shun Hasegawa, Yoshiki Obinata, Iory Yanokura

1.2.15 (2020-10-10)
-------------------

Expand Down
8 changes: 7 additions & 1 deletion audio_to_spectrogram/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ find_package(catkin REQUIRED COMPONENTS
catkin_python_setup()

generate_dynamic_reconfigure_options(
cfg/AudioAmplitudePlot.cfg
cfg/DataAmplitudePlot.cfg
)

catkin_package()
Expand Down Expand Up @@ -41,4 +41,10 @@ if(CATKIN_ENABLE_TESTING)
find_package(catkin REQUIRED COMPONENTS rostest roslaunch)
add_rostest(test/audio_to_spectrogram.test)
roslaunch_add_file_check(launch/audio_to_spectrogram.launch)
if(NOT $ENV{ROS_DISTRO} STRLESS "kinetic")
# Under kinetic, eval cannot be used in launch files
# http://wiki.ros.org/roslaunch/XML#substitution_args
add_rostest(test/wrench_to_spectrogram.test)
roslaunch_add_file_check(launch/wrench_to_spectrogram.launch)
endif()
endif()
203 changes: 191 additions & 12 deletions audio_to_spectrogram/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# audio_to_spectrogram

This package converts audio data to spectrum and spectrogram data.
This package converts audio data (or other time-series data) to spectrum and spectrogram data.

# Usage
By following command, you can publish audio, spectrum and spectrogram topics. Please set correct args for your microphone configuration, such as mic\_sampling\_rate or bitdepth.
Expand All @@ -9,6 +9,13 @@ By following command, you can publish audio, spectrum and spectrogram topics. Pl
roslaunch audio_to_spectrogram audio_to_spectrogram.launch
```

Its data conversion pipeline is as follows:
```
audio_to_spectrum.py -> spectrum
-> normalized_half_spectrum
-> log_spectrum -> preprocess node(s) -> preprocessed spectrum -> spectrum_to_spectrogram.py -> spectrogram
```

Here is an example using rosbag with 300Hz audio.
```bash
roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
Expand All @@ -18,19 +25,48 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
|---|---|---|
|<img src="docs/images/audio_amplitude.jpg" width="429">|![](https://user-images.githubusercontent.com/19769486/82075694-9a7ac300-9717-11ea-899c-db6119a76d52.png)|![](https://user-images.githubusercontent.com/19769486/82075685-96e73c00-9717-11ea-9abc-e6e74104d666.png)|

You can also convert data other than audio to spectrum and spectrogram data using this package.
Here is an example using rosbag of a force torque sensor sensing drill vibration.
```bash
roslaunch audio_to_spectrogram sample_wrench_to_spectrogram.launch
```

|Z-axis Force Amplitude|Normalized Half Spectrum|Spectrogram Source Spectrum|Spectrogram|
|---|---|---|---|
|<img src="docs/images/wrench_amplitude.jpg">|<img src="docs/images/wrench_normalized_half_spectrum.jpg">|<img src="docs/images/wrench_spectrogram_source.jpg">|<img src="docs/images/wrench_spectrogram.jpg">|

# Scripts

## audio_to_spectrum.py

A script to convert audio to spectrum.

- ### Publishing topics

- `~spectrum` (`jsk_recognition_msgs/Spectrum`)

Spectrum data calculated from audio by FFT.
Spectrum data calculated from audio by FFT.
It is usual "amplitude spectrum".
See https://ryo-iijima.com/fftresult/ for details.

- `~normalized_half_spectrum` (`jsk_recognition_msgs/Spectrum`)

Spectrum data which is "half" (having non-negative frequencies (0Hz-Nyquist frequency)) and is "normalized" (consistent with the amplitude of the original signal).
See the following for details.
- https://ryo-iijima.com/fftresult/
- https://stackoverflow.com/questions/63211851/why-divide-the-output-of-numpy-fft-by-n
- https://github.com/jsk-ros-pkg/jsk_recognition/issues/2761#issue-1550715400

- `~log_spectrum` (`jsk_recognition_msgs/Spectrum`)

Log-scaled spectrum data.
It is calculated by applying log to the absolute value of the FFT result.
Usually, log is applied to "power spectrum", but we don't use it for simplicity.
See the following for details.
- https://github.com/jsk-ros-pkg/jsk_recognition/issues/2761#issuecomment-1445810380
- http://makotomurakami.com/blog/2020/05/23/5266/

- ### Subscribing topics
- `audio` (`audio_common_msgs/AudioData`)
- `~audio` (`audio_common_msgs/AudioData`)

Audio stream data from microphone. The audio format must be `wave`.

Expand All @@ -55,15 +91,94 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch

Number of bits per audio data.

- `~high_cut_freq` (`Int`, default: `800`)
- `~fft_exec_rate` (`Double`, default: `50`)

Rate [Hz] to execute FFT and publish its results.

## data_to_spectrum.py

Generalized version of `audio_to_spectrum.py`.
This script can convert multiple message types to spectrum.

- ### Publishing topics

Same as `audio_to_spectrum.py`.

- ### Subscribing topics
- `~input` (`AnyMsg`)

Topic to which message including data you want to convert to spectrum is published.

- ### Parameters
- `~expression_to_get_data` (`String`, default: `m.data`)

Python expression to get data from the input message `m`. For example, if your input is `std_msgs/Float64`, it is `m.data`.
Just accessing a field of `m` is recommended.
If you want to do a complex calculation (e.g., using `numpy`), use `transform` of `topic_tools` before this node.

- `~data_sampling_rate` (`Int`, default: `500`)

Sampling rate [Hz] of input data.

- `~fft_sampling_period` (`Double`, default: `0.3`)

Period [s] to sample input data for one FFT.

- `~fft_exec_rate` (`Double`, default: `50`)

Rate [Hz] to execute FFT and publish its results.

- `~is_integer` (`Bool`, default: `false`)

Whether input data is integer or not. For example, if your input is `std_msgs/Float64`, it is `false`.

- `~is_signed` (`Bool`, default: `true`)

Whether input data is signed or not. For example, if your input is `std_msgs/Float64`, it is `true`.

- `~bitdepth` (`Int`, default: `64`)

Number of bits per input data. For example, if your input is `std_msgs/Float64`, it is `64`.

- `~n_channel` (`Int`, default: `1`)

If your input is scalar, it is `1`.
If your input is flattened 2D matrix, it is number of channel of original matrix.

- `~target_channel` (`Int`, default: `0`)

If your input is scalar, it is `0`.
If your input is flattened 2D matrix, it is target channel.

## spectrum_filter.py

A script to filter spectrum.

- ### Publishing topics
- `~output` (`jsk_recognition_msgs/Spectrum`)

Filtered spectrum data (`low_cut_freq`-`high_cut_freq`).

- ### Subscribing topics
- `~input` (`jsk_recognition_msgs/Spectrum`)

Original spectrum data.

- ### Parameters
- `~data_sampling_rate` (`Int`, default: `500`)

Sampling rate [Hz] of data used in generation of original spectrum data.

- `~high_cut_freq` (`Int`, default: `250`)

Threshold to limit the maximum frequency of the output spectrum.

- `~low_cut_freq` (`Int`, default: `1`)
- `~low_cut_freq` (`Int`, default: `0`)

Threshold to limit the minimum frequency of the output spectrum.

## spectrum_to_spectrogram.py

A script to convert spectrum to spectrogram.

- ### Publishing topics
Expand Down Expand Up @@ -128,7 +243,7 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch

Number of bits per audio data.

- `~maximum_amplitude` (`Int`, default: `10000`)
- `~maximum_amplitude` (`Double`, default: `10000.0`)

Maximum range of amplitude to plot.

Expand All @@ -140,6 +255,66 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch

Publish rate [Hz] of audio amplitude image topic.

## data_amplitude_plot.py

Generalized version of `audio_amplitude_plot.py`.

- ### Publishing topics

- `~output/viz` (`sensor_msgs/Image`)

Data amplitude plot image.

- ### Subscribing topics
- `~input` (`AnyMsg`)

Topic to which message including data whose amplitude you want to plot is published.

- ### Parameters
- `~expression_to_get_data` (`String`, default: `m.data`)

Python expression to get data from the input message `m`. For example, if your input is `std_msgs/Float64`, it is `m.data`.
Just accessing a field of `m` is recommended.
If you want to do a complex calculation (e.g., using `numpy`), use `transform` of `topic_tools` before this node.

- `~data_sampling_rate` (`Int`, default: `500`)

Sampling rate [Hz] of input data.

- `~is_integer` (`Bool`, default: `false`)

Whether input data is integer or not. For example, if your input is `std_msgs/Float64`, it is `false`.

- `~is_signed` (`Bool`, default: `true`)

Whether input data is signed or not. For example, if your input is `std_msgs/Float64`, it is `true`.

- `~bitdepth` (`Int`, default: `64`)

Number of bits per input data. For example, if your input is `std_msgs/Float64`, it is `64`.

- `~n_channel` (`Int`, default: `1`)

If your input is scalar, it is `1`.
If your input is flattened 2D matrix, it is number of channel of original matrix.

- `~target_channel` (`Int`, default: `0`)

If your input is scalar, it is `0`.
If your input is flattened 2D matrix, it is target channel.

- `~maximum_amplitude` (`Double`, default: `10.0`)

Maximum range of amplitude to plot.

- `~window_size` (`Double`, default: `10.0`)

Window size of input data to plot.

- `~rate` (`Double`, default: `10.0`)

Publish rate [Hz] of data amplitude image topic.

## spectrum_plot.py

A script to publish frequency vs amplitude plot image.
Expand All @@ -159,14 +334,18 @@ roslaunch audio_to_spectrogram sample_audio_to_spectrogram.launch
Spectrum data calculated from audio by FFT.

- ### Parameters
- `~plot_amp_min` (`Double`, default: `0.0`)
- `~min_amp` (`Double`, default: `0.0`)

Minimum value of amplitude in plot
Minimum value of amplitude in plot.

- `~plot_amp_max` (`Double`, default: `20.0`)
- `~max_amp` (`Double`, default: `20.0`)

Maximum value of amplitude in plot
Maximum value of amplitude in plot.

- `~queue_size` (`Int`, default: `1`)

Queue size of spectrum subscriber
Queue size of spectrum subscriber.

- `~max_rate` (`Double`, default: `-1`)

Maximum publish rate [Hz] of frequency vs amplitude plot image. Setting this value low reduces CPU load. `-1` means no maximum limit.
13 changes: 0 additions & 13 deletions audio_to_spectrogram/cfg/AudioAmplitudePlot.cfg

This file was deleted.

13 changes: 13 additions & 0 deletions audio_to_spectrogram/cfg/DataAmplitudePlot.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#! /usr/bin/env python

PACKAGE = 'audio_to_spectrogram'

from dynamic_reconfigure.parameter_generator_catkin import *


gen = ParameterGenerator()

gen.add("maximum_amplitude", double_t, 0, "Maximum range of amplitude to plot", 10.0)
gen.add("window_size", double_t, 0, "Window size of data input to plot", 10.0)

exit(gen.generate(PACKAGE, PACKAGE, "DataAmplitudePlot"))
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 55b488d

Please sign in to comment.