Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement speex resampler backend #314

Closed
gavv opened this issue Jan 8, 2020 · 13 comments
Closed

Implement speex resampler backend #314

gavv opened this issue Jan 8, 2020 · 13 comments
Assignees
Labels
dsp Digital sound processing enhancement help wanted An important and awaited task but we have no human resources for it yet much-needed This issue is needed most among other help-wanted issues
Milestone

Comments

@gavv
Copy link
Member

gavv commented Jan 8, 2020

Implement resampler interface using speex_resampler from libspeexdsp. See #235 for details and rationale.

Steps:

  • Add target_speexdsp to SConstruct and 3rdparty.py. See libsndfile source and sink #246 for details on adding a new target directory.

  • Add SpeexResampler class, implementing IResampler interface. Place it into roc_audio/target_speexdsp.

  • Add ResamplerBackend_Speex to ResamplerBackend enum. Add SpeexResampler to ResamplerMap.

  • Add ResamplerBackend_Speex to Test_n_resampler_backends and ensure that tests are passing for the new backend.

  • Add speex backend to command-line tools and C API.

@gavv gavv added enhancement help wanted An important and awaited task but we have no human resources for it yet much-needed This issue is needed most among other help-wanted issues labels Jan 8, 2020
@CristobalM
Copy link
Contributor

Hi @gavv,

I've tried to advance on this, but I haven't been able to make it work. From what I've read, I concluded that most of the work for this should be done in the resample_buff method inside SpeexResampler class, like in this attempt:

https://github.com/CristobalM/roc/blob/speex_resampler/src/modules/roc_audio/resampler_speex.cpp

But I'm not sure what is the input and output of resample_buff and if there is anything else that must be done, so that I can pass the correct input to speex_resampler_process_interleaved_float and process its output to make it compatible with what is expected from resample_buff.

I've studied some of the built in resampler code and the related paper mentioned in the docs, but still I got stuck with this.

Have you used any tools to visualize the input and output of resample_buff?, that would be useful for me to see what is speex_resampler_process_interleaved_float doing. I've tried printing the frames and visualizing them with matplotlib, but found it suboptimal. Do you have any other advise to keep going on this?

@gavv
Copy link
Member Author

gavv commented Jan 16, 2020

Hi, good to know that you're still interested! I'll prepare a reply soon.

@gavv
Copy link
Member Author

gavv commented Jan 16, 2020

Let's talk about ResamplerReader. (ResamplerWriter is similar).

ResamplerReader implements read() operation by:

  1. reading frames from nested reader to a circular buffer (frames_)
  2. passing current circular buffer state to IResampler::renew_buffers()
  3. asking IResampler::resample_buff() to read samples from circular buffer until it fully fills the output frame OR reports (by returning false) that there is no more data in the circular buffer and it should be renewed; in this case ResamplerReader goes to step 1

The circular buffer consists of 3 frames: prev, curr, next. This way we implement a 3-frame sliding window inside the input stream produced by nested reader. This 3-frame window is the input of resample_buff().

resample_buff()'s work is to read samples from the input sliding window and write samples to the output frame (passed to it as an argument).

resample_buff() maintains a pointer inside the input sliding window. This pointer starts from the beginning of curr and goes forward until it reaches the end of curr.

When the pointer reaches the end, resample_buff() returns false. In this case ResamplerReader reads one more frame from nested reader and adds it to the right of circular buffer, so this:

| frame1 | frame2 | frame3 |

becomes this:

| frame2 | frame3 | frame4 |

(frame1 is dropped; frame4 is a new frame returned by nested reader)

Then ResamplerReader passes pointers to new 3 frames to IResampler and calls resample_buff() again to continue filling output frame.

The pointer maintained by resample_buff() is always inside curr. This pointer is the center of another, smaller sliding window, also maintained by resample_buff(). This window is always smaller than the frame size (size of prev/curr/next). The window size is defined by ResamplerConfig::window_size parameter.

Finally, resample_buff() also maintains the pointer in the output frame, moving from the beginning of the output frame to its end.

Each sample in the output frame is calculated from samples inside the small window inside the input circular buffer. One output sample is a product of the samples from small window with the sinc function (for which we have a pre-calculated table).

resample_buff() moves output frame pointer by one sample forward each time. Each time it moves forward the output frame pointer, it also moves forward the small window. However, the small window is moved a bit slower of faster than 1 sample per time, depending on the scaling factor (set by set_scaling()). If the scaling factor is exactly 1.0, the output frame pointer and the small window are moved synchronously.

Here is how it looks when the small window center is at the beginning of curr and the output frame pointer is at the beginning of the output frame:

|   prev   |   curr   |   next   |        <- 3-frame sliding window in input stream
         |  /\  |
         |window|                         <- small window to compute output sample
         \      /
          \    /
           \  /
            \/                            <- pointer to current output sample
           |  output  |                   <- output frame

In the middle:

|   prev   |   curr   |   next   |
             |  /\  |
             |window|
             \      /
              \    /
               \  /
                \/
           |  output  |

And at the end:

|   prev   |   curr   |   next   |
                 |  /\  |
                 |window|
                 \      /
                  \    /
                   \  /
                    \/
           |  output  |

If resample_buff() reaches curr end before output frame end, it returns false. In this case, ResamplerReader renews the circular buffer and calls resample_buff() again; and resample_buff() points the small window center to the beginning of new curr and continues filling output frame from the position when it stopped.

@gavv
Copy link
Member Author

gavv commented Jan 16, 2020

Regarding tools and debugging.

You can use roc-conv tool to read wav file, run resampler, and write output to wav file.

You can generate the input wav file and visualize output wav in python or matlab, for example.

Look at #153. It provides some examples and scripts.

Here are some random scripts that I have locally: https://gist.github.com/gavv/41db26aa76737e72f70112293d7ddae4

gensine.py generarates wav file with a sine. plot_diff.py visualizes the difference between the two wav files. plot_spectrum.py computes and plots the spectrum of the signal in wav file.

I've tried printing the frames and visualizing them with matplotlib, but found it suboptimal.

Usually works good for me :)

@gavv
Copy link
Member Author

gavv commented Jan 16, 2020

Regarding speex resampler.

I guess it will maintain the circular buffer, and the pointer inside it, by itself.

So I guess SpeexResampler::resample_buff() should pass curr to speex and ask to fill the output frame until speex reports there is no more data in curr. Then SpeexResampler::resample_buff() should return false to renew buffers. And then repeat (but continue from the position inside output frame when it stopped). And it seems that it will completely ignore prev and next.

So in case of speex, we likely don't need to maintain the circular buffer by ourselves. Let's ignore this fact and keep it as is for now and make it working first. Then we'll see whether we need it for zita resampler. If we wont need circular buffer for zita too, we can do some refactoring and move circular buffer feature from ResamplerReader to BuiltinResampler.

@gavv
Copy link
Member Author

gavv commented Jan 16, 2020

Feel free to ask if you'll have any other questions.

@CristobalM
Copy link
Contributor

Thank you. Now I get recognizable sound from speex! But the tests are failing

Screenshot from 2020-01-19 15-39-08

As soon as I solve those I'll create a PR

@gavv
Copy link
Member Author

gavv commented Jan 19, 2020

Great! Regarding the test, see also #105. Maybe you'll need to adjust the tests.

@gavv
Copy link
Member Author

gavv commented May 23, 2020

@CristobalM BTW, even if you didn't fix the tests, creating a draft PR with your implementation is appreciated, maybe me or someone else will finish it.

@gavv gavv added the dsp Digital sound processing label May 25, 2020
@CristobalM
Copy link
Contributor

@CristobalM BTW, even if you didn't fix the tests, creating a draft PR with your implementation is appreciated, maybe me or someone else will finish it.

Sorry, I've been busy. Feel free anyone to continue from this, I created the draft PR.

@gavv
Copy link
Member Author

gavv commented May 25, 2020

No worries, thanks!

@gavv
Copy link
Member Author

gavv commented May 28, 2020

Great, this PR was a good start for me!

I've cherry-picked most of the changes, fixed both resampler and resampler tests, and pushed everything to develop.

What's new (see recent commits in develop):

  • Instead of adding quality parameter, derive speex quality from resampler profile parameter, which we already have. Also, ResamplerConfig is hidden since it was specific to builtin resampler.

  • Download speexdsp instead of speex for recent versions.

  • Don't recreate resampler when changing scaling, just change its rate.

  • Use speex_resampler_set_rate_frac() to increase scaling precision.

  • Add scons option to disable speex. Register speex resampler in resampler map only if it's enabled at build time.

  • Rework resampler tests. I just dropped the old tests because they were to complicated and I never fully understood them. They're replaces with much more simpler tests that don't try to estimate resampler quality, just check that resampler actually resamples something. BTW the new tests helped to find the problem caused by not using speex_resampler_set_rate_frac().

  • Rework IResampler interface. The old interface was assuming that the resampler needs a ring buffer for input data, however speex resample doesn't need it. The new interface is more generic and allows resamplers with or without ring buffers.

  • Add roc_resampler_backend to public API.

  • Make speex resampler default one.

  • Update documentation.

@gavv
Copy link
Member Author

gavv commented May 28, 2020

A word about the new speex resampler:

  • I've tested roc with speex on laptop and on a few arm-based boxes, it works well.

  • Speex resampler is about 5 times faster than roc builtin resampler on my laptop and about 1.5 times faster on raspberry pi zero (both on medium profile, which corresponds to speex quality 5).

  • Speex resampler has limited scaling precision. When upscaling and the downscaling back a sine wave, the lag between the original and the result accumulates quickly and is clearly visible in data after a 0.5 second or so. Our builtin resampler doesn't have such a problem. Fortunately, our FreqEstimator handles this well and even with inaccurate scaling everything works smoothly (at least in my tests).

  • Interesting enough, the scaling precision problem is solved when using a higher precision factor for speex_resampler_set_rate_frac(), but this causes overflow errors in speex, so we can't do it. Currently we use maximum factor that doesn't cause overflows. Lower factor values make scaling so inaccurate so even FreqEstimator can not handle it.

@gavv gavv closed this as completed May 28, 2020
@gavv gavv added this to the 0.2.0 milestone Jul 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dsp Digital sound processing enhancement help wanted An important and awaited task but we have no human resources for it yet much-needed This issue is needed most among other help-wanted issues
Projects
Status: Done
Development

No branches or pull requests

2 participants