Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor OggStreamReader and provide precise seek #94

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

TonalidadeHidrica
Copy link

@TonalidadeHidrica TonalidadeHidrica commented Feb 16, 2021

Current OggStreamReader has several issues:

So I implemented a sample-level seek.

Changes in API

  • OggStreamReader does no longer require io::Seek. Therefore it does not provide seek_absgp_pg.
  • Instead, new struct SeekableOggStreamReader is provided. This struct comes with seek_absgp, where users can seek the reader in sample-level.
    • I had to keep those two structs separate because we need to do the different initialization steps and fields for seekable and unseekable reader (in specific, record the stream bounds).
  • OggStreamReader and SeekableOggStreamReader is dedicated to only a single stream. If you want to parse chained stream, you have to create a new one.
    • Later, I'm going to implement some function like next_stream(self) -> Option<OggStreamReader> that consumes the reader, checks if the next packet exists, and returns if exists.
  • ident_hdr, comment_hdr, and setup_hdr is no longer public. Now they can be accessed only immutably through the corresponding getter functions.

Issues & Future Task

One major issue is that the page absgp in real-world ogg/vorbis file often contains errors. Specifically, each of the ogg files I have has 3-20 errors, classified to the following two types:

  • When a page ends with a "long" block and the next page starts with "short" block, then the absgp for the former page is wrong. In specific, all of the ogg files I have use "long" blocks of length 2048 and "short" blocks of length 512, and the absgp for the former page sometimes differs by 448. This type of error was found in almost all the ogg files I have.
  • The absgp for the last page is less than the number of total samples obtained from the entire stream, subtracted by the number of samples obtained from the last packet. The spec requires to drop the last few frames from the final packet according to the absgp provided in the page, as quoted below, but it seems that real-word ogg files requires to drop more samples.

A granule position on the final page in a stream that indicates less audio data than the final packet would normally return is used to end the stream on other than even frame boundaries

If the absgp provided in the pages has an error, then the seek result may be wrong, especially for the first type of error mentioned above. I'm now wondering that I've mistook some spec of the Vorbis stream, or this is the bug of many Vorbis encoders (all I have of which is based on libvorbis by the way).
Yeah, it's only 448 frames in ~44100 fps data, so the diff is 0.01 seconds, so we may not have to care about that though.

Speaking about the incompatibility with the spec, I also realized that most of the ogg files does not follow the spec that " the second finished audio packet must flush the page on which it appears and the third packet begin a fresh page". I implemented the dropping-leading-samples features based on this fact, but it does not actually run for most of the files. Anyway I followed the another requirement in the spec that "Failure to do so should, at worst, cause a decoder implementation to return incorrect positioning information for seeking operations at the very beginning of the stream". At any rate, most of the ogg files starts with absgp 0, which means that we don't have to drop the initial frames at all.

Here are the other tasks and concerns:

  • Only informal test is done yet for the new seeking function. Maybe should I include it in this repository as a test?
    • Also, it is not yet tested against chained / multiplexed streams.
  • I have not checked the compatibility with async feature yet.
  • I didn't encountered an audio packet spanning more than one page, so there may be a bug for that case.
  • The code may not compile with MSRV.

And here are the issues I'll handle at last:

  • The code may not be well-formatted.
  • The commits are not rebased yet.

@TonalidadeHidrica
Copy link
Author

I'm opening a draft PR since it's quite a large change. As soon as I'm done with other tasks I'll switch. But earlier comments and critics are welcome!

@TonalidadeHidrica
Copy link
Author

TonalidadeHidrica commented Feb 16, 2021

Some things I forgot to write:

  • I'm wondering if I should implement Deref<Target=OggStreamReader> for SeekableOggStreamReader>. It's definitely convenient as it requires less changes on old code, but it is considered as anti-pattern.
  • This PR is dependent on Improve seek ogg#22.

@est31
Copy link
Member

est31 commented Feb 16, 2021

@TonalidadeHidrica thanks I'll take a look tomorrow.

@TonalidadeHidrica
Copy link
Author

@est31 Thanks a lot, but there's no rush!

@ghost
Copy link

ghost commented May 8, 2021

You can make a trait containing the common methods of OggStreamReader and SeekableOggStreamReader, impl it for OggStreamReader, and then forward the calls into impl YourTrait for SeekableOggStreamReader. That will also make it more convenient for people extending on Lewton.

@ghost
Copy link

ghost commented May 10, 2021

Is this still being worked on?

@TonalidadeHidrica
Copy link
Author

TonalidadeHidrica commented May 10, 2021

The implementations are working correctly in my product so I'm currently waiting for a review. Also, I'm busy recently so if I receive a review, I will not be able to respond instantly.

@@ -29,12 +29,12 @@ required-features = ["ogg"]
[dependencies]
byteorder = "1.0"
tinyvec = { version = "1.0", features = ["alloc"] }
ogg = { version = "0.8", optional = true }
ogg = { version = "0.8", optional = true, path = "../ogg/" }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow what happened here

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems like you have a local edition of the ogg crate, which I can't find anywhere and thus can't build. :( Mind at least pushing the fork somewhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm patching ogg crate at RustAudio/ogg#18 and RustAudio/ogg#22 as well, and since this PR depends on those PRs, this temporary measure is taken to handle this. We can remove it after they are merged.

@ghost
Copy link

ghost commented May 10, 2021

Alright then it's just up the RustAudio to review this then... which seems to be taking a while, this was last commented on by a org member 3 months ago. Do you already have a branch of ogg with those two PRs merged?

@TonalidadeHidrica
Copy link
Author

@JackRedstonia better-seek branch in my fork is my state-of-the-art branch which includes every improvements I made :)

@ghost
Copy link

ghost commented May 10, 2021

based and forked

@ghost
Copy link

ghost commented May 10, 2021

Just to clarify, the SeekableOggStreamReader::stream_end_pos function returns the byte position of the last logical packet, right? I'm not exactly sure how to go from here to getting the stream duration in granules, as I don't know much about the ogg/vorbis format.

@TonalidadeHidrica
Copy link
Author

@JackRedstonia Oops, sorry. You're right. stream_end_pos seems to return the final byte position, so I DIDN'T implement it. I'll remove the "fixes" tag. (I guess it is quite easy to implement it though)

@est31
Copy link
Member

est31 commented May 10, 2021

Sorry I'm a bit busy atm. I think first the ogg PRs need to be reviewed, then this PR.

@ghost
Copy link

ghost commented May 13, 2021

On getting the stream length, from my understanding, the steps needed are:

  • Seek to the final packet's byte position, which we can obtain using stream_end_pos
  • Read that packet
  • Get the absolute granule position
  • The stream length is the absolute granule position plus the packet's length in granules

Is this correct?

@est31
Copy link
Member

est31 commented May 13, 2021

@JackRedstonia one also has to take the first packet's absolute granule pos into account, but other than that, yes it's accurate.

@est31
Copy link
Member

est31 commented May 13, 2021

wait no, one doesn't have to take the length of the last packet into account, but the length of the first packet. Absolute granule positions always refer to the last sample offset of the last packet finished on the page.

@TonalidadeHidrica
Copy link
Author

I don't actually remember the code I wrote few months ago, but now I'm wondering that stream_end_pos really seeks to the beginning of the last packet (or page?) of the current stream? Doesn't it seek to the real "end" of the stream? I've gotta check.

@hf29h8sh321
Copy link

What is the correct way to use SeekableOggStreamReader? I am currently testing the code in this branch. If I call seek_absgp with certain values, inner_mut().read_dec_packet() returns BadAudio(AudioBadFormat). I can seek to small values (0-5000 works), but if I seek to a larger value, I get the error.

@TonalidadeHidrica
Copy link
Author

@hf29h8sh321 I'm very sorry to leave it undocumented. Since I'm quite busy and sleepy I cannot currently give you a precise example now, but perhaps seeing this file might help you. Sorry for the inconvenienve.

@hf29h8sh321
Copy link

I copied that file into my project, and I cannot get any audio data after seeking. Calls to next only return None.

@TonalidadeHidrica
Copy link
Author

Hmm, at least it works for almost all of my ogg files, so there should be some difference...

@TonalidadeHidrica
Copy link
Author

TonalidadeHidrica commented Jul 5, 2021

@hf29h8sh321 Did you check out this messy gist of mine? (I'm not confident it would work right now...)

@hf29h8sh321
Copy link

I have narrowed down the issue. I ran the code in the gist, which works with my own files. However, I then modified the code and removed this line: let samples = take_n_samples(&mut reader, 1_500_000, false)?;, and all other lines that depend on that line. Then, check_seek fails with Error: Vorbis bitstream audio decode problem.

It seems like seeking into the middle of the stream without reading it without seeking first causes the error.

@hsi3322
Copy link

hsi3322 commented Jul 8, 2021

I think if you remove first 2000 bytes or so of the audio file. it should work properly. Thats where all the "garbage" data is stored.

@hf29h8sh321
Copy link

@hsi3322 Can you elaborate on what you mean by "garbage data"?

@est31 est31 force-pushed the master branch 10 times, most recently from 1a9ca88 to f655fa2 Compare January 23, 2022 04:11
@est31
Copy link
Member

est31 commented Apr 10, 2022

Thanks for keeping this PR alive! This is the single biggest issue of the lewton crate at the moment, I think.

@TonalidadeHidrica
Copy link
Author

You're welcome! The seeking is in fact crucial part of my project, and the feature has actually been thoroughly tested through my actually using. I'll resolve the conflict when I have time, and of course review is always welcome, in that case I'll respond to it as soon as I can.

@bbstilson
Copy link

@est31 @TonalidadeHidrica Any updates on merging this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants