Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: What should we do about overlapping subtitles? #60

Open
emk opened this issue May 4, 2024 · 1 comment
Open

RFC: What should we do about overlapping subtitles? #60

emk opened this issue May 4, 2024 · 1 comment

Comments

@emk
Copy link
Owner

emk commented May 4, 2024

The core substudy algorithms are all designed around non-overlapping subtitles. There's a built-in "cleaning" layer that will fix small overlaps as best as it can. But a few SRT files use partially overlapping subs to convey semantic and timing information, and other SRT files contain lots of garbage data.

What should we do here? Major options include:

  1. Try a few simple things to produce non-overlapping subs, and if none of those work, try to issue a good error. This is the approach we took in Error: Cannot truncate time period Period { begin: 453.57, end: 457.84 } at 453.57 #37. We could try to improve the "cleaning" algorithm to handle more cases, if we know what people are regularly encountering.
  2. Automatically combine subs with non-trivial overlap into one giant combined subtitle. This is tricky, especially with certain Whisper output, which will often produce a 30-second segment overlapping many shorter segments.
  3. Redesign all our algorithms and UI ideas to handle overlapping subtitles.

I am honestly not too interested in pursuing (3) if I can possibly get good results (for most use cases) without it. But (1) vs (2) is a harder tradeoff and I'd love feeback on what people are encountering in their SRT files.

CC @aaron-meyers

@aaron-meyers
Copy link

The main concerns I would have with either 1 or 2 is that a lot of videos legitimately have overlapping subtitles, because there are multiple speakers simultaneously (e.g. a TV broadcaster in the background while another character is speaking). In some cases, the 'secondary' subtitle has some unique formatting that could be used to identify it and then treat it essentially as a separate track, but this would need to be detected per file (or implement a bunch of common patterns). For example, in Japanese, Netflix will generally display one subtitle on the bottom (like normal) and a secondary subtitle on the right (vertically). In English I've seen italic used for the secondary subtitle or even different colors (in .ass subtitles).

I haven't looked at your alignment algorithm and I haven't actually tried to implement one myself yet. I was going to start with something pretty simple - iterating over the native (base) subtitle items and aligning the reference subtitles when they have > some % overlap with the native subtitle item (maybe 90%+) by default, with a more relaxed match if there aren't any overlapping subtitles in each track. This is probably naive though 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Triage
Development

No branches or pull requests

2 participants