Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ruby: Add source frame scaling via screen functions #1508

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jcm93
Copy link
Contributor

@jcm93 jcm93 commented May 29, 2024

(Marked as draft pending some further testing and any feedback)

This PR exposes several methods in screen to the video backend in ruby, in order to resolve certain libretro shader scaling issues, and downscale source frame buffers as appropriate to improve shader performance.

Background

ares's current rendering strategy, broadly, is to render frames on the CPU one at a time before sending them to the video backend. The frames sent to the video backend can have many duplicated pixels present, as a shortcut to account for cases where frames contain scanlines with different pixel clocks. Unless we scale these buffers down to the appropriate size before sending them to librashader, we can introduce artifacts, cause incorrect shader behavior, or even crash if we cause a shader to create an overly large texture.

By allowing screen's setScale, setInterlace and setProgressive methods to call into the video backend, we can signal that ruby should scale down the framebuffers appropriately before sending them to shaders. This is (in theory of course) more efficient than performing this scaling on the CPU, and more streamlined logically in terms of making the video backend responsible for the final frame buffer creation.

Proof of concept

To test this approach, I added rudimentary implementations of these functions for the Metal backend. These methods basically do the following in a separate render pass, prior to shader passes:

  • In terms of width, simply always scale the framebuffer to the largest pixel width.
  • In terms of height, if we are in the progressive mode and the framebuffer is doubling identical lines, scale down.
  • If we're in the interlaced mode and lines are doubled, don't scale down, because a frame larger in height than a deinterlaced frame often acts as a signal for shaders that we are outputting interlaced video.

This approach is basic and may need refinement, but is a solid improvement on existing behavior in my testing.

Future work

(Disclaimer; not trying to speak for maintainers here; just thinking out loud):

Per conversations on the ares Discord, it seems as though ares may in the future move toward a renderer that leans more heavily on the video backend, perhaps similar to the approach taken by something like CLK, where the core generates a video signal, and the display backend is entirely responsible for synthesizing that signal into frames. This could have numerous advantages in terms of accuracy, performance and latency.

While not terribly significant, this PR could be considered a small step in this direction, as it moves some of the work generating the final source frame from the CPU to the GPU/display backend.

This does, however, also mean that for other video backends to achieve parity with Metal in terms of framebuffer scaling, they will need to implement setScale, setInterlace and setProgressive themselves in some fashion.

Examples

crt/crt-maximus-royale.slangp used here, not because it is my favorite shader but because it seems to detect interlacing changes on the fly, and also creates very large textures.

SNES 240p, first with PR then without PR:

Screenshot 2024-05-29 at 12 51 36 AM Screenshot 2024-05-29 at 12 54 21 AM

Sega 32x at combined 320/256 pixel width, first with PR then without PR:

Screenshot 2024-05-29 at 12 51 36 AM Screenshot 2024-05-29 at 12 54 21 AM

Sonic 2 switching from interlaced to progressive during runtime (works both ways) (crashes without PR):

Screenshot 2024-05-29 at 12 54 21 AM

@LukeUsher
Copy link
Member

I like it!

My Mac is currently out of action so I can't test it myself right now, though
Any chance you could port this to the OpenGL backend so I can have a play with it on Windows?

@jcm93
Copy link
Contributor Author

jcm93 commented Jun 20, 2024

Yeah, the reason this is still in draft is that I started working on a larger PR to allow all the software rendering functions in screen to be overridden optionally by implementations in ruby (color bleed, sprites, etc.). I was planning on finishing that up and submitting it, then basing this off of that work instead.

I've also been planning on backporting some of my other work to OpenGL too, so maybe I'll look at that first.

LukeUsher pushed a commit that referenced this pull request Jun 21, 2024
### OpenGL rendering (all platforms)

- librashader specifies that the output viewport should be the same size
as the output framebuffer texture. This was not the case for the OpenGL
render code.
* Previously, librashader under OpenGL would render onto an intermediate
framebuffer sized to the "output" size (the entire window view size),
rather than the "target" size (the final composited size of the game
area within the output area). The `libra_viewport_t` would be sized to
the target size, but the underlying buffer would be larger.
* Now, we size the intermediate framebuffer to the "target" size, let
librashader render onto it with a `libra_viewport_t` that matches that
size. In the final pass we sample this buffer within an area of the
"output"-sized buffer as appropriate.

This prior behavior would lead to scaling issues with shaders in the
Metal backend. The same issues did not seem to be obviously present in
OpenGL in my testing, but we should nevertheless probably fix this in
case it is causing any of the subtle issues with shaders that have been
reported, and also in case something breaks in the future as a result of
not following this recommendation.

(The above is also unrelated to the scaling issues addressed by
#1508)

### CGL fix-ups (macOS)

- Backports the native fullscreen and monitor selection options to
OpenGL, and removes unnecessary custom window code from the OpenGL
driver.
- Removes a seemingly unnecessary output call from `reshape` that would
cause flickering during resizes.

This has been tested on macOS but should probably be tested on other
platforms as well to make sure nothing breaks.

Co-authored-by: jcm <butt@butts.com>
@aduffey
Copy link

aduffey commented Jul 18, 2024

In terms of width, simply always scale the framebuffer to the largest pixel width.

Does this mean that if 320 and 256 pixel wide content is mixed, that the framebuffer will be scaled to 320 pixels? Or will it still be 1280, but only when the content is mixed?

I ask because resampling 256 pixel wide content to 320 pixels would introduce either significant blur or aliasing. Some shaders can handle wide framebuffers and will display correctly even with horizontally duplicated pixels. Those shaders would look worse with a 320 pixel wide framebuffer in this case. It also makes shaders that aim for sharp pixels impossible.

I'm not familiar enough with the codebase to understand what scaleX and scaleY in screen.cpp are.

@jcm93
Copy link
Contributor Author

jcm93 commented Jul 18, 2024

With the Metal implementation in this version, for mixed content with pixel widths of 256 and 320, the buffers would be scaled horizontally before shader passes at all times, not just when pixel clocks are mixed in the same frame. This is up to the video backend to decide whether to do or not. In theory, if ruby could be informed when content is mixed, ruby could decide to only perform this horizontal scaling for content that is not or is not likely to be mixed. An API for that does not currently exist though.

As for shaders that try to allocate textures larger than the D3D or Metal max texture size in response to extra-wide buffers, I suppose that a bug should be filed against those shaders (such as crt-maximus-royale in the OP) regardless of what ends up being considered proper on the ares end.

Lastly I'd say the main benefit here visually speaking is the vertical scaling for progressive frames. One way or another, the source framebuffer should probably not end up being 480 pixels high for a 240p frame (this creates the artifacts in the first example, often leads to improper numbers of scanlines being drawn, etc.). It may be worth looking at screen more closely to see if a convenient solution exists for that issue, independent of the stuff being explored here and in 1551.

@aduffey
Copy link

aduffey commented Jul 20, 2024

Lastly I'd say the main benefit here visually speaking is the vertical scaling for progressive frames. One way or another, the source framebuffer should probably not end up being 480 pixels high for a 240p frame (this creates the artifacts in the first example, often leads to improper numbers of scanlines being drawn, etc.).

I agree and that's why I was excited to see this PR!

Horizontal scaling is a difficult problem. Outputting the least common multiple of the possible pixel widths breaks many shaders, often subtly, as they make assumptions about the horizontal resolution for the sake of simplicity or performance. Resampling to one of the resolutions on mixed content would also cause issues.

I'm any case, this issue may be out of scope here. If I understand correctly, you are preserving the existing horizontal scaling behavior on the Metal backend (e.g. always output 1280 pixels wide in Mega Drive/Genesis)?

@jcm93
Copy link
Contributor Author

jcm93 commented Jul 20, 2024

Without this PR the Metal backend takes in the 1280-wide buffer and passes that to librashader, which renders it onto an offscreen texture the same size as the final viewport, and then a final Metal pass blits that texture onto the viewport. With this PR the buffers are always resampled down to 320 width before being passed to librashader. If this PR ever becomes not-a-draft, that would probably not be the final behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants