-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2DGraphics vsync is slow #489
Comments
With which RPi model do you work and with which display? Have you tested this
with sample/41-screenanimations?
|
I have tested the sample 41 with the same result (i am using an RPI 4B and a standard HDMI display) it works better, but it still stuck a 30hz |
Then clearing the screen an the drawing takes too long to reach 60 Hz. You may have to reduce the screen size (width, height), if it's important for you to have 60 Hz. |
Before a new graphics frame has been prepared in the uncached display buffer. Now a separate cached buffer is used, which speeds up operation. The separate buffer is copied to the display buffer in the vertical sync phase to suppress flickering. Now this class can be used on the Raspberry Pi 5 too, but without vertical sync support. Issue #489
There is an update on the develop branch, which speeds up operation very much. The class |
It's great that it's sped up. I'm wondering if you can explain to me why drawing to a third buffer and using memcpy to update the second buffer before swapping is faster than just writing directly to the second in the first place. Clearly it is, I applied your fix to my copy and it works, but I don't understand why. |
The reason is the data cache. The buffer, where the drawing operations will be
done, is in cached memory, while the frame buffer memory is not cached. So
drawing is much faster in cached memory and memcpy() to uncached memory is
relatively quick, because it uses strictly increasing addresses and word
access.
|
That makes sense, thank you. One more question if you don’t mind. Would DMA from the cached buffer to the framebuffer be faster than memcpy(), or is it about even? |
You are welcome. I haven't made benchmarks on this, but maybe. I'm working on
a new general display interface for Circle. With this the class C2DGraphics
will use DMA to copy the internal display buffer to the frame buffer.
C2DGraphics will work with logical colors (RGB888) then, so that it can be
used on any display, which supports the new CDisplay interface. This will
require some small modifications in applications. The current status of this
new display support is on the branch general-display-interface in the Circle
repository. See #380 for more info.
|
Very cool. What I've done on my end is take the current main branch C2DGraphics class and extensively modified it into a project specific class (two classes, actually, one that owns the framebuffer and one that deals with drawing to an arbitrary memory buffer) that supports clipping rectangles and alpha blending; I will try making my screen class use DMA to update the back buffer and see if that is better, worse, or the same. |
Okay, I have implemented DMA write for updating the back-buffer from the cached draw buffer, and it's at least a little faster with a burst argument of zero; however, if I goose the burst argument up to 10, I pulled ~60fps at 1080p. I'm not sure how big a burst argument it can handle without causing bus problems, though. 16 crashes it at boot. I would imagine with more activity on all the CPU cores, bus contention with DMA gets worse? |
Great. Yes, the burst parameter has a big influence. It depends on the other
things, which were running on the bus, how this parameter can be set. I
wouldn't use values greater than 2 generally, but of course you can tune this
for your application. There is an "assert (nBurstLength <= 15)" in the DMA
driver, so it cannot be greater than 15.
|
I presently have it at 5, but I will back it off to 2; I just wanted to see the limits. I've implemented a limiter when running at 1080p, so it stays at a locked 30fps; 960x540, the default resolution I chose, runs 60fps even without burst. |
Good to know. |
Hi,
i have a little issue , and i dont know what is the cause.
in my kernel i use 2DGraphics ;
when i use the vsync option, it is verry slow, like 1 FPS , even if i draw a single rectangle ; it seems that the slow part is when it does "m_pFrameBuffer->WaitForVerticalSync();"
when i does not use the vsync option, it does a memcpy; and it's faster.
i what could be the cause of that ?
best regards
The text was updated successfully, but these errors were encountered: