Zero-copy scene detection #627

redzic · 2022-05-06T08:26:07Z

Scene detection, especially at higher resolutions, is currently bottlenecked by many allocations and copying of uncompressed frames. This PR uses the ffmpeg API directly to decode frames and removes copies frame data during scene detection, which allows for large speedups compared to the previous code (which generally seem to increase with resolution and higher bit depth).

Considerations before merging

Implement --sc-downscale-height and --sc-pix-format, this now needs to be manually done with the ffmpeg API as well
Upstream the changes to rav1e, av-scenechange, and possibly av-metrics-decoders (the code changes from av-metrics-decoders could be included in av1an or av-scenechange instead).

Other possible improvements

Decode next few frames in parallel of the scene detection analysis itself, instead of decoding frames as needed and then performing analysis in serial. This seems difficult, so this could possibly be done in a future PR
If possible, remove uses of Arc<T> in analyze_next_frame and remove temporary vectors holding Arc<Frame<T>>s. This would need to be done in addition to some work on rav1e to allow creating Plane<T> that are not owned
Run callback function (incrementing the progress bar) on another thread

master-of-zen · 2022-05-06T08:28:39Z

@redzic Amazing work 🥇 💯

shssoichiro · 2022-05-06T17:14:39Z

I haven't seen the av-scenechange code yet but here are some of my thoughts:

Fix standard mode of scene detection. Currently, only the fast mode of scene detection actually works because the function estimate_inter_costs seems to require the raw frame data to be padded to a power of 2.

I also think the ideal solution is to not require the padding. I looked into this recently because I realized there's actually a lot of padding on each frame in rav1e and I wanted to reduce memory usage. I got as far as seeing that the motion estimation requires padding because it looks outside of the visible portion of the video when searching. Removing this requirement would probably speed up motion estimation in rav1e (and by extension estimate_inter_costs) as well. But I don't think it's a small change.

Failing that, I think number 2 is the solution we are stuck with. I don't like 1 because as you mentioned it changes the results to be less accurate. And 3, unfortunately, I don't think is possible, or at the very least would be pretty messy to implement (a lot of raw pointers and indirection).

Upstream the changes to rav1e, av-scenechange, and possibly av-metrics-decoders (the code changes from av-metrics-decoders could be included in av1an or av-scenechange instead).

One of the major challenges here I think will be that rav1e does not currently depend on ffmpeg, and I'm not sure that they want to. It might be necessary to implement something similar in rav1e's y4m implementation but without relying on the ffmpeg API (which is cursed).

Decode next few frames in parallel of the scene detection analysis itself, instead of decoding frames as needed and then performing analysis in serial. This seems difficult, so this could possibly be done in a future PR

I don't think this is as difficult as it seems. But I agree putting it in a future PR makes sense, since it is a separate optimization from the zero-copy changes.

If possible, remove uses of Arc in analyze_next_frame and remove temporary vectors holding Arc<Frame>s. This would need to be done in addition to some work on rav1e to allow creating Plane that are not owned

Not sure how feasible this is... It might be worth investigating in a follow-up PR. But I also think the gains from it would be small, if any.

redzic · 2022-05-18T02:54:56Z

It seems like estimate_inter_costs actually needed more vertical padding (not horizontal padding which is what I thought originally), which is pretty easy to add without having to do a bunch of extra copying. I've added it and now the standard scene detect works (and is also much faster) on many videos I've tried, but it seems to segfault (apparently in a call to copy_nonoverlapping) on HBD videos with standard scene detect, and apparently an assertion fails sometimes too according to CI. I'll look into this further.

redzic · 2022-05-18T20:09:51Z

I've fixed the standard scene detection mode, now I just need to address the other issues (mainly vapoursynth input support and downscaling for scene detection).

BlueSwordM · 2022-05-18T20:10:33Z

Incredible.

redzic · 2022-06-03T18:11:52Z

Surprisingly ffmpeg isn't actually the problem here, it's mostly vapoursynth which has a limited (at least, for our specific use case) decoding API. Vapoursynth seems to be based on the idea of returning "frame refs", which is counter to the idea of decoding into a specific buffer that av1an allocates with specific dimensions to avoid copying altogether.

I think maybe it's still possible to support zerocopy fast scene detect with vapoursynth input, but in the case of standard scene detect and vapoursynth input I think we are forced to copy into a new buffer. There might be some additional optimizations though, like avoiding copying when combined with --sc-downscale-height, but I'll have to see later what can be done about that.

redzic · 2022-07-13T09:29:18Z

Update: I've added support in this PR for vapoursynth input. It seems to work on the scripts I've tried so far (even with HBD and --sc-method standard), but it might not be 100% correct yet. I'll have to double check why it's even working though to be honest, as I'm legitimately not even sure how it's producing the same results as before when I didn't even add any extra special handling for padding.

redzic · 2022-07-13T09:38:56Z

It looks like the standard scene detection mode with VS only happens to work with certain dimensions (but not with others, for example 930x734), so some kind of check will be needed to add the padding when needed (and unfortunately we have no choice to copy the buffer in this case unless there's a way to configure the padding on the returned frames in VS).

silverbacknet · 2022-07-13T11:06:30Z

unless there's a way to configure the padding on the returned frames in VS

Definitely possible, that's how we did it in AvsP. I have to go back and refresh my memory on exactly how that was done, but vital for colorspace conversion for the UI and such. On the other hand, not sure how useful it would be here, since it would be a wash whether VS or Av1an does the bitblt onto a padded region. Maybe there's a fix for the scene detection? I'll take a look at it.

redzic · 2022-08-30T19:33:22Z

Definitely possible, that's how we did it in AvsP. I have to go back and refresh my memory on exactly how that was done, but vital for colorspace conversion for the UI and such.

Definitely let me know if you figure it out/remember, it would simplify a lot of code if it's possible

0xBA5E64 · 2024-05-22T13:01:57Z

With @shssoichiro just now committing f259c44, is it possible we'll see a pulse on this getting merged as well?

Edit: I'm admittedly way out of my league with this but, I'm guessing the real question is if redzic's zerocopy branch of av-scenechange can be merged now that shssoichiro's figured out vapoursynth scene-detection?

shssoichiro · 2024-05-22T13:13:46Z

I do hope to integrate the ffmpeg portions of this as well, at least the portions that we know will work correctly. There are some limitations on making it truly zero-copy, which I believe is why this got stalled in the first place, but pipeless should be possible for ffmpeg as well.

redzic · 2024-05-22T15:28:16Z

Vapoursynth scene detection was working in this branch as well, with it being pipeless all the time but only being zerocopy if the strides between what vapoursynth returns and what rav1e expects matched. FFmpeg decoding with true zerocopy worked even with standard scenechange mode as it configured padding to what rav1e expects before calling the decoder. I can't dedicate time to this now but if someone else wants to pick this up and sort out the edge cases and do a little more testing I'm all good with that.

redzic requested a review from shssoichiro May 6, 2022 08:35

redzic force-pushed the zerocopy branch from e3ddf09 to 8fbe842 Compare May 17, 2022 17:52

redzic force-pushed the zerocopy branch from 45e5a93 to 1e4b667 Compare May 21, 2022 15:32

redzic force-pushed the zerocopy branch 2 times, most recently from 90ee178 to 52c1af8 Compare June 26, 2022 09:46

redzic force-pushed the zerocopy branch 2 times, most recently from d6fbf97 to 5c8b172 Compare July 9, 2022 05:48

redzic force-pushed the zerocopy branch 3 times, most recently from 67f44f4 to 437b2f5 Compare July 13, 2022 08:18

redzic force-pushed the zerocopy branch from 6881f25 to 9e37aa8 Compare August 29, 2022 17:52

redzic added 9 commits August 30, 2022 20:07

Zero-copy scene detection

6445197

Update dependencies

36abac8

Fix clippy warning

78a975c

Use local paths

c4368b9

Fix

31bf81a

Partially fix memory leak

a077c52

some stuff

58b58b4

use git av-scenechange

4ea2d11

temporary stuff

fdb9ad8

redzic added 12 commits August 30, 2022 20:07

fixes

8588fae

By some absolute miracle of God himself, this is SOMEHOW working (??)

37cd9d7

Support zerocopy VS scene detection

9d897bd

Don't use local paths for dependencies

27732a2

try fix 1

789ac70

try fix 2

2a532db

stuff

e4d3902

some stuff

d688f71

Use av-scenechange with its own decoders

b53ab97

Use git rav1e

6037cb5

Reuse frame allocations for VS input

1379d5a

temporary

760988d

redzic force-pushed the zerocopy branch from fd49c01 to 760988d Compare August 31, 2022 01:07

Bruh... how many times am I gonna make this mistake

b105b83

CartoonFan mentioned this pull request Sep 12, 2022

Cpu usage on scene detection #665

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zero-copy scene detection #627

Zero-copy scene detection #627

redzic commented May 6, 2022 •

edited

Loading

master-of-zen commented May 6, 2022

shssoichiro commented May 6, 2022

redzic commented May 18, 2022

redzic commented May 18, 2022

BlueSwordM commented May 18, 2022

redzic commented Jun 3, 2022

redzic commented Jul 13, 2022

redzic commented Jul 13, 2022

silverbacknet commented Jul 13, 2022

redzic commented Aug 30, 2022

0xBA5E64 commented May 22, 2024 •

edited

Loading

shssoichiro commented May 22, 2024

redzic commented May 22, 2024

Zero-copy scene detection #627

Are you sure you want to change the base?

Zero-copy scene detection #627

Conversation

redzic commented May 6, 2022 • edited Loading

Considerations before merging

Other possible improvements

master-of-zen commented May 6, 2022

shssoichiro commented May 6, 2022

redzic commented May 18, 2022

redzic commented May 18, 2022

BlueSwordM commented May 18, 2022

redzic commented Jun 3, 2022

redzic commented Jul 13, 2022

redzic commented Jul 13, 2022

silverbacknet commented Jul 13, 2022

redzic commented Aug 30, 2022

0xBA5E64 commented May 22, 2024 • edited Loading

shssoichiro commented May 22, 2024

redzic commented May 22, 2024

redzic commented May 6, 2022 •

edited

Loading

0xBA5E64 commented May 22, 2024 •

edited

Loading