-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zero-copy scene detection #627
base: master
Are you sure you want to change the base?
Conversation
@redzic Amazing work 🥇 💯 |
I haven't seen the av-scenechange code yet but here are some of my thoughts:
I also think the ideal solution is to not require the padding. I looked into this recently because I realized there's actually a lot of padding on each frame in rav1e and I wanted to reduce memory usage. I got as far as seeing that the motion estimation requires padding because it looks outside of the visible portion of the video when searching. Removing this requirement would probably speed up motion estimation in rav1e (and by extension Failing that, I think number 2 is the solution we are stuck with. I don't like 1 because as you mentioned it changes the results to be less accurate. And 3, unfortunately, I don't think is possible, or at the very least would be pretty messy to implement (a lot of raw pointers and indirection).
One of the major challenges here I think will be that rav1e does not currently depend on ffmpeg, and I'm not sure that they want to. It might be necessary to implement something similar in rav1e's y4m implementation but without relying on the ffmpeg API (which is cursed).
I don't think this is as difficult as it seems. But I agree putting it in a future PR makes sense, since it is a separate optimization from the zero-copy changes.
Not sure how feasible this is... It might be worth investigating in a follow-up PR. But I also think the gains from it would be small, if any. |
It seems like |
I've fixed the standard scene detection mode, now I just need to address the other issues (mainly vapoursynth input support and downscaling for scene detection). |
Incredible. |
Surprisingly ffmpeg isn't actually the problem here, it's mostly vapoursynth which has a limited (at least, for our specific use case) decoding API. Vapoursynth seems to be based on the idea of returning "frame refs", which is counter to the idea of decoding into a specific buffer that av1an allocates with specific dimensions to avoid copying altogether. I think maybe it's still possible to support zerocopy fast scene detect with vapoursynth input, but in the case of standard scene detect and vapoursynth input I think we are forced to copy into a new buffer. There might be some additional optimizations though, like avoiding copying when combined with |
90ee178
to
52c1af8
Compare
d6fbf97
to
5c8b172
Compare
67f44f4
to
437b2f5
Compare
Update: I've added support in this PR for vapoursynth input. It seems to work on the scripts I've tried so far (even with HBD and |
It looks like the standard scene detection mode with VS only happens to work with certain dimensions (but not with others, for example 930x734), so some kind of check will be needed to add the padding when needed (and unfortunately we have no choice to copy the buffer in this case unless there's a way to configure the padding on the returned frames in VS). |
Definitely possible, that's how we did it in AvsP. I have to go back and refresh my memory on exactly how that was done, but vital for colorspace conversion for the UI and such. On the other hand, not sure how useful it would be here, since it would be a wash whether VS or Av1an does the bitblt onto a padded region. Maybe there's a fix for the scene detection? I'll take a look at it. |
Definitely let me know if you figure it out/remember, it would simplify a lot of code if it's possible |
With @shssoichiro just now committing f259c44, is it possible we'll see a pulse on this getting merged as well? Edit: I'm admittedly way out of my league with this but, I'm guessing the real question is if redzic's |
I do hope to integrate the ffmpeg portions of this as well, at least the portions that we know will work correctly. There are some limitations on making it truly zero-copy, which I believe is why this got stalled in the first place, but pipeless should be possible for ffmpeg as well. |
Vapoursynth scene detection was working in this branch as well, with it being pipeless all the time but only being zerocopy if the strides between what vapoursynth returns and what rav1e expects matched. FFmpeg decoding with true zerocopy worked even with standard scenechange mode as it configured padding to what rav1e expects before calling the decoder. I can't dedicate time to this now but if someone else wants to pick this up and sort out the edge cases and do a little more testing I'm all good with that. |
Scene detection, especially at higher resolutions, is currently bottlenecked by many allocations and copying of uncompressed frames. This PR uses the ffmpeg API directly to decode frames and removes copies frame data during scene detection, which allows for large speedups compared to the previous code (which generally seem to increase with resolution and higher bit depth).
Considerations before merging
--sc-downscale-height
and--sc-pix-format
, this now needs to be manually done with the ffmpeg API as wellOther possible improvements
Arc<T>
inanalyze_next_frame
and remove temporary vectors holdingArc<Frame<T>>
s. This would need to be done in addition to some work on rav1e to allow creatingPlane<T>
that are not owned