Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for lossy animated WebP with alpha channel #988

Open
wants to merge 3 commits into
base: next
Choose a base branch
from

Conversation

torusrxxx
Copy link
Contributor

No description provided.

@@ -63,8 +63,9 @@ protected void readFilterChunk(byte[] ID, int size, Object context, DataInputStr
writeLittleEndianInt(output, size);
output.write(buf);
passthroughBytes(input, output, size - buf.length);
if((size & 1) != 0) // Add padding if necessary
if((size & 1) != 0) { // Add padding if necessary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you’re at it… a space between the if and the ( would be great. 🙂

@torusrxxx
Copy link
Contributor Author

It seems there are needs to use animated WebP in the short term as video playback is not working.
So I added support for animated WebP. But lossless animation as well as any use of lossless encoding is still not supported.
Hope it helps.
There is open source Matroska and WebM files validator mkvalidator. If I move on to WebM, can this be used?

@torusrxxx torusrxxx changed the title fix coding style in WebP filter Add support for lossy animated WebP with alpha channel Nov 8, 2024
@ArneBab
Copy link
Contributor

ArneBab commented Nov 11, 2024

Thank you for your work!

mkvalidator looks like it’s written in C, so we’d have to ship compiled versions of it to different platforms (arm, x86_64, M1, ...), so we really need pure Java tools.

jebml seems to be pure Java, though, and it is lgplv2.1 or later (compatible): https://github.com/Matroska-Org/jebml

Since that has not been changed for 6 years, I’d suggest to just add the sources to the fred git repo. We may need to change it to strip out anything that could cause web requests. It may be possible to directly merge the repository with the fred repository to keep the history of jebml.

@ArneBab
Copy link
Contributor

ArneBab commented Nov 11, 2024

@Bombe do you have time and energy to review commit 739a6dc ? If not, it needs to go into its own pull-request, so I can merge the style changes.

@ArneBab
Copy link
Contributor

ArneBab commented Nov 15, 2024

@torusrxxx tak from FMS detailed how webm could be parsed, and it looks like parsing it "by hand" could actually be easy enough that the risk of a library leaking elements might be bigger than the work to write a low-level validator:

The Webm container specification is a good starting point, since it
prohibits some of the potentially compromising elements that MKV has,
like "NextFilename" and "PrevFilename", which would allow linking to
other MKV files to play, and "Attachment", which is commonly used in
MKV to attach subtitle fonts, but could attach malicious data.
Although Webm still allows some elements that we might want to blank or
inspect.

MKV and Webm files consist of a 0x1A45DFA3 "EBML" element and the
0x18538067 "Segment" element. The element IDs I post match the
documentation, so they contain the VINT marker (counting from the msb to
the topmost bit that's set to 1 indicates the ID/content length)

I mainly used the Matroska element reference
(https://www.matroska.org/technical/elements.html) for this, because
the EBML specification I linked before doesn't address the
Matroska-specific stuff, and the Webm specification is missing the
element IDs. However the element reference indicates some elements as
supported by Webm, that the Webm specification lists as "unsupported".
To check actual Webm files for the used elements, you can use the info
tool in MKVToolNix GUI.

Since MKV has a lot of elements, I think that a whilelist approcach
might be best. Here is a list of the minimum used by a common Webm file,
but will include some common sense optional elements, that are commonly
used to e.g. allow for fast seeking in the timeline. I will use dashes
to indicate at which level elements should be.

0x1A45DFA3 "EBML" is the header and must contain:

  • 0x4286 EBMLVersion
  • 0x42F7 EBMLReadVersion
  • 0x42F2 EBMLMaxIDLength
  • 0x42F3 EBMLMaxSizeLength
  • 0x4282 DocType
  • 0x4287 DocTypeVersion
  • 0x4285 DocTypeReadVersion

0x18538067 "Segment" contains all the audio, video and subtitle tracks
and related meta data.

  • 0x114D9B74 "SeekHead" (Optional, speeds up seeking within file)
    -- 0x4DBB "Seek" (Mandatory if "SeekHead" is included)
    --- 0x53AB "SeekID" (Mandatory if "SeekHead" is included)
    --- 0x53AC "SeekPosition" (Mandatory if "SeekHead" is included)

  • 0xEC "Void" (Common placeholder to allow updating "SeekHead")

  • 0x1549A966 "Info" (Mandatory)
    -- 0x2AD7B1 "TimestampScale" (Mandatory)
    -- 0x4489 "Duration" (Optional)

  • 0x1654AE6B "Tracks" (Optional, but would be pointless without this)
    -- 0xAE "TrackEntry" (Mandatory if "Tracks" is included.)
    --- 0xD7 "TrackNumber" (Mandatory per "TrackEntry")
    --- 0x73C5 "TrackUID" (Mandatory per "TrackEntry")
    --- 0x83 "TrackType" (Mandatory per "TrackEntry")
    --- 0x86 "CodecID" (Mandatory per "TrackEntry")
    --- 0xB9 "FlagEnabled" (Supposed to be mandatory, but I can't find it in
    actual files)
    --- 0x88 "FlagDefault" (Mandatory per "TrackEntry")
    --- 0x55AA "FlagForced" (Mandatory for subtitles according to Matroska
    documentation, but Webm spec says it can be used for any track)
    --- 0x9C "FlagLacing" (Mandatory per "TrackEntry")
    --- 0x23E383 "DefaultDuration" (Optional, but contains framerate, so we
    should keep it)
    --- 0x22B59C "Language" (Mandatory per "TrackEntry")
    --- 0x56AA "CodecDelay" (Mandatory, but only seems to be used for audio)
    --- 0x56BB "SeekPreRoll" (Mandatory, but only seems to be used for
    audio)
    --- 0xE0 "Video" (Included for video tracks.)
    ---- I'm ignoring a bunch of supposedly "mandatory" elements here that
    not even YouTube is using in their Webm files.
    ---- 0xB0 "PixelWidth" (Mandatory)
    ---- 0xBA "PixelHeight" (Mandatory)
    ---- 0x55B0 "Colour" (Optional)
    ----- Color management elements go here. Not including them, since there
    are a lot
    --- 0xE1 "Audio" (Included for audio tracks.)
    ---- 0xB5 "SamplingFrequency" (Mandatory)
    ---- 0x9F "Channels" (Mandatory)
    ---- 0x6264 "BitDepth" (Optional)
    ---- I'm ignoring another bunch of "mandatory" elements that nobody
    is using.
    --- 0x63A2 "CodecPrivate" (Optional. Commonly contains codec headers.
    Should probably be filtered as part of the audio/video streams.)

  • 0xEC "Void" (Placeholder after track definitions)

  • 0x1F43B675 "Cluster" (Optional, but these contain the actual
    audio/video/subtitle data. There usually are a lot of clusters, but
    really short clips might only have one.)
    -- 0xE7 "Timestamp" (Mandatory)
    -- 0xA3 "SimpleBlock" (Only 0xA1 "Block", the more bloated predecessor
    is listed as mandatory, and supposedly can be nested inside 0xA0
    "BlockGroup", but I've only seen files using 0xA3, never 0xA1 or
    0xA0. There is also 0x75A1 "BlockAdditions", which is only allowed on
    level 3, so presumably would only be used inside 0xA0 "BlockGroup")
    --- Raw frames go here. We finally made it to the actual stream data.
    One "SimpleBlock" can contain multiple frames.

  • 0x1C53BB6B "Cues" (Optional, but should be used for files that are
    not live streams, to speed up seeking in the timeline. Note that
    normally this is located at the end of the file, but the Webm
    specification recommends putting it before the clusters, to allow
    seeking in incomplete files.)
    -- 0xBB "CuePoint" (Mandatory)
    --- 0xB3 "CueTime" (Mandatory)
    --- 0xB7 "CueTrackPositions" (Mandatory)
    ---- 0xF7 "CueTrack" (Mandatory)
    ---- 0xF1 "CueClusterPosition" (Mandatory)
    ---- 0xF0 "CueRelativePosition" (Optional)
    ---- 0xB2 "CueDuration" (Optional)
    ---- 0x5378 "CueBlockNumber" (Optional)

I think a good first step would be to have a parser for EBML elements,
that iterates over the hierarchy, and:

  • Checks if the element is on the whitelist.
  • Checks if the element is on an allowed level within the hierarchy.
  • Maybe looks ahead if the indicated element size is correct. There's
    nothing to mark the end of an element, but it should be followed by
    another element ID that could be checked against a list of known
    elements. (Or end of file.) I don't know if there are any security
    risks if libwebm gets incorrect element sizes and tries to read data
    outside the element. I assume it should be mature enough to not have
    any buffer overflows, but I'm not sure if we should trust it.
  • Checks if the elements are set to valid values.
  • Passes the content of simpleblock elements to the appropriate stream
    filters.

Regarding things that should be zeroed out, if we replace anything
not whitelisted with 0xEC "Void" and zero out its content, there
shouldn't be much left.

Some of the most commonly used elements I excluded from the list, that
might or might not be harmless:

  • 0x536E "Name" is located under the track element and stores track
    names, but only makes sense if there are multiple audio/subtitle
    tracks to select.
  • 0x7BA9 "Title" is located under the segment element and stores the
    video title that's displayed by most players.
  • 0x1254C367 "Tags" and its children are commonly used to add all
    kinds of meta data.
  • 0x1043A770 "Chapters" and its children are commonly used to add
    chapters to videos.

Something that caught my eye are the elements related to content
encryption: "ContentEncryption", "ContentEncAlgo", "ContentEncKeyID",
"ContentEncAESSettings", "AESSettingsCipherMode", "EncryptedBlock",
which if found should get the whole file blocked by the filter, since
that would be a DRM restricted file. I've never seen this used.

I wonder if the filter should restrict the layout of the file in other
ways. E.g. limiting the total number of tracks, since it's unlikely to
have multiple video tracks, or more than a handful of audio tracks.

@torusrxxx
Copy link
Contributor Author

I still think a library is much safer and easier to use for complex formats like MKV, PDF and so on. Otherwise we will need to have write lots of code just to avoid IndexOutOfBoundsException and it will be hard to maintain. But I have not used jebml before, so more time is needed before I come to a conclusion.

Also, starting with WebM, I think partial download (i.e. accept-ranges & range) will become important. This is used by the browser to seek to another location, important for video streaming. So a new version of content filter interface will be needed. This is possible for WebM because it uses void element to remove unsafe content. After downloading and checking certain parts of the file, the filter can tell freenet that no more ContentFilterException is expected to be thrown afterwards because it can deal with all of them using the void element rather than ContentFilterException. Afterwards, freenet can handle allow partial download requests from the browser and pass it to WebM filter. The WebM filter then decides how and where to get the requested range of data from the unfiltered file, or it doesn't need to download because the data downloaded will be filled with void anyway. For existing filters, freenet can also support partial downloads, but it just filters the entire file for each request and then send the requested part to the browser.

@ArneBab
Copy link
Contributor

ArneBab commented Nov 16, 2024

For jebml the difference is between reviewing jebml and writing new code. We have to make sure that jebml does not violate user privacy via some convenience feature. That’s much easier than a full quality review, but it isn’t just "drop in library and use it", because our privacy requirements are higher than what we can expect from other projects. For example PDF can send web-requests.¹

Partial content is something we can’t really do: that does not work with how encryption and retrieving content works on the network. But http range requests always allow sending more data than requested. So it may be possible to send http-range headers when a range request is sent, but simply provide all data. https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests#partial_request_responses

That’s why our video on demand solution (generate media site) pre-chunks the video before insertion and assembles it on the client during playback.

¹ "In addition to visible links in a PDF document, form fields can contain hidden JavaScript calls that open a page in a browser or silently requests data from the Internet." — https://www.adobe.com/devnet-docs/acrobatetk/tools/AppSec/external.html

@ArneBab
Copy link
Contributor

ArneBab commented Nov 19, 2024

PS: as an aside: using a prepared stream instead of range requests is similar to what DASH and HLS do: pre-chunk the data and reference it from a playlist file. So we aren’t that far from the state of the art there. I’m not sure how much data we can get from downloads in progress. An entry point to check that would be ClientGet.java.

@torusrxxx
Copy link
Contributor Author

It's possible to random access an encrypted file if a block cipher mode designed for random access (for example GCM) instead of CBC is used. It's also possible to random access FEC protected data. Compressed data is usually not random accessible, but I think videos aren't compressed by Freenet. So, there's nothing fundamental that prohibits range requests to encrypted videos. And this is the ultimate solution for gapless playback (playlist isn't, even music playlist can have a gap while the next music is loading, and even while the current music is restarted).

And the author no longer needs to upload the same video twice, one chunked and another one full.

@ArneBab
Copy link
Contributor

ArneBab commented Nov 20, 2024

Videos can be compressed — it depends on whether compression helps. And for random access of data during download, you’ll need to go deep into the persistent-temp layer (note also multi-segment files built from several splitfiles), and it may be hard to do without adding huge complexity by entangling filters much with storage and the request layer.

If you want to get there, that can have a huge benefit (partial file access has been a long standing wish), but I need to warn you, that that’s a huge change. You’ll need a persistence I’ve seen rarely — or a plan in which every smaller step on the way has its own benefits, so you can work on it with regular successes.

@torusrxxx
Copy link
Contributor Author

Splitfiles need to be updated as well, why can't a freesite have unlimited number of sub pages like https://github.com/hyphanet? This is forcing freesites like FMS archive to use a longer URL than necessary, and makes it almost impossible to have an anonymous world map as a freesite.

I have helped an open source project to become #1 in the field in terms of user friendliness. It takes many years. This is a project that should have become #1 in the field as well, it just has a lot of issues currently, so let's see.

@ArneBab
Copy link
Contributor

ArneBab commented Nov 20, 2024

There are already multilevel metadata and subcontainers, so a site can have unlimited sub-pages, that just becomes inefficient, because files are assigned to different sub-containers, so access delays can happen at unexpected points (clicking a link to a html file can cause the retrieval of a new 4 MiB container). My personal site is currently at 1569 files.

Improvements to DefaultManifestPutter.java could help there, but those are hard to get right. The first thing would be to only use an intermediate CHK for the metadata if the metadata does not fit into an SSK. That would increase the lifetime of Sharesites and Spider indexes a lot, but the two times I tried I had to stop before getting it done. It looks like it needs refactoring of the logic.

@ArneBab
Copy link
Contributor

ArneBab commented Nov 30, 2024

@bertm can you review this again? 8840a22 — is it good to go?

(the style changes are good to go, the functional change seems to be something you mostly already checked)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants