Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Unicode CJK CMap #11

Open
neko-para opened this issue Sep 27, 2022 · 0 comments
Open

Support Unicode CJK CMap #11

neko-para opened this issue Sep 27, 2022 · 0 comments

Comments

@neko-para
Copy link

neko-para commented Sep 27, 2022

There are a lot of CJK CMaps, but some of them are just Utf16-BE. We can check the prefix of unknown encoding and treat the encoding begin with Uni as Utf16-BE.
Here are some information from pdf v1.5 spec p404
image
image

// fontentry.rs
let source_encoding = match base_encoding {
    Some(BaseEncoding::StandardEncoding) => Some(Encoding::AdobeStandard),
    Some(BaseEncoding::SymbolEncoding) => Some(Encoding::AdobeSymbol),
    Some(BaseEncoding::WinAnsiEncoding) => Some(Encoding::WinAnsiEncoding),
    Some(BaseEncoding::MacRomanEncoding) => Some(Encoding::MacRomanEncoding),
    Some(BaseEncoding::MacExpertEncoding) => Some(Encoding::AdobeExpert),
    ref e => {
        // we can do the check here, return AdobeStandard if matches.
        warn!("unsupported pdf encoding {:?}", e);
        None
    }
};
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant