Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding-specific variants? #27

Open
zarybnicky opened this issue Aug 8, 2018 · 5 comments
Open

Encoding-specific variants? #27

zarybnicky opened this issue Aug 8, 2018 · 5 comments

Comments

@zarybnicky
Copy link

Using file-embed with non-ASCII files is rather fragile, as it depends on the build system's locale, and it doesn't work at all when building e.g. in a plain docker container (simonmichael/hledger#420 (comment)). In hledger, we've solved this by setting the handle encoding to UTF-8 prior to reading from it, which solved our problem.

I'm actually surprised that I haven't any issues here regarding this - maybe building in a pure Nix shell or a plain Docker container is not such a common scenario? Nevertheless, I think that extending the API with variants of the existing functions like embedFile' :: TextEncoding -> FilePath -> Q Exp and others would be a useful change not only for the hledger project.

What do you think?

@zarybnicky zarybnicky changed the title Locale-specific variants? Encoding-specific variants? Aug 8, 2018
@snoyberg
Copy link
Owner

snoyberg commented Aug 9, 2018

embedFile is working with ByteStrings, and therefore shouldn't care about the character encoding. Do you have some sample reproducing case them demonstrates the problem? If so, we should just fix embedFile.

@steshaw
Copy link

steshaw commented Aug 9, 2018

The hledger code is using embedStringFile which in turn uses Prelude.readFile (beware!). An alternative would be to try something like:

decodeUtf8 $(makeRelativeToProject "embeddedfiles/hledger.1" >>= embedFile)

To cut down on the duplication, hledger could defined it's own helper in a separate module:

module TH where

...

embedFileText :: FilePath -> Q Exp
embedFileText filePath =
  [| decodeUtf8 $(makeRelativeToProject filePath >>= embedFile) |]

so you can use it like:

$(embedFileText "embeddedfiles/hledger.1")

@snoyberg
Copy link
Owner

snoyberg commented Aug 9, 2018

I'd be in favor of deprecating embedStringFile and providing an embedFileUtf8 instead, which is a naming convention I've been moving towards with libraries like rio anyway. But yes, that approach should work.

@zarybnicky
Copy link
Author

Thanks for the response. Yes, I didn't think using embedFile would work better (I thought it'd do sth like BC.pack . readFile, don't know why), but it seems that would work just as well. Deprecating embedStringFile sounds sensible, to prevent people from shooting themselves in the foot.

@runeksvendsen
Copy link

runeksvendsen commented Nov 20, 2018

I just ran into this issue. Perhaps it would be sufficient to just add a note to the embedStringFile docs saying it doesn't work with binary data?

It's not entirely obvious why embedStringFile can fail with binary data since, in the end, I can specify e.g. Data.ByteString.Lazy.ByteString as the output type of embedStringFile (which seems like it shouldn't care about character encoding). Except it does, because the IsString interface requires decoding from a String (which isn't obvious at all when glancing at embedStringFile).

Full error:

    • Exception when trying to run compile-time code:
        /Users/runesvendsen/code/crypto-depth-db/test/data/test.json.zip: hGetContents: invalid argument (invalid byte sequence)
      Code: makeRelativeToProject "test/data/test.json.zip"
              >>= embedStringFile
    • In the untyped splice:
        $(makeRelativeToProject "test/data/test.json.zip"
            >>= embedStringFile)
   |
49 |     [ $(makeRelativeToProject "test/data/test.json.zip" >>= embedStringFile)
   |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants