Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support punycode #9

Open
porterjamesj opened this issue Apr 15, 2014 · 8 comments
Open

Support punycode #9

porterjamesj opened this issue Apr 15, 2014 · 8 comments

Comments

@porterjamesj
Copy link
Contributor

I'm not sure about the details, but it might be nice to allow URIs of non-ASCII text (i.e. use String rather than ASCIIString in the type definition). The RFC is a bit vague on this. Perhaps it isn't technically allowed but it seems possible to encounter in the wild (e.g. ☃.net will resolve in a browser). On a practical level it's somewhat annoying to not be able to pass UTF8Strings or SubStrings thereof to methods in Requests.jl.

@Keno
Copy link
Contributor

Keno commented Apr 15, 2014

The browser does the appropriate mangling of the URI which we probably would have to implement.

@porterjamesj
Copy link
Contributor Author

For now we could just add a URI constructor that converts String inputs to ASCIIString? Won't actually handle unicode but it will at least allow one to pass things that are typed UTF8 but will conform to the ASCII character set to Requests.get, etc.

@Keno
Copy link
Contributor

Keno commented Apr 15, 2014

That sounds reasonable.

@porterjamesj
Copy link
Contributor Author

I'll make a PR.

@tanmaykm
Copy link
Member

I think the hostname part needs to be encoded in punycode (http://www.faqs.org/rfcs/rfc3492.html) and the path percent-escaped UTF8.

@IainNZ IainNZ changed the title Be more permissive about string type? Support punycode Sep 10, 2015
@malmaud
Copy link
Contributor

malmaud commented Oct 30, 2015

punycode is fairly complex - see an example implementation. Is anyone up for having a go at it?

@randyzwitch
Copy link

Roughly two years old at this point...is this still a desirable feature/anyone going to claim this?

@samoconnor
Copy link
Contributor

another ~2 years later...

The URI part of this seems to currently not-fail in HTTP.jl:

julia> x = HTTP.URI("http://☃.net")
HTTP.URI("http://☃.net")

julia> HTTP.URIs.showparts(x)
HTTP.URI("http://☃.net"
    scheme = "http",
    userinfo = "" (absent),
    host = "☃.net",
    port = "" (absent),
    path = "",
    query = "" (absent),
    fragment = "" (absent))

getaddrinfo still fails

julia> HTTP.get("http://☃.net")
ERROR: non-ASCII hostname:.net
Stacktrace:
 [1] getaddrinfo(::Function, ::String) at ./socket.jl:619

And there is this: https://github.com/apricis/Punycoder.jl/blob/master/punycoder.jl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants