Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encode space to '%20' as per url standard #928

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kaffarell
Copy link

Previously the space character was exclusively encoded to '+'. This is wrong, as the URL Standard 0 specifies that the default is '%20'. Another function has been introduced as well, which replicates the old behavior and converts spaces to '+'.
Notice that this breaks the default behavior and could lead to bugs.

Fixes: #927
Fixes: #888

@kaffarell
Copy link
Author

@valenting let me know what you think!

Previously the space character was exclusively encoded to '+'. This is
wrong, as the URL Standard [0] specifies that the default is '%20'.
Another function has been introduced as well, which replicates the old
behavior and converts spaces to '+'.
Notice that this breaks the default behavior and could lead to bugs.

[0]: https://url.spec.whatwg.org/#string-percent-encode-after-encoding

Fixes: servo#927
Fixes: servo#888

Signed-off-by: Gabriel Goller <[email protected]>
@Manishearth
Copy link
Member

I am wary about breaking upstream users, especially the browser users that use this crate (who will likely care about nitty gritty url spec behavior)

@valenting @mrobinson can you check whether the Gecko/Servo callsites are all ones that need spaceIsPlus by default?

We can invert which behavior is the default here to solve this if needed.

Copy link

codecov bot commented Apr 30, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 1 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (main@de947ab). Click here to learn what that means.

❗ Current head c23893a differs from pull request most recent head 4e3f1b5. Consider uploading reports for the commit 4e3f1b5 to get more accurate results

Files Patch % Lines
form_urlencoded/src/lib.rs 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #928   +/-   ##
=======================================
  Coverage        ?   81.88%           
=======================================
  Files           ?       20           
  Lines           ?     3550           
  Branches        ?        0           
=======================================
  Hits            ?     2907           
  Misses          ?      643           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@valenting
Copy link
Collaborator

I am wary about breaking upstream users, especially the browser users that use this crate (who will likely care about nitty gritty url spec behavior)

@valenting @mrobinson can you check whether the Gecko/Servo callsites are all ones that need spaceIsPlus by default?

We can invert which behavior is the default here to solve this if needed.

As far as I can tell servo doesn't use this method https://github.com/search?q=repo%3Aservo%2Fservo+form_urlencoded&type=code

Nor does Gecko https://searchfox.org/mozilla-central/search?q=byte_serialize&path=&case=false&regexp=false

This will be a breaking change anyway.

@Manishearth
Copy link
Member

Since this crate is used a lot should we implement this in a way that is non breaking (with reversed defaults, strongly documented), and perhaps file an issue for a potential 3.0 where we start listing breaking changes?

I suspect 99% of users will not care about this anyway. I feel like doing a 3.0 just for this is a bit much.

Fixed link to url spec.

Signed-off-by: Gabriel Goller <[email protected]>
@kaffarell
Copy link
Author

I am wary about breaking upstream users, especially the browser users that use this crate (who will likely care about nitty gritty url spec behavior)

This is a breaking change, although a very small one. As you said it will probably only affect those who care about the url spec, which in turn, should know this case already!

Since this crate is used a lot should we implement this in a way that is non breaking (with reversed defaults, strongly documented), and perhaps file an issue for a potential 3.0 where we start listing breaking changes?

I have to disagree with the reversed defaults. It's better if we do a 3.0 release and keep it this way, otherwise we will not be 'url-spec compliant'. The url-spec clearly states that the default-behavior should be '%20' and not '+'.

IMO a 3.0 release (or even a minor release if we're feeling adventurous :) ) is the way to go!

@gibfahn
Copy link

gibfahn commented Jun 21, 2024

It would be nice to have an opt-in way to do this at least (which wouldn't have to be a breaking change hopefully), that way users who need this encoding can still use this crate while plans for a breaking change are worked out.

If nothing else this will probably break a bunch of tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants