Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StrExt: to_lowercase_smolstr & friends #69

Merged
merged 1 commit into from
Feb 8, 2024

Conversation

alexheretic
Copy link
Contributor

Add smol_str::StrExt trait providing methods for strs producing SmolStrs.

  • to_lowercase_smolstr
  • to_uppercase_smolstr
  • to_ascii_lowercase_smolstr
  • to_ascii_uppercase_smolstr

These allow case conversion of small strings sans allocation.

The trait is sealed to allow adding additional methods in future without breaking semver.

Resolves #68

Copy link
Member

@Veykril Veykril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through #68, I would've expected these to be just inherent methods on SmolStr that would shadow the str versions. I am not a big fan of extension traits, and the implementation exposed here could be implemented by downstream users a like (unlike the inherent methods making use of the hidden repr to clone accordingly)

@alexheretic
Copy link
Contributor Author

alexheretic commented Feb 7, 2024

I would've expected these to be just inherent methods on SmolStr that would shadow the str versions

That would be a breaking change I think, as code that currently compiles to return String would start returning SmolStr. Though they could be SmolStr methods with distinct names.

The other advantage of the extension trait is these methods can be called on &str / anything that can produce a &str making it more flexible.

the implementation exposed here could be implemented by downstream users a like

This isn't true as the trait is sealed.

@alexheretic
Copy link
Contributor Author

An example of why having &str methods is better is: Strip prefix then to lowercase. If the SmolStr lowercase method is only for SmolStr a user would have to:

  • convert the stipped str to SmolStr (potentially allocating)
  • call SmolStr::to_lowercase (potentially allocating again)

So this could actually allocate more than just using str::to_lowercase.

Whereas StrExt::to_lowercase_smolstr would only allocate 0-1 times here.

There are tons of methods manipulating strings -> &str that this applies to.

the implementation exposed here could be implemented by downstream users a like

This isn't true as the trait is sealed.

🤔 or perhaps you meant that this logic doesn't need to be in the crate because it can be implemented outside? That is true, but this is generally useful code that would fit here for a similar reason to_lowercase fits in std.

@Veykril
Copy link
Member

Veykril commented Feb 8, 2024

That would be a breaking change I think,

Well we can always bump the minor version, that's not an issue.

So this could actually allocate more than just using str::to_lowercase.

Whereas StrExt::to_lowercase_smolstr would only allocate 0-1 times here.

There are tons of methods manipulating strings -> &str that this applies to.

That's fair, I just got a bit confused as the issue you opened seem to imply the things that what I wrote here, instead of what the PR tackles 😅 (I do think there is value in the inherent methods nevertheless, as that would be better when going from SmolStr to SmolStr via StrExt::to_lowercase_smolstr)

or perhaps you meant that this logic doesn't need to be in the crate because it can be implemented outside

That was what I meant yes.

I'll merge it for now as it seems reasonable given the perf implications. Thanks!

@Veykril Veykril merged commit fe7064e into rust-analyzer:master Feb 8, 2024
1 check passed
@alexheretic alexheretic deleted the str-ext branch February 8, 2024 17:35
@alexheretic
Copy link
Contributor Author

Yep i changed my outlook after raising the issue and actually implementing. Sorry for the confusion.

I think it would be possible to add specialised SmolStr methods later if we had optimisations in mind using the internals. But the "clone an arc/static if already lowercase" case i don't think works in practice as you have to iterate the whole string to determine that.

For small strings i think StrExt is already optimal as is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Optimised str methods: `to_*case
2 participants