Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add keep-error-msg feature #740

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

stevefan1999-personal
Copy link

@stevefan1999-personal stevefan1999-personal commented Oct 28, 2024

This PR allows the user to remove the message embedded into each Error of the read/write/build crates, since Rust is terrible at removing dead code. The string will stay due to Error construction, which is effectively discarded, even if the user doesn't care about the error message (aka just want to have a boolean indicating error), which means precise ROM space wasted.

@philipc
Copy link
Contributor

philipc commented Oct 28, 2024

This is the sort of thing for which you should have created an issue first.

Are you using object directly, or as a dependency of something else? Are you using all of read/write/build? Are you aware of any other crates that have a feature like this?

@stevefan1999-personal
Copy link
Author

This is the sort of thing for which you should have created an issue first.

Are you using object directly, or as a dependency of something else? Are you using all of read/write/build? Are you aware of any other crates that have a feature like this?

I'm using object directly for a PE resolver, and I noticed there are useless strings in the binary that is unused which cannot be eliminated by DCE

@philipc
Copy link
Contributor

philipc commented Oct 29, 2024

So can I infer that the answers to my other questions are you are only using the read module, and you're not aware of any other crates with a feature like this.

How many bytes does this save you, and how large is your image?

A feature like this doesn't come without cost in terms of usability. For example, all existing users that use default-features = false must now add this feature if they want the error message. It'd be better if we could solve the issue without adding a new feature.

@stevefan1999-personal
Copy link
Author

stevefan1999-personal commented Oct 29, 2024

So can I infer that the answers to my other questions are you are only using the read module, and you're not aware of any other crates with a feature like this.

How many bytes does this save you, and how large is your image?

A feature like this doesn't come without cost in terms of usability. For example, all existing users that use default-features = false must now add this feature if they want the error message. It'd be better if we could solve the issue without adding a new feature.

Yes, at the moment I'm only using the read module, and although this only saved me merely 1k in binary size in total just by trimming out the error. I need to test if fat LTO will also help out though. It is not just the error message, but also other things such as extra instructions, space allocation, memcpy due to object construction. There is a lot of them and those error construction also prevented code inlining.

My guts feeling tells me that we should make those error message categorized as enum so we can better inline it (in other word, like string intern).

@philipc
Copy link
Contributor

philipc commented Oct 29, 2024

It is not just the error message, but also other things such as extra instructions, space allocation, memcpy due to object construction. There is a lot of them and those error construction also prevented code inlining.

I haven't looked at the code generation, but my instinct is there should be little overhead for &'static str. What need is there for space allocation and memcpy?

My guts feeling tells me that we should make those error message categorized as enum so we can better inline it (in other word, like string intern).

Yes, I think it's a better option than a feature, but I'm not sure it's not going to be a huge difference in code size (potentially changes from a 128-bit slice pointer, to a u8 or u16).

@stevefan1999-personal
Copy link
Author

It is not just the error message, but also other things such as extra instructions, space allocation, memcpy due to object construction. There is a lot of them and those error construction also prevented code inlining.

I haven't looked at the code generation, but my instinct is there should be little overhead for &'static str. What need is there for space allocation and memcpy?

Because Rust requires you to construct the whole object (that includes everything even if you don't need it) in order to maintain compliant drop semantics, even if you don't use the error yourself. Now you also need to take in account for the extra code that constructs those objects as well, because LLVM is not going to smartly deduce your struct to become a ZST even if you don't use the &'static str or String at all, it has to be very conservative, and that's why it takes a lot more space and code than expected and makes a lot more savings.

My guts feeling tells me that we should make those error message categorized as enum so we can better inline it (in other word, like string intern).

Yes, I think it's a better option than a feature, but I'm not sure it's not going to be a huge difference in code size (potentially changes from a 128-bit slice pointer, to a u8 or u16).

We could use derive_more, and it also supports no_std which nicely fit our use case. You can think of it as a better version of thiserror.

@philipc
Copy link
Contributor

philipc commented Nov 10, 2024

I still don't understand what you mean. read::Error is simply a newtype for &'static str. There is no whole object to construct. It's just a pointer+length that can be copied around. It should be able to exist in registers only. If I change it to a u8, it only saves 4.5k of code for the readobj example (which includes a lot of error handling and should cover the majority of the possible errors).

derive_more is not an option. We avoid dependencies where possible, both because this crate is used in libstd via backtrace-rs, and because it reduces the amount of maintenance.

A downside of using an enum is that users who do want to print the errors will be forced to pull in the strings for every possible error, not just the ones they might encounter, so it's still a tradeoff. It's also more effort to write the code, since the error definition is separate from the code where the error occurs.

There will need to be significant benefits to do any change in this area, and size reduction of 1k is not significant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants