-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to generate a custom GOT layout which can be safely reproduced #6
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! nicely done
pub fn read_got_layout(location: &str, format: ConfigFormat) -> HashMap<String, u64> { | ||
let mut s = String::new(); | ||
let mut fp = File::open(location).expect("GOT layout file could not be opened"); | ||
fp.read_to_string(&mut s).expect("GOT layout could not be read from file"); | ||
|
||
match format { | ||
ConfigFormat::JSON => serde_json::from_str(&s).expect("Could not parse GOT from JSON"), | ||
ConfigFormat::TOML => toml::de::from_str(&s).expect("Could not parse GOT from TOML"), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could maybe return a Result<HashMap<String, u64>, Diagnostic>
here? we need to look at how existing RuSTy code uses ConfigFormat
and if they .unwrap()/.expect()
or propagate the error, and then copy it. I'm fine with doing .expect()
here if the rest of the codebase does it when opening config files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah looks like we do normally propagate.
let mut fp = File::create(location).expect("GOT layout file could not be opened"); | ||
fp.write_all(s.as_bytes()).expect("GOT layout could not be written to file"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here :)
write_got_layout(new_got_entries, location.as_str(), *format); | ||
|
||
// Construct our GOT as a new global array. We initialise this array in the loader code. | ||
let got_size = new_got.keys().max().map_or(0, |m| *m + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we create an empty array here if the new GOT will be empty, but otherwise we reserve 1 + max slots? so the customGOT will be of size 0, or 2, 3, 4... etc? any reason for this? not criticizing, just want to make sure I understand everything :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the map contains indices, it's 1 + max index. So if the maximum index is 0, we have 1 in the array so we should create an array of size 1. I suppose we could avoid creating the array at all if it's empty, but also it might be useful in other parts of the system to be able to assume that __custom_got
exists.
let got = self.llvm.create_global_variable( | ||
self.module, "__custom_got", BasicTypeEnum::ArrayType( | ||
Llvm::get_array_type( | ||
BasicTypeEnum::PointerType(self.llvm.context.i8_type().ptr_type(0.into())), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we fill the custom GOT with NULL
uint8_t
pointers, correct? any reason for not using void*
? (that might not even be a thing for LLVM IR?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah void *
isn't a thing for LLVM IR it's always i8*
(until recently, where we have the option of 'opaque pointers' where the type is never specified, but I don't think this frontend has made the switch yet)
self.module, "__custom_got", BasicTypeEnum::ArrayType( | ||
Llvm::get_array_type( | ||
BasicTypeEnum::PointerType(self.llvm.context.i8_type().ptr_type(0.into())), | ||
got_size as u32))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one issue we have here is that this is a truncating conversion. so if got_size
is larger than u32::MAX
then we will just truncate the value to that. I think it would be better here to do
got_size as u32))); | |
got_size.try_into().expect("the computed customGOT size is too large to fit in a global. this is a compiler error"))); |
or something of the sort. what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh gosh yeah! I mean I'd hope if someone has 2^32 globals, they run into other issues before this but I'm happy to put that in.
let mut s = String::new(); | ||
let mut fp = File::open(location).expect("GOT layout file could not be opened"); | ||
fp.read_to_string(&mut s).expect("GOT layout could not be read from file"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing that confused me a little bit was that the compiler expects the GOT layout file to already exist and to contain a valid "GOT layout". I think it would be nice to create an empty hashmap if the file does not exist or if it is empty, but we can do that in another PR and ask Bachmann how they feel about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I am aware of this limitation, just wanted to get something pushed before I gave that a go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, that's completely fine!
if let Some((location, format)) = &self.got_layout_file { | ||
|
||
let got_entries = read_got_layout(location.as_str(), *format); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rustfmt gives me a couple changes here and later in the file. you can format it automatically with rustfmt --emit files src/codegen/generators/variable_generator.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh helpful, thanks!
let mut new_globals = Vec::<String>::new(); | ||
let mut new_got_entries = HashMap::<String, u64>::new(); | ||
let mut new_got = HashMap::<u64, String>::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should be able to remove the type annotations here. Tried it out locally with the other changes I suggested applied, but it should work even without them.
let mut new_globals = Vec::<String>::new(); | |
let mut new_got_entries = HashMap::<String, u64>::new(); | |
let mut new_got = HashMap::<u64, String>::new(); | |
let mut new_globals = Vec::new(); | |
let mut new_got_entries = HashMap::new(); | |
let mut new_got = HashMap::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah fair point! I just find it so bizarre that Rust does this.
|
||
// Construct our GOT as a new global array. We initialise this array in the loader code. | ||
let got_size = new_got.keys().max().map_or(0, |m| *m + 1); | ||
let got = self.llvm.create_global_variable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, maybe I used it wrong but I don't think this ends up being committed to the LLVM IR file/resulting binary? I can't seem to find the __custom_got
global variable but maybe I'm just blind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, I thought I gave that a try already. I'll see what happens
let mut s = String::new(); | ||
if !Path::new(location).is_file() { | ||
// Assume if the file doesn't exist that there is no existing GOT layout yet. write_got_layout will handle | ||
// creating our file when we want to. | ||
return Ok(HashMap::new()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great - I would move the String
creation right after the check to make sure we don't allocate one for nothing, but otherwise it's solid and the check is good :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So something like
if !Path::new(location).is_file() {
// Assume if the file doesn't exist that there is no existing GOT layout yet. write_got_layout will handle
// creating our file when we want to.
return Ok(HashMap::new());
}
let mut fp = File::open(location).map_err(|_| Diagnostic::new("GOT layout file could not be opened"))?;
let mut s = String::new();
fp.read_to_string(&mut s).map_err(|_| Diagnostic::new("GOT layout could not be read from file"))?;
also, last nitpick I promise: fp.read_to_string(&mut s)
can be replaced by fs::read_to_string
which is a helper method to do the same thing. so this would become
if !Path::new(location).is_file() {
// Assume if the file doesn't exist that there is no existing GOT layout yet. write_got_layout will handle
// creating our file when we want to.
return Ok(HashMap::new());
}
let s = fs::read_to_string(location).map_err(|_| Diagnostic::new("GOT layout could not be read from file"))?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks! Do you also thing fs::write
is a good helper to use when we're doing the writing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I think it's good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :D great stuff!
82026c7
to
63ab538
Compare
This commit contains changes that are required to generate a global variable which will be used to store an array of addresses of other global variables in the program. The location of the globals within this array should be stable after a recompilation to support online change, so we support loading a pre-existing layout to ensure that for all globals that we see in both the current program and the pre-existing layout, their positions remain the same. Otherwise, any new globals will be placed in any empty spaces left by old globals, or appended on to the end of the array. The layout will then be saved back the the file used for saving and loading. Currently, to use this feature the flag `--got-layout-file=<file>` must be provided, which should specify the name of either a TOML or JSON file to use to save, and optionally load if the file already exists, the GOT layout. In future we will integrate this with a generic online change flag, whereby it will not be necessary to ask for a GOT layout file when we already know that we need it for online change.
63ab538
to
58b262e
Compare
This PR covers the addition of a method of placing all global variables into a GOT and recording their indices such that if a recompilation is required, the indices of global variables that are retained can remain the same, but new globals can be inserted in any empty spaces. More changes are needed, for example to create a file to write to if it doesn't already exist.