Misc #354

Nahor · 2024-03-04T23:28:40Z

The first two (refactor) are simplification of the code.

On the last one (fix), regarding the default SourceCode implementations:

it fixes a panic if a 0-length span is at the end of the source code
line_count was totally wrong (based on doc + field name). Since it was consistently broken, one can imagine some code out there relying on that bug. So squinting a certain way, one could see this fix as a breaking change.
\r was handled as a newline, but is not supported in other places (Graphical/Narrated). So this too, depending on how one squints, it could be a breaking change or not.
Otherwise, I rewrote the code so that it is more easily understandable (opinions may vary 😝)

Benjamin-L

Aside from the unsafe blocks, this looks reasonable to me

Benjamin-L · 2024-03-19T20:21:38Z

src/handlers/graphical.rs

+                Line {
+                    line_number: context_data.line() + line_number + 1,
+                    offset: context_data.span().offset()
+                        + unsafe { line.as_ptr().offset_from(context.as_ptr()) as usize },


I believe this unsafe block is sound, but it would be good to have a // SAFETY: comment explaining the reasoning. I had to do a bit of digging to convince myself that it's not possible to have context.len() > isize::MAX.

Alternatively, you could avoid the offset_from call entirely by keeping a mutable offset variable and doing offset += line.len() or similar inside the map closure. That's a little ugly, but I'd personally prefer it to unsafe.

That's the exact opposite of what I wanted to do. The whole reason I went with offset_from is to avoid using an offset variable, because that variable is a possible source of errors if not computed right (and indeed, such problem in the previous code is why I started to look at this piece of code in the first place).

Also, from a semantic point of view, the "offset from" meaning is what we are looking for ("the offset of line from the beginning of the data buffer"). A + line.len() is only indirectly related to that info. It's only correct if, and only if, the various lines follow each other exactly, without gaps nor overlaps. In particular, it requires the use of split_inclusive instead of the more common split¹, creating, IMO, an unnecessary dependency between the split and the offset computation.

And yes it's safe per design:

The theoretical upper bound on object and array size is the maximum isize value. This ensures that isize can be used to calculate differences between pointers into an object or array and can address every byte within an object along with one byte past the end.

Footnotes

I would have used split except that it causes some other issues in some corner cases (I can't remember if it was when the data terminates with a newline or when the data was empty) ↩

- Fix a panic when a 0 length span covers the end of a document - Fix incorrect `line_count` - Add new unit tests and complete existing ones - Improve readability

zkat added the breaking A semver-major breaking change label Mar 7, 2024

Nahor mentioned this pull request Mar 19, 2024

Zero length span #356

Open

Benjamin-L reviewed Mar 20, 2024

View reviewed changes

Nahor added 3 commits March 19, 2024 18:29

refactor(get_lines): Simplify get_lines

6f8fbf8

refactor(span): Simplify checks how a span applies on a line

a37328a

fix(SourceCode): Fix and improve default SourceCode implementation

f7fa50d

- Fix a panic when a 0 length span covers the end of a document - Fix incorrect `line_count` - Add new unit tests and complete existing ones - Improve readability

Nahor force-pushed the misc branch from fd5d5e6 to f7fa50d Compare March 20, 2024 01:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc #354

Misc #354

Nahor commented Mar 4, 2024

Benjamin-L left a comment

Benjamin-L Mar 19, 2024

Nahor Mar 20, 2024

Misc #354

Are you sure you want to change the base?

Misc #354

Conversation

Nahor commented Mar 4, 2024

Benjamin-L left a comment

Choose a reason for hiding this comment

Benjamin-L Mar 19, 2024

Choose a reason for hiding this comment

Nahor Mar 20, 2024

Choose a reason for hiding this comment

Footnotes