-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce til::rle - a run length encoded vector #10099
Conversation
## Summary of the Pull Request Introduces `til::rle`, a vector-like container which stores elements of type T in a run length encoded format. This allows efficient compaction of repeated elements within the vector. ## References * #8000 - Supports buffer rewrite work. A re-use of `til::rle` will be useful as a column counter as we pursue NxM storage and presentation. * #3075 - The new iterators allow skipping forward by multiple units, which wasn't possible under `TextBuffer-/OutputCellIterator`. Additionally it also allows a bulk insertions. * #8787 and #410 - High probability this should be `pmr`-ified like `bitmap` for things like `chafa` and `cacafire` which are changing the run length frequently. ## PR Checklist * [x] Closes #8741 * [x] I work here. * [x] Tests added. * [x] Tests passed. ## Validation Steps Performed * [x] Ran `cacafire` in `OpenConsole.exe` and it looked beautiful * [x] Ran new suite of `RunLengthEncodingTests.cpp` Co-authored-by: Michael Niksa <[email protected]>
@msftbot make sure @miniksa signs off on this |
Hello @DHowett! Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:
If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's perfect, and I love it. I can only find fault in a comment, and I did attempt to understand the algorithm. I've kicked the tires with my buffer implementation, as well, and it looks great. Excellent work!
template<typename T, typename S = std::size_t, std::size_t N = 1> | ||
using small_rle = basic_rle<T, S, boost::container::small_vector<rle_pair<T, S>, N>>; | ||
#endif | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you wanted to add one more class template (for fun!) i would suggest til::pmr::rle
that uses a std::pmr::vector
. However! basic_rle can't take an Allocator type, and this seems like an unnecessary cost for something we do not currently need. 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I meant "type alias" instead of "class template"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very very pleased. You took my start and you made it shine. I love it. Excellent work.
Just a few little comments to clean up that I'd like to see the answers for before I sign for merge.
// rle_pair is a simple clone of std::pair, with one difference: | ||
// copy and move constructors and operators are explicitly defaulted. | ||
// This allows rle_pair to be std::is_trivially_copyable, if both T and S are. | ||
// --> rle_pair can be used with memcpy(), unlike std::pair. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooooh nifty.
// | ||
// | ||
// | ||
// MUST READ: How this function (mostly) works |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this comment so much. It is exactly what I was hoping for.
|
||
VERIFY_ARE_EQUAL("1|3 3|2|1 1 1|5 5"sv, rle); | ||
// empty | ||
VERIFY_ARE_EQUAL(""sv, rle.slice(0, 0)); // begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, VERIFY_ARE_EQUAL
takes a 3rd parameter which can be a string you want printed out in the log output so you can more easily tell which sub-test this is. (If you wanted to do that over the comments.)
using value_type = typename rle_vector::value_type; | ||
|
||
public: | ||
static bool AreEqual(const ::std::string_view& expected, const rle_vector& actual) noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super elegant and I love it.
This comment has been minimized.
This comment has been minimized.
@miniksa @DHowett The most recent commit changes the way the iterator works: Instead of 1-based indices it's now based on 0-based ones. I felt that this fits better with the the container class itself, which is also written with 0-base indexes in mind. It also fixes a off-by-one error in |
Ooh, you made the static analyzer mad! |
operator-=(1); | ||
if (_pos == 0) | ||
{ | ||
--_it; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the vector [debug] iterator catch the out of bounds move if the user is at [0]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The std::vector
one will. I'm not sure about the one in boost.
But this code is functionally equivalent to the operator+=
implementation (just much simpler for inlining).
Yeah that's probably wise. I don't know what drugs I was on when I wrote it 1-based. Thanks for fixing it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good. Thanks for taking this over the finish line. It looks great. Minorly sad that some of the comments got removed from the iterators, but you're probably right to do so as they really weren't describing the "why".
Hello @lhecker! Because this pull request has the p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
🎉 Handy links: |
Summary of the Pull Request
Introduces
til::rle
, a vector-like container which stores elements oftype T in a run length encoded format. This allows efficient compaction
of repeated elements within the vector.
References
til::rle
will beuseful as a column counter as we pursue NxM storage and presentation.
which wasn't possible under
TextBuffer-/OutputCellIterator
.Additionally it also allows a bulk insertions.
pmr
-ifiedlike
bitmap
for things likechafa
andcacafire
which are changing the run length frequently.
PR Checklist
Validation Steps Performed
cacafire
inOpenConsole.exe
and it looked beautifulRunLengthEncodingTests.cpp
Co-authored-by: Michael Niksa [email protected]