Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reading orders dectors to support any class that has a bounding box/PdfRectangle #855

Open
davebrokit opened this issue Jun 25, 2024 · 1 comment
Labels
document-reading Related to reading documents enhancement

Comments

@davebrokit
Copy link
Contributor

davebrokit commented Jun 25, 2024

Currently the interface IReadingOrderDetector relies on TextBlock as a parameter. This limits it's use to the TextBlock class.

I propose adding an IBoundingBox interface

public interface IBoundingBox
{
    PdfRectangle BoundingBox { get; }
}

Then changing IReadingOrderDector interface and implementing classes to use IBoundingBox as it's parameter

Adding an overload that takes a Func<T, PdfRectangle> would allow the caller to specify any bounding box making the interface more useful.

Breaking changes: The IReadingOrderDector will instead return an IReadOnlyList<T> which will be the ordered results. This would mean TextBlock.ReadingOrder is not set which is a breaking change. But some code can be added that if type T is TextBlock then ReadingOrder is set

Happy to make the changes

@BobLd
Copy link
Collaborator

BobLd commented Jun 25, 2024

@davebrokit I was thinking of doing similar, please go ahead and implement your idea.

I did a similar interface for my project https://github.com/BobLd/Caly/blob/master/Caly.Pdf/Models/IPdfTextElement.cs feel free to reuse that or not.

I think the Letter class has a method instead of a property to get the bounding box. Might be a good opportunity to change that too (in my mind, the letters, text lines and text block should implement your interface, but please let me know what you think)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
document-reading Related to reading documents enhancement
Projects
None yet
Development

No branches or pull requests

3 participants