-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: improve handling of leading punctuation removal #10761
refactor: improve handling of leading punctuation removal #10761
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please fix the lint errors |
Thank you for pointing that out. I have addressed all the lint errors flagged during the review and committed the fixes. |
As the logic is not as clear in readability as before, for the maintainability, I would suggest extracting to a common util method for this, and the proper unit tests for is very import for this, too. To be honest, I accept the purpose of this PR, but I don't understand the regex in the change and how it satisfies the goal. |
Thank you for your feedback! Here's an explanation of the regex and its purpose:
The purpose of this logic is to clean up the string so it doesn't begin with any unnecessary symbols or punctuation. If you feel this approach works, I can extract the logic into a reusable utility method (e.g., |
Thanks for the explanation, SGTM. And please make sure unit tests provided, at least covering the minimum cases is alright to me. |
This commit finalizes the logic for removing leading punctuation and symbols by adopting a Unicode range regex pattern. The new approach improves compatibility and avoids reliance on advanced regex features such as Changes:
This update simplifies the implementation while maintaining the intended behavior. All related unit tests pass, ensuring correctness. |
why we need to remove the leading chars like |
why we need to remove |
Summary
This refactor improves the logic for trimming leading punctuation from text content. It replaces the use of
startswith
checks with a regular expression, enabling broader support for diverse punctuation and symbols. Additionally,.strip()
has been incorporated to ensure removal of trailing whitespace for cleaner output.Motivation and Context:
Dependencies:
No additional dependencies were introduced with this change.
Checklist
Important
Please review the checklist below before submitting your pull request.
dev/reformat
(backend) andcd web && npx lint-staged
(frontend) to appease the lint gods