Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more HTML elements and CSS attributes and fix CSS filter crashes #990

Merged
merged 20 commits into from
Nov 11, 2024

Conversation

Copy link
Contributor

@bertm bertm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable but please do add a test for this. The bare minimum would be something like / an extension to what is introduced in
8d12ea0

@@ -206,8 +215,8 @@ public class ElementInfo {
"nth-last-child",
"nth-of-type",
"nth-last-of-type",
"link",
"visited",
"link", // inverse of visited, probably should be blocked?
Copy link
Contributor

@ArneBab ArneBab Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: you’re right. This needs to be added into the BANNED_PSEUDOCLASS. It’s less effective than visited, because it can only check for not existing pages, but still too much access to history. Good catch!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A future improvement to keep pages working could be to transparently rewrite "link" to "any-link". But that may be beyond the scope of this PR (no need to fix everything in one step).

"empty",
"enabled",
"focus-visible",
"dir",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is dir necessary here? We already have a dedicated parsing rule for it. If it is: please add a test that fails without your change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dir(ltr) and dir(rtl) as pseudoclasses, added here.

"target",
"any-link",
"default",
"defined",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not have custom element support (it requires javascript), so defined does not make sense. Please remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove it or add it into the BANNED_PSEUDOCLASS?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BANNED_PSEUDOCLASS is better: CSSTokenizerFilter::HTMLElementVerifier defines that as banned (but otherwise valid) selector.

"enabled",
"focus-visible",
"dir",
"indeterminate",
Copy link
Contributor

@ArneBab ArneBab Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also only used for forms. Maybe split up the rules to have a dedicated set of rules for forms (for easier understanding of the meaning of these rules)?

public static boolean isColor(String value)
{
value=value.trim();
value=value.trim().toLowerCase();
Copy link
Contributor

@ArneBab ArneBab Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! (feel free to mark this resolved once you read it ☺)

public void testInvalidColors() {
assertFalse(FilterUtils.isColor("rgb(0.1 0.2 0.3)"));
assertFalse(FilterUtils.isColor("rgb(/)"));
assertFalse(FilterUtils.isColor("#ABCDEFGH"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add more color-tests? switching to regexes usually needs tests for edge-cases.

@ArneBab
Copy link
Contributor

ArneBab commented Oct 21, 2024

This is a neat addition, but the added elements still need tests. I know that these can be annoying to write, since they feel obvious, but the last two times I omitted them and Bombe reminded me with an ENOTEST, one of those would have actually shown a bug (that luckily got found before merging when I added the test Bombe had requested).

⇒ please do test. These filters are critical for safe operation in browsers that may query into the clearnet on errors here.

@torusrxxx
Copy link
Contributor Author

Some more tests have been added. However, tests can only be written for expected inputs, but problems occur for unexpected inputs which cannot be tested before you find a problem! I already found two bugs that can crash the CSS validator within just a few days - very scary indeed!

@torusrxxx torusrxxx changed the title Allow bdi and main elements and auto value of dir attribute Support more HTML elements and CSS attributes and fix CSS filter crashes Oct 28, 2024
@ArneBab
Copy link
Contributor

ArneBab commented Nov 8, 2024

Please note when you finished adding elements so I can review it again.
(I need a stable state for reviewing)

New elements can then go into a separate additional PR/branch (rebasing that after this PR is merged should then make the other PR good).

@torusrxxx
Copy link
Contributor Author

Ok, I will freeze this one from now on. I do have further plans, but it seems getting code reviewed in this project is not as quickly as my first contribution being released in a new version in just a day. Hope you will have more free time in the future!

@torusrxxx
Copy link
Contributor Author

Checking MDN for updated elements and add them here usually takes just a small amount of time that could otherwise watch a video. But I am gonna watch videos as so many things are waiting to be merged.

@ArneBab
Copy link
Contributor

ArneBab commented Nov 8, 2024

Thank you! I’d love to be faster, but this is all volunteer-work by a handful of people that we can only do when we have the resources besides family and paying work. That’s why activity wanes and waxes depending on current energy levels.

I’d love to merge more quickly, but I have to review dilligently. We have to stay clear of the fine line between powerful CSS and accidental Turing Completeness (which would open up pages for attacks): https://beza1e1.tuxen.de/articles/accidentally_turing_complete.html

@@ -137,6 +137,10 @@ public class CSSParserTest {
CSS2_BAD_SELECTOR.add("h1[foo,=bar] {}");

CSS2_BAD_SELECTOR.add("h1:langblahblah(fr) {}");
// java.lang.StringIndexOutOfBoundsException
CSS2_BAD_SELECTOR.add("h1:golang {}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

@ArneBab
Copy link
Contributor

ArneBab commented Nov 11, 2024

I now reviewed all the commits since my last review ( #990 (comment) ). Once you address the comments, I’ll gladly merge to next.

(no need to address "good catch" ☺)

Thank you!

@ArneBab ArneBab merged commit a6ba4c4 into hyphanet:next Nov 11, 2024
1 check passed
@ArneBab
Copy link
Contributor

ArneBab commented Nov 11, 2024

Merged — thank you!

@torusrxxx torusrxxx deleted the patch-5 branch November 13, 2024 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants