Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pfBlockerNG: extend support for AdBlock-style lists #1303

Closed

Conversation

andrebrait
Copy link

@andrebrait andrebrait commented Oct 6, 2023

This supersedes #1302 and closes #14838.

Extend support for AdBlock-style lists:

  • Unbound mode:
    • Support whitelist entries, including wildcards
  • Python mode:
    • Support whitelist entries, including wildcards
    • Expand blacklist entry support to also handle wildcards

In Python mode, most processing was moved to Python, with the surrounding code merely assembling lists that get consumed by it.

User-defined, TOP1M and Whitelist entries from AdBlock-style lists get all joined in a single file (PHP) that gets loaded and parsed later (Python).

Whenever possible, simple domain matches are kept as-is. The presence of any wildcards triggers the conversion of that specific entry into a Regular Expression in the Python format.

In Unbound mode, everything is kept the same, except that whitelisting is now done in 3 distinct steps (User-defined and TOP1M using fixed string matching with ggrep, and Whitelist entries from AdBlock-style lists using extended regular expressions with ggrep), in a postprocessing step separate from the deduplication, etc.

In both cases, the # White count in the logs now refers to how many whitelist entries were found, rather than how many domains were removed from the Blacklist.

This still needs testing and input on where to log things in a more adequate fashion.

* Use less memory, write whitelist to disk right away
* Fix regular expression conversion to disallow matching limiter chars
* Do not recreate whitelist file for each alias
* Use a dedicated log file for whitelisting results
* Restrict detailed view on whitelisted domains to log file
* Make the new file visible via the Logs page
* Missing:
* TOP1M support
* Whitelist support
* Testing it
@andrebrait andrebrait force-pushed the pffblockerng_devel_whitelist_regex branch from 60ecbea to b26ee61 Compare October 6, 2023 16:37
* This is the intended behavior for them
* Enabled resolving multiple wildcards for blacklists
@andrebrait andrebrait force-pushed the pffblockerng_devel_whitelist_regex branch from b26ee61 to b9a6874 Compare October 7, 2023 19:57
@andrebrait
Copy link
Author

Idea: move the .whitelist files under the Permit files category in the Logs tab.

@andrebrait andrebrait force-pushed the pffblockerng_devel_whitelist_regex branch 2 times, most recently from 77537b7 to f926b71 Compare October 12, 2023 22:45
@andrebrait andrebrait force-pushed the pffblockerng_devel_whitelist_regex branch from f926b71 to 40d5d9f Compare October 12, 2023 22:48
@andrebrait
Copy link
Author

andrebrait commented Feb 8, 2024

Superseded by #1343

@andrebrait andrebrait closed this Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant