Skip to content

acezxn/Anyscrape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Anyscrape

Scrape any public data with ease

Purpose

The first step to write a typical web scraper is to identify which element to scrape. As a result, an immense amount of work is done to research the structure of the web. Even worse, some websites have dynamically changing information on html elements or structures that are unfriendly to web scrapers.

The library is intended to circumvent anti-scraping structures, and simplify the process of web scraping by leaving a large room of customization.

To simplify the configuration of the web scraper, anyscrape-reader is a GUI application that allows intuitive element selection, and generation of filters.

Features

Anyscrape

  • Tag name filtering
  • Attribute filtering
  • Location based filtering with linear expressions integrated
  • Delay customization
  • Cookie support

Reader

  • Interactive html element selection
  • Automatic attribue detection and filtering
  • Element location detection
  • Export ability for configurations

For example usage of the library, see examples

About

⚡️An overengineered web scraper⚡️

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published