NodeJS based website downloader

Download a website locally without any configuration right from you terminal

Note: The script is based entirely on node-webiste-scraper, an awesome website scraper library :)

Requirments

Nodejs version >= 8

Installation

npm install -g node-site-downloader

Usage

node-site-downloader download DOMAIN START_POINT OUTPUT_FOLDER [VERBOSE] [OUTPUT_FOLDER_SUFFIX] [INCLUDE_IMAGES]

Example

# Download all of the english jest documentation
node-site-downloader download -s https://jestjs.io/docs/en/getting-started -d https://jestjs.io/docs/en/ -o jest-docs -v --include-images

For more information please run

node-site-downloader --help
node-site-downloader download --help

Docker support

Now you can run the downloader straight from a docker container. This way there is no need to download nodejs and install node-site-downloader.

Instead please pull the image from dockerhub

docker pull gnird/node-site-downloader

And then run the container with all of the relevant options passed to the script (Please check the options section), except for --output-folder.

--output-folder isn't passed to the container because the script saves the site inside of the container.

Instead configure a volume from a folder on your computer to /data in the container.

docker run -v /some/path:/data ...

Docker example

docker run -v /tmp/mysite:/data gnird/node-site-downloader download -d https://jestjs.io/docs/en/ -s https://jestjs.io/docs/en/getting-started -v

NOTICE: The first -v configures the volume for the container and the second -v (at the end of the command) is passed to the script in order to make it verbose.

Options

domain (-d) - The script will download all of the urls under the specified url.
start point (-s) - The page from which the script should start scraping
include-images (--include-images) - Should the script download relevant images as well?
output folder (--output-folder) - The folder in which the script should save the downloaded assets, Note: The folder should not exist!
verbose (-v) - If flag is present the script will print every url that was downloaded.
output folder suffix (--output-folder-suffix) - The suffix that will be added to OUTPUT_FOLDER, defaults to: .site

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NodeJS based website downloader

Requirments

Installation

Usage

Example

Docker support

Docker example

Options

Files

README.md

Latest commit

History

README.md

File metadata and controls

NodeJS based website downloader

Requirments

Installation

Usage

Example

Docker support

Docker example

Options