GitHub - gnir-work/node-site-downloader: An easy to use CLI for downloading websites for offline usage

NodeJS based website downloader

Download a website locally without any configuration right from you terminal

Note: The script is based entirely on node-webiste-scraper, an awesome website scraper library :)

Requirments

Nodejs version >= 8

Installation

npm install -g node-site-downloader

Usage

node-site-downloader download DOMAIN START_POINT OUTPUT_FOLDER [VERBOSE] [OUTPUT_FOLDER_SUFFIX] [INCLUDE_IMAGES]

Example

# Download all of the english jest documentation
node-site-downloader download -s https://jestjs.io/docs/en/getting-started -d https://jestjs.io/docs/en/ -o jest-docs -v --include-images

For more information please run

node-site-downloader --help
node-site-downloader download --help

Docker support

Now you can run the downloader straight from a docker container. This way there is no need to download nodejs and install node-site-downloader.

Instead please pull the image from dockerhub

docker pull gnird/node-site-downloader

And then run the container with all of the relevant options passed to the script (Please check the options section), except for --output-folder.

--output-folder isn't passed to the container because the script saves the site inside of the container.

Instead configure a volume from a folder on your computer to /data in the container.

docker run -v /some/path:/data ...

Docker example

docker run -v /tmp/mysite:/data gnird/node-site-downloader download -d https://jestjs.io/docs/en/ -s https://jestjs.io/docs/en/getting-started -v

NOTICE: The first -v configures the volume for the container and the second -v (at the end of the command) is passed to the script in order to make it verbose.

Options

domain (-d) - The script will download all of the urls under the specified url.
start point (-s) - The page from which the script should start scraping
include-images (--include-images) - Should the script download relevant images as well?
output folder (--output-folder) - The folder in which the script should save the downloaded assets, Note: The folder should not exist!
verbose (-v) - If flag is present the script will print every url that was downloaded.
output folder suffix (--output-folder-suffix) - The suffix that will be added to OUTPUT_FOLDER, defaults to: .site

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
libs		libs
.gitignore		.gitignore
.npmignore		.npmignore
.prettierrc		.prettierrc
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
get_image_name.sh		get_image_name.sh
index.js		index.js
jsconfig.json		jsconfig.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NodeJS based website downloader

Requirments

Installation

Usage

Example

Docker support

Docker example

Options

About

Releases

Packages

Languages

License

gnir-work/node-site-downloader

Folders and files

Latest commit

History

Repository files navigation

NodeJS based website downloader

Requirments

Installation

Usage

Example

Docker support

Docker example

Options

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages