Guide to Running Web Interface on AWS EC2 #87

mattyjacks · 2024-10-21T16:15:54Z

I managed to get this thing working via AWS EC2! YAY! I decided to write a little guide on it.

First thing is you launch an Ubuntu Server 24.04 LTS instance, I use a t2.xlarge ($0.18 per hour) (you can turn it off when you're not using it to save money) with 25 GB of storage.

Then you connect to the instance. Using EC2 Instance Connect with default username is fine.

Here are the commands you have to run:

git clone https://github.com/gosom/google-maps-scraper.git

sudo apt install golang-go

sudo apt-get update

sudo apt install golang-go

sudo apt-get install libatk1.0-0 libatk-bridge2.0-0 libcups2 libatspi2.0-0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libpango-1.0-0 libcairo2 liboss4-salsa-asound2

sudo apt-get install liboss4-salsa-asound2

sudo apt-get update

sudo apt-get upgrade

sudo apt install nodejs npm

sudo npm install -g playwright

sudo apt-get install libasound2 libasound2-plugins

rm -rf ~/.cache/ms-playwright

playwright install

sudo npx playwright install-deps

uname -m

npx playwright install firefox

npx playwright install webkit

cd google-maps-scraper

go mod download

go build

(Adjust the number after -c depending on the number of cores your EC2 instance has, 1 less than the number of cores you have, the EC2 Instance I chose has 4 cores)

./google-maps-scraper -web -c 3

Edit inbound security group rules of the EC2 Instance to allow 8080 port range from anywhere

Visit the port 8080 of the public IP address of the EC2, like 54.147.206.100:8080 , be sure to use HTTP instead of HTTPS or it won't connect

Above is what the scraper looks like in action.

THANK YOU @gosom FOR YOUR WONDERFUL TOOL!

The text was updated successfully, but these errors were encountered:

gosom · 2024-10-21T16:21:16Z

@mattyjacks it is nice that it works for you but I have a few points here:

(1) The webapp is NOT DESIGNED (at the moment ) to be publicly available for security purposes. I HIGHLY recommend you IMMEDIATELY allow ONLY your ip to access the tool until an authentication system is in place.

(2) I think it's easier to run it via a docker container.

Thank you very much for trying this into AWS

mattyjacks · 2024-10-21T16:29:14Z

Thank you for the quick response.

1: I'll be shutting down the tool as soon as this scrape-job is finished (to save money), and when I revive it a new IP address will be assigned from Amazon anyways. I wasn't planning on sharing the IP address that would let others access it.

In response to 2: Yeah, probably. I've never used docker before, tho.

I'm overall very satisfied with the result. One huge advantage of the AWS EC2 approach is it's not tying my IP address to the scraping activity in Google's eyes. Pretty paranoid about getting banned from Google.

gosom · 2024-10-21T16:32:21Z

Even if you do not sharing the IP this is still not safe. People might break into your server.

I recommend in the firewall just to allow connections from your IP address.

Additionally, you might consider using proxies if you want to mask your IP address.

In any case the tool is for educational purposes only.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guide to Running Web Interface on AWS EC2 #87

Guide to Running Web Interface on AWS EC2 #87

mattyjacks commented Oct 21, 2024

gosom commented Oct 21, 2024

mattyjacks commented Oct 21, 2024

gosom commented Oct 21, 2024

Guide to Running Web Interface on AWS EC2 #87

Guide to Running Web Interface on AWS EC2 #87

Comments

mattyjacks commented Oct 21, 2024

gosom commented Oct 21, 2024

mattyjacks commented Oct 21, 2024

gosom commented Oct 21, 2024