-
Notifications
You must be signed in to change notification settings - Fork 9
Geocoding with DeGAUSS
The input file must be a CSV file with a column containing an address string. Other columns may be present and will be returned in the output file, but should be kept to a minimum to reduce file size.
An example input CSV file (called my_address_file.csv
) might look like:
id,address
001,3333 Burnet Ave Cincinnati OH 45229
002,660 Lincoln Avenue Cincinnati OH 45229
003,2800 Winslow Avenue Cincinnati OH 45206
Please see our geocoding documentation for more information on the geocoding process, how to interpret the output, and tips for getting the best results.
After opening the Docker Quickstart Terminal (or in a shell on linux), navigate to the directory where the CSV file to be geocoded is located. See here for help on navigating a filesystem using the command line.
For those unfamiliar with the command line, the simplest approach might be to put the file to be geocoded on the desktop and then navigate to your desktop folder after starting the Docker Quickstart Terminal with cd Desktop
.
Run:
docker run --rm=TRUE -v "$PWD":/tmp degauss/geocoder <name-of-file> <address-column-name>
replacing <name-of-file>
with the name of the CSV file to be geocoded and <address-column-name>
with the name of the column in the CSV file that contains the address strings.
Continuing on our example address file above, we can use:
docker run --rm=TRUE -v "$PWD":/tmp degauss/geocoder my_address_file.csv address
To avoid headaches don't use a file with spaces in the filename or address column name. When issuing the geocoding docker command make sure to include the .csv
filename extension even if they don't show up in your system file browser.
If run successfully, the shell should show a progress bar while geocoding and the geocoded file will be written to the current working directory named similarly to the input file but with _geocoded
appended to the file name.
Don't forget that if calling this image for the first time, Docker will have to download the image before starting the geocoding process. Although it is quite a large download (~ 6 GB), this only has to happen one time.
Our output file is written to the same directory and in our example, will be called my_address_file_geocoded.csv
:
"address","id","street","zip","city","state","lat","lon","score","prenum","number","precision"
"2800 Winslow Avenue Cincinnati OH 45206","003","Winslow Ave","45206","Cincinnati","OH",39.130586,-84.49631,0.941,"","2800","range"
"3333 Burnet Ave Cincinnati OH 45229","001","Burnet Ave","45229","Cincinnati","OH",39.14089,-84.500402,0.949,"","3333","range"
"660 Lincoln Avenue Cincinnati OH 45229","002","Lincoln Ave","45206",NA,NA,39.13282,-84.494724,0.805,"","660","range"
This output file will also contain diagnostic information on the precision and method used for geocoding each address. See here for more details on interpreting the output. These geocodes can be used to create maps of subject locations or can be further passed onto other DeGAUSS containers for geomarker assessment.