postal

Using

If my_address_file.csv is a file in the current working directory with an address column named address, then the DeGAUSS command:

docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/postal:0.1.4 my_address_file.csv

will produce my_address_file_postal_0.1.4.csv with added columns:

cleaned_address: address with non-alphanumeric characterics and excess whitespace removed (with dht::clean_address())
parsed.{address_component}: multiple columns, one for each parsed address component (e.g., parsed.road, parsed.state, parsed.house_number)
parsed_address: a "parsed" address created by pasting together available parsed.house_number, parsed.road, parsed.city, parsed.state, and the first five digits of the parsed.postcode address components

Optional Argument

After parsing, the parsed addresses can be expanded into several possible normalized addresses using libpostal. This can be useful for matching of these addresses with other messy, real world addresses.

If any value is provided as an argument (e.g., "expand"), then the DeGAUSS command:

docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/postal:0.1.4 my_address_file.csv expand

will produce my_address_file_postal_0.1.4_expand.csv with the above columns plus:

expanded_addresses: the expanded addresses for parsed_address

Because each parsed_address will likely result in more than one expanded_addresses, each input row is duplicated to accomodate several expanded_addresses. This means that when expanding addresses, the input CSV file is "expanded" too by duplicating the input rows.

Geomarker Methods

Input addresses are parsed/normalized using libpostal by:

removing non-alphanumeric characters (except -) and excess whitespace (with dht::clean_address())
parsing addresses into components using libpostal/scr/address_parser (a machine learning model trained on OpenStreetMap and OpenAddresses)
(with an optional argument) expanding the parsed address into several possible normalized addresses

DeGAUSS Details

For detailed documentation on DeGAUSS, including general usage and installation, please see the DeGAUSS homepage.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
renv		renv
test		test
.Rprofile		.Rprofile
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
entrypoint.R		entrypoint.R
renv.lock		renv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

postal

Using

Optional Argument

Geomarker Methods

DeGAUSS Details

About

Releases

Packages

Languages

License

degauss-org/postal_parser

Folders and files

Latest commit

History

Repository files navigation

postal

Using

Optional Argument

Geomarker Methods

DeGAUSS Details

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages