This is a repository to store the functions used within the Neotoma Database for data access and updates. The Neotoma is a Postgres-based system, migrated from a SQL Server database. As such, at the current time, this repository largely reflects the state of transition between the two systems.
All individuals are welcome to contribute to this repository. Contributions are subject to the Code of Conduct for this repository.
- Steve Crawford - Penn State
- Simon Goring - University of Wisconsin
- Mike Stryker - Penn State
- Anna George - University of Wisconsin
- Jack Williams - University of Wisconsin
The repository is divided into two main folders, legacy
and function
:
legacy
folder is intended to act as the store for the now deprecated SQL Server Stored Procedures.function
folder represents the newer Postgres functions. This folder is then divided into folders for each individual database schema.
This structure is not necessary, but has been implemented to help manage the workflow of rewriting the large number of functions associated with the original database.
Neotoma maintains five separate databases within the PSU Postgres server. Of these five, three are significant, each has a specific function:
neotomadev
: Main development database. Currently this is the parent database and should be the most accurate/up to date version of the database. This is the database that the API points to.neotoma
: A database for experimentation. This version of the database may be out of sync with the SQL Server andneotomadev
as it is used to build and prototype new tools.neotomatilia
: A database for testing and managing Tilia services.
There are several files that have been used in the transition and are included here for posterity.
rewrite_funs.R
is a short R script that was intended to rewrite any function in the legacyTI
schema that used the formatSELECT * FROM table
.tilia_check.py
was intended to check against existing functions in theti
namespace to help ensure alignment with the Tilia software package designed to work with the Neotoma Paleoecology Database.
This repository is currently intended to be read-only with respect to the databases (neotoma
, neotomadev
and neotomatilia
). The stored postgres functions can be updated using the Python3 script connect_remote.py
by calling:
python3 connect_remote.py
This program checks the pg_catalog
for the stated database and pulls each function within a defined set of namespaces, and returns each function as its own sql
file to a folder in the function
directory.
connect_remote.py
has built in help that can be accessed using:
connect_remote.py -h
usage: connect_remote.py [-h] [-dev] [-push] [-g [PULLGIT]] [-tilia]
Check Neotoma SQL functions against functions in the online database servers
(`neotoma` and `neotomadev`).
optional arguments:
-h, --help show this help message and exit
-dev Use the `dev` database? (`False` without the flag)
-push Assume that SQL functions in the repository are newer, push to
the db server.
-g [PULLGIT] Pull from the remote git server before running?.
-tilia Use the `dev` database? (`False` without the flag)
A user should make sure that local changes are committed and pushed before using the script, but in the case that changes were made directly to the GitHub repository online, they may use the -g
flag.
Expected use should -push
to each of the three databases.
Functions are expected to follow (as much as possible) Simon Holywell's SQL Style Guide and the Postgres Coding Convention.
- Functions should be generated in standalone files
- Functions should be named using verbs
- Functions should refrain from using vendor specific functions
- Use C style commenting: /* . . . */
For more insight into the structure of the Neotoma Database (specifically the ndb
schema, which contains the core data of the database), visit the Neotoma Database Documentation. Documentation was generated using the SchemaSpy software.
Please feel free to raise issues using the issue tracker on this repository.