-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize GeoPandas support #200
Comments
here's some thoughts on how we could do this. I'd propose the signature for class CompassApp:
def run(self, queries, gdf: bool = false, index_col: Optional[str]=None) Top-level structureIn my mind a route is mapped to a single row of a dataframe. I think for a tree, a link is mapped to a row. Because their schemas are different, I think we want the output of a
This is similar to Returning RoutesSummary row dataAny row in a route GeoDataFrame should have all traversal and cost columns that may be relevant to post-processing:
All of these should be set only if they exist (optional semantics). Geo row identifiersThe route geometries are parsed as geometry objects. A
As a result, a row is uniquely indexed by the _index column when there is one route per row, and a combination of _index and _path when route.path has more than one entry. Put another way: we end up with k * q rows for q queries and k paths. Geometry dataWe need to cover the possible geometry output types. We can know in advance what the geometry type is by inspecting the configuration of the CompassApp (via #201). We can then switch our geometry parsing method accordingly:
Returning treesTrees are similar but different. Each Compass output row may have an entire tree, but, we do not want a GDF of trees (plural), we want instead to plot a single tree. Following the logic above, we can add one more index column so that each row is a link in a tree in a result:
|
btw, if we do the proposed above, then i feel like all of our plotting functions can disappear and get replaced by GeoDataFrame plotting methods, which for example supports the feature described in #199, but also can swap-in replace the existing |
Just thinking about it. Maybe instead of gdf: bool we instead provide |
working with Compass right now in a python environment, and i want my run result as a dataframe. my result has thousands of rows, so, avoiding a copy would be ideal. that's where we should be using polars. two thoughts:
edit: then, to make this result a geopandas.GeoDataFrame could be simple with the user providing a JSONPath to the geometry object |
Yeah I really like the idea of using polars here for zero copy operations and I like that this will still allow support of geopandas if we use the pyarrow backend. |
The result of a
CompassApp::run
is a Python Object which has some standard keys and structures that are known/specified internally in the rust codebase. Handling this monolithic object in Python is counter-intuitive since we are not leveraging Pandas/GeoPandas, which are the standard format for analysis of columnar (geo) data.The
CompassApp::run
method could include ageopandas: bool = false
(orgdf
if that's more Pythonic) argument which will assume the user wants GeoDataFrame outputs for route and tree data when it istrue
, but output the monolithic Object whenfalse
(by default).The text was updated successfully, but these errors were encountered: