-
Notifications
You must be signed in to change notification settings - Fork 0
Fix issue 3 #4
base: master
Are you sure you want to change the base?
Fix issue 3 #4
Conversation
Loading data from pyphenology as below: observations, predictors = utils.load_test_data(
name="vaccinium", phenophase="budburst"
)
print(observations.head(5))
species site_id year doy phenophase
0 vaccinium corymbosum 1 1991 100 371
1 vaccinium corymbosum 1 1991 100 371
2 vaccinium corymbosum 1 1991 104 371
3 vaccinium corymbosum 1 1998 106 371
4 vaccinium corymbosum 1 1998 106 371
print(observations.shape)
(48, 5)
print(predictors.head(5))
site_id temperature year doy latitude longitude daylength
0 1 13.10 1990 -65 42.5429 -72.2011 10.24
1 1 13.26 1990 -64 42.5429 -72.2011 10.20
2 1 12.30 1990 -63 42.5429 -72.2011 10.16
3 1 12.15 1990 -62 42.5429 -72.2011 10.11
4 1 13.00 1990 -61 42.5429 -72.2011 10.07
print(predictors.shape)
(4356, 7) The data is preprocessed internally as: observed_doy, temperature_array, doy_series = mu.misc.temperature_only_data_prep(
observations, predictors, for_prediction=False
)
print(observed_doy.shape)
(48,)
print(temperature_array.shape)
(363, 48)
print(doy_series.shape)
(363,) |
@Peter9192 I found that applying Now there are still two issues: |
That's okay, I think. If we document it properly.
Perhaps instead of checking the types, we can check the shapes? For pyphenology,
Okay, I suppose that should be fine. Well spotted. Maybe copy this note to an inline comment? |
Nice progress! I'm wondering if it also works for some of the other pyphenology models now? |
Looking at the existing checks in
In
"latitude" is not in |
not for |
Not sure about that. I'm already very happy that it works for all the others! Let's wrap it up without M1 for now |
Tried using this with the following: import pandas as pd
import numpy as np
import geopandas as gpd
from pyPhenology import models
# Generate test data
# random observations for 10 points:
obs = gpd.GeoDataFrame(
data = {
'year': np.arange(2000, 2010),
'DOY_firstbloom': np.random.randint(120, 180, size=10),
'geometry': gpd.GeoSeries.from_xy(*np.random.randn(2, 10))
},
)
# dummy temperature data for each of these years/locations, for each DOY
get_temperature = lambda year, geometry: pd.Series(np.random.randn(365), index=np.arange(1, 366), name='temperature')
weather = obs.apply(lambda row: get_temperature(row.year, row.geometry), axis=1)
# This works
model = models.ThermalTime()
model.fit(observations=obs.DOY_firstbloom.values, predictors=weather.values)
model.get_params()
# This doesn't
model = models.ThermalTime()
model.fit(observations=obs.DOY_firstbloom, predictors=weather)
model.get_params() That's why I still think checking based on something different than types is useful. |
Testing for other pyphenology models: model_list = [
models.ThermalTime(),
models.Alternating(),
models.Uniforc(),
models.Unichill(),
models.Linear(),
models.MSB(),
models.Sequential(),
# models.M1(), # Fails
models.FallCooling(),
models.Naive(),
]
for model in model_list:
model.fit(observations=obs.DOY_firstbloom.values, predictors=weather.values)
print(model.__class__.__name__, model.get_params()) Interestingly, it also works for |
I have implemented it for |
@Peter9192 I had another look at the implementations and your examples here. Now |
closes #3