-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent 'date of entry' reported by survexp
and questions on survexp
#134
Comments
survexp
and questionssurvexp
and questions on survexp
Dear @qmarcou , This is interesting questions for me also. I tried to reproduce your first code and add some suggestions for using slopop rate table. library(survival) str(devtools::session_info()["packages"][[1]][c("survival","relsurv"), c("package", "loadedversion")]) Classes ‘packages_info’ and 'data.frame': 2 obs. of 2 variables: data("slopop") # import slovenian population mortality table from the relsurv package Build a mock data frame with a single 63 years old male patient with start of follow up on April 24th 2003 df <- data.frame(list( Computed expected survival at 100 days for this single individual cohort exp_surv_us <- survexp( exp_surv_slo <- survexp( exp_surv_slo2 <- survexp( print(exp_surv_us$summ) # output the correct entry date
exp_surv_slo2 Call: age ranges from 63 to 63 years time n.risk survival I also tried another way using the option for argument method "individual.h" using slopop rate table:
expected survival
Thank you again for opening this issue, which allow us to go deeper in survexp function use. |
Maja, could I get your opinion here? I can change the code to call summary earlier, which will fix the issue with slopop, but may break other rate tables that "figured out" what I was doing. Any guess as to which version is likely to break fewer things?
|
Hi @jgoungounga , # Compute expected survival for 100 days for 63 year old male in 2003 in a naive way
print(exp(-100*slopop["63","2003","male"]))
Now if you look at the obtained values with the snippets you proposed: print(exp_surv_slo$surv)
print(exp_surv_slo2$surv)
print(exp(-exp_surv_slo3))
Since there seem to be already a slight difference between the naive computation and all others (maybe an extra conversion with 365.25 somewhere?), let's make a brute force check that the correct year is used: test <- slopop
test["63","2003","male"]<-test["63","2003","male"]*100 #set the daily hazard for this year at an absurdly large value
exp_surv_test <- survexp(formula = ~1, data = df,
rmap = list(sex = sex, year = diag, age = age),
times = (100),
ratetable = test)
exp_surv_test2 <- survexp(
formula = ~ 1,
data = df,
rmap = list(sex = sex, year = diag-as.Date("1960-01-01"), age = age), # year correspond for this table to the number of years since 1960-01-01
times = (100),
ratetable = test
)
print(exp_surv_test$surv)
print(exp_surv_test2$surv)
Package versions:
|
Dear @therneau ,
Thank you for this note on the natural scale/unit of Dates, things make more sense now.
Yes I think I found the document here. Thank you for the pointer. Best regards, |
Dear @therneau ,
First of all, thank you very much for this great package !
Inconsistent
date of entry
I have been trying to use the
survexp
function to compute individual expected survival and have been confused by one of the helper output of the function (more precisely by the$summ
attribute). I'm using version 3.2.7 of the package.I would expect the "date of entry" to correspond to the date passed for start of follow up passed through the
year=
attribute in thermap
list. However I have found out that the reported date of entry depends on the provided ratetable, and thus possibly returns wrong date of entries. Here is a minimal example reproducing this behaviorHere is the obtained result
Though after a thorough check I'm pretty sure that the correct hazard (the one provided for 2003 and not 1993) is used to compute survival even for the slovenian ratetable, which is reassuring, I still find this output very misleading. In particular I wouldn't expect the reported date of entry to depend on the provided ratetable object, but I might be missing something.
Other questions on survexp
I would also have several other questions on the
survexp
function, for which I failed to find clear answers in the documentation:times=
arguments) be in "days" unit? I found elusive hints in the documentation that it is not necessarily the case e.g:or
However if provided times can be in any unit, I fail to understand how the implementation knows which unit to use to perform the integral for the cumulated hazard when times are simply passed as numeric values. A similar question stands for the unit of provided hazards in ratetable objects.
'conditional'
method, see the correponding pieces of documentation of thesurvexp
function:Details section:
There's a bit more information in the
survexp.fit
documentation about this'conditional'
, but I'm still not sure what is the computed quantity at the end.Thank you very much for your help and again thank you very much for this great work!
Best regards
The text was updated successfully, but these errors were encountered: