R/lt_rule.R
lt_rule_4m0_m0.Rd
Given the log crude death rate for ages 0-5, \(log(M(0,5)\) there is a strong 2-slope linear relationship with the log of the infant death rate. With this method the user supplies the under 5 death rate and the infant death rate is returned. This method could be invoked if either population or (exclusive or) deaths are tabulated for infants, or if only the under 5 mortality rate is available
lt_rule_4m0_m0(M04, D04, P04, Sex = c("m", "f"))
M04 | numeric. Death rate under age 5. |
---|---|
D04 | numeric. Deaths under age 5. |
P04 | numeric. Exposure under age 5. A population estimate will do. |
Sex | character, either |
Estimated deaths in age 0
This is an elsewhere-undocumented relationship derived from the whole of the HMD. We used the segmented
package to fit a 2-slope linear model. This can (and should) be reproduced using data from a more diverse collection, and even as-is the data should be subset to only those observations where deaths and populations were not split using HMD methods. You can reproduce the analysis given a data set in the format shown (but not executed) in the examples and following the code steps shown there.
Regarding argument specification, either M04
or D04
and P04
can both be given.
Muggeo VM (2008). “segmented: an R Package to Fit Regression Models with Broken-Line Relationships.” R News, 8(1), 20--25. https://cran.r-project.org/doc/Rnews/.
Human Mortality Database
# to reproduce the coefficient estimation # that the method is based on: if (FALSE) { # get data in this format: # dput(head(Dat)) Dat <- structure( list( lm0 = c(-2.92192434640927, -3.06165842367016, -3.10778177736261, -3.14075804560425, -3.20005407062761, -3.22489389719376 ), lM5 = c(-4.38578221044468, -4.56777854985643, -4.58851248767896, -4.57684656734645, -4.62854681692805, -4.61294106314254)), .Names = c("lm0", "lM5"), class = c("data.frame"), row.names = c(NA, -6L)) # where lm0 is log(M0) # i.e. log of infant death rate # and lM5 is log(M0_4) # i.e. log of death rate in first 5 years of life # then first fit a linear model: obj <- lm(lm0~lM5,data=Dat) # use segmented package: seg <- segmented::segmented(obj) # breakpoint: seg$psi[2] # brk # first intercept: seg$coef[1] # int1 # first slope: seg$coef[2] # s1 # difference in slope from 1st to second: seg$coef[3] # ds1 # make Dat come from some other dataset and you'll get different coefs, # it'd be possible to have these in families maybe, and in any case # different for males and females. This is just a rough start, to be # replaced if someone offers a superior method. These } M0_4 <- 5/1000 M0 <- lt_rule_4m0_m0(M0_4) # M0 from this relationship is more reliable than other methods # of independently splitting D0 or P0. So, if you're going to be # splitting counts, then make use of M0 to force numerators and # denominators to conform to this estimate. D0_4 <- 2e4 P0_4 <- 4e6 # function usage straightforward, also vectorized. D0 <- lt_rule_4m0_D0(D0_4, M0_4, Sex = "m") # deaths in ages 1-4 are a separate step. P0 <- D0 / M0 P1_4 <- P0_4 - P0 D1_4 <- D0_4 - D0 M1_4 <- D1_4 / P1_4 # and now we have all the pieces such that rate # estimates conform. if (FALSE) { plot(NULL, type = 'n', xlim = c(0, 5), ylim = c(1e-3, .025), log = "y", xlab = "Age", ylab = "log(rate)") segments(0, M0_4, 5, M0_4) segments(0, M0, 1, M0) segments(1, M1_4, 5, M1_4) text(1, c(M0, M1_4, M0_4), c("M0", "M1_4", "M0_4"), pos = 3) }