rule of thumb for estimating infant mortality rate from under 5 mortality

Given the log crude death rate for ages 0-5, \(log(M(0,5)\) there is a strong 2-slope linear relationship with the log of the infant death rate. With this method the user supplies the under 5 death rate and the infant death rate is returned. This method could be invoked if either population or (exclusive or) deaths are tabulated for infants, or if only the under 5 mortality rate is available

lt_rule_4m0_m0(M04, D04, P04, Sex = c("m", "f"))

Arguments

M04	numeric. Death rate under age 5.
D04	numeric. Deaths under age 5.
P04	numeric. Exposure under age 5. A population estimate will do.
Sex	character, either `"m"` or `"f"`.

Value

Estimated deaths in age 0

Details

This is an elsewhere-undocumented relationship derived from the whole of the HMD. We used the segmented package to fit a 2-slope linear model. This can (and should) be reproduced using data from a more diverse collection, and even as-is the data should be subset to only those observations where deaths and populations were not split using HMD methods. You can reproduce the analysis given a data set in the format shown (but not executed) in the examples and following the code steps shown there.

Regarding argument specification, either M04 or D04 and P04 can both be given.

References

Muggeo VM (2008). “segmented: an R Package to Fit Regression Models with Broken-Line Relationships.” R News, 8(1), 20--25. https://cran.r-project.org/doc/Rnews/.

Human Mortality Database

Examples


# to reproduce the coefficient estimation
# that the method is based on:
if (FALSE) {
# get data in this format:
  # dput(head(Dat))
Dat <- structure(
  list(
    lm0 = c(-2.92192434640927, -3.06165842367016,
        -3.10778177736261, -3.14075804560425,
        -3.20005407062761, -3.22489389719376
      ),
    lM5 = c(-4.38578221044468, -4.56777854985643,
        -4.58851248767896, -4.57684656734645,
        -4.62854681692805, -4.61294106314254)),
       .Names = c("lm0", "lM5"),
  class = c("data.frame"),
  row.names = c(NA, -6L))
 # where lm0 is log(M0)
 # i.e. log of infant death rate
 # and lM5 is log(M0_4)
 # i.e. log of death rate in first 5 years of life

 # then first fit a linear model:
  obj  <- lm(lm0~lM5,data=Dat)
 # use segmented package:
  seg  <- segmented::segmented(obj)
 # breakpoint:
  seg$psi[2]     # brk
 # first intercept:
  seg$coef[1]    # int1
 # first slope:
  seg$coef[2]    # s1
 # difference in slope from 1st to second:
  seg$coef[3]    # ds1
 # make Dat come from some other dataset and you'll get different coefs,
 # it'd be possible to have these in families maybe, and in any case
 # different for males and females. This is just a rough start, to be
 # replaced if someone offers a superior method. These

}

M0_4 <- 5/1000
M0   <- lt_rule_4m0_m0(M0_4)

# M0 from this relationship is more reliable than other methods
# of independently splitting D0 or P0. So, if you're going to be
# splitting counts, then make use of M0 to force numerators and
# denominators to conform to this estimate.
D0_4 <- 2e4
P0_4 <- 4e6
# function usage straightforward, also vectorized.
D0   <- lt_rule_4m0_D0(D0_4, M0_4, Sex = "m")
# deaths in ages 1-4 are a separate step.
P0   <- D0 / M0
P1_4 <- P0_4 - P0
D1_4 <- D0_4 - D0
M1_4 <- D1_4 / P1_4
# and now we have all the pieces such that rate
# estimates conform.

if (FALSE) {
plot(NULL, type = 'n', xlim = c(0, 5), ylim = c(1e-3, .025), log = "y",
    xlab = "Age", ylab = "log(rate)")
segments(0, M0_4, 5, M0_4)
segments(0, M0, 1, M0)
segments(1, M1_4, 5, M1_4)
text(1, c(M0, M1_4, M0_4), c("M0", "M1_4", "M0_4"), pos = 3)
}