Adjust population counts for the age groups 0 to 10

basepop_five(
  location = NULL,
  refDate,
  Age = NULL,
  Females_five,
  Males_five = NULL,
  nLxFemale = NULL,
  nLxMale = NULL,
  nLxDatesIn = NULL,
  AsfrMat = NULL,
  AsfrDatesIn = NULL,
  ...,
  SRB = NULL,
  SRBDatesIn = NULL,
  radix = NULL,
  verbose = TRUE
)

Arguments

location

UN Pop Division LocName or LocID

refDate

The reference year for which the reported population pertain (these are the population counts in Females_five and Males_five). Can either be a decimal date, a Date class. If nLxDatesIn or AsfrDatesIn are not supplied and the corresponding nLxFemale/Male/AsfrMat is not supplied, refDate must be at a minimum 1962.5. This is because we can only fetch WPP data from 1955 onwards, and these minimum date is assumed to be 7.5 years before refDate, meaning 1955.

Age

integer vector of lower bounds of abridged age groups given in Females_five and Males_five.

Females_five

A named numeric vector with the population counts for five-year abridged age groups for females in refDate. The names of the vector should reflect the age groups. See the example section for some examples.

Males_five

A named numeric vector with the population counts for five-year abridged age groups for males in refDate. The names of the vector should reflect the age groups. See the example section for some examples.

nLxFemale

A numeric matrix. The female nLx function of two abridged life tables with ages in the rows and time in columns. The earlier date should be at least 7.5 years before the reference date of the "reported" population. The later date should be no earlier than one-half year before the reference date of the "reported" population. If not provided, it's automatically downloaded if location, refDate and the equivalent population counts *_five are provided.

nLxMale

A numeric matrix. The male nLx function of two abridged life tables with ages in the rows and time in columns. The dates which are represented in the columns are assumed to be the same as nLxDatesIn. This argument is only used when female is set to FALSE and Males_five is provided. If Males_five is provided and female set to FALSE, the nLx for males is automatically downloaded for the dates in nLxDatesIn.

nLxDatesIn

A vector of numeric years (for example, 1986). The dates which pertain to the columns in nLxFemale and nLxMale. If not provided, the function automatically determines two dates which are 8 years before refDate and 0.5 years after refDate.

AsfrMat

A numeric matrix. An age-period matrix of age specific fertility rates with age in rows, time in columns. If not provided, the function automatically downloads the ASFR matrix based on the dates in AsfrDatesIn.

AsfrDatesIn

A vector of numeric years (for example, 1986). These are the dates which pertain to the columns in AsfrMat. If not provided, the function automatically determines two dates which are 8 years before refDate and 0.5 before refDate.

...

Arguments passed to \link{interp}. In particular, users might be interested in changing the interpolation method for the nLx* matrices and the Asfr matrix. By default, it's linearly interpolated.

SRB

A numeric. Sex ratio at birth (males / females). Default is set to 1.046. Only a maximum of three values permitted.

SRBDatesIn

A vector of numeric years (for example, 1986). Only a maximum number of three dates allowed. These are the dates which pertain to the values in SRB. If not provided, the function automatically determines three dates which are 7.5 years, 2.5 and 0.5 years before refDate.

radix

starting point to use in the adjustment of the three first age groups. Default is NULL. If not provided, it is inferred based on the scale of age 1L0.

verbose

when downloading new data, should the function print details about the download at each step? Defaults to TRUE. We recommend the user to set this to TRUE at all times because the function needs to make decisions (such as picking the dates for the Asfr and nLx) that the user should be aware of.

Value

basepop_five returns a list with the following elements: *

  • Females_adjusted numeric vector of adjusted population counts for females. Age groups 0, 1-4, and 5-9 are adjusted, while ages 10 and higher are unchanged.

  • Males_adjusted numeric vector of adjusted population counts for males. Age groups 0, 1-4, and 5-9 are adjusted, while ages 10 and higher are unchanged.

  • Females_five numeric vector of female population counts given as input.

  • Males_five numeric vector of male population counts given as input.

  • nLxf numeric matrix of female nLx, abridged ages in rows and (potentially interpolated) time in columns. Potentially downloaded.

  • nLxm numeric matrix of male nLx, abridged ages in rows and (potentially interpolated) time in columns. Potentially downloaded.

  • Asfr numeric matrix of age specific fertility in 5-year age groups ages 15-19 until 45-49 in rows, and (potentially interpolated) time in columns. Potentially downloaded.

  • Exposure_female numeric matrix of approximated age-specific exposure in 5-year age groups ages 15-19 until 45-49 in rows, and (potentially interpolated) time in columns.

  • Bt births at three time points prior to census corresponding to the midpoints of the cohorts entering ages 0, 1-4, and 5-9.

  • SRB sex ratio at birth at three time points prior to census corresponding to the midpoints of the cohorts entering ages 0, 1-4, and 5-9. Potentially downloaded.

  • Age age groups of the input population counts.

Details

basepop_five and basepop_single can estimate both the BPA and BPE methods. If the user specifies SmoothedFemales, both basepop_* functions will return the BPA method. If SmoothedFemales is left empty, both basepop_* functions will adjust using the BPE method.

For basepop_five, adjusting the female population counts is the default. For this, only the location, refDate and Females_five are needed. All other arguments are downloaded or set to sensible defaults. For adjusting the male population counts, the user needs to specify the Males_five population counts and set female = FALSE.

Currently, basepop_five works only with five year abridged age groups

The BPE method is used by default. To adjust the counts using the BPA method, the user needs to provide the SmoothedFemales argument. This is the female population counts passed through a smoothing function such as smooth_age_5. See the examples section for some examples.

BPA

Description:

The method estimates a smoothed population ages 10 and over and adjusts the population under age 10 using the smoothed population and estimates of fertility and mortality.

Based on the smoothed female population counts, it rejuvenates the female "reported" population 20 to 59 years of age for the two 5 year periods prior to the census date to represent the female population in reproductive ages 5 and 10 years earlier. Based on the rejuvenated population and fertility and mortality levels, the method then estimates the male and female births during the two 5 year periods prior to the census date. Next, it projects the two 5-year birth cohorts to the census date. The projected figures represent the adjusted population ages 0 to 4 years and 5 to 9 years at the census date.

Advantages:

(1) The method adjusts under-10 population to be consistent with fertility and mortality levels and adjusted adult female population.

Limitations:

(1) BPA assumes a linear change in fertility and mortality during the decade prior to the reference year.

(2) The procedure ignores migration, which can lead to misleading results. There are two issues. First, age groups 0-4 and 5-9 are subject to migration, which would affect the comparability of estimated and reported populations in the base year. Second, the estimated size of age groups 0-4 and 5-9 are calculated from numbers of women of reproductive age in the base year rejuvenated to points in the past. With migration, rejuvenated number of women may exceed or be smaller than the number present, and giving birth to children, in the decade prior to the base year.

(3) BPA’s smoothing calculations may mask unusual, but real, variations in population age group size. Smoothing irregularities in age structure not attributable to age misreporting will distort estimated births and survived children in the base year.

Assumptions:

(1) No significant international migration took place within the reference periods for the population, mortality, and fertility input.

(2) The data input as the "reported" population is not affected by underenumeration of persons in certain ages, nor by age misreporting.

BPE

Description:

The method adjusts the population under age 10 using the reported population ages 10 and above and estimates of fertility and mortality.

The method rejuvenates the reported female population 20 to 59 years of age for the two 5 year periods prior to the census date to represent the female population in reproductive ages 5 and 10 years earlier. Based on the rejuvenated population and fertility and mortality levels, the method then estimates the male and female births during the two 5 year periods prior to the census date. Next, it projects the two 5-year birth cohorts to the census date. The projected figures represent the adjusted population ages 0 to 4 years and 5 to 9 years at the census date.

Advantages:

(1) The method adjusts the under-10 population to be consistent with fertility and mortality levels and adult female population.

Limitations:

(1) BPE assumes a linear change in fertility and mortality during the decade prior to the reference year.

(2) The procedure ignores migration, which can lead to misleading results. There are two issues. First, age groups 0-4 and 5-9 are subject to migration, which would affect the comparability of estimated and reported populations in the base year. Second, the estimated size of age groups 0-4 and 5-9 are calculated from numbers of women of reproductive age in the base year rejuvenated to points in the past. With migration, rejuvenated number of women may exceed or be smaller than the number present, and giving birth to children, in the decade prior to the base year.

(3) The method does not adjust for possible underenumeration and age misreporting errors in the over-10 “reported” population. If the reported population is subject to age-misreporting or age-sex-specific underenumeration, the over-10 population should be smoothed or otherwise corrected prior to use.

Assumptions:

(1) No significant international migration took place within the reference periods for the population, mortality, and fertility input.

(2) The data input as the “reported” population is not affected by underenumeration of persons in certain ages, nor by age misreporting.

References

Arriaga EE, Johnson PD, Jamison E (1994). Population analysis with microcomputers, volume 1. Bureau of the Census. United States Census Bureau (2017). “Population Analysis System (PAS) Software.” https://www.census.gov/data/software/pas.html, https://www.census.gov/data/software/pas.html.

Examples

if (FALSE) { ################ BPE (five year age groups) ##################### # Grab population counts for females refDate <- 1986 location <- "Brazil" pop_female_single <- fertestr::FetchPopWpp2019(location, refDate, ages = 0:100, sex = "female") pop_female_counts <- single2abridged(setNames(pop_female_single$pop, pop_female_single$ages)) pop_male_single <- fertestr::FetchPopWpp2019(location, refDate, ages = 0:100, sex = "male") pop_male_counts <- single2abridged(setNames(pop_male_single$pop, pop_male_single$ages)) Age <- names2age(pop_male_counts) # Automatically downloads the nLx, ASFR, and SRB data bpe <- basepop_five( location = location, refDate = refDate, Females_five = pop_female_counts, Males_five = pop_male_counts, Age = Age ) # The counts for the first three age groups have been adjusted: bpe$Females_adjusted[1:3] pop_female_counts[1:3] bpe$Males_adjusted[1:3] pop_male_counts[1:3] ################ BPE (for single ages) ############################ # blocked out for now, until single age function refactored as # TR: actually, it just needs to be rethought for single ages.. # pop_female_single <- setNames(pop_female_single$pop, pop_female_single$ages) # # # Automatically downloads the nLx and ASFR data # bpe_female <- basepop_single( # location = location, # refDate = refDate, # Females_single = pop_female_single # ) # # # The counts for the first 10 age groups have been adjusted: # bpe_female[1:10] # pop_female_single[1:10] ################ BPA (five year age groups) ##################### # for BPA, smooth counts in advance smoothed_females <- smooth_age_5(Value = pop_female_counts, Age = Age, method = "Arriaga", OAG = TRUE, young.tail = "Original") # Note, smooth_age_5() will group infants into the 0-4 age group. So, # we manually stick them back in place. smoothed_females <- c(pop_female_counts[1:2], smoothed_females[-1]) smoothed_males <- smooth_age_5(Value = pop_male_counts, Age = Age, method = "Arriaga", OAG = TRUE, young.tail = "Original") smoothed_males <- c(smoothed_males[1:2], smoothed_males[-1]) # Automatically downloads the nLx, ASFR, and SRB data bpa <- basepop_five( location = location, refDate = refDate, Females_five = smoothed_females, Males_five = smoothed_males, Age = Age ) # The counts for the first three age groups have been adjusted: bpa$Females_adjusted[1:3] smoothed_females[1:3] pop_female_counts[1:3] bpa$Males_adjusted[1:3] smoothed_males[1:3] pop_male_counts[1:3] ################ PAS example ############################### # (1) refDate refDate <- 1986.21 # (2) Reported population by 5-year age groups and sex in the base year # (Include unknowns). pop_male_counts <- c(11684, 46738, 55639, 37514, 29398, 27187, 27770, 20920, 16973, 14999, 11330, 10415, 6164, 7330, 3882, 3882, 1840, 4200) pop_female_counts <- c(11673, 46693, 55812, 35268, 33672, 31352, 33038, 24029, 16120, 14679, 8831, 9289, 4172, 6174, 2715, 3344, 1455, 4143) Age <- c(0,1, seq(5, 80, by = 5)) # (4) Sex ratio at birth (m/f) sex_ratio <- 1.0300 # (6) The male and female nLx functions for ages under 1 year, 1 to 4 years, and 5 to 9 # years, pertaining to an earlier and later date nLxDatesIn <- c(1977.31, 1986.50) nLxMale <- matrix(c(87732, 304435, 361064, 88451, 310605, 370362), nrow = 3, ncol = 2) nLxFemale <- matrix(c(89842, 314521, 372681, 353053, 340650, 326588, 311481, 295396, 278646, 261260, 241395,217419, 90478, 320755, 382531, 364776, 353538, 340687, 326701, 311573, 295501, 278494, 258748,234587), nrow = 12, ncol = 2) # (7) A set of age-specific fertility rates pertaining to an earlier and later # date asfrmat <- structure( c(0.2, 0.3, 0.3, 0.25, 0.2, 0.15, 0.05, 0.15, 0.2, 0.275, 0.225, 0.175, 0.125, 0.05), .Dim = c(7L, 2L), .Dimnames = list( c("15-19", "20-24", "25-29", "30-34", "35-39", "40-44", "45-49"), c("1977.81", "1985.71"))) # for BPA, smooth counts in advance smoothed_females <- smooth_age_5(Value = pop_female_counts, Age = Age, method = "Arriaga", OAG = TRUE, young.tail = "Original") smoothed_females <- c(pop_female_counts[1:2], smoothed_females[-1]) smoothed_males <- smooth_age_5(Value = pop_male_counts, Age = Age, method = "Arriaga", OAG = TRUE, young.tail = "Original") smoothed_males <- c(pop_male_counts[1:2], smoothed_males[-1]) ## This is the only number that messes up the whole calculation. ## smooth_age_5 returns the same result as the PASS excel sheet ## except for the age groups 10-15 and 15-19. Here we only use ## age group 15-19. If we plug in manually the correct value, ## we get all results match exactly, otherwise there are ## some differences. smoothed_females[4] <- 34721 # For adjusting using BPA for males, we need to specify # female = FALSE with Males and nLxMale. bpa <- basepop_five( refDate = refDate, Males_five = smoothed_males, Females_five = smoothed_females, Age = Age, SRB = sex_ratio, nLxFemale = nLxFemale, nLxMale = nLxMale, nLxDatesIn = nLxDatesIn, AsfrMat = asfrmat, AsfrDatesIn = AsfrDatesIn, radix = 1e5 ) # See adjustments? pop_male_counts[1:3] bpa$Male_adjusted[1:3] pop_female_counts[1:3] bpa$Female_adjusted[1:3] # For adjustment using BPE, we use exactly the same definitions as above # but use the original inputs bpe <- basepop_five( refDate = refDate, Females_five = pop_female_counts, Males_five = pop_male_counts, SRB = sex_ratio, nLxFemale = nLxFemale, nLxDatesIn = nLxDatesIn, AsfrMat = asfrmat, AsfrDatesIn = AsfrDatesIn ) pop_female_counts[1:3] bpe$Females_adjusted[1:3] # basepop_single for single ages # Single ages for males and females # pop_male_counts <- # c(11684, 11473, 11647, 11939, 11680, 10600, 11100, 11157, 11238, # 11544, 7216, 7407, 7461, 7656, 7774, 5709, 5629, 5745, 6056, # 6259, 5303, 5423, 5497, 5547, 5417, 5441, 5466, 5500, 5668, 5694, # 4365, 4252, 4122, 4142, 4039, 3210, 3222, 3258, 3413, 3871, 2684, # 2844, 3052, 3182, 3237, 2263, 2298, 2318, 2257, 2194, 2231, 2172, # 2072, 2008, 1932, 1301, 1262, 1213, 1197, 1191, 1601, 1593, 1490, # 1348, 1299, 568, 745, 843, 801, 925, 806, 883, 796, 725, 672, # 470, 441, 340, 300, 289, 4200) # # pop_female_counts <- # c(11673, 11474, 11670, 11934, 11614, 10603, 11144, 11179, 11269, # 11617, 6772, 6948, 7030, 7211, 7306, 6531, 6443, 6535, 6951, # 7213, 6096, 6234, 6327, 6410, 6285, 6464, 6492, 6549, 6739, 6795, # 5013, 4888, 4735, 4747, 4646, 3040, 3068, 3107, 3246, 3658, 2650, # 2788, 2977, 3108, 3156, 1756, 1784, 1802, 1764, 1724, 1982, 1935, # 1846, 1795, 1731, 863, 850, 825, 819, 816, 1348, 1342, 1246, # 1138, 1101, 391, 520, 585, 560, 659, 670, 750, 686, 634, 604, # 353, 340, 270, 246, 247, 4143) # Age <- 0:80 # # smoothed_females <- smooth_age_5(Value = pop_female_counts, # Age = Age, # method = "Arriaga", # OAG = TRUE, # young.tail = "Original") # smoothed_males <- smooth_age_5(Value = pop_male_counts, # Age = Age, # method = "Arriaga", # OAG = TRUE, # young.tail = "Original") # For adjusting using BPA for males, we need to specify # female = FALSE with Males and nLxMale. # This needs work still # bpa_male <- # basepop_single( # refDate = refDate, # Males_single = pop_male_counts, # Females_single = pop_female_counts, # SRB = sex_ratio, # nLxFemale = nLxFemale, # nLxMale = nLxMale, # nLxDatesIn = nLxDatesIn, # AsfrMat = asfrmat, # AsfrDatesIn = AsfrDatesIn # ) # See adjustments? # pop_male_counts[1:10] # bpa_male[1:10] # Adjusting the BPA for females requires less arguments # bpa_female <- # basepop_single( # refDate = refDate, # Females_single = pop_female_counts, # SmoothedFemales = smoothed_females, # SRB = sex_ratio, # nLxFemale = nLxFemale, # nLxDatesIn = nLxDatesIn, # AsfrMat = asfrmat, # AsfrDatesIn = AsfrDatesIn # ) # pop_female_counts[1:10] # bpa_female[1:10] # # # For adjustment using BPE, we use exactly the same definitions as above # # but remove SmoothedFemales. # bpe_male <- # basepop_single( # refDate = refDate, # Males_single = pop_male_counts, # Females_single = pop_female_counts, # SRB = sex_ratio, # nLxFemale = nLxFemale, # nLxMale = nLxMale, # nLxDatesIn = nLxDatesIn, # AsfrMat = asfrmat, # AsfrDatesIn = AsfrDatesIn, # female = FALSE # ) # See adjustments? # pop_male_counts[1:10] # bpa_male[1:10] # bpe_male[1:10] # Adjusting the BPA for females requires less arguments # bpe_female <- # basepop_single( # refDate = refDate, # Females_single = pop_female_counts, # SRB = sex_ratio, # nLxFemale = nLxFemale, # nLxDatesIn = nLxDatesIn, # AsfrMat = asfrmat, # AsfrDatesIn = AsfrDatesIn # ) # # pop_female_counts[1:10] # bpa_female[1:10] # bpe_female[1:10] }