For a given age-structured vector of counts, how rough is data after grouping to 5-year age bins? Data may require smoothing even if there is no detectable sawtooth pattern. It is best to use the value in this method together with visual evidence to gauge whether use of smooth_age_5() is recommended.

check_heaping_roughness(
  Value,
  Age,
  ageMin = 20,
  ageMax = max(Age[Age%%5 == 0])
)

Arguments

Value

numeric. A vector of demographic counts by single age.

Age

numeric. A vector of ages corresponding to the lower integer bound of the counts.

ageMin

integer evenly divisible by 5. Lower bound of evaluated age range, default 20.

ageMax

integer evenly divisible by 5. Upper bound of evaluated age range, defaults to highest age evenly divisible by 10.

Details

First we group data to 5-year age bins. Then we take first differences (d1) of these within the evaluated age range. Then we smooth first differences (d1s) using a generic smoother (ogive()). Roughness is defined as the mean of the absolute differences between mean(abs(d1 - d1s) / abs(d1s)). Higher values indicate rougher data, and may suggest more aggressive smoothing. Just eyeballing, one could consider smoothing if the returned value is greater than ca 0.2, and values greater than 0.5 already highly recommend it (pending visual verification).

Examples

Age <- 0:99
A5 <- seq(0,95,by=5)
smoothed <- graduate_sprague(
    smooth_age_5(pop1m_pasex,
        Age,
        method = "Strong",
        OAG = FALSE,
        young.tail = "Arriaga"),
    Age = A5,
    OAG = FALSE)
# not very rough, no need to smooth more
check_heaping_roughness(smoothed, Age)
#> [1] 0.05958194
 # quite rough, even after grouping to 5-year ages
check_heaping_roughness(pop1m_pasex, Age)
#> [1] 0.5530865
# heaped, but no 0>5 preference
h1 <- heapify(smoothed, Age, p0 = 1, p5 = 1)
# heaping progressively worse
h2 <- heapify(smoothed, Age, p0 = 1.2, p5 = 1.2)
h3 <- heapify(smoothed, Age, p0 = 1.5, p5 = 1.5)
h4 <- heapify(smoothed, Age, p0 = 2, p5 = 2)
h5 <- heapify(smoothed, Age, p0 = 2.5, p5 = 2)
if (FALSE) {
#cols <- RColorBrewer::brewer.pal(7,"Reds")[3:7]
 cols <-  c("#FC9272", "#FB6A4A", "#EF3B2C", "#CB181D", "#99000D")

  plot(A5, groupAges(smoothed), type='l',xlim=c(20,80),ylim=c(0,3e5))
lines(A5, groupAges(h1),col=cols[1])
  lines(A5, groupAges(h2),col=cols[2])
  lines(A5, groupAges(h3),col=cols[3])
lines(A5, groupAges(h4),col=cols[4])
lines(A5, groupAges(h5),col=cols[5])
}
check_heaping_roughness(smoothed, Age)
#> [1] 0.05958194
check_heaping_roughness(h1, Age)
#> [1] 0.07530694
check_heaping_roughness(h2, Age)
#> [1] 0.1036847
check_heaping_roughness(h3, Age)
#> [1] 0.1737339
check_heaping_roughness(h4, Age)
#> [1] 0.7451597
check_heaping_roughness(h5, Age)
#> [1] 1.868327