R/check_heaping.R
check_heaping_roughness.Rd
For a given age-structured vector of counts, how rough is data after grouping to 5-year age bins? Data may require smoothing even if there is no detectable sawtooth pattern. It is best to use the value in this method together with visual evidence to gauge whether use of smooth_age_5()
is recommended.
check_heaping_roughness( Value, Age, ageMin = 20, ageMax = max(Age[Age%%5 == 0]) )
Value | numeric. A vector of demographic counts by single age. |
---|---|
Age | numeric. A vector of ages corresponding to the lower integer bound of the counts. |
ageMin | integer evenly divisible by 5. Lower bound of evaluated age range, default 20. |
ageMax | integer evenly divisible by 5. Upper bound of evaluated age range, defaults to highest age evenly divisible by 10. |
First we group data to 5-year age bins. Then we take first differences (d1) of these within the evaluated age range. Then we smooth first differences (d1s) using a generic smoother (ogive()
). Roughness is defined as the mean of the absolute differences between mean(abs(d1 - d1s) / abs(d1s))
. Higher values indicate rougher data, and may suggest more aggressive smoothing. Just eyeballing, one could consider smoothing if the returned value is greater than ca 0.2, and values greater than 0.5 already highly recommend it (pending visual verification).
Age <- 0:99 A5 <- seq(0,95,by=5) smoothed <- graduate_sprague( smooth_age_5(pop1m_pasex, Age, method = "Strong", OAG = FALSE, young.tail = "Arriaga"), Age = A5, OAG = FALSE) # not very rough, no need to smooth more check_heaping_roughness(smoothed, Age)#> [1] 0.05958194# quite rough, even after grouping to 5-year ages check_heaping_roughness(pop1m_pasex, Age)#> [1] 0.5530865# heaped, but no 0>5 preference h1 <- heapify(smoothed, Age, p0 = 1, p5 = 1) # heaping progressively worse h2 <- heapify(smoothed, Age, p0 = 1.2, p5 = 1.2) h3 <- heapify(smoothed, Age, p0 = 1.5, p5 = 1.5) h4 <- heapify(smoothed, Age, p0 = 2, p5 = 2) h5 <- heapify(smoothed, Age, p0 = 2.5, p5 = 2) if (FALSE) { #cols <- RColorBrewer::brewer.pal(7,"Reds")[3:7] cols <- c("#FC9272", "#FB6A4A", "#EF3B2C", "#CB181D", "#99000D") plot(A5, groupAges(smoothed), type='l',xlim=c(20,80),ylim=c(0,3e5)) lines(A5, groupAges(h1),col=cols[1]) lines(A5, groupAges(h2),col=cols[2]) lines(A5, groupAges(h3),col=cols[3]) lines(A5, groupAges(h4),col=cols[4]) lines(A5, groupAges(h5),col=cols[5]) } check_heaping_roughness(smoothed, Age)#> [1] 0.05958194check_heaping_roughness(h1, Age)#> [1] 0.07530694check_heaping_roughness(h2, Age)#> [1] 0.1036847check_heaping_roughness(h3, Age)#> [1] 0.1737339check_heaping_roughness(h4, Age)#> [1] 0.7451597check_heaping_roughness(h5, Age)#> [1] 1.868327