R/check_heaping.R
check_heaping_sawtooth.Rd
Ages ending in 0 often have higher apparent heaping than ages ending in 5. In this case, data in 5-year age bins might show a sawtooth pattern. If heaping occurs in roughly the same amount on 0s and 5s, then it may be sufficient to group data into 5-year age groups and then graduate back to single ages. However, if heaping is worse on 0s, then this procedure tends to produce a wavy pattern in count data, with 10-year periodicity. In this case it is recommended to use one of the methods of smooth_age_5()
as an intermediate step before graduation.
check_heaping_sawtooth( Value, Age, ageMin = 40, ageMax = max(Age[Age%%5 == 0]) - 10 )
Value | numeric. A vector of demographic counts by single age. |
---|---|
Age | numeric. A vector of ages corresponding to the lower integer bound of the counts. |
ageMin | integer evenly divisible by 10. Lower bound of evaluated age range, default 40. |
ageMax | integer evenly divisible by 5. Upper bound of evaluated age range, defaults to highest age evenly divisible by 10. |
numeric, ratio of 0s to 5s. If > 1 then the pattern is present.
Data is grouped to 5-year age bins. The ratio of each value to the average of its neighboring values is calculated. If 0s have stronger attraction than 5s then we expect these ratios to be >1 for 0s and <1 for 5s. Ratios are compared within each 10-year age group in the evaluated age range. If in the evaluated range there are at most two exceptions to this rule (0s>5s), then the ratio of the mean of these ratios is returned, and it is recommended to use a smoother method. Higher values suggest use of a more aggressive method. This approach is only slightly different from that of Feeney, as implemented in the smooth_age_5_zigzag_inner()
functions. This is not a general measure of roughness, but rather an indicator of this particular pattern of age attraction.
Feeney G (1979). “A technique for correcting age distributions for heaping on multiples of five.” Asian and Pacific Census Forum, 5(3), 12--15. Feeney, G. 2013 "Removing "Zigzag" from Age Data," http://demographer.com/white-papers/2013-removing-zigzag-from-age-data/
Age <- 0:99 A5 <- seq(0,95,by=5) smoothed <- graduate_sprague( smooth_age_5(pop1m_pasex, Age, method = "Strong", OAG = FALSE, young.tail = "Arriaga"), Age = A5, OAG = FALSE) # not saw-tooth jagged check_heaping_sawtooth(smoothed, Age)#> [1] 0.9825734# saw-tooth pattern detected in older ages check_heaping_sawtooth(pop1m_pasex, Age)#> [1] 1.315748# heaped, but no 0>5 preference h1 <- heapify(smoothed, Age, p0 = 1, p5 = 1) # heaping progressively worse on 0s than on 5s. h2 <- heapify(smoothed, Age, p0 = 1.2, p5 = 1) h3 <- heapify(smoothed, Age, p0 = 1.5, p5 = .8) h4 <- heapify(smoothed, Age, p0 = 2, p5 = .5) if (FALSE) { plot(Age, smoothed, type='l') lines(Age, h1,col="blue") lines(Age, h2,col="green") lines(Age, h3,col="red") } check_heaping_sawtooth(h1, Age)#> [1] 1.013527# 0-preference > 5 pref check_heaping_sawtooth(h2, Age)#> [1] 1.148617# increasing values check_heaping_sawtooth(h3, Age)#> [1] 1.584231check_heaping_sawtooth(h4, Age,ageMin=35)#> [1] 2.895501