Bayesian modelling and mapping of health outcomes in space and time using complex national surveys

Atilola, Glory Olayiwola (2021) Bayesian modelling and mapping of health outcomes in space and time using complex national surveys. Doctoral thesis, Northumbria University.

Text (Doctoral Thesis)
atilola.glory_phd_16029707.pdf - Submitted Version

Download (6MB) | Preview


Significance of the thesis:
Nationally representative surveys have played a central role in generating relevant data for monitoring and evaluation of health outcomes over the past four decades. In such surveys, individuals within households, are nested within geographically referenced units, and sampled with unequal probabilities over time. This allows subnational and small area analyses to be carried out and trend estimation to be done locally. More importantly, population determinants of health outcomes can be evaluated in a manner that accounts for the complexity of the sampling design, geographic variation, and subnational trends within a unified generalized linear additive mixed regression modelling framework. Efforts to acknowledge and account for such complex sampling design features ensure improved estimation of risk factors and small area estimation of trends. We propose two novel statistical frameworks to address these two important challenges: 1) The Proximate Determinants Framework to evaluate female genital mutilation risk factors embedded within a hierarchical Bayesian SpatioTemporal STructured Additive mixed Regression (PDF/ST-STAR) using nationally representative household surveys; 2) The repeated measurement hierarchical Bayesian linear mixed effect model (RM-LMM) for small area estimation of local trends in health outcomes using nationally representative longitudinal survey data.

How the Research was undertaken:
The study assessed five research questions: 1) What factors, operating at individual and community levels, explain observed prevalence trends in female genital mutilation/cutting (FGM/C) among girls in Kenya, Nigeria, and Senegal between a specific period? 2) Does accounting for excess variability in FGM/C prevalence risk due to unmeasured risk factors interacting in space and time lead to observable changes in risk factor estimates? 3) Does accounting for the complex sampling design feature inherent in Demographic and Health surveys result in observable changes in the effect size of risk factor estimates? 4) Was there any significant overall change in the average systolic blood pressure among South African adults aged 18 years and older between 2008 and 2017? 5) Was there any observable geographic variation in average systolic blood pressure (SBP) among South African adults aged 18 years and older at small area (district municipality) level in 2008, 2010, 2012, 2014/15 and 2017?

Statistical Methodology:
The study utilized two sources of nationally representative survey data to address the research questions. For the FGM/C study, we utilized the Demographic and Health survey (DHS) data on girls and their mothers collected cross-sectionally across three time points between a specific period in 3 African countries – Kenya DHS (2003, 2008, 2014), Nigeria DHS (2008, 2013, 2018), and Senegal DHS (2010, 2015, 2017). For the small area estimation study on systolic blood pressure, we considered the South African National Income Dynamics Survey (NIDS) data collected, on average, every two years across 5 time points between 2008 and 2017.

For the FGM/C risk factor study, we investigated changes to the observable influence of risk factors on FGM/C likelihood among girls in Kenya, Nigeria, and Senegal between a specific period. We evaluated changes to risk factor estimates after accounting for the complex sampling design features of the DHS data separately, namely, stratification and clustering, within a non-separable hierarchical Bayesian spatiotemporal generalized additive mixed regression model framework. To account for non-separable influence of geography and time in the model framework, we decomposed the total space-time variability in FGM/C prevalence risk into - a main spatial effect, a main linear temporal effect, and a space-time interaction term. We considered four types of prior specifications to model the space-time interaction term as proposed by Knorr-Held to evaluate excess variability in space and time. All analyses were implemented using the Integrated Nested Laplace Approximation (INLA) within the R programming environment using the R-INLA package. Model comparison and performance evaluation were carried out using deviance information criterion (DIC) and Watanabe information criterion (WAIC). Predictive performance of the best fitting model was assessed using the logarithm of the conditional predictive ordinate(logCPO) at the individual level and the root mean squared error (RMSE) at state/regional level.

For the small area estimation of SBP study, we implemented the direct (design-based) estimation method, the spatially smoothed designed-based estimation method within a hierarchical Bayesian framework, and the newly proposed repeated measurement hierarchical Bayesian linear mixed effect model (RM-BLMM) formulation. The first two considered cross-sectional analysis of the NIDS survey data as a representative sample of the South African adult population at each time point with sample obtained across the 52 small areas (districts municipalities) for all five waves (2008, 2010, 2012, 2015 and 2017). Spatial smoothing was achieved by decomposing the overall effect into spatially structured random effects and spatially unstructured effects and a penalized complexity prior. The proposed RM-BLMM formulation, however, addressed three important limitations of the design-based estimators, namely, repeated measurement of SBP at each wave for every survey participant, the longitudinal design features of the NIDS survey and the problem of extremely small sample size at spatially disaggregated level. Statistical analysis was implemented using Markov Chain Monte Carlo simulation techniques using Gibb’s sampling inferential algorithm over 80,000 MCMC iterations thinning every 10th iteration to ensure convergence and efficiency of the MCMC samples. Convergence was attained after the first 5000 iterations and were discarded as burn-in while the remaining 75,000 samples were utilized for posterior inference. Convergence was by checking the trace plots of the samples, the autocorrelation functions, the estimated kernel density plots, the Brooks-Gelman-Rubin (BGR) summary statistic and a Monte Carlo errors <5% of the posterior standard deviation. Model evaluation was carried out by comparing the model-based predicted mean trajectory to the observed data values. Model-based small area prediction was carried out using posterior means and variance of estimated model parameters. Computation of standard error of small area mean estimates and 95% confidence interval was conducted using a parametric bootstrap sample procedure over 50 simulations.

Main Research Findings:
We found significant decline in the probability of FGM/C among girls 0-14 years in Kenya between the period 2003 and 2014 (Mean: -1.47, 95%CI: -1.85, -1.10). In contrast, prevalence trend remained unchanged in Nigeria for the period 2008 to 2018 (Mean: 0.38, 95%CI: -0.11, 0.88) and in Senegal for the period 2010 to 2017 (Mean: 0.16, 95%CI: -0.15, 0.46).

At the individual level, we observed a positive relationship between a mother’s FGM/C history (Kenya, Nigeria, Senegal), her support for continuation of the practice (Nigeria and Senegal), household decision making by Father (Kenya) and jointly by Father and Mother (Nigeria), wife justification of beating for sex refusal (Senegal), religious affiliation (Kenya, Nigeria), Marital status (Nigeria), Ethnicity (Kenya and Nigeria), and a girl’s likelihood of undergoing FGM/C between each specific study period. At the contextual (community) level, we found substantial evidence for a positive relationship between prevalence of FGM/C among mothers, proportion that supported FGM/C continuation and to a lesser extent proportion that believed FGM/C was a religious requirement, and the probability of FGM/C in a girl in Nigeria and Senegal. All risk factor models showed substantial improvement in predictive performance after accounting for Type I space-time interaction effect and cluster sampling DHS design across the three countries. Notable changes were observed in risk factors, especially in Nigeria. In addition, the study demonstrated for the first time, that accounting for the effects of contextual factors operating at community level and excess variability due to cluster sampling, substantially reduced spatially structured unobserved risk factor effects at state/regional level.

For the small area estimation study, we found evidence of significant decline in the overall average SBP among South African adults between 2008 and 2017. Findings from the spatially smoothed design-based estimator showed significant geographic variations in average SBP at district level for each consecutive cross-sectional time point of the NIDS samples. Results showed preponderance of areas with elevated SBP in the southern districts in 2008, 2010, 2012, 2014/2015 and 2017 (such as Overberg, Central Kaaro, West Cape) and cluster of areas with low mean SBP in northern part of the country. Findings showed overall reduction across all districts from 2008 to 2017 with greatest reduction observed in Frances Baard and Pixley ka Seme in Northern Cape Province, Uthukela and Umzinyathi in Kwazulu-Natal Province, Nkangala in Mpumalanga Province, and Sedibeng in Gauteng Province. However, the study only obtained preliminary results for the RM-BLMM with the assumption of a random intercept but a common trajectory in SBP progression for all survey participants with at least one SBP measurement time point. The model predicted the observed data well given the observed small change in mean for majority of participants between the study period. Future work needs to extend the proposed framework to more complex model formulations.

Why Research findings matter:
Study findings demonstrate that social normative influences operating within specific normative ecological contexts at individual, household and community levels are the most important drivers of FGM/C prevalence at a single time point and persistence over time. This conclusion was supported by the reduced impact of social norms in Kenya where a significant decline in the practice was observed between 2003 and 2014. The PDF/ST-STAR framework approach to modelling FGM/C likelihood showed evidence of observable changes in risk factor effect size estimates after accounting for cluster sampling design across the three countries and excess variability due to unobserved risk factors operating concurrently in specific region and specific time point. Evidence from the small area estimation study provides adequate insight to evaluate district level geographic pattern in the distribution of average SBP among South African adults in 2008, 2010, 2012, 2014/2015 and 2017 along with spatially smoothed interval estimates. More importantly, the RM-BLMM model-based framework provides an important and innovative contribution to existing small area estimation statistical methodologies for longitudinal survey data.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: spatiotemporal statistics, small area estimation, risk factor epidemiology, female genital mutilation, blood pressure
Subjects: B900 Others in Subjects allied to Medicine
G100 Mathematics
Department: Faculties > Engineering and Environment > Mathematics, Physics and Electrical Engineering
University Services > Graduate School > Doctor of Philosophy
Depositing User: John Coen
Date Deposited: 08 Apr 2022 11:42
Last Modified: 08 Apr 2022 12:00

Actions (login required)

View Item View Item


Downloads per month over past year

View more statistics