2013 Italian Stata Users Group meeting: Abstracts
The role of sensitivity analysis in estimating causal pathways from observational data
Causal inference in observational data is a nearly alchemic task because parameter estimates depend on the model being correctly specified. Researchers strive to include all potential confounders in their models, but this assumption cannot be directly tested. Further complications arise in causal mediation analyses where the decomposition to direct and indirect effects is of interest. We argue that sensitivity analysis is an effective method for probing the plausibility of this nonrefutable assumption. The goal of sensitivity analysis in the context of causal mediation is to quantify the degree to which the key assumption of no unmeasured confounders must be violated for a researcher's original conclusion regarding the decomposition to direct and indirect effects to be reversed. Three general scenarios where the assumption of no unmeasured confounders is violated will be discussed, and results derived from sensitivity analyses appropriate for each scenario will be presented.
rfmm: a Stata command for the minimum density power divergence estimation of finite mixtures of regression models
The minimum density power divergence (MDPD) framework (Basu et al. 1998) provides a family of estimators indexed by a parameter (α, which controls the tradeoff between efficiency and robustness. In this paper, we extend this estimation framework to finite mixtures of regression models. In order to make this extension readily accessible to researchers, we provide the new Stata command rfmm, which allows for the MDPD estimation of finite mixtures of Gaussian, Poisson, and negative binomial regression models. Of special note is that the proposed command provides a graphical tool for preliminary diagnostics on the appropriate number of mixtures’ components based on the L2 criterion function (Scott 2009).
We compare the performance of the MDPD family of estimators provided by rfmm with the ML estimator via Monte Carlo simulations for correctly specified and gross-error contaminated mixture of Poisson regression models. Finally, the proposed package is illustrated using applications from the biometrical and health economics literatures.
- Basu, A., Harris, I. R., Hjort, H. L., and Jones, M. C. 1998.
- Robust and efficient estimation by minimizing a density power divergence. Biometrika 85: 549–560.
- Scott, D. W. 2009.
- The l2e method. Wiley Interdisciplinary Reviews: Computational Statistics 1: 45–51.
Estimating net survival using a life-table approach
Cancer registries are often interested in estimating net survival, the probability of survival if the cancer under study is the only possible cause of death. In 2011, Pohar Perme et al. proposed a new estimator of net survival based on inverse-probability weighting. They demonstrated that existing estimators of net survival based on relative survival were biased, whereas the new estimator was unbiased. However, the Pohar–Perme estimator was developed for continuous survival times, yet cancer registries often only have discrete survival times (for example, survival time in completed months or completed years).
We propose an approach to estimation when survival times are discrete and life table estimation is applied.
A new Stata command, stnet, allows a life-table estimation of net survival, adapting the Pohar–Perme approach to life-table estimation. Age-standardized survival estimates are available. In addition to the traditional cohort and complete approach, estimates can also be made using a period or hybrid approach.
With examples, we describe the main feature of the new command and illustrate some comparison of the results produced by our life-table approach and the time-continuous approach developed on R software.
The new command is available for download from the SSC-Archive.
- Pohar Perme, M., J. Stare, and J. Estève. 2012.
- On estimation in relative survival. Biometrics 68: 113–120 .
- Pohar Perme, M. 2013.
- relsurv: Relative survival. R package version 2.0-4. Available at http://CRAN.Rproject.org/package=relsurv.
xsmle: a Stata command for spatial panel-data models estimation
This paper presents xsmle, a new Stata command for the estimation of spatial panel-data models. xsmle fits a spatial autoregressive model, a spatial error model, and a spatial Durbin model with fixed or random effects and with or without a dynamic component. Moreover, xsmle estimates the fixed-effects spatial autoregressive model with autoregressive disturbances and the generalized spatial random-effects model. Different weighting matrices may be specified when appropriate, and both Stata matrices and spmat objects are allowed. Of special note is that xsmle computes direct, indirect, and total effects according to Lesage (2008), implements Lee and Yu’s (2010) data trasformation for fixed-effects models, performs a robust Hausman test, and may be used with the mi prefix when the panel is unbalanced.
- Lee, L.-f., and Yu, J. 2010.
- Estimation of spatial autoregressive panel data models with fixed effects. Journal of Econometrics 154: 165–185.
- Lesage, J. 2008.
- An Introduction to Spatial Econometrics. Revue d’économie industrielle: 123, 19–44.
Reti sociali ed inizio dell’abitudine al fumo: risultati di uno studio pilota fra gli studenti di una scuola secondaria di secondo grado
Introduzione: L’inizio dell’abitudine al fumo avviene prevalentemente durante l’adolescenza e può essere influenzato dalle reti di relazioni che si creano a scuola. Il presente studio ha l’obiettivo di ricostruire le reti di relazioni tra gli studenti del primo e secondo anno di una scuola secondaria di 2⩝ grado, di valutare il grado di centralità dei fumatori nella rete sociale ed il livello di omofilia della stessa.
Metodi: Nell’ambito di un progetto di ricerca europeo sull’abitudine al fumo, è stato condotto uno studio pilota, di tipo trasversale, nel Liceo Psico-Pedagogico della città di Cassino (FR). Attraverso un questionario auto-compilato, sono stati raccolti dati sugli studenti iscritti al primo e secondo anno. Insieme al questionario è stato fornito un elenco con i nomi di tutti gli studenti della scuola dei primi due anni, in cui a ciascun nome era abbinato un codice numerico. Ogni studente poteva indicare fino a 5 compagni di scuola con cui più frequentemente studiava o trascorreva del tempo libero. La rete sociale è stata visualizzata utilizzando il comando netplot (Corten 2011), mentre l’indice di betweenness centrality è stato calcolato utilizzando il comando netsis (Miura 2011).
Risultati: Su 231 studenti iscritti al biennio, 175 (range di età 13–16 anni, 90.2% femmine) hanno compilato il questionario (proporzione di rispondenti=75.8%). Di questi, quasi la metà (43.3%) aveva provato a fumare almeno una volta. I fumatori regolari sono risultati pari al 6.9%, mentre il 36.4% erano nella fase di sperimentazione dell’abitudine al fumo. Il numero di relazioni (indegree) è risultato simile tra fumatori correnti, sperimentatori e coloro che non hanno mai fumato, con valori medi rispettivamente pari a 3.3, 4.3 e 3.6. La probabilità di avere un amico fumatore o sperimentatore tra quelli nominati è risultata rispettivamente pari al 27.1%, 27.4% e 23.5% per fumatori correnti, sperimentatori e coloro che non hanno mai fumato. Nei 3 gruppi, i valori di betwenness centrality sono risultati rispettivamente pari a 0.037, 0.025 e 0.017.
Conclusioni: Poiché soggetti fumatori e sperimentatori hanno valori di centralità superiori rispetto ai non fumatori, interventi di prevenzione dell’abitudine al fumo in questa fascia d’età devono tenere conto dell’influenza che i soggetti fumatori esercitano sui loro amici.
Generalized structural equation modeling in Stata
Stata’s structural equation modeling (SEM) capabilities have been greatly expanded in version 13. Support for categorical and count outcomes as well as multilevel data structures allow us to fit a dizzying array of models. This talk will demonstrate how to use these new features with several common applications.
Maxima Bridge System: A software interface between Stata and the Maxima computer algebra system
Maxima is a free and open-source computer algebra system (CAS), namely, software that can perform symbolic computations such as solving equations, determining derivatives of functions, obtaining Taylor series, and manipulating algebraic expressions. In this presentation, I discuss the Maxima Bridge System (MBS), a collection of software that allows Stata to interface with Maxima to use it as an engine for symbolic computation, transfer data from Stata to Maxima, and retrieve results from Maxima. The cooperation between Stata and Maxima provides the user with an environment for statistical analysis in which the power of symbolic computation can be easily used together with all the facilities supplied by Stata. In this environment, the statistician can employ symbolic computation algorithms, when convenient, to manage the complexity of algebra and calculus while using numerical computation when speed matters.
Average partial effects in multivariate probit models with latent heterogeneity: Monte Carlo experiments and an application to immigrants’ ethnic identity and economic performance (Cancelled)
We extend the univariate results in to multivariate probit models, proving that multivariate probit models produce estimates of average partial effects (APEs) based on joint probabilities that are robust to general forms of conditionally independent unobserved heterogeneity. We also show that consistency may break down for APEs based on probabilities conditional on response subvectors (whether or not the response variables are used as regressors) as well as for APEs based on joint probabilities in some constrained models. For each of the cases considered, we provide restrictions that reestablish consistency. The finite-sample performance of such solutions are examined through a battery of Monte Carlo experiments on the bivariate probit model. A useful implication of our results is the possibility of implementing simple Stata procedures for estimating APEs that are both faster and more accurate than simulation-based codes. For example, we prove that, in the context of the trivariate probit model, APEs can be consistently estimated in Stata by combining, through the suest command, three biprobit equations (combined biprobit). A second battery of Monte Carlo experiments shows that our combined biprobit procedure is faster and has better finite-sample performances (bias and RMSE) than the simulation-based commands mvprobit and cmp. As a further empirical illustration, we apply the combined biprobit procedure and mvprobit to estimate APEs in the context of a trivariate probit model of ethnic identity formation and economic performance on Italian survey data of immigrants.
- Wooldridge, J. M. 2005.
- Unobserved heterogeneity and estimation of average partial effects. In Identification And Inference For Econometric Models: Essays In Honor Of Thomas Rothenberg, ed. by D. W. K. Andrews, and J. H. Stock. New York: Cambridge University Press.
A review of mediation analysis in Stata: Principles, methods and application
Abstract not available.
Estimating average treatment effects from observational data using teffects
After reviewing the potential-outcome framework for estimating treatment effects from observational data, this talk discusses how to estimate the average treatment effect and the average treatment effect on the treated by the regression-adjustment estimator, the inverse-probability weighted estimator, two doubly robust estimators, and two matching estimators implemented in teffects.