|Interactive graphs with Stata
Graphs have been used not only to solve topographic problems and
to represent social structures but also to study relationships
To improve their analytical potential, these graphs are endowed with an interactive potential that includes the selection of various attributes for the recognition of the elements analyzed and the modification of parameters to focus on stronger relationships.
The proposed representations are based on solving several equations and selecting only those coefficients with a significant positive relationship. By doing so, we obtain graphs by selecting the categories with predicted proportions or means significantly greater than those of the population. Furthermore, to increase their analytic power, they have interactive characteristics, which include the selection of the elements according to their size or attributes and the filter of the most central and strongest links. In this presentation, we advance a Stata program to elaborate these interactive graphs, giving a variety of examples.
Cristina Calvo López
Universidad de Salamanca
Universidad de Salamanca
|Implementation of different propensity-score matching (PSM) methods using Stata
Propensity-score matching (PSM) has become a popular approach to
estimate causal treatment effects, mainly because it allows
estimation of the ATT (Imbens 2004).
But there are several methods of matching such as exact matching, where each treated unit with one control unit for which the values of Xi are identical, or K-to-K matching. Based on actual analysis, in this seminar I compare the differences between the results of one-to-one and 1-to-K matching without replacement and the exact matching analysis in a simulated sample. Once the matching is done, I will estimate the treatment effect through three approaches: using the obtained weights, considering the matching as a cluster, and without considering weights or matching, and I will compare the results with the exact matching estimations. Changes in effect estimates were evaluated as a function of improvements in balance and effect estimand.
Laura del Campo Albendea
Universidad Autónoma de Madrid, Hospital Universitario Ramón y Cajal, and IRYCIS
|A Stata 17 implementation of the local ratio autonomy: Calling Python
In many countries around the world, the public sector is
decentralized to improve efficiency in the provision of public
Until the publication of the paper by Martínez-Vazquez, Vulovic, and Liu (2011), the level of decentralization was approximated through the local income ratio. It has been shown that this covariate is endogenous, and that because of the unobservable heterogeneity, it can generate correlation. The local autonomy ratio proposed by these authors is an indicator weighted by the inverse of the distance between municipalities, which in turn is weighted by the sum of the inverse of the distance between all municipalities in the country. However, we propose a local autonomy ratio, conditioned by the distance and population thresholds between the country's municipalities. It is evident that multiple distance and population restrictions must be tested until the effect of this ratio is found to be significant as a covariate in an econometric model. To reduce the computational cost-time of the estimation, we automated the calculation of the indicator, programming local ratio autonomy in Stata 17, but calling Python. We use Python version 3.10.5.
Juan S. Morales-Castillo
Universidad de Granada
Jose L. Sáez-Lozano
Universidad de Granada
|Introduction to Bayesian VAR estimation in Stata
The use of the Bayesian approach for regression analysis is
spreading more across different disciplines.
The possibility to incorporate a priori information in the form of probability distributions for the parameters of the model makes this approach highly appealing when the researcher has that knowledge. Bayesian vector autoregressive models (BVAR) are particularly attractive because the overparameterization present in many VAR models can be handled by using prior probability distributions that allow shrinking the parameter space. In this presentation, I will briefly highlight the general elements associated with Bayesian VAR models, and I will use a couple of examples to illustrate the way Stata implements the estimation for the parameters of a VAR model using the Bayesian approach and how we can get probabilities for events that combine levels for the different endogenous variables of the model.
|Ensemble learning targeted maximum-likelihood estimation for Stata users
Modern epidemiology has identified significant limitations of
classical epidemiological methods, such as outcome regression
analysis when estimating causal quantities for the average
treatment effect (ATE) using observational data.
A limitation of estimating the ATE with regression models is the assumption that the effect measure is constant across levels of confounders included in the model (for example, that there is no effect modification). Another limitation of parametric modeling rests on the need for correct model specification to obtain unbiased estimates of the true ATE.
To overcome these limitations, targeted maximum-likelihood estimation (TMLE) has been developed, which is a semiparametric, double-robust, efficient substitution estimator allowing for data-adaptive estimation while obtaining valid statistical inference based on the targeted minimum loss-based estimation. Moreover, TMLE allows inclusion of machine-learning algorithms to minimize the risk of model misspecification, a problem that persists for competing estimators.
eltmle is the only Stata program implementing TMLE for the ATE for a binary or continuous outcome and binary treatment. eltmle includes the use of an R-based super-learner called from the SuperLearner package v.2.0-2.1 (Polley et al. 2011) to calculate predictions of the treatment and outcome models. We are developing the program to be native to Stata using lasso and also calling the Super Learner from Python.
Evidence shows that TMLE typically provides the least unbiased estimates of the ATE compared with other double-robust estimators. Nonetheless, recent developments support the use of cross-fit double-robust estimators for data adaptive estimation, and we are planning to update eltmle with these functionalities.
London School of Hygiene and Tropical Medicine
Miguel Angel Luque Fernandez
London School of Hygiene and Tropical Medicine and University of Granada
|mpitb: A toolbox for multidimensional poverty indices
I present mpitb, a toolbox for multidimensional poverty
The Stata package mpitb comprises several subcommands to facilitate specification, estimation, and analysis of MPIs and supports the popular Alkire–Foster framework to multidimensional poverty measurement. mpitb offers several benefits to researchers, analysts, and practitioners working on MPIs, including substantial time savings (for example, due to lower data-management and programming requirements) while allowing for a more comprehensive analysis at the same time. Moreover, the toolbox encourages reporting of standard errors or confidence intervals.
Keywords: st0001, mpitb, multidimensional poverty, Alkire–Foster method, MPI
Autonomous University of Barcelona and University of Oxford
|Agent-based model calling Python from Stata 17: An application to spatial voting theory
The agent-based model (ABM) allows us to explain and simulate
the behavior of interacting, adaptive, and diverse agents
interacting in space and time.
In this presentation, we assume that agents' decisions are rational and profit maximizing. ABM offers great potential in the field of political behavior, because it helps us to better understand complex systems and mechanisms. The spatial theory of voting predicts that voters' decisions are based on the ideological distance between voter and candidates: voters locate themselves on the ideological spectrum and also locates the candidates according to their proposals. Ideological distance is the only argument that guides the vote. Our goal is to elaborate an ABM using the Python language. From Stata 17, we will call Python to complete the modeling, calibration, and simulation phases of the model.
Universidad de Granada
Jose L. Sáez-Lozano
Universidad de Granada
|Computing decomposable multigroup indices of segregation
There are eight multigroup segregation indices that are
decomposable into a between and a within term.
They are two versions of (a) the mutual information, (b) the symmetric Atkinson, (c) the relative diversity, and (d) Theil's H index. In this presetation, we present the Stata command dseg for obtaining all of them. It contributes to the stock of segregation commands in Stata by (1) implementing in a single call the decomposition; (2) providing the weights and local indices employed in the computation of the within term; (3) facilitating the deployment of the decomposability properties of the eight indices in complex scenarios that demand tailor-made solutions; and (4) leveraging sample data with bootstrapping and approximate randomization tests. We analyze 2017 census data of public schools in the United States to illustrate the use of dseg. The subject topic is school racial segregation.
Keywords: atkinson, decomposability, multigroup, mutual information, race, relative diversity, Theil's H, schools, segregation
Universidad Carlos III de Madrid
Universidad Nacional de Educación a Distancia
|Board composition and airports' efficiency
Adequate management, supervision, and control are essential for
effective airport operations decisions.
The board structure (internal mechanism of corporate governance) embeds a monitor system (one- or two-tier system) for decision-making processes according to the airports' needs and shareholders' best interests. However, other factors could implicitly enhance endogamy. Previous studies have demonstrated a positive relationship between board size, gender, and reporting quality. There are no implications of the board composition features on aviation efficiency. We apply data envelopment analysis (DEA) to estimate 41 airports' efficiency in 2019. We use a second-stage truncated regression to explain efficiency per the boards' features (independence, size, and gender equality) and accounting, financing, and company characteristics. The results show that gender equality at the board level and the Board size improves airports' efficiency significantly. However, a second-tier system, for example, having executive (internal) and nonexecutive members (external) do not ensure making the appropriate managerial decisions, thus reducing airports' efficiency.
Keywords: corporate governance, board composition, gender equality, airports, efficiency, DEA, truncated regression
Poznan University of Economics and Business
Ane Elixabete Ripoll-Zarragaa
Universitat Autònoma de Barcelona
|Open panel discussion with Stata developers
Contribute to the Stata community by sharing your feedback with StataCorp's developers. From feature improvements to bug fixes and new ways to analyze data, we want to hear how Stata can be made better for our users.