The 2017 Canadian Stata Users Group meeting was on June 9 at the Bank of Canada, but you can still view the program below.
|8:00–8:30||Registration and breakfast|
A simple, graphical approach to comparing multiple treatment
Abstract: I propose a graphical approach to comparing multiple treatments that allows users to easily infer differences between any treatment effect and zero and between any pair of treatment effects. Our approach makes use of a flexible, resampling-based procedure that asymptotically controls the familywise error rate (the probability of making one or more spurious inferences). I demonstrate the usefulness of our approach with three empirical examples.
Causal inference with sample selection
Abstract: I discuss how to use the new extended regression model (ERM) commands to estimate average causal effects when the outcome is censored or when the sample is endogenously selected. I also discuss how to use these commands to estimate causal effects in the presence of endogenous explanatory variables, which these commands also accomodate.
The relationship between healthcare expenditure and income in Latin-American countries: A panel time
Abstract: This presentation examines the link between healthcare expenditure and GDP in Latin American and Caribbean countries using the Stata command xtwest to estimate the error-correction-based cointegration tests for panel data featured in The Stata Journal (Persyn and Westerlund, 2008). Extensions of unit root and cointegration tests to a panel data-approach allow investigation of the dynamics for developing countries, which tend to have shorter available time series. I employ several unit-root tests coded as xtunitroot to determine whether the panel is stationary and Westerlund (2007) coded as xtwest to analyze the relationship and dynamics between health expenditure (total, private, and public) and income. Cointegration tests for panel data have increased power properties over traditional tests because of increased degrees of freedom and the inclusion of heterogeneous cross-country information. Results suggest that all categories of health expenditure move to maintain a stable long-run equilibrium. This presentation is an interesting case study of the use of Stata commands within economics. All empirical analysis was conducted in Stata and will highlight several recent developments within time-series techniques applied to panel data that have yet to be coded within Stata to make said tools accessible to researchers.
Elisabet Rodriguez Llorian
University of Manitoba
Assessing credit risk in the automated clearing settlement system
Abstract: In this presentation, I study a credit risk management scheme for the Canadian retail payment system designed to cover the exposure of a defaulting member. Using a generalized extreme value model, I estimate ex ante the size of a collateral pool large enough to cover exposure for a historical simulated worst-case default scenario and examine its performance over time in terms of stability and efficacy. I found that a statistically derived collateral pool outperforms a rule-based approach along several measures, such as variability and degree of coverage. However, despite relying on extreme-value theory, my model forecasts may still underestimate potential exposures given the absence of observed data on defaults in Canada. To this end, we investigate the statistical properties of gross payment volumes as potential early warning measures of balance sheet capital flight. My results are also informative for risk management in faster payment schemes in Canada.
Estimating Dynamic Stochastic General Equilibrium models in Stata
Abstract: Dynamic stochastic general equilibrium (DSGE) models are used in macroeconomics for policy analysis and forecasting. A DSGE model consists of a system of equations derived from economic theory. Some of these equations may be forward looking, in that expectations of future values of variables matter for the values of variables today. Expectations are handled in an internally consistent way, known as rational expectation. I describe the new dsge command, which estimates the parameters of linear DSGE models. I outline a typical DSGE model, estimate its parameters, discuss how to interpret dsge output, and describe the command's postestimation features.
Lunch and Poster Session
The multiway cluster wild bootstrap
Abstract: Many datasets involve observations which are grouped, or clustered, and often in several dimensions. While robust inference with single-way or multiway clustering is possible with a large number of clusters, reliable inference with few clusters and multiway clustering has otherwise proved challenging. We propose a bootstrap method that improves inference considerably.
The Vuong test for nonnested models
Abstract: Comparing nonnested models is challenging, especially in duration data, where there are not many available methods. This presentation discusses a transformed Vuong's test, which is applicable to models for duration data, in particular to hazard models that are not directly comparable. The proposed method adapts the Vuong's likelihood-ratio closeness test to compare parametric, semiparametric, and discrete hazard models. The test is built in Stata and shows good performance on ranking different nonnested hazard models. An empirical example presents the performances of different hazard models used to model the duration of shipbuilding firms. The results suggest that models that model the unobserved heterogeneity using finite mixtures perform better than other hazard models that model unobserved heterogeneity using different parameterizations.
Bitcoin awareness and usage in Canada
Abstract: There has been a tremendous discussion of Bitcoin, digital currencies, and fintech. However, there is limited empirical evidence of its adoption and usage. We propose a methodology to collect a nationally representative sample via the Bitcoin Omnibus Survey (BTCOS) to track the ubiquity and usage of Bitcoin in Canada. We find that about 64 percent of Canadians have heard of Bitcoin, but only 4 percent own it. We find that awareness of Bitcoin was strongly associated with men and those with college or university education; additionally, Bitcoin awareness was also more concentrated among unemployed individuals. On the other hand, Bitcoin ownership was associated with younger age groups and a high school education. Furthermore, we construct a test of Bitcoin characteristics to gauge the level of perceived versus actual knowledge held by respondents. We find that knowledge is positively correlated with Bitcoin adoption. Based on the survey response and data, we offer suggestions to improve future digital currency surveys, in particular, to achieve precise estimates from the hard-to-reach population of digital currency users using social network analysis.
Bank of Canada
Is cash the fastest way to pay?
Bank of Canada
Money pump and the gas pump: Revealed preference violations in nonenergy consumption and gasoline prices
Abstract: Using 11 years of monthly scanner consumption data for a panel of U.S. households in two municipalities, we document that households shift their consumption basket in response to gasoline prices and that these shifts are associated with exploitable revealed preference violations. Following Echinque, Lee, and Shum (2011) and Cherchye, De Rock, Schmeulders, and Spieksma (2012), we construct a money pump cost of the household's violation of the axiom of revealed preference. The money pump cost is the dollar value that could be extracted from the household from its revealed preference violations by a seller. We show that the money pump costs are affected by gasoline prices, household search effort (as proxied by the number of shopping trips) and by environmental health factors. We find that a $1.00 increase in the price per gallon of gasoline increases the average amount that could be exploited from a household by $0.85 - $3.27 from a basket of goods worth $50.00. Quantile regressions reveal that some households may be affected nearly twice as much. The results indicate that gasoline prices may have heterogeneous wealth and welfare effects.
Bank of Canada
The effects of interaction between location of birth and location of study on immigrant workers' wages in Canada
Abstract: Previous studies have suggested that the wage gap between immigrants and the native-born can be accounted for by human capital factors, including education and work experience and, more importantly, where they are acquired. However, current Canadian economic immigration policies do not consider either a potential immigrant's location of birth or location of study. In this paper, we attempt to study the effects of the interaction between a worker's location of birth and location of study on his or her wage with data from the 2011 National Household Survey. Using both OLS and median regression LAD, performed in STATA, we show that (1) the location of birth is not generally indicative of a workers earning potential; (2) without the interactions, all foreign degrees lead to a lower wage compared with Canadian peers, with a U.S. degree being the least punitive; (3) a U.S. degree would lead to a wage premium for workers from some countries; and (4) when a worker from a nontraditional foreign student source country receives a degree in a culturally and geographically distant location, there is a significant wage premium.
qfactor: A new Stata program for Q-methodology analysis
Abstract: Q-methodology is a research method in which qualitative data are analyzed using quantitative techniques. It has the strengths of both qualitative and quantitative methods and can be regarded as a bridge between these two approaches. Q-methodology can be used in any field of research where the outcome variable involves assessment of subjectivity, including attitudes, perceptions, feelings and values, life experiences such as stress and quality of life, and individual concerns such as self-esteem, body image, and satisfaction. Although it was introduced by William Stephenson in 1935, it has recently emerged as a more widely used method, mainly because of advances in its statistical analysis component. In Q-methodology an inverted factor analysis is used to identify salient viewpoints, as well as commonly shared views on subjective issues, thereby providing unique insights into the richness of human subjectivity. Only a limited number of programs offer Q-methodology analysis. Although there are many Stata users in the Q-methodology community, there is no program for Q-methodology in Stata. In this presentation, I will introduce a new program, qfactor, that was written in Stata and that not only includes common features in other programs but also adds many new technical features.
Variance estimation for survey-weighted data using bootstrap resampling methods: 2013 methods-of-payment
Abstract: Sampling units for the 2013 Methods-of-Payment survey were selected through an approximate stratified random sampling design. To compensate for nonresponse and noncoverage, the observations are weighted through a raking procedure. The variance estimation of weighted estimates must take into account both the sampling design and the raking procedure. I propose using bootstrap resampling methods to estimate the variance. I find that the variance is smaller when estimated through the bootstrap resampling method from Stata's ipfraking than through Stata's linearization method, where the latter does not take into account the correlation between the variables used for weighting and the outcome variable of interest.
Bank of Canada
Introduction to Bayesian Analysis Using Stata
Abstract: Bayesian analysis has become a popular tool for many statistical applications. Yet many statisticians have little training in the theory of Bayesian analysis and software used to fit Bayesian models. This talk will provide an intuitive introduction to the concepts of Bayesian analysis and demonstrate how to fit Bayesian models using Stata. No prior knowledge of Bayesian analysis is necessary and specific topics will include the relationship between likelihood functions, prior, and posterior distributions, Markov Chain Monte Carlo (MCMC) using the Metropolis-Hastings algorithm, and how to use Stata's graphical user interface and command syntax to fit Bayesian models.
Simulation-based robust IV inference for lifetime data
Abstract: Endogeneity or unmeasured confounding is a nontrivial complication in duration data models, for which there are relatively few existing methods. I develop two related, but methodologically distinct, identification-robust instrumental variable estimators to address the complications of endogeneity in an accelerated life regression model. The two unique methods generalize the Anderson-Rubin statistic to (1) lifetime data distributions in the case of the least squares estimator and (2) distribution-free censored models in the case of the rank estimator. Valid confidence sets, based on inverting the pivotal least-squares statistic and the linear rank statistic, form the basis for identification-robust inference using the Mata programming language via exact simulation-based methods. The finite sample performance of the proposed statistics is evaluated using the built-in features of Stata combined with the original Mata code. I provide an empirical analysis, utilizing an original prospectively collected clinical patient dataset in which the trauma status of a pediatric critical care patient instruments a possibly confounded illness severity index in a length of stay regression for a specific pediatric intensive care population. Results suggest a clinically relevant bias correction for routinely collected patient risk indices that is meaningful for informing policy in the healthcare setting.
Perils, challenges, and triumphs: Integrating Stata and syntax into an undergraduate social statistics
Abstract: Undergraduate social statistics classes are some of the most challenging to teach. One of the challenges that an instructor faces is how (or even if) to incorporate statistical software into the course. In a recent class of 75 sociology undergraduates, the course design included the use of syntax writing in Stata as a basic learning objective and wove the use and learning of syntax throughout the lectures and in all labs and a final research project. All 75 students were required to bring a laptop to class or to sign out a laptop, and labs were integrated directly with the lecture and not held in separate computer labs. This approach had many perils and challenges but also had some major triumphs. This presentation will outline the basic approach, discuss lessons learned, and suggest how this approach may be successfully used in other classroom contexts.
Wishes and grumbles
Workshop: Implementing an estimation command in Stata/Mata
David Drukker, StataCorp LLC
Writing a Stata command for methods that you use or develop disseminates your research to a huge audience. This workshop shows how to write a Stata estimation command. No Stata or Mata programming experience is required, but it does help. After providing an introduction to basic Stata do-file programming, the workshop covers basic and advanced ado-file programming. Next, it provides an introduction to Mata, the byte-compiled matrix language that is part of Stata. Then, the workshop shows how to implement linear and nonlinear statistical methods in Stata/Mata programs. The workshop discusses using Monte Carlo simulations to test the implementation.
Registration and accommodations
|Meeting fees||Price (USD)|
| Dinner June 9 at 6:30 p.m. (optional)
239 Nepean Street
Alt Hotel Ottawa
185 Slater street
Ottawa, Ontario K1P 0C8
Bank of Canada
Museum and conference center
30 Bank Street
Ottawa, Ontario, Canada
Kim Huynh (Chair)
Senior Research Advisor
Bank of Canada
Calgary Statistical Support