Home  /  Resources & support  /  User Group meetings  /  2009 Canada Stata Users Group meeting

Last updated: 13 November 2009

2009 Canada Stata Users Group meeting

Thursday, 22 October 2009

Toronto by day

Pantages Hotel
200 Victoria Street
Toronto, ON M5B 1V8

Proceedings


Reflections on how Stata enhances creativity and problem solving in the real world

Lee Sieswerda
Thunder Bay District Health Unit
Many people think of science and statistics as dry and even lifeless endeavors. In fact, the wellspring of good science and statistics is creativity, and creativity is enhanced by memory, imagination, beauty, and collaboration. I will discuss some of the Stata features that I believe work together to stimulate the creative impulse, including a unified interface design and syntax structure, the “type a little, get a little” paradigm, a very large suite of statistical procedures, expandability, excellent documentation, automation, beautiful and flexible graphics, and inter-operability with other statistical packages.

Additional information
ca09_sieswerda.ppt

Automating the production of descriptive tables at Statistics Canada: mog.ado, a user-written program

Matt Hurst
Statistics Canada
Research at Canadian Social Trends within Statistics Canada, Canada’s premiere Statistical agency, often involves the creation and analysis of numerous descriptive tables. These tables provide convenient and easy-to-understand information for the general public, one of our many clients. Analysis generally requires an understanding of what estimates are statistically different from each other. Statistics Canada’s quality control measures require that any released estimates pass reliability and confidentiality standards. Both of these needs are often operationalized by numerous lines of Stata code after the use of a command, such as mean. This presentation is about a user-designed program, mog, that is essentially a front-end for the mean and test commands. It produces a fixed-width table of means over the groups specified. This table can then be easily copied into other productivity tools (Word, Excel, Open Office Apps, etc.) for any additional formatting and publication. The key is that the results are tabular and can copy properly as a table, significance tests of estimates versus a reference group are already performed and indicated, and quality control symbols indicating minimum sample size and individual significance are shown. I plan to present the amount of code to perform the tasks the old way, and thus time saved using the command, as well as the many options it has.

Additional information
ca09_hurst.ppt

Using Stata graphs to visually monitor the progress of multicenter randomized clinical trials

Glenn Jones
McMaster University
Alexandra Whate
University of Guelph
Medical randomized trials testing treatments are a complex technology and they require regular attention to assure data quality prior to definitive analysis. Typically, only simple non-graphical methods (tables, proportions) help monitor trial progress, except one graph of cumulative accrual over calendar time; surprisingly, graphical methods are largely ignored for this purpose. For six multicenter trials of the International Atomic Energy Agency, we have developed a graphical approach to data management and trial monitoring, using histograms, scatterplots, dot plots, and cumulative distributions as indicators of overall study and investigator-specific quality. Monthly reports are automated (do-files) and are sent as slideshows by email to investigators and the International Atomic Energy Agency staff. Visual patterns and shapes of curves facilitate early and rapid identification of issues. Clear pictures help investigators to better adhere to a protocol and improve accuracy and completeness of trial data. Visual methods assist in the tracking patients, submitting forms, and clarifying data. Clinical investigators find graphs to be far more intuitive, engaging, efficient, meaningful, and compelling, as compared with conventional tables and text (especially in developing countries where statistical training and language barriers may interfere). This presentation will demonstrate our visual strategy to trial management and explore how this may be optimized.

Additional information
ca09_whate.ppt

Using and teaching Stata in emergency medicine research rotation

Muhammad Waseem
Lincoln Medical & Mental Health Center
Participation in scholarly activities is a requirement in Emergency Medicine (EM) Residency Curriculum. A research project is a necessity for graduation for EM residents. To fulfill this requirement, EM residents have a mandatory research rotation. During this rotation, residents learn basic research designs, write protocols for IRB, and collect data. In addition, they are required to understand basic statistical concepts before the data are analyzed. I believe that their understanding will be enhanced if they are provided with the basic knowledge of a statistical program. During the EM research rotation, residents are introduced to Stata and research methods. I developed a manual explaining the basic operation of Stata, which includes, but is not restricted to the following: pull-down menus (rather than commands), 4 windows, 9 tabs, basic commands with pull-down menus, description and summarization of data, tables of frequencies, tables of mean, data input, data output, data import, saving files, graph commands with dialog boxes, box plots, histograms, and scatterplots. In my experience, introduction to Stata facilitated accurate data recording. It also provided residents the experience necessary to navigate Stata following the completion of the research rotation.

Additional information
ca09_waseem.ppt

Teaching Stata—Some reflections after 8 years of training experiences

Karen Robson
York University
This presentation focuses on the author’s 8 years of experience teaching Stata to international audiences—primarily at the Essex Summer School in Social Science Data Analysis and Collection in the United Kingdom, but also in the World Bank funded statistical capacity-building initiatives in Bosnia-Herzegovina and Albania. The author has recently co-authored (with David Pevalin) The Stata Survival Manual, published by Open University/McGraw Hill. The author will focus on common student questions and some approaches she has used to assist students in learning the software.

Additional information
ca09_robson.ppt

Teaching Stata and statistics in contexts of evidence-based medicine and clinical trials

Glenn Jones
McMaster University
Alexandra Whate
University of Guelph
International experiences with students (high school, medical) and clinical investigators (courses, trials' meetings) demonstrate that Stata is highly visual, intuitive, and relatively straightforward. Stata helps the teacher communicate efficiently and effectively about methods and concepts relating to data management, statistics, reporting, the nature of evidence and causality, and the technology of trials. For example, core aspects of medical research (randomized trials, survival plots) do not require sophisticated modeling methods and are essential (i.e. repeatedly used to answer different questions). A subset of Stata components aligns with non-Stata course content to constitute a "basic curriculum" for individuals without much statistical training or research experience. Hands-on use of Stata (e.g. individual laptops) using a small set of concocted databases with highly relevant questions may be matched in real-time to a presentation of course content. Stata quickly becomes an easy "add-on" to an organized presentation of course content. Consistent with educational psychology, the combination of didactic presentation and dynamic (i.e. Stata) interactions more effectively engages learners and improves learning and retention. Learners simultaneously pick up Stata as a skill. Theoretical and practical features of this teaching approach, relevant from elementary school to medical professionals and clinical investigators, will be described and demonstrated.

Additional information
ca09_jones.pptx

Survey data analysis in Stata

Jeff Pitblado
StataCorp
In this presentation, I cover how to use Stata for survey data analysis assuming a fixed population. We will begin by reviewing the sampling methods used to collect survey data, and how they affect the estimation of totals, ratios, and regression coefficients. We will then cover the three variance estimators implemented in Stata’s survey estimation commands. Strata with a single sampling unit, certainty sampling units, subpopulation estimation, and poststratification will be also covered in some detail.

Additional information
ca09_pitblado_presentation.pdf
ca09_pitblado_handout.pdf
ca09_pitblado_stata.zip

Data cleaning in Stata using Internet search engines

Sergiy Radyakin
The World Bank
Open-ended questions can be a nightmare for statistical processing. Any mistake in spelling can result in a mismatch during merging, or multiple counting of the same object. For example, the answers to the "place-of-birth" question might be "Chicago" and "San Francisco", but in practice they are often "Chicaga" and "SanFrancisko". Manual correction of hundreds of answers is tedious, and becomes infeasible with a larger dataset. For a long time, algorithms like SOUNDEX remained the only alternative for researchers. A new Stata command allows taking advantage of Internet search engines, like Google or Yahoo to find proper substitutes for an unclear word or multiple words. The distinctive feature of the search engines is that they rely not only on the spelling similarity, but are also context driven: other words may affect the suggestion, such as including "city" into the query. This will hint to the search engine to give more priority to the names of cities. This presentation will demonstrate this new command and explain the main steps necessary to programmatically acquire information available on the Internet and convert it into Stata-usable format. Keywords: data cleaning, search engine, spelling correction.

Additional information
ca09_radyakin.pdf
ca09_radyakin.wmv

Using Stata with Statistics Canada data: Incorporating complex survey design into analysis

Leslie-Anne Keown
Statistics Canada
Most Statistics Canada data are based on surveys with complex survey designs. To allow users to account for the survey design in their analyses, Statistics Canada generally provides both a probability weight and a set of survey bootstrap weights in the survey data files. This presentation will give an overview of how to use the survey commands in Stata to account for the complex survey design using the weights and bootstrap weights provided. It will also give some practical advice on using these elements with various surveys and some of the pitfalls to avoid.

Additional information
ca09_keown.pptx

Report to users

Jeff Pitblado
StataCorp
Presentation
ca09_rtu_pitblado.pdf