Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: FW: help on variable selection problem


From   Ronan Conroy <rconroy@rcsi.ie>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: FW: help on variable selection problem
Date   Mon, 13 Jun 2011 14:06:12 +0100

On 2011 Meith 10, at 20:40, Lachenbruch, Peter wrote:

> A student is trying to analyze data from a national survey (no weights needed).  She has 26 variables plus 10 years of data.  There are about 1,000,000 observations.  With this many observations, everything is significantly different from 0.  She's using mlogit (predicting medical care expenses), so she'd like to cut down the number of 'important' predictors.  I have thought of several options: backward stepwise  (not available with mlogit); look at effect size and insist it be larger than 0.05 - again not available since there are four categories of the response variable; use a Bonferroni inequality on the coefficients and insist on a low p-value to begin with - e.g. try for a size of 0.01 adjusting for 25 tests, so p must be less than 0.0004.  The issue seems to be the huge sample size pushing everything to significance.
> Does anybody have any ideas?

What is the underlying question? Is it to locate the most important modifiable predictors? To look for determinants of unusual levels of expenditure (in individuals or in demographic sectors of the population)? To partition the population based on a parsimonious set of splits? Is it to inform health service planning? 

Without a practical research question driving the model, you will indeed end up with a lot of predictors that are statistically significant but which have yet to prove their importance.




Ronán Conroy
rconroy@rcsi.ie
Associate Professor
Division of Population Health Sciences
Royal College of Surgeons in Ireland
Beaux Lane House
Dublin 2


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index