Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: probit model sample size


From   Steve Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: probit model sample size
Date   Fri, 2 Jul 2010 23:18:04 -0400

Now I don't know what you did:
1. Have you run a model with 20 variables in at one time?
2. Did you do some kind of selection process to narrow  the variables
down to seven "significant" ones? What was it
3. Most important, what is the purpose of your analysis: exploration?
prediction?  to study the association of your outcome with a small
number of primary variables, with the others of secondary interest?
to investigate  specific hypotheses?

For certain:
1) If analysis that had 20 variables in the model at the same time,
even at the start, then you must start with a reduced set.
2) If you employed a selection process like "keep only significant
variables" or "stepwise",  the p-values of the significant variables
are not believable.
3) Your exact strategy will depend on the purpose of your analysis.

References:
For overfitting (with some reservations):
http://www.psychosomaticmedicine.org/cgi/content/short/66/3/411?rss=1&ssource=mfc

For general modeling & validation strategies: Frank Harrell's book
Regression Modeling Strategies: With Applications  (Springer)

 Steve


On Fri, Jul 2, 2010 at 10:30 PM, dk <statad27@googlemail.com> wrote:
> Thanks Steve,
> I mean I found only 7 significant explanatory variables, but while
> ruining the probit model i have used 20 explanatory variables for 300
> sample size. can I go forward or I should use less explanatory
> variables.
>
> waiting for the reply,
>
>
> On Sat, Jul 3, 2010 at 2:36 AM, Steve Samuels <sjsamuels@gmail.com> wrote:
>> For logit models, the rules of thumb apply not to the number of
>> observations, but to the smaller of the number of events and
>> non-events. If that applies to probit models (and I think it does),
>> then you have  already overfit your data  and must  reduce the number
>> of predictors.
>>
>> Steve
>>
>>
>> --
>> Steven Samuels
>> sjsamuels@gmail.com
>> 18 Cantine's Island
>> Saugerties NY 12477
>> USA
>> Voice: 845-246-0774
>> Fax:    206-202-4783
>>
>>
>>
>> On Fri, Jul 2, 2010 at 6:25 PM, dk <statad27@googlemail.com> wrote:
>>> Thanks Tony,
>>>
>>> You mean I can use the probit model for 300 sample size, 20
>>> explanatory variables, but I have to interpret the results carefully.
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jul 2, 2010 at 11:30 PM, Lachenbruch, Peter
>>> <Peter.Lachenbruch@oregonstate.edu> wrote:
>>>> You generally need about 8 to 10 times as many observations as you have variables.  You may be overfitting your data.  Proceed with caution.
>>>>
>>>> Tony
>>>>
>>>> Peter A. Lachenbruch
>>>> Department of Public Health
>>>> Oregon State University
>>>> Corvallis, OR 97330
>>>> Phone: 541-737-3832
>>>> FAX: 541-737-4001
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of dk
>>>> Sent: Friday, July 02, 2010 1:53 PM
>>>> To: statalist@hsphsun2.harvard.edu
>>>> Subject: st: probit model sample size
>>>>
>>>> I want to know about the sample size for the probit model. If i have a
>>>> sample size of 300 and using 20 explanatory variables, after running
>>>> the model, I have carried various goodness-of-fit test. and the result
>>>> shows that our model fit the data.
>>>>
>>>> should I use the probit model or not.
>>>>
>>>> Thanks in advance.
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/statalist/faq
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>>
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/statalist/faq
>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Steven Samuels
sjsamuels@gmail.com
18 Cantine's Island
Saugerties NY 12477
USA
Voice: 845-246-0774
Fax:    206-202-4783

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index