Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten buis <maartenbuis@yahoo.co.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: gologit2 model |

Date |
Wed, 13 Oct 2010 21:26:24 +0100 (BST) |

--- On Wed, 13/10/10, Nilam Prasai wrote: > Furthermore, what would be the good sample size to run > maximum likelihood estimates Depends on a lot of things, to name two: the number of parameters and the variance in your dependent variable, i.e. how much information you are trying to extract from the data, and how much information is present in the data. -gologit2- in its default formulation tries to estimate a lot of parameters. Say your dependent variable contains 5 categories, then you are trying to estimate 4 equations, that is 4 parameters for everey explanatory variable in your model plus 4 constants. Say you have 4 explanatory variables, then you are estimating 20 parameters. An absolute minimum would be 10 observations per parameter, and often you need a lot more, so in that case we would require a minimum of 200 observations. I don't think that that would be enough to trust the standard errors, but in very well behaved data, this may be enough to get reasonable point estimates. For the standard errors to be correct, the asymptotics need to start kicking in. I would not be surprised if that would require a 100 observations per parameter or more, so leading to a minimum sample size of 2000. A common problem with models like -gologit2- and -mlogit- are categories of the dependent variable that contain few observations. This is a variation on low variance in the dependent variable. In that case you probably do not have enough information to estimate the parameters of the corresponding equation. Anyhow, the real minimum number of observations depends on all the details of your model and data, and the best way of getting an idea about that is to run a simulation. For example below is one such simulation. You just need to replace the example data with your data, adjust the two -gologit2- models and the references to the parameter of interest, in my case the parameter of male in the first equation. Than play around with the sample size to see when trouble starts occuring (this is the number of -sim- in the -simulate- command). The example below also requires Ian White's -simsum- wich you can download by typing -ssc install simsum- and which is described in: Ian White (2010) "simsum: Analyses of simulation studies including Monte Carlo error". The Stata Journal, 10(3): 369--385. <http://www.stata-journal.com/article.html?article=st0200> *----------------------- begin example ------------------ use http://www.indiana.edu/~jslsoc/stata/spex_data/ordwarm2.dta, clear save c:/temp/dat, replace gologit2 warm yr89 male white age ed prst, tempname true scalar `true' = [#1]_b[male] program drop _all program define sim, rclass args N use c:/temp/dat, clear bsample `N' gologit2 warm yr89 male white age ed prst, return scalar b = [#1]_b[male] return scalar se = [#1]_se[male] end simulate b=r(b) se=r(se), reps(1000): sim 500 simsum b, se(se) true(`true') mcse *----------------------- end example ----------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) Hope this helps, Maarten -------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://www.maartenbuis.nl -------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: gologit2 model***From:*Nilam Prasai <nilamprasai@gmail.com>

- Prev by Date:
**st: Multiple plots, by() option and reversed yaxis scales/labels** - Next by Date:
**Re: st: problem with generated regressands and WLS** - Previous by thread:
**Re: st: gologit2 model** - Next by thread:
**Re: st: gologit2 model** - Index(es):