Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Problem with looping when executing command


From   Maarten Buis <[email protected]>
To   [email protected]
Subject   Re: st: Problem with looping when executing command
Date   Fri, 5 Oct 2012 10:44:15 +0200

On Fri, Oct 5, 2012 at 3:00 AM, Kim, Isok wrote:
> I'm having trouble with running exactly same syntax with different subgroups. Analysis on one group runs just fine, but when I execute same syntax with another group, it just keeps looping without doing anything. I never encountered this in Stata.  I don't know if it'll be helpful but I'm pasting one set of command lines that works and another that doesn't:
>
> /*nbreg: cardio overall*/
> set more off
> svy, subpop(subsdh): nbreg cardio AGE i.female i.married i.ed3cat ib3.as4cat i.generation ///
>                                                                   i.poverty i.no_ins discr i.ELP srph bias, irr
>
> /*nbreg: cardio Vietnamese*/
> set more off
> svy, subpop(subsdh if as4cat==1): nbreg cardio AGE i.female i.married i.ed3cat i.generation ///
>                                                                    i.poverty i.no_ins discr i.ELP srph bias, irr

Finding the coefficients in a model like -nbreg- is a complicated
issue. It needs an iterative process: so Stata starts at a set of
parameter values, than looks around in the immediate vicinity to see
in what direction it might find better values, once there it again
looks around, etc. till it cannot find a better set of parameter
values. -nbreg- is a model that can often be inappropriate, which
means that it becomes very hard or impossible to find the best set of
parameter values, and the Stata will just continue iterating.

Before you give up on -nbreg- I would suggest you look at the variable
AGE first. A key feature of the -nbreg- model is that it contains a
random model for the constant. The constant is the expected value of
cardio when all explanatory/right-hand-side/independent/x-variables
are 0. So you can imagine that this model becomes easier to estimate
when the constant refers to a person within the range of the data. If
you used age as years since birth, than the constant refers to a newly
born baby. With a dependent variable called cardio my guess would be
that such a person is not in your data. Instead I would create and use
a new variable that could be something like gen age2 = age-50. So now
the constant refers to a 50 year old person. Which age you use
obviously depends on the problem and the data, just choose a
meaningful value within the range of the data. I also find that it
often makes more sense to look at age in decades rather than years,
i.e. compare persons who are 10 years apart rather than 1 year. In
that case your new variable would be gen age2 = (age-50)/10. I would
take a similar look at the other continuous variables (srph, discr and
bias)

If the models now converge you can start looking at whether the effect
of age is linear. If you cover a large range of ages, than the effect
of age is unlikely to be linear. My favorite way of adding
non-linearity is adding splines, see -help mkspline-.

Hope this helps,
Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index