Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: nbreg - problem with constant?


From   Joerg Luedicke <joerg.luedicke@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: nbreg - problem with constant?
Date   Fri, 2 Mar 2012 11:35:03 -0800

In addition to what Richard said:

It may or may not be related to any potential non-intuitive results,
but I am wondering if you should include some kind of offset. Just
judging from your post I would think you need the number of firms in a
country from the previous year as an offset to basically model the
rate of new firms, instead of the absolute number of new firms.
Whether there are 10 firms in a country in year x and 15 firms in year
x+1 or 1000 firms in another country in year x and 1005 firms in year
x+1, is most likely a different story.

Joerg


On Fri, Mar 2, 2012 at 11:11 AM, Richard Williams
<richardwilliams.ndu@gmail.com> wrote:
> At 01:34 PM 3/2/2012, Simon Falck wrote:
>>
>> Hi,
>>
>> I have some problems in fitting a negative binomial regression model. It
>> seems that one problem is related to the "constant" as the it inflates the
>> coef. If the constant is removed, some coef are still unexpectedly high.
>> Since removing the constant bias coef results implies restrictions, I hope
>> anyone can contribute with some insights on this matter.
>>
>> I apply the NBREG command to estimate the nr of new firms per country
>> explained by country-characteristics. The dataset is consisted of
>> information for 72 countries over 8 years, N=id=576. The information is
>> annual, all regressors are lagged 1 year (t-1). The dv (Y)  is the nr of new
>> firms per country and vary between 0-204. The indepv (X1-X5) are
>> country-specific attributes. Each indepv are continuous and vary across
>> countries (id). No interaction terms are used. Some correlation exist, in
>> general <0.3, but up to 0.6. The dataset is structured as,
>>
>> id      year    Y       X1              X2              X3    X4
>>    X5
>> 1       2000    10      0.5258504       1.148275        1.623761
>>  0.00905698      0.2926497
>> 2       2000    1       1.105136        0.9730458       0.7427208
>>  0.03010507      0.1732135
>> 3       2000    2       1.342283        0.7757816       0.6444564
>>  0.01280751      0.2596922
>> ...
>> The model is estimated with command: nbreg Y X1 X2 X3 X4 X5
>> Generates results:
>> -----------------
>> Negative binomial regression                    Number of obs   =
>> 576
>>                                                LR chi2(8)      =
>> 387.39
>> Dispersion     = mean                           Prob > chi2     =
>> 0.0000
>> Log likelihood = -562.09431                     Pseudo R2       =
>> 0.2563
>>
>> Y               Coef.                   Std. Err.       z       P>z
>> [95% Conf. Interval]
>> X1              .3927241        .3024751        1.30    0.194 -.2001162
>>     .9855644
>> X2              .6401666        .4818861        1.33    0.184 -.3043129
>>     1.584646
>> X3              1.27199         .4352673        2.92    0.003 .4188815
>>    2.125098
>> X4              -5.603575       1.724484        -3.25   0.001 -8.983502
>>     -2.223648
>> X5              -1.370085       .1557769        -8.80   0.000 -1.675402
>>     -1.064768
>> Constant        10.5169         2.30579         4.56    0.000 5.997634
>>    15.03617
>> /lnalpha        -.2836582       .1966372 -.66906         .1017437
>> alpha           .753024         .1480725 .5121898        1.1071
>> Likelihood-ratio test of alpha=0:  chibar2(01)  =       214.48
>>  Prob>=chibar2 = 0.000
>> -----------------
>> The LR-test indicates that Negbin- is preferred over Possion. X1-X2 are
>> insignf., while X3-X5 are signf., P<0.05.
>> We can see that the constant is very large, coef= exp(10.5169)=33225.488
>> and std.err for X4 is quite high (1.72..).
>
>
> Without knowing more about the variables, I would be hesitant to say the
> constant is "large" or the standard error is "quite high." If you, say,
> rescaled the Xs, or centered each X about its mean, the constant would
> change. Likewise if you rescaled X4 (e.g. changed it from income in dollars
> to income in thousands of dollars) the coefficient and standard error for X4
> would change. You can think of the constant as being the score a case would
> have if every X equaled 0, but there may be no such cases where that would
> ever happen, e.g. in a sample of adults nobody will have a value of 0 years.
>
> In short, it isn't clear to me that you have a problem. If you find the
> coefficients non-intuitive, then rescaling the Xs in some way or centering
> them may help.
>
> As a sidelight, your analysis seems to be ignoring your panel structure. You
> may wish to take a look at the XT manual and/or Paul Allison's book on
> "Fixed effects regression models."
>
>
> -------------------------------------------
> Richard Williams, Notre Dame Dept of Sociology
> OFFICE: (574)631-6668, (574)631-6463
> HOME:   (574)289-5227
> EMAIL:  Richard.A.Williams.5@ND.Edu
> WWW:    http://www.nd.edu/~rwilliam
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index