Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

st: nbreg - problem with constant?

 From Simon Falck To "statalist@hsphsun2.harvard.edu" Subject st: nbreg - problem with constant? Date Fri, 2 Mar 2012 18:34:13 +0000

```Hi,

I have some problems in fitting a negative binomial regression model. It seems that one problem is related to the "constant" as the it inflates the coef. If the constant is removed, some coef are still unexpectedly high. Since removing the constant bias coef results implies restrictions, I hope anyone can contribute with some insights on this matter.

I apply the NBREG command to estimate the nr of new firms per country explained by country-characteristics. The dataset is consisted of information for 72 countries over 8 years, N=id=576. The information is annual, all regressors are lagged 1 year (t-1). The dv (Y)  is the nr of new firms per country and vary between 0-204. The indepv (X1-X5) are country-specific attributes. Each indepv are continuous and vary across countries (id). No interaction terms are used. Some correlation exist, in general <0.3, but up to 0.6. The dataset is structured as,

id	year	Y 	X1		X2		X3		X4		X5
1	2000	10	0.5258504	1.148275	1.623761	0.00905698	0.2926497
2	2000	1	1.105136	0.9730458	0.7427208	0.03010507	0.1732135
3	2000	2	1.342283	0.7757816	0.6444564	0.01280751	0.2596922
...
The model is estimated with command: nbreg Y X1 X2 X3 X4 X5
Generates results:
-----------------
Negative binomial regression			Number of obs	=	576
LR chi2(8)	=	387.39
Dispersion     = mean				Prob > chi2	=	0.0000
Log likelihood = -562.09431			Pseudo R2	=	0.2563

Y		Coef.   		Std. Err.      	z	P>z	[95% Conf. Interval]
X1		.3927241	.3024751	1.30	0.194	-.2001162	.9855644
X2		.6401666	.4818861	1.33	0.184	-.3043129	1.584646
X3		1.27199		.4352673	2.92	0.003	.4188815	2.125098
X4		-5.603575	1.724484	-3.25	0.001	-8.983502	-2.223648
X5		-1.370085	.1557769	-8.80	0.000	-1.675402	-1.064768
Constant	10.5169    	2.30579     	4.56	0.000	5.997634    	15.03617
/lnalpha   	-.2836582   	.1966372			-.66906    	.1017437
alpha     	.753024   	.1480725			.5121898      	1.1071
Likelihood-ratio test of alpha=0:  chibar2(01)	=	214.48	Prob>=chibar2 = 0.000
-----------------
The LR-test indicates that Negbin- is preferred over Possion. X1-X2 are insignf., while X3-X5 are signf., P<0.05.
We can see that the constant is very large, coef= exp(10.5169)=33225.488 and std.err for X4 is quite high (1.72..).

If the constant is removed: nbreg Y X1 X2 X3 X4 X5, noconstant
-----------------
Negative binomial regression			Number of obs   =	576
Dispersion           = mean			Wald chi2(8)    =	457.53
Log likelihood = -600.50218			Prob > chi2     =	0.0000
Robust
Y 		Coef.   		Std. Err.	z	P>z     [95% Conf.	Interval]
X1		.3932589	.326686		1.20	0.229	-.247034	1.033552
X2		2.416205	.4947226	4.88	0.000	1.446567	3.385844
X3		1.410318	.4546365	3.10	0.002	.5192471	2.301389
X4		3.029976	1.402828	2.16	0.031	.2804839	5.779468
X5		-.6341323	.1417935	-4.47	0.000	-.9120425	-.3562222
/lnalpha	.251133   	.1629578			-.0682585  	 .5705246
alpha		1.285481  	 .2094792			.934019    	1.769195
Likelihood-ratio	test of alpha=0:  chibar2(01)	=	276.18	Prob>=chibar2 = 0.000
-----------------
The results (coef) significantly changes. EFor example, X3 has coef exp(1.27199)= 3.5679457 for the first model and exp(2.416205)= 11.203262 for the second model. X4 changes sign from (-) to (+). In general, the coef are unexpectedly/unreasonable large.

Are these "large" coef due to collinear variables (although VIF~2 and Tolerance is >0.5) or misspecification of the model?

Can you think of something that I do wrong?

/Simon

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```