Hi folks
I am trying to estimate a panel growth model as follows:
gY = aY_1 + bX +cE + d(i.period) +error (d is the parameter vector for
period dummies)
where Y is income, X is a set of exogenous regressors (excl time dummies), E
is endogenous or predetermined variables and TIME is the time dummies.
The problem is when I try to put all the E variables in gmm( ) and put
iv(i.period, eq(level)) I get the warning: "Number of instruments may be
large relative to number of observations." Also the computation takes a long
time (expected) and the Hansen Chi-Squared is 1.000.
Six "very brief" questions related to this. Please answer whatever you
comfortably can. Many many thanks in advance.
QUESTION 1
My total observations are 642 and no. of instruments is 101. [This is
"after" I have reduced the lags to (3 3).] Is this a serious warning (I read
on statalist that if the number of instruments is large relative to number
of observations, there can be serious small sample biases)? At what ratio of
obs/instruments does this cease/begin to be a problem.
QUESTION 2
I have also noticed that if I remove the iv(i.period, eq(level)) and most of
the E variables in gmm(.), the Hansen-Ch-Squared p-value becomes less than 1
(as it healthily should) and the WARNING disappears. However, if I plug back
one additional E variable in gmm(.) the WARNING comes back, and the Hansen
p-value climbes (although not up to 1). My question, therefore, is: what is
the relationship ship between the Hansen Chi-squared and the WARNING, and is
it permissible to ignore the warning as long as Hansen p-value is below 1?
QUESTION 3
I was a bit concerned that perhaps by removing iv(i.period, eq(level)) from
the command line, I was not actually removing the i.period dummies as
instruments, rather only removing the information that they were time
dummies. If this fear of mine is correct, then this is serious. Please let
me know if this fear is well-founded or have I actually removed the time
dummies from the instrument list.
QUESTION 4
A more general question is: is it acceptable to remove time dummies from the
instrument set? What are the conditions when it is, and when it is not?
QUESTION 5
Also I was not clear if my X exogenous regressors are included as
instruments; I was assuming xtabond2 would automatically include them noting
that they are in the regression equation but not in the gmm(.) list. If my
assumption is incorrect, how can I include say variable X with say a lag
structure of (2 2)?
QUESTION 6
Finally, my E set contains predetermined variables say Ep and other
endogenous variabels Ed. I also have, in addition the Y_1 lagged dependent
variable. I was not sure how I should write my command so that it
distinguishes between these three categories of endogenous variables. I
understand that the lag length used (a b) should differ across the three
categories. But I am not sure what the synatx would be, or whether I would
put Y_1 in the gmm(.) or just Y but starting with a deeper lag? Also I
wasn't sure what the treatment would be for endogenous variables that I have
included with lags in the regression. For e.g., you can think of EDU
affecting growth with a lag so my E set contains EDU_1. Should I write this
as EDU in gmm(.) with a deeper lag (say 3 3), or just EDU_1 with (2 2). Any
guidance on this would be highly appreciated.
Ali Abbas
Oxford University
UK
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/