Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Jackknife and standard error in NEGBIN model


From   [email protected] (Jeff Pitblado, StataCorp LP)
To   [email protected]
Subject   Re: st: Jackknife and standard error in NEGBIN model
Date   Wed, 06 May 2009 16:42:57 -0500

Marc Philipp <[email protected]> is trying to reproduce the standard error
calculations produced by the -jackknife:- prefix command:

> I am still trying to understand how the -jackknife:- command computes the
> standard errors of the parameters.  I made some progress, but I still have a
> problem that is puzzling me. Actually, I tried to replicate these standard
> errors using the method outlined in Miller (1974), which is based on Tukey
> (1958). According to Stata user guide, this is the method implemented in
> Stata.
> 
> However, I don't manage to get the same standard errors. I send you my
> output below, where you can see how I tried to replicate the results. You
> can see that the Jackknifed parameters are exactly the same, but the
> standard errors produced by the -jackknife:- command are smaller than those
> I computed. They should be the same.  Am I making a mistake or is Stata
> using another method to compute these standard errors?

> . jackknife _b[x] e(delta), cluster(tt) saving(jack, replace): nbreg y x d*, disp(c) nocons
> (running nbreg on estimation sample)
> 
> Jackknife replications (3)
> ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
> ...
> 
> Jackknife results                               Number of obs      =       300
>                                                 Number of clusters =         3
>                                                 Replications       =         3
> 
>       command:  nbreg y x d*, disp(c) nocons
>         _jk_1:  _b[x]
>         _jk_2:  e(delta)
>           n():  e(N)
> 
> ------------------------------------------------------------------------------
>              |              Jackknife
>              |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>        _jk_1 |   1.013864   .1062226     9.54   0.011     .5568252    1.470903
>        _jk_2 |   1.362775   .0625554    21.79   0.002     1.093621    1.631929
> ------------------------------------------------------------------------------
>  
> . matrix bet = e(b)
> . matrix list e(b_jk)
> 
> e(b_jk)[1,2]
>         _jk_1      _jk_2
> y1  1.0350594  2.6554469
> 
> . use jack.dta, clear
> (jackknife: nbreg)
> 
> . gen beta_i = 3*bet[1,1]-2*_jk_1
> . gen delta_i = 3*bet[1,2]-2*_jk_2
> . su beta_i delta_i
> 
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>       beta_i |         3    1.035059    .1839829   .8623033   1.228518
>      delta_i |         3    2.655447     .108349   2.582983   2.780004

In the above code, Marc is using -jackknife- to save a dataset with the
jackknife replicates of his statistics of interest, there are only 3
replicates because of clustering.  He then uses these replicates to generate
his own 'pseudo' values.  Finally he uses -summarize- on the newly generated
variables.

While -summarize- computed the standard deviation of Marc's new variables, the
standard error produced by -jackknife- comes from the standard error of the
mean of the pseudo values.  In Marc's example above, the difference is due to
a factor of 'sqrt(1/n)' where 'n' is the number of replicates, n=3 in Marc's
example.

	.1062226 = .1839829 * sqrt(1/3)
	.0625554 = .108349  * sqrt(1/3)

Marc could use the -ci- command instead of -summarize- to reproduce the
standard error calculations of -jackknife-.

--Jeff
[email protected]
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index