Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Confidence Interval for Proportion


From   Maarten buis <maartenbuis@yahoo.co.uk>
To   statalist@hsphsun2.harvard.edu
Subject   RE: st: Confidence Interval for Proportion
Date   Tue, 11 Mar 2008 22:31:07 +0000 (GMT)

To answer my own question with a simulation: it matters only in very
small samples (20 or so), and the coverage rate isn't that good anyhow
for extreme proportions.

*----------------- begin example ------------------
set more off
capture program drop sim
program define sim, rclass
	syntax, n(integer) p(real)
	drop _all
	set obs `n'
	local df = `n'-1
	gen x = uniform()<.95
	sum x, meanonly
	tempname m se
	scalar `m' = r(mean)
	scalar `se' = sqrt(`m'*(1-`m')/`df')
	return scalar true_z = ///
             `m' - invnormal(0.975)*`se' < .95 & ///
             `m' + invnormal(0.975)*`se' > .95
	return scalar true_t = ///
             `m' - invttail(`df',0.025)*`se' < .95 & ///
             `m' + invttail(`df',0.025)*`se' > .95
end
simulate true_z=r(true_z) true_t=r(true_t),  ///
         reps(10000) nodots: sim, n(100) p(.95)
sum true*
simulate true_z=r(true_z) true_t=r(true_t),  ///
         reps(10000) nodots: sim, n(50) p(.95)
sum true*
simulate true_z=r(true_z) true_t=r(true_t),  ///
         reps(10000) nodots: sim, n(20) p(.95)
sum true*
*--------------------- end example ---------------------------
(For more on how to use examples I sent to the Statalist, see
http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

-- Maarten

--- Maarten buis <maartenbuis@yahoo.co.uk> wrote:

> Martin may have a point, though I am not sure: I have always taught
> that the reason we compare the test-static to the t-distribution and
> not the Gaussian distribution is that we have additional uncertainty
> due to the fact that we not only estimate the mean but also the
> standard devation (to get to the standard error). In case of a
> proportion we know that the standard deviation is a deterministic
> function of the mean, so why should we compare the test-statistic to
> the t-distribution instead of the Gaussian distribution? 
> 
> -- Maarten
> 
> --- Martin Weiss <martin.weiss@uni-tuebingen.de> wrote:
> 
> > Jeff,
> > 
> > thanks for the reply, but am I still missing something here? I did
> > experiment with the " r(N)-1", but discarded the possibility as it
> > did not
> > provide the correct lower and upper bound... Indeed,
> > 
> > ************************
> > sysuse auto, clear
> > proportion rep78
> > matrix define A=e(b)
> > count if rep78!=.
> > *Std error
> > local stderr= sqrt(A[1,1]*(1-A[1,1])/`=`r(N)'-1')
> > *Upper/Lower Bound for proportion of "1"
> > di A[1,1]+invnormal(1-0.05/2)*`stderr'
> > di A[1,1]-invnormal(1-0.05/2)*`stderr'
> > ************************
> > 
> > still gives the wrong numbers. Have you told us the whole story?
> > 
> > Martin Weiss
> > _________________________________________________________________
> > 
> > Diplom-Kaufmann Martin Weiss
> > Mohlstrasse 36
> > Room 415
> > 72074 Tuebingen
> > Germany
> > 
> > Fon: 0049-7071-2978184
> > 
> > Home: http://www.wiwi.uni-tuebingen.de/cms/index.php?id=1130
> > 
> > Publications:
> http://www.wiwi.uni-tuebingen.de/cms/index.php?id=1131
> > 
> > SSRN:
> http://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=669945
> > 
> > 
> > -----Original Message-----
> > From: owner-statalist@hsphsun2.harvard.edu
> > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Jeff
> > Pitblado,
> > StataCorp LP
> > Sent: Tuesday, March 11, 2008 7:08 PM
> > To: statalist@hsphsun2.harvard.edu
> > Subject: Re: st: Confidence Interval for Proportion
> > 
> > Martin Weiss <martin.weiss@uni-tuebingen.de> is using the
> > -proportion-
> > command
> > and has a question about how standard errors are computed:
> > 
> > > Dear Statalisters,
> > > 
> > > try this in Stata:
> > > 
> > > ************************
> > > sysuse auto, clear
> > > proportion rep78
> > > matrix define A=e(b)
> > > matrix define B=e(V)
> > > count if rep78!=.
> > > *Upper/Lower Bound for proportion of "1"
> > > di A[1,1]+invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> > > di A[1,1]-invnormal(1-0.05/2)*sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> > > *Standard Error for "1"
> > > *Mistake obviously there...
> > > di sqrt(A[1,1]*(1-A[1,1])/`r(N)')
> > > ************************
> > > 
> > > Then let me know: why do I not hit the correct CI for the
> > proportion of
> > "1"
> > > in the repair record? Something`s wrong with the standard error,
> I
> > do not
> > > know what, though...
> > 
> > Using Martin's example Stata code, -proportion- effectively
> computes
> > the
> > standard error via
> > 
> > 	sqrt(A[1,1]*(1-A[1,1])/(r(N)-1))
> > 
> > This is explained (rather tersely, I'll admit) in the 'Methods and
> > Formulas'
> > section of -[R] proportion-.
> > 
> > 	"Proportions are means of indicator variables; see -[R] mean-."
> > 
> > From the 'Methods and Formulas' section of -[R] mean-, the variance
> > is
> > calculated as
> > 
> > 	V(ybar) = (1/(N*(N-1))) Sum_{j=1}^N (y_j - ybar)^2
> > 
> > If the y_j are observations of an indicator variable, this is
> > algebraically
> > equivalent to
> > 
> > 	V(ybar) = ybar(1-ybar)/(N-1)
> > 
> > --Jeff
> > jpitblado@stata.com


-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------


      ___________________________________________________________ 
Rise to the challenge for Sport Relief with Yahoo! For Good  

http://uk.promotions.yahoo.com/forgood/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index