[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Roger Newson <roger.newson@kcl.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: statalist-digest V4 #965 |

Date |
Sun, 04 Aug 2002 20:24:52 +0100 |

At 11:23 04/08/02 -0400, Stephen Soldz wrote:

I don't think there is any need for a reference, as the point is so trivial. If you are estimating the difference between 2 population proportions from 2 different sample proportions on the same sample, then you are estimating the mean of Z=X-Y, where X and Y are Bernoulli variables. You are therefore simply estimating the population mean Z from the sample mean Z. The large sample theory applies, courtesy of the central limit theorem for ordinary sample means, whether Z is normal (as with the usual 2-sample t-test) or a discrete distribution with possible values -1, 0 and 1 (as here).thanks to Nick Cox and Roger Newson for their responses to my question about robust tests of dependent proportions. Nick gave several references I'll look up. Roger thinks I wouldn't do to bad with paired t-tests as they are: "a special case of the Huber variance for clustered data (where the clusters are the pairs of responses and the observations are the individual responses)". I wonder if you have a refernce for this I could cite?

The bit about clustered Huber variances is probably not strictly necessary, but is justified as follows. The conventional SE of the sample mean happens also to be the Huber SE for estimating the population mean, if you are using any likelihood function which uses the sample mean as the maximum-likelihood estimator for the population mean (which includes the normal likelihood function, and includes also the discrete-distribution likelihood function with possible values -1, 0 and 1). This is because the Huber variance is, by definition, the sample mean square of the sample influence function divided by the number of sampling units. The sample influence function of the mean, for the i'th sampling unit, is Z_i-Zbar, where Z_i as the i'th Z-value and Zbar is the sample mean Z-value. A good reference on influence functions in general is Hampel (1974).

I hope this helps.

Best wishes

Roger

References

Hampel FR. The influence curve and its role in robust estimation. Journal of the American Statistical Association 1974; 69: 383-397.

--

Roger Newson

Lecturer in Medical Statistics

Department of Public Health Sciences

King's College London

5th Floor, Capital House

42 Weston Street

London SE1 3QD

United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648

Fax: 020 7848 6620 International +44 20 7848 6620

or 020 7848 6605 International +44 20 7848 6605

Email: roger.newson@kcl.ac.uk

Opinions expressed are those of the author, not the institution.

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Ecological monitoring***From:*"Graham M Smith" <graham.smith@myotis.co.uk>

**References**:**st: RE: statalist-digest V4 #965***From:*"Stephen Soldz" <ssoldz@soldzresearch.com>

- Prev by Date:
**st: Date: Sun, 4 Aug 2002 12:50:48 -0500** - Next by Date:
**st: how to control the output lenth?** - Previous by thread:
**st: RE: statalist-digest V4 #965** - Next by thread:
**st: Ecological monitoring** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |