# AW: st: parametric vs. nonparametric estimators

 From Thomas M�hlmann <[email protected]> To <[email protected]> Subject AW: st: parametric vs. nonparametric estimators Date Wed, 16 Jun 2004 19:52:13 +0200

```Dear Nick and Rich,

maybe I miss something, but my problem is as follows:

suppose I have a data set with 100 objects and two binary variables, X
(sex=male (coded 1) or female (coded 0)) and Y (disease=absent (coded 0) or
present (coded 1)) for example. My goal is to estimate the probability
P(Y=1|X=1). Suppose 50 of the 100 persons are male and of this 10 have a
disease, then my so called "nonparametric" estimate of P(Y=1|X=1) is
10/50=0.200. By nonparametric I mean, that no assumption about the
distribution of Y is made. By using logistic regression, I assume that Y can
be related to a latent variable Y* which has a logistic distribution. Now,
for example, the logistic regression estimate of P(Y=1|X=1) is 0.201. What
does the difference between 0.200 and 0.201 tell me?

I hope things are now more clear!

Thanks for your time and effort!

Thomas

-----Urspr�ngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]]Im Auftrag von Nick Cox
Gesendet: Mittwoch, 16. Juni 2004 17:25
An: [email protected]
Betreff: RE: st: parametric vs. nonparametric estimators

This sounds like a thread letting Theseus (or the
thesis) escape from a semantic maze,
but it hinges on one notion of a parameter.

Thus even with Wilcoxon-Mann-Whitney
and only minimal assumptions (continuity?) about what
kind of distributions are being postulated, the
common U statistic can be scaled to give an
estimate of pr(X > Y). Indeed Rich was one of
the people instrumental in getting StataCorp
to add the -porder- option to -ranksum-. I'd
want to regard this probability as a parameter
(property of the system or chance set-up which
can be estimated) and an estimate of it is sometimes
more interesting or useful than the U statistic or
its P-value. It's perhaps then just that
it is not a parameter which specifies a probability
distribution (i.e. distribution, mass or density
function).

(Roger Newson would want me to point out that this
pr(X > Y) is just Somers' d in one of its many
guises. Shall I compare thee to a Somers' d?
(Shakespeare))

Nick
[email protected]

Richard Goldstein

> I'm a little confused about what you mean by a parameter estimated
> via non-parametric methods; to me, non-parametric means that no
> parameter is estimated (yes, I distinguish between non-parametric
> and "distribution free")

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```