Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

AW: st: parametric vs. nonparametric estimators

From   Thomas Mählmann <>
To   <>
Subject   AW: st: parametric vs. nonparametric estimators
Date   Wed, 16 Jun 2004 19:52:13 +0200

Dear Nick and Rich,

maybe I miss something, but my problem is as follows:

suppose I have a data set with 100 objects and two binary variables, X
(sex=male (coded 1) or female (coded 0)) and Y (disease=absent (coded 0) or
present (coded 1)) for example. My goal is to estimate the probability
P(Y=1|X=1). Suppose 50 of the 100 persons are male and of this 10 have a
disease, then my so called "nonparametric" estimate of P(Y=1|X=1) is
10/50=0.200. By nonparametric I mean, that no assumption about the
distribution of Y is made. By using logistic regression, I assume that Y can
be related to a latent variable Y* which has a logistic distribution. Now,
for example, the logistic regression estimate of P(Y=1|X=1) is 0.201. What
does the difference between 0.200 and 0.201 tell me?

I hope things are now more clear!

Thanks for your time and effort!


-----Ursprüngliche Nachricht-----
[]Im Auftrag von Nick Cox
Gesendet: Mittwoch, 16. Juni 2004 17:25
Betreff: RE: st: parametric vs. nonparametric estimators

This sounds like a thread letting Theseus (or the
thesis) escape from a semantic maze,
but it hinges on one notion of a parameter.

Thus even with Wilcoxon-Mann-Whitney
and only minimal assumptions (continuity?) about what
kind of distributions are being postulated, the
common U statistic can be scaled to give an
estimate of pr(X > Y). Indeed Rich was one of
the people instrumental in getting StataCorp
to add the -porder- option to -ranksum-. I'd
want to regard this probability as a parameter
(property of the system or chance set-up which
can be estimated) and an estimate of it is sometimes
more interesting or useful than the U statistic or
its P-value. It's perhaps then just that
it is not a parameter which specifies a probability
distribution (i.e. distribution, mass or density

(Roger Newson would want me to point out that this
pr(X > Y) is just Somers' d in one of its many
guises. Shall I compare thee to a Somers' d?


Richard Goldstein

> I'm a little confused about what you mean by a parameter estimated
> via non-parametric methods; to me, non-parametric means that no
> parameter is estimated (yes, I distinguish between non-parametric
> and "distribution free")

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index