# st: RE: Non-paramtetric test for survey data

 From "Nick Cox" To Subject st: RE: Non-paramtetric test for survey data Date Wed, 13 Apr 2005 17:50:34 +0100

```Offlist someone commented to me

>> Poisson is the ``right" command to use with a nonnegative dep var.

to which I replied

>> I really would like to hear the case
>> for regarding costs as a discrete
>> variable. Let's suppose the data
>> are like USD 123,456.78. How should
>> that be treated?

to which in turn the answer was

>> Cents are certainly discrete, but the formulation of Poisson does not
>> require discrete data, just E[y]=exp(Xb).  See Wooldridge's Econometric
>> Analysis of Cross Section and Panel Data Ch. 19 for details.

I don't understand why the person concerned didn't
want this contribution to be public. I certainly expect
my arguments to be shot down if they are incorrect

Anyway, there the suggestion is. It seems to me
that the spike at zeros still needs care. My
guess is that the rest of the distribution
doesn't start immediately at USD 0.01.

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: Nick Cox [mailto:n.j.cox@durham.ac.uk]
Sent: Wednesday, April 13, 2005 11:33 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: Non-paramtetric test for survey data

This question was asked in slightly
different form on Saturday. It got one
seems to be based on various
misconceptions, which may be why no
one tried to answer the main question.

The first is the idea that marginal
normality is required for -svy- methods
and that the alternative must be some
non-parametric test. I don't know where
you got that idea. Nor is it clear
that the ideas behind -svy- can be
combined usefully with non-parametric
in costs as the key response would
not seem to march at all with
degrading the data to ranks, and so
forth.

The second is that log transformation
could ever be a satisfactory solution for data
with a spike of zeros. Even with some
fudge like log(response + 1) a spike will
map to another spike. How problematic
that is will depend upon circumstances,
but transformation is of dubious relevance
here.

I don't know what you really need.
It might be that you need to model
those with non-zero costs and zero
costs separately. I suspect that what
you need involves a lot of programming
from somebody. I doubt that it is canned
anywhere.

Nick
n.j.cox@durham.ac.uk

anju parthan

> I am trying to compare if the total healthcare costs
> are different in those who missed work and those who
> did not miss work using lincom because I am using a
> survey data.
>
> The total healthcare costs variable is not normally
> distributed. A large proportion of individuals had
> zero costs.  I tried log transformation but it did not
> change the distribution. So I guess I have to use
> non-parametric tests.
>
> How can I use non-parametric tests with survey data?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```