Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Non-paramtetric test for survey data


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Non-paramtetric test for survey data
Date   Wed, 13 Apr 2005 17:50:34 +0100

Offlist someone commented to me 

>> Poisson is the ``right" command to use with a nonnegative dep var.
>> Start with -whelp svypoisson-

to which I replied 

>> I really would like to hear the case
>> for regarding costs as a discrete 
>> variable. Let's suppose the data 
>> are like USD 123,456.78. How should 
>> that be treated? 

to which in turn the answer was 

>> Cents are certainly discrete, but the formulation of Poisson does not
>> require discrete data, just E[y]=exp(Xb).  See Wooldridge's Econometric
>> Analysis of Cross Section and Panel Data Ch. 19 for details.

I don't understand why the person concerned didn't 
want this contribution to be public. I certainly expect
my arguments to be shot down if they are incorrect
or misleading. 

Anyway, there the suggestion is. It seems to me 
that the spike at zeros still needs care. My 
guess is that the rest of the distribution 
doesn't start immediately at USD 0.01. 

Nick 
n.j.cox@durham.ac.uk 

-----Original Message-----
From: Nick Cox [mailto:n.j.cox@durham.ac.uk]
Sent: Wednesday, April 13, 2005 11:33 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: Non-paramtetric test for survey data


This question was asked in slightly 
different form on Saturday. It got one 
reply, not answered here. It also 
seems to be based on various 
misconceptions, which may be why no 
one tried to answer the main question. 

The first is the idea that marginal 
normality is required for -svy- methods
and that the alternative must be some 
non-parametric test. I don't know where 
you got that idea. Nor is it clear
that the ideas behind -svy- can be 
combined usefully with non-parametric 
ideas. In your case, your interests 
in costs as the key response would 
not seem to march at all with 
degrading the data to ranks, and so 
forth. 

The second is that log transformation 
could ever be a satisfactory solution for data 
with a spike of zeros. Even with some 
fudge like log(response + 1) a spike will 
map to another spike. How problematic
that is will depend upon circumstances, 
but transformation is of dubious relevance
here. 

I don't know what you really need. 
It might be that you need to model 
those with non-zero costs and zero 
costs separately. I suspect that what
you need involves a lot of programming 
from somebody. I doubt that it is canned 
anywhere. 

Nick 
n.j.cox@durham.ac.uk 

anju parthan
 
> I am trying to compare if the total healthcare costs
> are different in those who missed work and those who
> did not miss work using lincom because I am using a
> survey data.  
> 
> The total healthcare costs variable is not normally
> distributed. A large proportion of individuals had
> zero costs.  I tried log transformation but it did not
> change the distribution. So I guess I have to use
> non-parametric tests.  
> 
> How can I use non-parametric tests with survey data?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index