# Re: st: median regressin and survey data!

 From "Stas Kolenikov" To statalist@hsphsun2.harvard.edu Subject Re: st: median regressin and survey data! Date Sat, 29 Mar 2008 11:15:56 -0500

```Short answer: neither one fits well enough into the paradigm of survey
sampling, so coming up with fully justifiable implementation is not
straightforward.

For the first one, all rank tests implicitly assume the data are
i.i.d., and I don't think very clear analogies are possible with
survey data. There are no estimating equations to work with; you
probably would be able to get the distribution of the test statistic
over repeated sampling, but it won't be nearly as nice as the textbook
distribution.

For the second one, -qreg-is a heavily model-based concept: that for
any combination of explanatory variables, there's a well defined
distribution of responses over which the median can be computed. The
straight design perspective, on the other hand, says that there are
only so many individuals in the finite population, so there is no talk
about conditional distributions. So one needs to invent some sort of a
hybrid framework to incorporate both model and design ideas, and they
don't always go hand in hand. A basic introduction to the subject is a
chapter by Binder and Roberts in 2003 Analysis of Survey Data book
(http://doi.wiley.com/10.1002/0470867205.ch3) -- I say introductory
because they consider the simplest possible situations, but they still
operate with big-O small-O in probability. Conceptually, it should
still be possible to formulate median regression for sample surveys,
as it is linked to a minimization problem, and thus can be cast in
terms of estimating equations. Then you need to say, "If I had the
full population, I would run this same median regression on it, and
get some numbers from this census estimation procedure. Now, what I
can hope for with the sample is that my estimates are going to be
consistent for those numbers that came out of the census problem". I
don't really know if that was done for quantile regression; for linear
regression, the comparable result goes back to mid 1970s due to Wayne
Fuller, and for generalized linear models, to David Binder's 1983
paper. Median regression is somewhat trickier though, as the function
being minimzed, the sum of absolute deviations, is not differentiable,
so the standard tools like the delta method are not applicable.

On 3/29/08, Mohammed El Faramawi <melfaram@yahoo.com> wrote:
> Hi,
>  I am trying to run non-parametric tests using survey
>  data ( probability weighted). unfortunately I can not
>  find commands which takes into the consideration the
>  pweight. I am interested in qreg (median regression)
>  and Mann-Whitney test. Is there any way to do this by
>  Stata? Thank you
>  Mohammed Faramawi, MD,Phd,MPH,Msc
>
>
>       ____________________________________________________________________________________
>  OMG, Sweet deal for Yahoo! users/friends:Get A Month of Blockbuster Total Access, No Cost. W00t
>  http://tc.deals.yahoo.com/tc/blockbuster/text2.com
>  *
>  *   For searches and help try:
>  *   http://www.stata.com/support/faqs/res/findit.html
>  *   http://www.stata.com/support/statalist/faq
>  *   http://www.ats.ucla.edu/stat/stata/
>

--
Stas Kolenikov, also found at http://stas.kolenikov.name