# Re: st: Median test & ANOVA with sampling weights

 From Steven Samuels To statalist@hsphsun2.harvard.edu Subject Re: st: Median test & ANOVA with sampling weights Date Fri, 19 Sep 2008 14:11:24 -0400

hafida--

You've given us very little information about your survey sample and its design. More would have been helpful.

You appear to be misusing the terms "sample" and "population". A "population" is the larger group of people represented by the sample; statistics for a population are known from outside sources such as a census. For example, in the U.S. a sample of 1500 people might represent the population of millions. What you are calling "sample" and "population" appear to be, respectively, one subgroup of a sample (those with dmstat=1) and the entire sample.

The proper way to compare one subgroup to the whole group is to compare the subgroup to the others. So, form two groups: group = 1 if dmstat =1 and group = 2 if dmstat is not 1 (the rest of the sample).

-pctile- will estimate weighted medians, but the CI's will not be correct, for they assume independent observations. To proceed, you must know the sampling design, including cluster and stratum information. The program -cendif- by Roger Newson (-findit cendif-) will estimate differences in the medians and accommodates sampling weights and clustering. The sign test, in contrast, is for a set of paired independent observations, not for any list of paired numbers.

To do ANOVA, you must first -svyset- your data and use -svy: reg-. There is nothing special about -svy: reg-; ust set up the ANOVA as you would do with ordinary -reg-. To compare individual groups to one another, after the regression run -test-, with options -mtest(holm)- or -mtest(sidak)-.

Your post shows that you are fairly new to sampling concepts. Before proceeding, I suggest that you look at a good text; I recommend "Sampling Design and Analysis", by Sharon Lohr. Your faculty may be able to suggest local resources.

-Steve

On Sep 19, 2008, at 7:53 AM, Nur.Hikmayani@studentmail.newcastle.edu.au wrote:

I'm using a survey data and wonder how can I perform a comparison between median in the sample and in the population. Medians were separately obtained using -pctile- or -_pctile-.

. pctile pctGH = o4gh [pw=o1wtarea], nq(4) genp(percent)
. list percent pct in 1/4
+-----------------+
| percent pctGH |
|-----------------|
1. | 25 50 |
2. | 50 67 |
3. | 75 77 |
4. | . . |
+-----------------+

. pctile pctileGH1 = o4gh if dmstat==1 [pw=o1wtarea], nq(4) genp (pctGH1)
. list pctGH1 pctileGH1 in 1/4
+------------------+
| pctGH1 pctileGH1 |
|------------------|
1. | 25 40 |
2. | 50 60 |
3. | 75 72 |
4. | . . |
+------------------+

Should I calculate the difference between each value in the sample and population first and carry out a sign test then? If so, how is sampling weight taken into account? (I mean, can I use weighted median in the population to substract each 'unweighted' value?)

Secondly, is it possible to perform one-way ANOVA with sampling weight, particularly for post-hoc comparison? Using svy: regress did not give enough information.
```

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```