Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Survey and -catplot-


From   "Scholes, Shaun" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: RE: Survey and -catplot-
Date   Sun, 22 Jan 2012 13:06:24 +0000

Hi. Just to further clarify what I meant below...this Stata FAQ on the UCLA website:

http://www.ats.ucla.edu/stat/stata/faq/sample_survey_setups.htm#NHANES_III

uses pweights to analyse NHANES data (as does the Stata manual for survey data). 

Best wishes
Shaun









-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Harini Sarathy
Sent: 21 January 2012 19:15
To: [email protected]
Subject: Re: st: RE: Survey and -catplot-

Dear Shaun, and Nick

Thanks a lot. Shaun's suggestion worked perfectly. Though if you won't mind explaining - what does it exactly mean / do ?

Thanks a lot, again. Much appreciated.
Harini



On Sat, Jan 21, 2012 at 12:53 PM, Scholes, Shaun <[email protected]> wrote:
> Ok, looking at the help for catplot I think it may help to collapse the data before using catplot? Looking at the titanic example, you may need something like:
>
> collapse abdobes [aweight=wtmeccombined], by(sddsrvyr sub_all) catplot 
> sddsrvyr sub_all [aweight=100*abdobes] , asyvars
>
> But this doesn't mean I endorse aweights!
>
> Hope this helps
> Shaun
>
>
>
>
>
>
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Harini 
> Sarathy
> Sent: 21 January 2012 16:57
> To: [email protected]
> Subject: Re: st: RE: Survey and -catplot-
>
> Shaun,
>
> Thanks for responding. I am using the same weight I used for the svy analysis (I've sort of combined data from NHANES-III and the continuous NHANES, so I created a new sampling weight).
>
> While looking online for catplot help, I found someone suggesting that I should use 'aweight' instead of 'pweight'. Stata also gives an error if 'pweight' is used, so I guess 'aweight' is the correct usage for weight.
>
> I used the same sampling weight for aweight - frankly I don't know the difference and what it means, and I just sort of aped the example.
>
>
>
> Harini
>
>
>
> On Sat, Jan 21, 2012 at 11:48 AM, Scholes, Shaun <[email protected]> wrote:
>> Are you sure the weight specified in the catplot command was the same weight that was used in your svy commands? I have not analysed NHANES myself but I would expect to use a pweight rather than aweight?
>> Can you check this using svydes (I'm assuming this is individual level NHANES data)?
>> Best wishes
>> Shaun
>>
>>
>>
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Nick Cox
>> Sent: 21 January 2012 15:58
>> To: [email protected]
>> Subject: st: Survey and -catplot-
>>
>> Harini Sarathy <[email protected]> had difficulties sending this to the list. I will look at it shortly myself but anyone is naturally free to answer first.
>>
>> Nick
>>
>> I'm doing survey analysis with NHANES data from 1988-2008 and have been trying to use -catplot- (SSC) to show trends in abdominal obesity over the survey years (sddsrvyr) across age groups (agegrp):
>>
>> Abdominal obesity (abdobes) is a binary/discrete variable. (0 "Normal"
>> 1 "Abdominal Obesity"
>>
>> sddsrvyr
>> 1988-1996: 1
>> 1999-2000: 2
>> 2001-02    : 3
>> 2003-04    : 4
>> 2005-06:   : 5
>> 2006-07    : 6
>>
>> For my analysis I created subpopulations for the age groups: sub_0812, sub_1317, sub_1840 (These subpopulations had complete data on variables of interest).
>>
>> The big picture: I have one binary variable (abdobes), two 
>> categorical variables (sub_0812/sub_1317/sub_1840 & sddsrvyr). I want 
>> to show the increasing trend in abdominal obesity over the survey 
>> years within each group - but I only want to show it for abdobes==1
>>
>> Proportions of obesity
>>
>> . svy: prop abdobes, sub(sub_0812) over(sddsrvyr) . svy: prop 
>> abdobes,
>> sub(sub_1317) over(sddsrvyr) . svy: prop abdobes, sub(sub_1840)
>> over(sddsrvyr)
>>
>> sub_0812: Abd Obese==1
>>
>> _sddsrvyr_1    .1099138
>> _sddsrvyr_2    .1972264
>> _sddsrvyr_3    .205952
>> _sddsrvyr_4    .2562671
>> _sddsrvyr_5    .2243748
>> _sddsrvyr_6    .2589271
>>
>>
>> sub_1318: Abd Obese==1
>>
>> _sddsrvyr_1    .1288447
>> _sddsrvyr_2    .1717575
>> _sddsrvyr_3    .1773453
>> _sddsrvyr_4    .2003957
>> _sddsrvyr_5    .1790184
>> _sddsrvyr_6    .2129547
>>
>>
>> sub_1840: Abdo Obese==1
>>
>> _sddsrvyr_1    .2576976
>> _sddsrvyr_2    .3403194
>> _sddsrvyr_3    .3599359
>> _sddsrvyr_4    .3894223
>> _sddsrvyr_5    .3934921
>> _sddsrvyr_6     .394528
>>
>> For the purposes of a graph, I created a variable sub_all to 
>> represent all age-groups
>>
>> gen sub_all=0 if sub_0812==1
>> replace sub_all=1 if sub_1317==1
>> replace sub_all=2 if sub_1840==1
>>
>> The catplot command I used does not give me the graph I expected. Can you point out where I went wrong?
>>
>>
>> catplot sddsrvyr sub_all [aweight=wtmeccombined] if abdobes==1,
>> percent(sub_all) asyvars bar(1, bcolor(red)) bar(2, bcolor(midgreen)) 
>> bar(3, bcolor(sandb)) bar(4, bcolor(pink)) bar(5, bcolor(ebblue)) 
>> bar(6, bcolor(orange)) vertical title("Trends in Abdominal Obesity in 
>> NHANES population from 1988 to 2008 across age-groups",
>> size(medsmall)) ytitle(%)
>>
>> According to the graph, I'm putting down approximations here
>>
>> sub_0812: Abd Obese==1
>>
>> _sddsrvyr_1    .18
>> _sddsrvyr_2    .13
>> _sddsrvyr_3    .165
>> _sddsrvyr_4    .18
>> _sddsrvyr_5    .17
>> _sddsrvyr_6    .175
>>
>>
>> sub_1318: Abd Obese==1
>>
>> _sddsrvyr_1    .22
>> _sddsrvyr_2    .135
>> _sddsrvyr_3    .1475
>> _sddsrvyr_4    .17
>> _sddsrvyr_5    .165
>> _sddsrvyr_6    .175
>>
>>
>> sub_1840: Abdo Obese==1
>>
>> _sddsrvyr_1    .25
>> _sddsrvyr_2    .1475
>> _sddsrvyr_3    .145
>> _sddsrvyr_4    .15
>> _sddsrvyr_5    .16
>> _sddsrvyr_6    .15
>>
>> Given the values from the analysis, I'd expect an increasing trend in each age group. e.g. In age group 18-40, I expected it go from 25.7% to 39.4 %, whereas the graph shows something different.
>>
>> I know the problem lies in creating the variable "sub_all"and it does not seem to capture the information for the individual age groups.
>> Does anyone have any ideas about what went wrong? And what is the way to correct it?
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index