[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Lee Sieswerda <Lee.Sieswerda@tbdhu.com> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: output svymean |

Date |
Wed, 23 Jul 2003 19:00:14 -0400 |

I am not familiar with SAS. As Brian Sayer suggested, in Stata I would use the -post- or -file- commands to get an output dataset after -svytab-. Here's an example using -post- where I extract what I need from the estimates and variance matrices left behind by -svytab-. Then I calculate the confidence intervals based on the formula on page 85 of the [SVY] manual. I've included two samples: one using the subpop() option and one without. It will probably wrap badly after I send it, so you may have to manually unwrap some long lines. Best, Lee Sieswerda, Epidemiologist Thunder Bay District Health Unit Lee.Sieswerda@tbdhu.com ********************************** capture log close cd d:\Data\nwosdus\2001 log using cage.log, replace set more off tempname memhold tempfile cage postfile `memhold' region grade sex percent stderr lowci upci N using cage, replace gen dum = 1 * First, calculate for all grades and regions... quietly svytab cage_2 dum, per se ci local df = e(df_r) local N = e(N) matrix myV = e(V) local se = sqrt(myV[1,1]) matrix myb = e(b) local est = myb[1,2] * Upper Confidence Limit local up = log(`est'/(1-`est')) + ((invttail(`df',0.025)*`se')/(`est'*(1-`est'))) local upci = exp(`up')/(1+ exp(`up')) * Lower Confidence Limit local lo = log(`est'/(1-`est')) - ((invttail(`df',0.025)*`se')/(`est'*(1-`est'))) local lowci = exp(`lo')/(1+ exp(`lo')) post `memhold' (0) (0) (0) (100*`est') (100*`se') (100*`lowci') (100*`upci') (`N') * Then calculate by region... * Subregions for subpop command already generated in crdata1.do forvalues r = 1(1)3 { quietly svytab cage_2 dum, per se ci subpop(subreg`r') local df = e(df_r) local N = e(N_sub) matrix myV = e(V) local se = sqrt(myV[1,1]) matrix myb = e(b) local est = myb[1,2] * Upper Confidence Limit local up = log(`est'/(1-`est')) + ((invttail(`df',0.025)*`se')/(`est'*(1-`est'))) local upci = exp(`up')/(1+ exp(`up')) * Lower Confidence Limit local lo = log(`est'/(1-`est')) - ((invttail(`df',0.025)*`se')/(`est'*(1-`est'))) local lowci = exp(`lo')/(1+ exp(`lo')) post `memhold' (`r') (0) (0) (100*`est') (100*`se') (100*`lowci') (100*`upci') (`N') } postclose `memhold' set more on log close ****************************************** -----Original Message----- From: Rita Luk [mailto:Rita_Luk@camh.net] Sent: Wednesday, July 23, 2003 9:39 AM To: 'statalist@hsphsun2.harvard.edu' Subject: st: output svymean Hello all, I am hoping that you can help me solve an issue that is causing me to pull out what is left of my thinning hair. I normally use SAS, but am required to use STATA for a current project. I am trying to create a database of results produced from the survey tabulation procedure (svytab). I have included the code that I would use in SAS proc freq below, so that someone who is familiar with both packages can see what I am up to. However, because of the complex survey design I cannot use SAS for this project. I plan on doing A LOT of single tabulations and cross tabulations and want to produce an output dataset that might look like: OBS NAME1 VALUE1 NAME2 VALUE2 COUNT PERCENT1 PERCENT2 1 sex female . . 50 0.25 . 2 sex male . . 150 0.75 . 3 sex female age old 25 0.50 0.33 4 sex female age young 25 0.50 0.20 5 sex male age old 50 0.33 0.67 6 sex male age young 100 0.67 0.80 As you can see, this dataset would result from running two different tabulations. The first two observations would come from a single variable tabulation of sex. The subsequent 4 observations (3-6) would have resulted from a cross tabulation of the dichotomous variables age and sex. Ideally I want my program/macro to be able to handle any categorical variables automatically regardless of the number of categories. In stata by using several e() and matrix commands, I can get the percents, and with a little data manipulation the counts and names, but I cannot get the values in a variable form. In addition, I can only do this for the crosstabs (svytab), but not for the single variable frequencies because the svyprop command does not seem to give me saved estimates. Furthermore, the amount of coding needed just to get the percents and counts seems excessive. There must be a quicker way. I am doing something like: *this gives me the row percentatges in variable form (unfortunately attached to my raw database) svytab sex age, row matrix mrowpct=e(b)' svmat mrowpct, name(rowpct) *this gives me the col percentages in variable form (unfortunately attached to my raw database) svytab sex age, col matrix mcolpct=e(b)' svmat mcolpct, name(colpct) and that is just to get TWO of the variables in my output dataset!!! There has got to be a simpler way, since in SAS just one option will give a full output dataset of everything I need, all one needs to do (as demonstrated below) is just massage the output to look the way I want. Stata must have an analogous feature! I am not asking for someone to code this for me. I need to learn how to use this program. I just need some hints as to how to do this, commands etc. (ie is there a simple way to get output datasets other than piecing together a whole bunch of matrices. Thanks for any help you can provide, Charles For those that want to see what I would do in SAS, to get an idea of what I am doing here: I would first create the macro (I think stata users call these programs?) /**FOR SINGLE VARIABLE TABLES USE CODE=1, FOR CROSSTABS USE CODE=2**/ %macro outdata (var1, var2,code); %if &code=1 %then %do; proc freq data=mydata; tables &var1 / out=predata; run; %end; %if &code=2 %then %do; proc freq data=mydata; tables &var1*&var2 / out=predata outpct; run; %end; data predata; set predata; rename &var1=VALUE1 &var2=VALUE2 pct_row=PERCENT1 pct_col=PERCENT2; NAME1="&var1"; NAME2="&var2"; run; data final; set final predata; run; %mend; Then run the macro on my data: %outdata (sex, ,1); %outdata (sex,age,2); This program will work regarless of missing values, categories, or number of categories to produce the output that I want. In fact almost all of the output that I want is created by simply the addition of the "out=" option in the tables statement of the proc freq. The rest of the program is simply changing the names. Essentially I am hoping that stata has the equivalent of the "OUT=" option for its svy commands, but I cannot seem to find them. Thanks, Charles _______________________________________________ J. Charles Victor BSc, MSc, PhD (candidate) Department of Public Health Sciences University of Toronto Toronto, Ontario * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: question about tables** - Next by Date:
**st: Re: marginal effects** - Previous by thread:
**st: RE: output svymean** - Next by thread:
**st: New versions of -eclplot- and -smileplot- on SSC** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |