I realized this topic has been addressed in many forms in the STATA list
so my question is more or 'state of STATA' - to see if a user written
program has been developed recently or if one is planned.
I have data for which I need to regularly report summary stats by a
categorical variable. I don't seem to be able to get each variation of
missing reported regularly. For example, I have a continuous variable
CD8 counts (cd8a) and a categorical variable HIV status (hivst) which
have the following missing:
. nmissing hivst
hivst 1
. by hivst: nmissing cd8a
-> hivst = hiv negative
cd8a 10
-> hivst = hiv positive
cd8a 7
however, if I use:
tabstat cd8a, by (hivst) s(n mean sd p50 min max) f(%9.2f) long miss
col(variables)
gives:
HIV Status stats | CD8 Absolute
----------------------+--------------
HIV Negative N | 101
mean | 546.33
sd | 459.88
p50 | 469.00
min | 0.00
max | 3768.00
----------------------+--------------
HIV Positive N | 67
mean | 782.72
sd | 508.08
p50 | 629.00
min | 0.00
max | 3621.00
----------------------+--------------
Missing N | 1
----------------------+--------------
Total N | 169
mean | 638.48
sd | 491.39
p50 | 536.00
min | 0.00
max | 3768.00
-------------------------------------
Is there a way now to get the extra line under HIV Neg and HIV Pos for
the missing in each category, without doing this manually for several
dozen variables? This would also give me a consistent N for Total.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/