Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: RE: Using egen or similar


From   Lee Sieswerda <Lee.Sieswerda@tbdhu.com>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: RE: Using egen or similar
Date   Mon, 25 Aug 2003 10:57:38 -0400

You're welcome, Amani. I was thinking about it on the drive home on Friday,
and it occurred to me that I forgot to mention that the -merge- portion is
only really necessary if you want to see categories for which there are zero
observations.

Lee

Lee Sieswerda, Epidemiologist
Thunder Bay District Health Unit
Lee.Sieswerda@tbdhu.com




-----Original Message-----
From: siyama@who.int [mailto:siyama@who.int] 
Sent: Saturday, August 23, 2003 10:38 AM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: RE: Using egen or similar


Dear Lee,

Many thanks for your help, I think I solved the problem with your suggestion
to use "egen, concat".  Vars m1-m6 are coded 0/1 so I started with the
following:

for X in num 2/6: recode mX 1=X 
egen new=concat(m1 m2 m3 m4 m5 m6)

and I did get the desired output as follows:

         new  |      Freq.     Percent        Cum.
-----------------------------------------------
     000000 |         21        7.07        7.07
     100000 |          2        0.67        7.74
     103000 |          1        0.34        8.08
     103456 |          3        1.01        9.09
     123000 |          3        1.01       10.10
     123400 |          2        0.67       10.77
     123450 |          2        0.67       11.45
     123456 |      263       88.55      100.00
  ------------+-----------------------------------
      Total |        297      100.00

Once again, my thanks for your good suggestion and time.

Amani


-----Original Message-----
From: Lee Sieswerda [mailto:Lee.Sieswerda@tbdhu.com]
Sent: Saturday, 23 August 2003 00:00
To: 'statalist@hsphsun2.harvard.edu'
Subject: st: RE: Using egen or similar


I think this will give you pretty close to what you want.

* Reduce your master data to only the six
* variables of interest, and concatenate
* into a new variable
use your_master_data.dta
keep m1-m6
egen bincats = concat(m1-m6)
keep bincats
sort bincats
save temp1.dta, replace

* Generate a new dataset with all
* possible categories, in this case
* you've got 2^6 possible categories

clear
set obs 64
gen cat1 = 0
gen cat2 = 0
gen cat3 = 0

* Okay, this block is a hack job. I know
* there is a nice algorithm for 
* generating all of the permutations
* of a binary outcome over n events, but
* I can't put my hands on it right now
replace cat1 = 1 in 33/64
replace cat2 = 1 in 17/32
replace cat2 = 1 in 49/64
replace cat3 = 1 in 9/16
replace cat3 = 1 in 25/32
replace cat3 = 1 in 41/48
replace cat3 = 1 in 57/64
egen cat4 = fill(0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1)
egen cat5 = fill(0 0 1 1 0 0 1 1)
egen cat6 = fill(0 1 0 1)
egen bincats = concat(cat1-cat6)

* Now generate the kind of
* labels you want, more or less
forvalues n = 1/6 {
	replace cat`n' = cat`n'*`n'
}
egen numcats = concat(cat1-cat6)
keep numcats bincats

* Now merge your original data back into these
* categories
sort bincats
merge bincats using temp1.dta

* Drop the categories not needed
drop if _merge==1

* And tabulate
tab numcats


There may also be a clever way to do this with string functions rather than
with -merge-. 


Regards,
Lee

Lee Sieswerda, Epidemiologist
Thunder Bay District Health Unit
Lee.Sieswerda@tbdhu.com





-----Original Message-----
From: siyama@who.int [mailto:siyama@who.int] 
Sent: Friday, August 22, 2003 2:43 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: Using egen or similar


Dear Stata-listers,

I have six binary variables (m1,..,m6) coded 0/1

I wish to see the frequency of the pattern of  occurring 1s, but preserving
the variable sequence.  I am not sure whether any of "egen" functions can
perform the task  (and I prefer not to use egen..=group(..)).

I started off with the following:

for X in num 2/6: recode mX 1=X

then if there is one of egen functions it would be

egen new=fun(m1 m2 m3 m4 m5 m6)

then tabulate new would yield

	new		freq
	------		------
	1,2,3		63
	1,3,4,5		23
	1,4,5		30
	2,5,6		20
.....etc

It is if I am "xtdes" but with variables rather than panel time-entries.

Many thanks for your help in advance.  

Amani

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index