[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Jun Xu" <mystata@hotmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: RE: loop [ignore previous one (from this message), sorry!!!] |

Date |
Sun, 16 Feb 2003 01:13:30 -0600 |

Stata-listers/Nick,

This email might be kind of long, but to explain this problem clearly, could you bear with me for a second :( Thanks a lot.

Thanks a lot for your help, and I have managed to write an ado file that seems to work fine till I found out that it won't work with a varlist of more than 8. I have attached our previous Q&A emails at the end.

Notes:

`imax': here is a macro that I have grabbed using 2^`nvars'-1.

`nvars': is a macro containing number of variables in the varlist.

part of the ado file that's relevant to my question

************************************************************************

......

......

......

*create this response pattern variabl and replace the value in future

quietly gen `pat02'=.

format `pat02' %0`nvars'.0f

*create other tempvars

forval i = 1 / `imax' {

qui inbase 2 `i'

local which : di %0`nvars'.0f `r(base)'

*local a="`which'"

*di "`which'"

forval j = 1 / `nvars' {

local char = substr("`which'",`j',1)

*I only invoke `which' once within this loop

.......

.......

.......

}

quietly replace `pat02'=`which' if _n==`i'

*I don't see what's wrong here and I have used the following three *lines kind of check the instantaneous value of `which' and it is a *binary code everytime, but `pat02'[`i'] becomes not binary after the *binary code 100000000, so I strongly suspect either I misused the *replace statement, or there is something unique to "local which : di %*0`nvars'.0f `r(base)'" the command that Nick taught me in his email to *my question, which I might not understand well. The puzzle here is *that no problems with 8 variables and less, but not 9 variables and *above as you will see in the output, where the binary codes will have *other values than 0 and 1.

*di _n(1) "`which'"

*di `pat02'[`i']

*di %0`nvars'.0f `which'

*replacing values of other tempvar

......

......

......

}

......

......

......

************************************************************************

Here one function of this ado file presented above is to go through all combinations of variables in the varlist and to calculate the listwise nonmissing sample size. For example if the syntax is:

permlist var1 var2 var3 var4 var5 var6, patmis(001001)

It will produce the listwise nonmissing sample size for var3 var6. A "0" within [,patmis()] options indicates exclusion from the listwise nonmissing sample size calculation, and a "1" indicates inclusion.

And it will also calculate all listwise nonmissing sample size for any

combinations (choose 1 to 6 out of 6 = 2^6 combinations, except the

possibility of choosing none) of variables among these six variables

(var1-var6), and presented in a table using tabdisp. Results as follow, and the only difference is that I used 9 variables to present the problem here. I only grab the portion that starts to show problems:

***************************************************

pat02 | nomiscnt

------------------------

......

......

......

011101110 | 1387

011101111 | 1387

011110000 | 1192

011110001 | 1192

011110010 | 1192

011110011 | 1192

011110100 | 1192

011110101 | 1192

011110110 | 1192

011110111 | 1192

011111000 | 1191

011111001 | 1191

011111010 | 1191

011111011 | 1191

011111100 | 1191

011111101 | 1191

011111110 | 1191

011111111 | 1191

100000000 | 1388

100000008 | 1388

100000096 | 1388

100000104 | 1388

100000112 | 1388

100001000 | 1387

100001008 | 1387

100001104 | 1387

100001112 | 1387

100010000 | 1192

100010008 | 1192

100010096 | 1192

100010104 | 1192

100010112 | 1192

100011000 | 1191

100011008 | 1191

......

......

......

***************************************************

What I didn't include in the previous syntax section in this email (but included in my ado file) is to tabdisp "pat02" (binary codes used for indicating inclusion or exclusion from my listwise nonmissing sample size calculation) with "nomiscnt" (nonmissing case counts). The most puzzling part is that why the binary codes only work for 8 or lower digits (till the 1000000000), not 9 (any binary number greater than 1000000000) and above. I can only think of the following possibilities:

1. I am doing something wrong with inbase

2. something associated with "local which : di %0`nvars'.0f `r(base)'"

3. quietly replace `pat02'=`which' if _n==`i'

Or probably the way how I created `which' prevents me from doing

quietly replace `pat02'=`which' if _n==`i'

I have tried varlist with 9 vars, 10 vars and different data sets. I don't see any problems with the value of the immediate `which' in each run of forval. Just cannot get it right for `pat02' using

quietly replace `pat02'=`which' if _n==`i'

and it looks strange why this won't work, especially just won't work for more than 8 variables.

Last email from Nick Cox attached:

**********************************************************************

Jun Xu posted twice, here labelled (1) and (2):

1. To get a tabulation of patterns with some instances,(1) > puzzled by a problem when writing an ado file. Suppose I have > var1 var2 var3 var4 var5......vark, and I want to do the > following loop: > ************************************************************ > ********* > var1 > var2 > var3 > ... > ... > ... > vark > var1 var2 > var1 var3 > var1 var4 > ... > ... > var1 vark > var2 var3 > var2 var4 > ... > ... > var(k-1) vark > ... > ... > var1 var2 var3 > var1 var2 var4 > var1 var2 var5 > ... > var1 var2 var3....vark > ****************************************************************** > Basically, what I want to do is like step wise exhausting > all combinations > in a systematic way from univariate, bivariate, trivariate, to > multivariate....Or, I can say for every variable in the > variable list, there > is indicator variable associated with it. I either take > this variable in or > out in each run. And there should be 2^k possibilities. I > have no idea how > to handle that. COuld anyone give me some hint? Many > thanks in advance. (2) I think I might not have explained my problems clearly. I have k indicator variables (coded as 1 or 0) and I would like to know the response patterns (for example for latent class analsis) to these k variables. For example, var1 var2 var3 ....vark 1 0 0 0 0 1 0 0 ... 1 1 0 0 ... ... ... 1 1 1 1 I would like to know for each response pattern, how many cases are there, and programmed into an ado file. My key problem here is how to run through all the combinations (univariate, bivariate, and trivariate) One posibility is that I used the following cods (or reviced version to fit into an ado file) ****************************************************************** clear for num 1/6: set obs 100\ gen xX=invnorm(uniform()) \ gen DxX=xX>0.6 gen pattern=0 local i=1 while `i'<6 { replace pattern=pattern+Dx`i'*10^(6-`i') local i=`i'+1 } aorder list Dx1-Dx6 pattern sort pattern list pattern gen count=1 collapse (sum) count, by(pattern) *********************************************************** The resulting data matrix looks like: ============================ pattern count 0 16 10 10 100 5 110 6 1000 8 1010 7 1100 2 1110 1 10000 11 10010 3 10100 2 10110 2 11000 2 11010 2 11100 1 100000 7 100010 1 100100 2 101000 4 101010 1 110000 1 110010 2 110100 2 111000 1 111010 1 ================================= Here the problem is that it only presents the response pattern that has at least one case and it's hard to handle its order (now is list in numerical order: from small to big) But what if I want to go through "each" combination (2^k possible ways) in a sysmatic way and list all response pattern freqeuncy though some of them have zero cases. What I meant by a systematic way is like: ************************************************************ ********* var1 var2 var3 ... ... ... vark var1 var2 var1 var3 var1 var4 ... ... var1 vark var2 var3 var2 var4 ... ... var(k-1) vark ... ... var1 var2 var3 var1 var2 var4 var1 var2 var5 ... var1 var2 var3....vark ****************************************************************** or in binary coding **************************************************************** 1 0 0 0 0 .....0 0 1 0 0 0 .....0 0 0 1 0 0 .....0 ... ... ... 0 0 0 0 0 .....1 1 1 0 0 0 .....0 1 0 1 0 0 .....0 1 0 0 1 0 .....0 ... ... 1 1 1 0 0 .....0 1 0 1 1 0 .....0 ... ... ... ... 1 1 1 1 1 .....1 *********************************************** Here I didn't present some summarize command that could grab case number for that response pattern. But basically I will run through each combination and calculate the frequency for that particular combination though there might be zero cases. Thanks a lot

egen all = concat(var1-vark)

tab all

2. The following program suggests some possible lines of attack.

program permlist, rclass

version 8

syntax varlist

tokenize `varlist'

local nvars : word count `varlist'

local imax = 2^`nvars' - 1

forval i = 1 / `imax' {

qui inbase 2 `i'

local which : di %0`nvars'.0f `r(base)'

local vars

forval j = 1 / `nvars' {

local char = substr("`which'",`j',1)

if `char' {

local vars "`vars'``j'' "

}

}

local vlist `"`vlist'"`vars'" "'

}

local varlist

forval i = 1 / `nvars' {

foreach w of local vlist {

local nv : word count `w'

if `i' == `nv' {

local varlist `"`varlist'"`w'" "'

}

}

}

return local varlist `"`varlist'"'

end

I use the undocumented -inbase- command

to get the binary equivalent of 1 ... 2^k - 1 (I omit the

null case in which none of the variables are chosen).

It is important to get leading zeros explicit.

-inbase- is in Stata 8; for Stata 7 or Stata 6 type

. findit inbase

or use the search method of your choice

to find it in Bill Gould's files. In Stata

. type http://www.stata.com/users/wgould/inbase/inbase.ado

Then each variable is or is not chosen according

to whether each digit is 1 or 0.

Then we need to sort for your purposes according

to the number of variables chosen.

The whole list is left behind in memory

in the form (e.g. for a b c d)

"d " "c " "b " "a " ... "a b c d "

I think the above program should also

work with very minor modifications in Stata 7.

3. For implementation of a different, and less

general, technique see -allpossible- on SSC.

Nick

n.j.cox@durham.ac.uk

_________________________________________________________________

Help STOP SPAM with the new MSN 8 and get 2 months FREE* http://join.msn.com/?page=features/junkmail

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: RE: loop [ignore previous one (from this message), sorry!!!]***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**st: Re: RE: loop***From:*"Michael Blasnik" <michael.blasnik@verizon.net>

- Prev by Date:
**st: RE: loop (please ignore previous)** - Next by Date:
**st: Missing Log-Likelihood test of alpha in nbreg** - Previous by thread:
**st: Re: previous version** - Next by thread:
**st: Re: RE: loop** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |