Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: loop [ignore previous one (from this message), sorry!!!]


From   "Jun Xu" <mystata@hotmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: RE: loop [ignore previous one (from this message), sorry!!!]
Date   Sun, 16 Feb 2003 01:13:30 -0600

Stata-listers/Nick,

This email might be kind of long, but to explain this problem clearly, could you bear with me for a second :( Thanks a lot.

Thanks a lot for your help, and I have managed to write an ado file that seems to work fine till I found out that it won't work with a varlist of more than 8. I have attached our previous Q&A emails at the end.

Notes:

`imax': here is a macro that I have grabbed using 2^`nvars'-1.
`nvars': is a macro containing number of variables in the varlist.

part of the ado file that's relevant to my question
************************************************************************
......
......
......
*create this response pattern variabl and replace the value in future
quietly gen `pat02'=.
format `pat02' %0`nvars'.0f

*create other tempvars

forval i = 1 / `imax' {
qui inbase 2 `i'


local which : di %0`nvars'.0f `r(base)'

*local a="`which'"
*di "`which'"

forval j = 1 / `nvars' {
local char = substr("`which'",`j',1)

*I only invoke `which' once within this loop
.......
.......
.......

}


quietly replace `pat02'=`which' if _n==`i'

*I don't see what's wrong here and I have used the following three *lines kind of check the instantaneous value of `which' and it is a *binary code everytime, but `pat02'[`i'] becomes not binary after the *binary code 100000000, so I strongly suspect either I misused the *replace statement, or there is something unique to "local which : di %*0`nvars'.0f `r(base)'" the command that Nick taught me in his email to *my question, which I might not understand well. The puzzle here is *that no problems with 8 variables and less, but not 9 variables and *above as you will see in the output, where the binary codes will have *other values than 0 and 1.


*di _n(1) "`which'"
*di `pat02'[`i']
*di %0`nvars'.0f `which'

*replacing values of other tempvar

......
......
......

}
......
......
......
************************************************************************

Here one function of this ado file presented above is to go through all combinations of variables in the varlist and to calculate the listwise nonmissing sample size. For example if the syntax is:

permlist var1 var2 var3 var4 var5 var6, patmis(001001)

It will produce the listwise nonmissing sample size for var3 var6. A "0" within [,patmis()] options indicates exclusion from the listwise nonmissing sample size calculation, and a "1" indicates inclusion.
And it will also calculate all listwise nonmissing sample size for any
combinations (choose 1 to 6 out of 6 = 2^6 combinations, except the
possibility of choosing none) of variables among these six variables
(var1-var6), and presented in a table using tabdisp. Results as follow, and the only difference is that I used 9 variables to present the problem here. I only grab the portion that starts to show problems:

***************************************************
pat02 | nomiscnt
------------------------
......
......
......
011101110 | 1387
011101111 | 1387
011110000 | 1192
011110001 | 1192
011110010 | 1192
011110011 | 1192
011110100 | 1192
011110101 | 1192
011110110 | 1192
011110111 | 1192
011111000 | 1191
011111001 | 1191
011111010 | 1191
011111011 | 1191
011111100 | 1191
011111101 | 1191
011111110 | 1191
011111111 | 1191
100000000 | 1388
100000008 | 1388
100000096 | 1388
100000104 | 1388
100000112 | 1388
100001000 | 1387
100001008 | 1387
100001104 | 1387
100001112 | 1387
100010000 | 1192
100010008 | 1192
100010096 | 1192
100010104 | 1192
100010112 | 1192
100011000 | 1191
100011008 | 1191
......
......
......
***************************************************
What I didn't include in the previous syntax section in this email (but included in my ado file) is to tabdisp "pat02" (binary codes used for indicating inclusion or exclusion from my listwise nonmissing sample size calculation) with "nomiscnt" (nonmissing case counts). The most puzzling part is that why the binary codes only work for 8 or lower digits (till the 1000000000), not 9 (any binary number greater than 1000000000) and above. I can only think of the following possibilities:
1. I am doing something wrong with inbase
2. something associated with "local which : di %0`nvars'.0f `r(base)'"
3. quietly replace `pat02'=`which' if _n==`i'

Or probably the way how I created `which' prevents me from doing
quietly replace `pat02'=`which' if _n==`i'

I have tried varlist with 9 vars, 10 vars and different data sets. I don't see any problems with the value of the immediate `which' in each run of forval. Just cannot get it right for `pat02' using

quietly replace `pat02'=`which' if _n==`i'

and it looks strange why this won't work, especially just won't work for more than 8 variables.

Last email from Nick Cox attached:
**********************************************************************
Jun Xu posted twice, here labelled (1) and (2):


(1)

> puzzled by a problem when writing an ado file. Suppose I have
> var1 var2 var3 var4 var5......vark, and I want to do the
> following loop:
> ************************************************************
> *********
> var1
> var2
> var3
> ...
> ...
> ...
> vark
> var1 var2
> var1 var3
> var1 var4
> ...
> ...
> var1 vark
> var2 var3
> var2 var4
> ...
> ...
> var(k-1) vark
> ...
> ...
> var1 var2 var3
> var1 var2 var4
> var1 var2 var5
> ...
> var1 var2 var3....vark
> ******************************************************************
> Basically, what I want to do is like step wise exhausting
> all combinations
> in a systematic way from univariate, bivariate, trivariate, to
> multivariate....Or, I can say for every variable in the
> variable list, there
> is indicator variable associated with it.  I either take
> this variable in or
> out in each run.  And there should be 2^k possibilities.  I
> have no idea how
> to handle that.  COuld anyone give me some hint?  Many
> thanks in advance.

(2)

I think I might not have explained my problems clearly.  I
have k indicator
variables (coded as 1 or 0) and I would like to know the
response patterns
(for example for latent class analsis) to these k
variables.  For example,

var1 var2 var3 ....vark
1    0    0         0
0    1    0         0
...
1    1    0         0
...
...
...
1    1    1         1

I would like to know for each response pattern, how many
cases are there,
and programmed into an ado file.  My key problem here is
how to run through
all the combinations (univariate, bivariate, and trivariate)

One posibility is that I used the following cods (or
reviced version to fit
into an ado file)

******************************************************************
clear
for num 1/6: set obs 100\ gen xX=invnorm(uniform()) \ gen DxX=xX>0.6
gen pattern=0
local i=1
while `i'<6 {
	replace pattern=pattern+Dx`i'*10^(6-`i')
	local i=`i'+1
}

aorder
list Dx1-Dx6 pattern
sort pattern
list pattern
gen count=1
collapse (sum) count, by(pattern)
***********************************************************
The resulting data matrix looks like:

============================
pattern	count
0	16
10	10
100	5
110	6
1000	8
1010	7
1100	2
1110	1
10000	11
10010	3
10100	2
10110	2
11000	2
11010	2
11100	1
100000	7
100010	1
100100	2
101000	4
101010	1
110000	1
110010	2
110100	2
111000	1
111010	1
=================================


Here the problem is that it only presents the response
pattern that has at
least one case and it's hard to handle its order (now is
list in numerical
order: from small to big)
But what if I want to go through "each" combination (2^k
possible ways) in a
sysmatic way and list all response pattern freqeuncy though
some of them
have zero cases.  What I meant by a systematic way is like:

************************************************************
*********
var1
var2
var3
...
...
...
vark
var1 var2
var1 var3
var1 var4
...
...
var1 vark
var2 var3
var2 var4
...
...
var(k-1) vark
...
...
var1 var2 var3
var1 var2 var4
var1 var2 var5
...
var1 var2 var3....vark
******************************************************************

or in binary coding
****************************************************************
1 0 0 0 0 .....0
0 1 0 0 0 .....0
0 0 1 0 0 .....0
...
...
...
0 0 0 0 0 .....1
1 1 0 0 0 .....0
1 0 1 0 0 .....0
1 0 0 1 0 .....0
...
...
1 1 1 0 0 .....0
1 0 1 1 0 .....0
...
...
...
...
1 1 1 1 1 .....1
***********************************************

Here I didn't present some summarize command that could
grab case number for
that response pattern.  But basically I will run through
each combination
and calculate the frequency for that particular combination
though there
might be zero cases.  Thanks a lot
1. To get a tabulation of patterns with some instances,

egen all = concat(var1-vark)
tab all

2. The following program suggests some possible lines of attack.

program permlist, rclass
version 8
syntax varlist
tokenize `varlist'
local nvars : word count `varlist'
local imax = 2^`nvars' - 1

forval i = 1 / `imax' {
qui inbase 2 `i'
local which : di %0`nvars'.0f `r(base)'
local vars
forval j = 1 / `nvars' {
local char = substr("`which'",`j',1)
if `char' {
local vars "`vars'``j'' "
}
}
local vlist `"`vlist'"`vars'" "'
}

local varlist
forval i = 1 / `nvars' {
foreach w of local vlist {
local nv : word count `w'
if `i' == `nv' {
local varlist `"`varlist'"`w'" "'
}
}
}

return local varlist `"`varlist'"'
end

I use the undocumented -inbase- command
to get the binary equivalent of 1 ... 2^k - 1 (I omit the
null case in which none of the variables are chosen).
It is important to get leading zeros explicit.

-inbase- is in Stata 8; for Stata 7 or Stata 6 type

. findit inbase

or use the search method of your choice
to find it in Bill Gould's files. In Stata

. type http://www.stata.com/users/wgould/inbase/inbase.ado

Then each variable is or is not chosen according
to whether each digit is 1 or 0.

Then we need to sort for your purposes according
to the number of variables chosen.

The whole list is left behind in memory
in the form (e.g. for a b c d)

"d " "c " "b " "a " ... "a b c d "

I think the above program should also
work with very minor modifications in Stata 7.

3. For implementation of a different, and less
general, technique see -allpossible- on SSC.

Nick
n.j.cox@durham.ac.uk








_________________________________________________________________
Help STOP SPAM with the new MSN 8 and get 2 months FREE* http://join.msn.com/?page=features/junkmail

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index