Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Joerg Luedicke <joerg.luedicke@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: if expression involving a variable in a while loop |

Date |
Wed, 13 Jul 2011 15:14:20 -0400 |

Are the weights sorted from hi to low within each observation across the 100 variables? If not, your result will depend on some arbitrary sort order in your data. If I understand correctly, you want to get at the biggest weights, that sum up to .95, not any weights out of the 100 that sum up to .95. If that's correct, I would just use -reshape- to do the whole thing. Consider the code below. The first part sets up the toy data similar to yours (if the weights are not sorted within each observation). The second part suggests -reshape-ing to long format, sorting the weights, creating the desired variables, and -reshape- back to wide. This code, however, preserves the original variables. If the original variables do not matter, just create a new -var_id- before reshaping back to wide. /********************************/ //setting up toy data clear all set obs 2 gen id=_n expand 100 bys id: gen var_id=_n bys id: gen var=uniform()/50 reshape wide var, i(id) j(var_id) //actual code starts here reshape long var, i(id) j(var_id) gsort +id -var bys id: gen sum95=sum(var) bys id: gen d95=sum95<=.95 drop if d95==0 bys id: egen sum=max(sum95) bys id: egen n_95=total(d95) drop sum95 d95 reshape wide var, i(id) j(var_id) /********************************/ J. On Wed, Jul 13, 2011 at 2:00 PM, Daifeng He <dhe.statlist@gmail.com> wrote: > Dear all, > > > I have a quesiton about loops involving a variable in the "if expression." > > I have 100 variables named var1 var2... var100 which sums up to 1. > These are share variables and most of the weights are carried by the > first few varialbes. I would like to truncate at 0.95 because I don't > want to carry all this 100 variables around. > > So specifically, I would like to generate two variables: a) the sum of > the first few of those 100 variables so that the sum is just over > 0.95; b)the number of variables used in the sum. E.g., for the first > obs, if var1 +var2 = 0.94 and var1+var2+var3 = 0.97, then my first > variable should be 0.97 and my second varible should be 3; Similary, > for another obs, if var1-var10 sum up to 0.85 and var1-var11 sums up > to 0.96, then my first variable would be 0.96 and second variable be > 11. > > Here are my codes: > > gen sum=0 > gen n_95=. > local i=1 > > while sum<=0.95 { > > replace sum=sum+var`i' > > local `i'=`i'+1 > > replace n_95=`i' > > } > > These codes runs but don't give me the variables I want. When > explaining "if exp in a loop," Stata manual says "If the expression > refers to any variables, their values in the first observation are > used unless explicit subscripts are specified," so I think my problem > might be associated with my "while sum<=0.95" code line because "sum" > is a varable, not a macro. But I am not sure how to explicitly > subcript. I've desperately tried replacing "sum" by "sum[_n]," > "sum[n]", "sum[`n']", or "sum[_`n']" in the while loop and none of > them worked. Furthermore, I am not sure whether my problem is entirely > the subscript issue, because som eof the sum varialbe I got are bigger > than 1! (which I have no idea what happens here -- my var1-var100 > strictly sums up to 1) > > . list sum n_95 in 1/5 > > +-----------------+ > | sum n_95 | > |-----------------| > 1. | 1.242837 1 | > 2. | 1.583333 1 | > 3. | .7551287 1 | > 4. | .9887668 1 | > 5. | .6393646 1 | > +-----------------+ > > > > I'd greatly appreciate any help on this. > > Regards, > Daifeng > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: if expression involving a variable in a while loop***From:*Daifeng He <dhe.statlist@gmail.com>

- Prev by Date:
**st: count differents values of a variable** - Next by Date:
**Re: st: if expression involving a variable in a while loop** - Previous by thread:
**Re: st: if expression involving a variable in a while loop** - Next by thread:
**st: Twin data: descriptive statistics with robust standard deviation** - Index(es):