Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: the use of loop function in Stata


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: the use of loop function in Stata
Date   Tue, 13 Mar 2012 10:36:27 +0000

Looking again at Rosie's loop, and setting aside the -histogram- call, the statements inside the loop with corrected syntax look like this:  

sum `var'
egen std`var'= sd(`var')
gen sds`var'=0.2*std`var'
bysort AI4: sum `var'
bysort AI4: egen mins`var'=min(`var')
egen min`var'=max(mins`var')
bysort AI4: egen maxs`var'=max(`var')
egen max`var'=min(maxs`var')
gen insamp`var'= (`var'>= (min`var'- sds`var') & `var'<= max`var'+ sds`var')

One repeated device here is using -egen- to put a single constant into a variable. That is rarely necessary. Consider this revision: 

sum `var'
local sdover5 = 0.2 * r(sd)

bysort AI4: sum `var'

bysort AI4: egen mins`var'=min(`var')
sum mins`var', meanonly 
local maxmin = r(max) 

bysort AI4: egen maxs`var'=max(`var')
su maxs`var', meanonly 
local minmax = r(min) 

gen insamp`var'= (`var'>= (`maxmin'- `sdover5') & `var'<= `minmax'+ `sdover5')

The new idea here is using saved results, which are put into locals. (If maximum precision is needed, scalars would be better.) 

Next consider whether all the new variables are going to be needed after the loop. I guess not, although clearly that's Rosie's choice. Going with this guess, the loop inside could be further edited to 

sum `var'
local sdover5 = 0.2 * r(sd)

bysort AI4: sum `var'

bysort AI4: egen mins = min(`var')
sum mins, meanonly 
local maxmin = r(max) 

bysort AI4: egen maxs = max(`var')
su maxs, meanonly 
local minmax = r(min) 

gen insamp`var'= (`var' >= (`maxmin' - `sdover5') & `var' <= `minmax' + `sdover5')

drop mins maxs 

Nick 
[email protected] 

Nick Cox

By "loop function" you mean here the -foreach- command. Your opening
syntax can be simplified to

foreach var of varlist lp* {

but that's cosmetic. The most evident bug is that you are using the
same character to delimit start and end of references to local macro
names.

The syntax is `macname' not 'macname'.

On Mon, Mar 12, 2012 at 6:29 PM, Rosie Chen <[email protected]> wrote:

> Dear all, this is the first time I use loop function combined with global macro, and I got an error message of ""invalid name" after running the syntax below. Could any of you help identity where the problem is? Thank you very much.
>
>
> ***I have lp1, lp2, lp3, and lp4, and would like to do the same analysis (within the loop) on each of the lp variables.
> global lp "lp*"
> display "$lp"
>
> foreach var of varlist $lp{
>
>     histogram 'var',  by(AI4, col(1))
>     sum 'var'
>     egen std'var'= sd('var')
>     gen sds'var'=0.2*std'var'
>     bysort AI4: sum 'var'
>     bysort AI4: egen mins'var'=min('var')
>     egen min'var'=max(mins'var')
>     bysort AI4: egen maxs'var'=max('var')
>     egen max'var'=min(maxs'var')
>     gen insamp'var'= ('var'>= (min'var'- sds'var') & 'var'<= max'var'+ sds'var')
> }
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index