Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE:Re:i-1 in forvalues loop

From   Viktor Emonds <>
To   "" <>
Subject   st: RE:Re:i-1 in forvalues loop
Date   Fri, 21 Oct 2011 10:29:44 +0200

Thanks a lot for your informative replies, Steve, I'll try the scalar approach!




Date: Thu, 20 Oct 2011 17:04:12 -0400
From: Steven Samuels <>
Subject: Re: st: RE: i-1 in forvalues loop

Well, I see that my statement was too strong, so:

In fact, to make the starting value a constant, the statement must be either

. scalar bal1= ceil(3167*runiform())
. local bal1= ceil(3167*runiform())

and the other values must be defined as scalars/locals as well.

But I recommend scalars, because with a local macro, you must quote each reference: `bal1' `ball`i''


You are using a fractional interval in your example because 15 does not divide evenly into the total number of students (which must be 15x 3166.467 = 47,497) Leslie Kish (Survey Sampling, 1965, Wiley, page 115) presents better solutions to this problem. A simple one is to enlarge otal to the next highest multiple of 15, or 3167 x 15 = 47,505. This adds eight fictitious students, an increase of less than 0.02%, and makes the sampling interval 3167. You could, if you wish, disperse the extra students to the schools with the largest size; this will alter the selection probabilities for individual schools only slightly. 

I think it a waste of effort to make the intervals too exact, because the advance counts are likely to be out of date when you do the study. See WE Deming, Sample Design in Business Research, Wiley, 1960, Chapter Six, and elsewhere, for examples of rough counts. If you will be sub-sampling students, then you will need Deming's methods to keep the probabilities of selection constant across schools in the same stratum. 

A specific mistake in your code: the starting value is not random. Taking the mean of the 47,000+ uniforms has made it a constant (very close to 3166.667). Assuming that you take the advice above, the command should be:

. scalar bal1= ceil(3167*runiform()) //corrected

But I agree with Nick thoughts about style. I would use scalars, not variables, to hold the "ball" choices.


On Oct 18, 2011, at 10:52 AM, Viktor Emonds wrote:

Let me try to explain again: I have a file with schools and I know the number of students in the target population. I have everything neatly sorted and for sampling, I just need to give each school a chance poportional to the number of students in the target year in that school on getting selected, determinate a random starting point to make my first pick and add a constant interval. 

In the loop with bal, I basically try to draw the 'winning numbers' by taking the constant (bal1) and adding the interval (3166.467), storing the new constant in bal2, bal3 ....bal15. The way I envisioned doing this was by running a loop for bal2-bal15, adding the interval to the value of the previous bal (the i-1th bal). Is there a way to do so?



I don't understand what you are trying to do. I comment only on obvious Stata problems. 
`i-1' would only work if "i-1" were the name of a local macro, but it can't be such a name, as minus signs are not allowed in Stata names. 
gen bal`i' = bal`=`i'-1' * 3166.467
would at first sight work as then Stata knows to evaluate the expression 
`i' - 1 
on the fly. 
Your code largely consists of putting constants into variables, which is legal but not especially good style. 
Note that 
gen lotto=sum(studentsj3) //sum of target population
produces a _cumulative_ sum: only in the last observation will this be the actual sum, as your comment implies. Whether the comment or the code is what you want only you can say. 

Van: Viktor Emonds
Verzonden: dinsdag 18 oktober 2011 15:55
Onderwerp: i-1 in forvalues loop


I am trying to draw a sample with random start, fixed interval systematic sampling procedure in each explicit stratum. The data in each stratum are already sorted by all the implicit stratifiers with serpentine sorting for a variable of particular interest. Now I just need to do the actual sampling and tried to start by doing the following:

use ethnicframe31 //the specific explicit stratum
gen lotto=sum(studentsj3) //sum of target population
egen bal1= mean(3166.4667*runiform()) //random starting point
forvalues i=2/15 { //Draw ' lotto balls' by adding the fixed interval
gen bal`i'= bal`i-1'*3166.4667

gen winnaar=0 //Identify ' winning' schools
forvalues i=1/15{
replace winnaar=1 if inrange(bal`i',lotto[_n-1],lotto)

Apparently, the `i-1' in the first loop is not understood. What am I doing wrong here? Thanks in advance!
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index