Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steven Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: i-1 in forvalues loop |

Date |
Wed, 19 Oct 2011 08:27:38 -0400 |

You are using a fractional interval in your example because 15 does not divide evenly into the total number of students (which must be 15x 3166.467 = 47,497) Leslie Kish (Survey Sampling, 1965, Wiley, page 115) presents better solutions to this problem. A simple one is to enlarge the total to the next highest multiple of 15, or 3167 x 16 = 47,505. This adds eight fictitious students, an increase of less than 0.02%, and makes the sampling interval 3167. You could, if you wish, disperse the extra students to the schools with the largest size; this will alter the selection probabilities for individual schools only slightly. I think it a waste of effort to make the intervals too exact, because the advance counts are likely to be out of date when you do the study. See WE Deming, Sample Design in Business Research, Wiley, 1960, Chapter Six, and elsewhere, for examples of rough counts. If you will be sub-sampling students, then you will need Deming's methods to keep the probabilities of selection constant across schools in the same stratum. A specific mistake in your code: the starting value is not random. Taking the mean of the 47,000+ uniforms has made it a constant (very close to 3166.667). Assuming that you take the advice above, the command should be: . gen bal1= ceil(3167*runiform()) But I agree with Nick thoughts about style. I would use scalars, not variables, to hold the "ball" choices. Steve On Oct 18, 2011, at 10:52 AM, Viktor Emonds wrote: Let me try to explain again: I have a file with schools and I know the number of students in the target population. I have everything neatly sorted and for sampling, I just need to give each school a chance poportional to the number of students in the target year in that school on getting selected, determinate a random starting point to make my first pick and add a constant interval. In the loop with bal, I basically try to draw the 'winning numbers' by taking the constant (bal1) and adding the interval (3166.467), storing the new constant in bal2, bal3 ....bal15. The way I envisioned doing this was by running a loop for bal2-bal15, adding the interval to the value of the previous bal (the i-1th bal). Is there a way to do so? Best, Viktor ______________________________ I don't understand what you are trying to do. I comment only on obvious Stata problems. `i-1' would only work if "i-1" were the name of a local macro, but it can't be such a name, as minus signs are not allowed in Stata names. gen bal`i' = bal`=`i'-1' * 3166.467 would at first sight work as then Stata knows to evaluate the expression `i' - 1 on the fly. Your code largely consists of putting constants into variables, which is legal but not especially good style. Note that gen lotto=sum(studentsj3) //sum of target population produces a _cumulative_ sum: only in the last observation will this be the actual sum, as your comment implies. Whether the comment or the code is what you want only you can say. Nick ________________________________________ Van: Viktor Emonds Verzonden: dinsdag 18 oktober 2011 15:55 Aan: statalist@hsphsun2.harvard.edu Onderwerp: i-1 in forvalues loop Hey, I am trying to draw a sample with random start, fixed interval systematic sampling procedure in each explicit stratum. The data in each stratum are already sorted by all the implicit stratifiers with serpentine sorting for a variable of particular interest. Now I just need to do the actual sampling and tried to start by doing the following: use ethnicframe31 //the specific explicit stratum gen lotto=sum(studentsj3) //sum of target population egen bal1= mean(3166.4667*runiform()) //random starting point forvalues i=2/15 { //Draw ' lotto balls' by adding the fixed interval gen bal`i'= bal`i-1'*3166.4667 } gen winnaar=0 //Identify ' winning' schools forvalues i=1/15{ replace winnaar=1 if inrange(bal`i',lotto[_n-1],lotto) } Apparently, the `i-1' in the first loop is not understood. What am I doing wrong here? Thanks in advance! Best, Viktor * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: i-1 in forvalues loop***From:*Viktor Emonds <Viktor.Emonds@soc.kuleuven.be>

**st: RE: i-1 in forvalues loop***From:*Viktor Emonds <Viktor.Emonds@soc.kuleuven.be>

- Prev by Date:
**st: Meta-analysis, Breslow-Day test for non-binary variable** - Next by Date:
**Re: st: RE: i-1 in forvalues loop** - Previous by thread:
**st: RE: RE: i-1 in forvalues loop** - Next by thread:
**Re: st: RE: i-1 in forvalues loop** - Index(es):