Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Creating a second output data set

From   Bryan Sayer <>
Subject   st: Creating a second output data set
Date   Tue, 06 Sep 2011 16:53:01 -0400

I need to create an output data set that will differ in the content and number of observations from the input file. The observations will be created one at a time, based on the input data set.

Specifically, I am creating all combinations of N objects taken two at a time. I will probably also do permutations.

The input data set (to start with) consists of N records with two variables, the primary sampling unit (PSU) and a size variable associated with the PSU (a count variable). I want to create two output data sets. One is each combination of PSU with the associated joint probability. The second has the same structure as the input data set but includes the marginal probability, calculated as the sum of the joint probabilities associated with the PSU (which are accumulated as each combination is created).

The part I am stuck on is how to output the data set of combinations. Can someone point me to a program that outputs a file as calculations are made?

(For those interested, this is for probability proportional to size (PPS) sampling. See, for example, Levy and Lemeshow "Sampling of Populations, chapter 11).

Here is an example of one stratum:

Input data set (with marginal probability added)

District Size		pi(i)
LUWEERO	 12,466 	0.916858
KAMPALA	 3,459 		0.542857
TORORO	 2,815 		0.448739
KAMULI	 549 		0.091546
Total	 19,289 	

Output data set:


Bryan Sayer
Monday to Friday, 8:30 to 5:00
Phone: (614) 442-7369
FAX:  (614) 442-7329

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index