Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Creating a second output data set

From   Roger Newson <[email protected]>
To   [email protected]
Subject   Re: st: Creating a second output data set
Date   Tue, 06 Sep 2011 21:58:10 +0100

I think you are looking for the -postfile- utility. In Stata, type

help postfile

to find out more.


Best wishes


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page:
Departmental Web page:

Opinions expressed are those of the author, not of the institution.

On 06/09/2011 21:53, Bryan Sayer wrote:
I need to create an output data set that will differ in the content and
number of observations from the input file. The observations will be
created one at a time, based on the input data set.

Specifically, I am creating all combinations of N objects taken two at a
time. I will probably also do permutations.

The input data set (to start with) consists of N records with two
variables, the primary sampling unit (PSU) and a size variable
associated with the PSU (a count variable). I want to create two output
data sets. One is each combination of PSU with the associated joint
probability. The second has the same structure as the input data set but
includes the marginal probability, calculated as the sum of the joint
probabilities associated with the PSU (which are accumulated as each
combination is created).

The part I am stuck on is how to output the data set of combinations.
Can someone point me to a program that outputs a file as calculations
are made?

(For those interested, this is for probability proportional to size
(PPS) sampling. See, for example, Levy and Lemeshow "Sampling of
Populations, chapter 11).

Here is an example of one stratum:

Input data set (with marginal probability added)

District Size pi(i)
LUWEERO 12,466 0.916858
KAMPALA 3,459 0.542857
TORORO 2,815 0.448739
KAMULI 549 0.091546
Total 19,289

Output data set:


*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index