Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: FW: How to make 'contract' create a row of zero frequencies

From   n j cox <>
Subject   Re: st: FW: How to make 'contract' create a row of zero frequencies
Date   Mon, 22 May 2006 21:11:29 +0100

To show the issue: suppose Pauline's data looks like

centre month
A 1
A 2
B 2
B 3

contract centre month, zero

will produce

centre month _freq
A 1 1
A 2 1
A 3 0
B 1 0
B 2 1
B 3 1

Thus, -contract- gives you counts for
the Cartesian product of {A, B} and {1, 2, 3},
i.e. all possible pairs.

If you look at the code using -viewsource-
you will see that -fillin- is doing the hard work

Now suppose Pauline has also centres C, D, etc.
in mind but that they are not in the dataset. How
can -contract- be forced (or, more diplomatically,
persuaded) to take those into account?

The first answwer is that -contract- has no
extrasensory ability to know what might have been
the case but isn't. Nor are there handles to specify such
details on the fly.

You need a way to work around this limitation of
-contract-. In addition to Roger's solution, another
way to do it is to add pseudo-observations

centre month
C 1
D 1

before -contract-. Then you have to make sure
that you retract your lie.

replace _freq = 0 if centre == "C" & month == 1
replace _freq = 0 if centre == "D" & month == 1

Yet another way to do it is after the -contract-.

Just add new observations like those above
and then

fillin centre month
replace _freq = 0 if _fillin
drop _fillin

In addition to the manual entry on -fillin-, see
"Filling in the gaps" in SJ 5(1) 2005.


> I have a large dataset containing information on individuals attending for
> screening at several centres.
> I am writing a program to summarise monthly attendance, to be run
> regularly by one of our administrators. So I first 'contract' my data to
> create a new dataset of the number of attendances for each centre for
> each month. I used the following statement:
> contract centre3 mrec, freq(count) zero
> (Centre3 = centre code, mrec = month code)
> I have discovered that when there are no attendances in a month at any of
> the centres, Stata completely excludes this month rather than create a
> zero count for each centre.
> This then completely throws out my summary statistics. Is there any way I
> can force 'contract' to create a count of zero for each centre for any
> month with zero attendance at all centres?

and Roger Harbord suggested

One solution could be to use -tabcount- by Nick Cox, available from SSC.
You specify a list of values which define the categories of a variable that
are to be counted, and zero frequencies are then included. The -replace-
option makes it function much like -contract-, except zero frequencies are
retained for the specified values.

* For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index