st: Missing cases after contract with large weight

 From "Friedrich Huebler" <[email protected]> To [email protected] Subject st: Missing cases after contract with large weight Date Thu, 20 Dec 2007 16:32:05 -0500

```I discovered that -contract- drops observations when a weight with
large numbers is used. What is the explanation for this and is
-contract- designed to work this way?

Below are two examples for -contract- with weights. In the first
example, the weight is not too large and the -contract-ed dataset
contains the same frequencies as those reported by -tabulate-.

. sysuse auto, clear
. replace weight = weight * 10000
. tab rep78 [fw=weight], m

Repair |
Record 1978 |      Freq.     Percent        Cum.
------------+-----------------------------------
1 | 62,000,000        2.77        2.77
2 |268,300,000       12.01       14.78
3 |989,700,000       44.29       59.08
4 |516,600,000       23.12       82.20
5 |255,500,000       11.43       93.63
. |142,300,000        6.37      100.00
------------+-----------------------------------
Total | 2234400000      100.00

. contract rep78 [fw=weight]
. clist

rep78         _freq
1.        1      62000000
2.        2     268300000
3.        3     989700000
4.        4     516600000
5.        5     255500000
6.        .     142300000

In the second example, the weight variable is not multiplied by 10,000
but 100,000. When -contract- is used with the larger weight, some
values are dropped from the dataset.

. sysuse auto, clear
. replace weight = weight * 100000
. tab rep78 [fw=weight], m

Repair |
Record 1978 |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |620,000,000        2.77        2.77
2 | 2683000000       12.01       14.78
3 | 9897000000       44.29       59.08
4 | 5166000000       23.12       82.20
5 | 2555000000       11.43       93.63
. | 1423000000        6.37      100.00
------------+-----------------------------------
Total |22344000000      100.00

. contract rep78 [fw=weight]
. clist

rep78         _freq
1.        1     620000000
2.        2             .
3.        3             .
4.        4             .
5.        5             .
6.        .    1423000000

Perhaps it is a coincidence, but the frequencies that appear in the
-tabulate- output and are missing from the -contract- output exceed
the largest possible number that can be held in a variable of datatype
long: 2,147,483,620. Can -contract- be modified to allow larger
weights?

Thanks,

Friedrich
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```