Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Increase in Observations

From   Renuka Metcalfe <>
Subject   st: Increase in Observations
Date   Sun, 18 Feb 2007 20:06:48 +0000 (GMT)

Dear Maarten

Thank you. The command I used to bring the
meanworkplacex1 and meanworkplacex2 by firm id (this
variable is called serno)and occupation (this variable
is called (xsoc2000) from the matched
employer-employee data are as follows:

First I must mention that meanworkplacex1 was derived
from own individual x1 (let us call this variable x1)
and own individual x2 (let us call this variable
x2)and the collapse command would work out the mean
anyway I issued

. collapse (mean) x1 x2, by (serno xsoc2000)

I have removed all missing values etc. beforehand.

I then merged this file by sort serno with the main
employer only file. But the observations always go up
to about 4660 in my analysis. But the total
observations are about 7700. But the employer only
file as I said only had about 2200 observations.

I believe there is another way of bringing the above
mentioned variables from the matched employee-employer
data. Perhaps this method of bringing these 2
variables may not increase the observations in excess
of about 2000. I would be grateful, if anyone would
please let me know the alternative way of doing this.

Many thanks

--- Renuka Metcalfe <> wrote:
> I want to bring "meanworkplacex1 by firm and
> occupations" and meanworkplacex2 by firm and
> occupations" from a matched employee and employer
> to the same employer only data (which only comprises
> of about 2200). When I use the collapse command to
> bring 2 variables the observations increase to about
> 4660. I do not think the observations should
> to beyound about 2000. I would be grateful, if
> could let me know, how I could bring the
> to less than 2200.

This seems to be a problem with your data and you do
not give us the
exact commands you used, so it is very hard for us to
solve this
problem. There are two things I can do given the
information you have
given us: 1) remark that the by option in -collapse-
assumes that a
missing value is also a level, this might increase the
number of cases
to more then you expect, and 2) give you an example
that works with
data that is available to everyone (with Stata):

*------------ begin example --------------
sysuse auto, clear
collapse (mean) price mpg if rep78 < ., by(rep78) 
*------------- end example --------------- 

Hope this helps,

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

Now you can scan emails quickly with a reading pane. Get the new Yahoo! Mail.
*   For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index