Michael Blasnik

<statalist@hsphsun2.harvard.edu> |

Re: st: Re: Unix stata big dataset

Fri, 30 Nov 2007 08:46:10 -0500

I'm sorry but your answers are still insufficient -- I will make one last try to help, but my patience (and the list bandwidth) may be getting taxed at this point... Please see my comments below.

Michael

----- Original Message ----- From: <ncdcta00@uniroma2.it>

This does not answer my question -- I didn't ask you what option you specify in joinby, I asked a specific question about your example to try to determine if joinby was really needed.1) You say you have 18 million observations -- is that for both datasets or just one dataset? How many observations are in the smaller dataset?in just one, in the other I have 2 million.I use the id that is the same in each data set, and I use the option unmatched( master) in joinby command.2) In your example data, do you expect to have 3 observations for id=1, or 9 observations (all combinations of the 3 observations in each dataset)? If you want three observations, how do you tell which observations to match from each dataset?

This answer is not consistent with your prior answer. You are specifying unmatched(master) which means that you are not asking for observations in the using dataset that do not match. How many observations have matching ids in the two datasets? How many using dataset observations are not matched?3) Do you want all observations for both datasets, or are there many observations in the larger dataset that don't match and you don't need?I need all the observations.

