[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: Unix stata big dataset

From   "Michael Blasnik" <>
To   <>
Subject   Re: st: Re: Unix stata big dataset
Date   Fri, 30 Nov 2007 08:46:10 -0500


I'm sorry but your answers are still insufficient -- I will make one last try to help, but my patience (and the list bandwidth) may be getting taxed at this point... Please see my comments below.


----- Original Message ----- From: <>

1) You say you have 18 million observations -- is that for both
datasets or just one dataset?  How many observations are in the smaller
in just one, in the other I have 2 million.

2) In your example data, do you expect to have 3 observations for id=1,
or 9 observations (all combinations of the 3 observations in each
dataset)?  If you want three observations, how do you tell which
observations to match from each dataset?
I use the id that is the same in each data set, and I use the option unmatched( master) in joinby command.

This does not answer my question -- I didn't ask you what option you specify in joinby, I asked a specific question about your example to try to determine if joinby was really needed.

3) Do you want all observations for both datasets, or are there many
observations in the larger dataset that don't match and you don't need?
I need all the observations.
This answer is not consistent with your prior answer. You are specifying unmatched(master) which means that you are not asking for observations in the using dataset that do not match. How many observations have matching ids in the two datasets? How many using dataset observations are not matched?

* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index