Hi Pamela,
A colleague recently told me about an interesting technique
of dealing with this problem, that at first glance sounds nice,
but I have no idea about the statistical properties. Basically
for each observation with a missing you add two
observations: one with value 1 and one with value 2 and
you attach to the first observation a weight equal to prob1
and to the second a weight equal to prob2. All complete
observations receive a weight of 1. Again, it sounds nice to
me, but if anyone else on statalist warns you not to use it,
than I will bow to superior wisdom.
HTH,
Maarten
-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands
visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214
+31 20 5986715
http://home.fsw.vu.nl/m.buis/
-----------------------------------------
Pamela Mueller wrote:
I need to fill in missing values in my dataset. For most sectors I know
how many start-ups there were for each year, but the data was not given
if the number of startups (st) is less than 3 (hence one or two).
Therefore, I know if the missing is 1 or 2 and I know the probability
for either one. I also know how often 1 or 2 should be each given in
total for each year.
for each year and region the data looks like this:
sect st prob1 prob2
1 4
2 3
3 . 0.3 0.7
4 8
5 0
6 . 0.45 0.55
7 3
8 5
9 . 0.48 0.52
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/