# Re: st: impute with draws from random distribution

 From Joerg Luedicke To statalist@hsphsun2.harvard.edu Subject Re: st: impute with draws from random distribution Date Wed, 22 Jun 2011 10:30:56 -0400

On Wed, Jun 22, 2011 at 4:03 AM, D-Ta <altruist81@gmx.de> wrote:
> Dear All,
> I am looking for a convenient solution to the following problem. That is the
> type of the sample I am working with:
> id      x1      x2      participant     programm        time to
> participation
> 1       5       23      1       1       3.5
> 2       6       42      1       2       5.7
> 3       73      7       0       .       .
> 4       35      2       0       .       .
> 5       5       6       1       1       12
> 6       34      34      1       1       3.5
> 7       34      34      1       2       8.1
> The sample consists of of individuals (with covariates x1 and x2) who can
> either be participating in programm 1, programm 2, or be non-participants.
> The non-participants are my controll group. One of the control variables
> that I would like to condition on in a subsequent matching step is time to
> participation. By definition, time to participation is not observed for the
> non-participants. Hence, I would like to create hypothetical values in that
> variable for the group of non-participant. It is standard in the literature
> to randomly draw from the distribution of the participants.
> Since there are two groups of participants, there are also two different
> distributions in the start dates. I would like to assign two values in the
> time to participation to each non-participant (hypothetical time to
> participation in programm 1 and hypothetical time to participation in
> programm 2)
> Any suggestions how to do this??
I agree with Maarten that this makes no sense at all. Let's say your
aim is to create a balanced sample via the use of propensity scores.
Now let's further assume that "time to participation" would be the
only variable of concern. If you now impute some values and predict
your propensity scores, a claim of any subsequent analysis would be
that the treatment assignment is ignorable based on the observed time
that elapsed between timepoint x and start of the program. Only that
for the non-participants, there was no start of a program in the first
place, hence no elapsed time and, thus, there is nothing to balance.

J.

