Chunling Lu --
This strikes me as a very bad idea. What can you possibly hope to
gain by imputing number of visits from the information {visits>0} in
this context? Even if there were a clear reason to do this, it cannot
be done.
Here is some actual data:
visits last |
year | Freq. Percent Cum.
------------+-----------------------------------
0 | 5,266 87.16 87.16
1 | 137 2.27 89.42
2 | 85 1.41 90.83
3 | 65 1.08 91.91
4 | 66 1.09 93.00
5 | 32 0.53 93.53
6 | 63 1.04 94.57
7 | 16 0.26 94.84
8 | 23 0.38 95.22
9 | 9 0.15 95.37
10 | 34 0.56 95.93
11 | 1 0.02 95.95
12 | 70 1.16 97.10
13 | 5 0.08 97.19
14 | 4 0.07 97.25
15 | 16 0.26 97.52
16 | 4 0.07 97.58
17 | 1 0.02 97.60
...
which you would see as
anyvisits | Freq. Percent Cum.
------------+-----------------------------------
0 | 5,266 87.16 87.16
1 | 776 12.84 100.00
------------+-----------------------------------
Total | 6,042 100.00
Using your method, you would estimate lambda = -log(5266/6042) = 0.137
but this implies the expected tab of number of visits looks like:
visits | Freq. Percent Cum.
------------+-----------------------------------
0 | 5,266 87.16 87.16
1 | 724 11.98 99.14
2 | 50 0.82 99.96
3 | 2 0.04 100.00
4 | 0 0.00 100.00
...
which ain't even close to right.
If you really need number of visits on your data, your only way
forward is "cold deck imputation" or "statistical matching" I think.
On 5/25/07, Chunling Lu <chunling_lu@harvard.edu> wrote:
David, thanks for the information. But I think we may work out something
here. We know that individuals either not seeing doc, or seeding doc at
least once in the last 30 days. So we may calculate probability(y>=1) (y is
the number of visits) = 1-probability(y=0) in the last 30 days. Using
poisson distribution for counts, we know that p(y=0)=1-exp(-lamda), we may
then derive lamda value which is the mean of number of visits. How do you
think about this? Thanks very much. Chunling
-----Original Message-----
From: David Greenberg
You can't, unless you are confident that those who visited a doctor within
the last 30 days did so only once. David Greenberg, Sociology Department,
New York University
----- Original Message -----
From: Chunling Lu <chunling_lu@harvard.edu>
> I have a question "When was the last time you visited doctor" with the
> following categories: (1) in the last 30 days, (2) between 1 month and
> less than 1 year ago. I now would like to derived the average number
> of visits for last 30 days. How should I model it and how can I do it
> in stata?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/