RE: st: RE: Output summary stats before doing other iteration

 From Larraine Becker <[email protected]> To [email protected] Subject RE: st: RE: Output summary stats before doing other iteration Date Mon, 15 Sep 2008 10:36:37 +1000

```Thanks Eva.

Sorry for HTML posting - I've hopefully changed this.  Please let me
know if it didn't.

Thanks for the advice - I'll try to answer and explain a bit more to
clarify what I am trying to do.

First, there is no -if- statement in my logit regression.

Basically I'm trying to see if people who had caesareans are more likely
to have another caesarean.  So, I'm randomly replacing the number of
people who had previous caesareans and then predicting future
caesareans.  I want to do this many times and just use the average.

Not sure if this makes sense?  I'm basically the "programmer", so it's
not my project - I need to just clarify a few things with the project

But, my main question is just trying to repeat this procedure more than
once, and providing me with the predicted values and the mean of all
observations for each iteration.  I can then later on calculate the
overall mean.

I will try what you've suggested - if you have any further ideas, please
let me know.

Thanks,
Larraine

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Eva Poen
Sent: Friday, 12 September 2008 5:31 PM
To: [email protected]
Subject: Re: st: RE: Output summary stats before doing other iteration

Larraine,

please don't use html for your postings to statalist (see the advice
in the faq).

It would be interesting to see the -if- statement in your logit
regression. Is it -if dropouts==1-? If I understand correctly, your
procedure is
- generate random variable for sorting
- run logistic regression on a (random) subsample of your data, which
excludes all observations that have been used in previous iterations
- generate predicted values for _all_ observations in the dataset.

First, whenever you work with random numbers, you should set the
random number seed in order to make your results reproducible:

set seed 123

for example.
Next, it seems perfectly sufficient to save the estimation result
during the loop. You can use

estimates store iteration`i'

within your loop, and then generate predicted values as and if you need
them:

estimates for iteration723 : predict onehat723

See -help estimates-. Do you really want to generate 1000 variables
with predicted values? If you tell us what you ultimately want to
achive, we might be able to suggest something more suitable. It looks
as if you are doing a simulation exercise of some sort; there might be
a more direct way.

Hope this helps,
Eva

2008/9/12 Larraine Becker <[email protected]>:
> Just a correction, the program below is wrong...it should be:
>
>
>
> forvalues i=1(1)10 {
>
> use "U:\CS\combined dataset_2006.dta", clear
>
> generate random`i' = uniform()
>
> sort anyprevcs random`i'
>
> generate dropouts = 0
>
> replace dropouts =1 if anyprevcs==1 & (_N - _n) < 3575
>
>
>
> logit .......(I've deleted the variables, as there are too many to put
here!)
>
>
>
> replace anyprevcs=0 if dropouts==1
>
> predict onehat`i'
>
> summarize onehat`i'
>
> drop dropouts n
>
> }
>
>
>
>
>
> ________________________________
>
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Larraine
Becker
> Sent: Friday, 12 September 2008 4:35 PM
> To: [email protected]
> Subject: st: Output summary stats before doing other iteration
>
>
>
> Hi all,
>
>
>
> I'm doing 1000 iterations of a logistic regression.  I have to output
the
> predicted value each time before it carries on with the next
iteration,
> otherwise I lose the
>
> first 999 predicted values!  I'm sure there is a way to go about this,
but
> how can I save the predicted value each time so I end up with a table
with
> 1000 predicted values?
>
>
>
> My program is as follows:
>
>
>
> forvalues i=1(1)10 {
>
> use "U:\CS\combined dataset_2006.dta", clear
>
> generate random`i' = uniform()
>
> sort anyprevcs random`i'
>
> generate dropouts = 0
>
> replace dropouts =1 if anyprevcs==1 & (_N - _n) < 3575
>
>
>
> logit .......(I've deleted the variables, as there are too many to put
here!)
>
>
>
> replace anyprevcs=0 if dropouts==1
>
> predict onehat`i'
>
> summarize onehat`i'
>
> gen n=_n
>
> egen predicted`i'=mean(onehat`i')*n
>
> drop dropouts n
>
> }
>
>
>
> Thanks,
>
> Larraine

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```