[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Larraine Becker <[email protected]> |

To |
[email protected] |

Subject |
RE: st: RE: Output summary stats before doing other iteration |

Date |
Tue, 16 Sep 2008 15:49:46 +1000 |

Eva and Martin, Thanks for all the help. I've got something working! Larraine -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Eva Poen Sent: Monday, 15 September 2008 9:59 PM To: [email protected] Subject: Re: st: RE: Output summary stats before doing other iteration Larraine, ok, so you want to see if people who previously have had a c-section are more likely to have one again. And your strategy to do that is to - randomly replace some of your data with ones where there have been zeros in the variable for "previous c-section"; - estimate a logit model, with the dependent variable c-section; - predict probability; - repeat. Did I get that right? I couldn't quite see from your very first postings how the variables you replace affect the regression; you need to tell us the full -logit- command you are using. I'm not quite sure I understand what you need the randomised element for, though. If, in a regression model, you have c-section as dependent variable, and previous c-section as regressor, then the marginal effect for your previous c-section dummy is what you are after. Note that this marginal effect is non-linear and will depend on the values of all other covariates. Also, are there any women in your sample who gave birth for the first time? In your random replacements, you'd assign some of them the value of one when they cannot possibly have had a previous c-section. I'm not sure that will make sense for your interpretations. Someone with expertise in medical statistics will be able to help you out here. If you really want to go down the simulation route, then -simulate- is the way to go, as Martin said. The manual entry should get you a long way. However, I don't quite see what you can learn from this excercise. You'd get 1000 estimation results, and what these are will highly depend on the fraction of data you replace, among other things. How do you then interpret your 1000 estimations of the coefficient on previous c-section? You will really need to show us the -logit- command before we can help any further, I think. Eva 2008/9/15 Larraine Becker <[email protected]>: > Thanks Eva. > > Sorry for HTML posting - I've hopefully changed this. Please let me > know if it didn't. > > Thanks for the advice - I'll try to answer and explain a bit more to > clarify what I am trying to do. > > First, there is no -if- statement in my logit regression. > > Basically I'm trying to see if people who had caesareans are more likely > to have another caesarean. So, I'm randomly replacing the number of > people who had previous caesareans and then predicting future > caesareans. I want to do this many times and just use the average. > > Not sure if this makes sense? I'm basically the "programmer", so it's > not my project - I need to just clarify a few things with the project > leader as well. > > But, my main question is just trying to repeat this procedure more than > once, and providing me with the predicted values and the mean of all > observations for each iteration. I can then later on calculate the > overall mean. > > I will try what you've suggested - if you have any further ideas, please > let me know. > > Thanks, > Larraine > > > > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Eva Poen > Sent: Friday, 12 September 2008 5:31 PM > To: [email protected] > Subject: Re: st: RE: Output summary stats before doing other iteration > > Larraine, > > please don't use html for your postings to statalist (see the advice > in the faq). > > It would be interesting to see the -if- statement in your logit > regression. Is it -if dropouts==1-? If I understand correctly, your > procedure is > - generate random variable for sorting > - run logistic regression on a (random) subsample of your data, which > excludes all observations that have been used in previous iterations > - generate predicted values for _all_ observations in the dataset. > > First, whenever you work with random numbers, you should set the > random number seed in order to make your results reproducible: > > set seed 123 > > for example. > Next, it seems perfectly sufficient to save the estimation result > during the loop. You can use > > estimates store iteration`i' > > within your loop, and then generate predicted values as and if you need > them: > > estimates for iteration723 : predict onehat723 > > See -help estimates-. Do you really want to generate 1000 variables > with predicted values? If you tell us what you ultimately want to > achive, we might be able to suggest something more suitable. It looks > as if you are doing a simulation exercise of some sort; there might be > a more direct way. > > Hope this helps, > Eva > > > 2008/9/12 Larraine Becker <[email protected]>: >> Just a correction, the program below is wrong...it should be: >> >> >> >> forvalues i=1(1)10 { >> >> use "U:\CS\combined dataset_2006.dta", clear >> >> generate random`i' = uniform() >> >> sort anyprevcs random`i' >> >> generate dropouts = 0 >> >> replace dropouts =1 if anyprevcs==1 & (_N - _n) < 3575 >> >> >> >> logit .......(I've deleted the variables, as there are too many to put > here!) >> >> >> >> replace anyprevcs=0 if dropouts==1 >> >> predict onehat`i' >> >> summarize onehat`i' >> >> drop dropouts n >> >> } >> >> >> >> >> >> ________________________________ >> >> From: [email protected] >> [mailto:[email protected]] On Behalf Of Larraine > Becker >> Sent: Friday, 12 September 2008 4:35 PM >> To: [email protected] >> Subject: st: Output summary stats before doing other iteration >> >> >> >> Hi all, >> >> >> >> I'm doing 1000 iterations of a logistic regression. I have to output > the >> predicted value each time before it carries on with the next > iteration, >> otherwise I lose the >> >> first 999 predicted values! I'm sure there is a way to go about this, > but >> how can I save the predicted value each time so I end up with a table > with >> 1000 predicted values? >> >> >> >> My program is as follows: >> >> >> >> forvalues i=1(1)10 { >> >> use "U:\CS\combined dataset_2006.dta", clear >> >> generate random`i' = uniform() >> >> sort anyprevcs random`i' >> >> generate dropouts = 0 >> >> replace dropouts =1 if anyprevcs==1 & (_N - _n) < 3575 >> >> >> >> logit .......(I've deleted the variables, as there are too many to put > here!) >> >> >> >> replace anyprevcs=0 if dropouts==1 >> >> predict onehat`i' >> >> summarize onehat`i' >> >> gen n=_n >> >> egen predicted`i'=mean(onehat`i')*n >> >> drop dropouts n >> >> } >> >> >> >> Thanks, >> >> Larraine * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Output summary stats before doing other iteration***From:*Larraine Becker <[email protected]>

**st: RE: Output summary stats before doing other iteration***From:*Larraine Becker <[email protected]>

**Re: st: RE: Output summary stats before doing other iteration***From:*"Eva Poen" <[email protected]>

**RE: st: RE: Output summary stats before doing other iteration***From:*Larraine Becker <[email protected]>

**Re: st: RE: Output summary stats before doing other iteration***From:*"Eva Poen" <[email protected]>

- Prev by Date:
**Re: st: Re: Comparing datasets** - Next by Date:
**Re: st: Inverse Mills Ratio After Negative Binomial Regression Model** - Previous by thread:
**Re: st: RE: Output summary stats before doing other iteration** - Next by thread:
**RE: st: RE: Output summary stats before doing other iteration** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |