In addition to other solutions, check out an
existing -egen- function -rndsub()- in the
-egenmore- package on SSC.
Nick
n.j.cox@durham.ac.uk
Maarten Buis
> To sample approx. 80% you could make a selection variable by:
> -gen select = uniform()>.2- and than -regress price mpg
> foreign if select==1- to run a regression on the selected
> part only, and use for instance -predict yhat if select == 0-
> to get to the statistics you want. Have a look at the
> -simulate- command to repeat the procedure and store estimates.
Yang Li
> I am required to randomly partition my sample into two groups
> with 80% and
> 20% split, and run the normal OLS regression on the 80% set (report R
> square, Parameters, significance indicators, MSE/(var
> expected)). Then for
> each of the observation in my 20% set, I need to use the parameters
> calculated (from the 80% set) to produce and report the
> estimation error
> (for the dependent variable). This process is required to
> run 100 times.
>
> I encountered the following difficulties:
> 1. how to keep both (80% and 20%) partitioned sample for
> further estimation
> (I can only find the command "sample", but it drops the
> observations and
> does not allow to maintain the rest 20% for further test).
> 2. how to output the specific estimation results (e.g. R
> square of 'reg')
> into a spreadsheet (e.g. Excel) (I can assess the estimated
> results stored
> in e( ), but how can I output it automatically to a Excel for
> report purpose).
> 3. how to do it automatically 100 times (How could I store the each
> partitioned sample (for 100 times) separately? Is a do-file enough to
> handle this?)
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/