I don't think you need any section
of the manual as support here, but
FWIW Stata's -sample- doesn't do this.
The unofficial -swor- (-search swor-)
will do it.
But best of all is to think from first
principles. Suppose we decide on
a validation sample of 500: then we
should be explicit about a random
number seed for reproducibility.
Your seed choice may naturally differ,
but here's one
set seed 280352
Then we pick some random numbers
and shuffle:
gen random = uniform()
sort random
The first whatever observations
are one sample:
gen byte validation = _n <= 500
Your validation sample has
-validation- 1 and the other sample has
validation 0. Subsequent analyses
can be done
... if validation
... if !validation
Having written that down, I now
remember that this is already an FAQ:
How can I take random samples from an existing dataset?
http://www.stata.com/support/faqs/stat/sampling.html
Nick
n.j.cox@durham.ac.uk
Richard Hiscock
> I would be grateful for some direction to the area in the
> stata manual
> that explains how to do the following
> I am trying to split a dataset (n ~1500) into an estimation
> sample and a
> validation sample by random sampling (n = 400-500) from the dataset
>
> Later I wish to compare results with that using bstrap techniques
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/