Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: R: Bootstrapping new observations to add to an existing dataset


From   "Carlo Lazzaro" <carlo.lazzaro@tiscalinet.it>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: R: Bootstrapping new observations to add to an existing dataset
Date   Mon, 22 Jun 2009 07:45:15 +0200

Dear Davide,
First of all, I would point you to -help bsample- (in Stata 9/2 SE).

Otherwise, assuming you are interested in bootstrapping the sample mean of
your dataset (let's say creating 1000 bootstrap replications from your
original dataset of 200 observation), things are easier:
-----------------------begin example------------------------------
set obs 1000
g A=uniform in 1/200
sum A
bootstrap r(mean), reps(1000): sum A
---------------------end example---------------------------------
-help bootstrap- will give you a full list of optional(seed; saving on a
separate file and so on).

Another option would be to assume that your sample of 200 observation comes
from a given distribution (es: Gamma); hence, you can simulate accordingly.

-help simulate- will introduce you to Monte Carlo approach.

HTH and Kind Regards,
Carlo

-----Messaggio originale-----
Da: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Per conto di Davide Cantoni
Inviato: lunedì 22 giugno 2009 5.58
A: statalist@hsphsun2.harvard.edu
Oggetto: st: Bootstrapping new observations to add to an existing dataset

Hello, I am stuck while thinking about this issue and I would
appreciate your suggestions. I have a dataset which I use for
simulation purposes, to test whether my do-files run correctly. The
issue is that this dataset is too short for many applications, as it
has only 200 observations.

What I want to do is expand this dataset to include more observations,
but keeping the same (unknown to me) data generating process that
created the first 200 observations. So I was thinking to proceed in a
bootstrapping manner, by drawing the values for each one of the
variables (var1, var2 etc etc) for the new observations from the
empirical distributions of var1, var2,... in the first 200
observations. Yet, I have no idea on how to implement this. I'm
grateful for any idea. Thanks for your interest,

Davide
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index