Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Draw a random sample of my data and merge with original paneldata

Subject   st: Draw a random sample of my data and merge with original paneldata
Date   Mon, 1 Oct 2012 17:49:37 +0200 (CEST)

Dear Friends,
I have a little problem with generating a new dataset.
I first use the command "sample" and "set seed" to generate a new dataset.
The reason is that US firms account for more than 50% of the dataset, this
affects the cross-country results very strong. However, with respect to
the world wide industry business volume, US firms account 29%. Therefore,
I draw a random sample, in which I randomly account 29% of the US firms in
the dataset. I have a panel data with countryID firmID and years. After
running the random sample and setting the seeds, I would like to merge the
randomly generated dataset of US firms (with random firmID and random
years) with my original panel data (with countryID firmID and years). But:
how can I merge the dataset in which only the random sample of US firms is
considered and the other US fimrs are dropped. How can I genetrate a
variable, in which I can say that only "the random" US firms can be
considered within the original panel dataset?
Please help..Thank you in advance...Mehmet Altun

My commands look like:
use all_data8;

by firmID, sort: gen firms = _n;
keep if firms==1;

keep if countryID==244 (USA);
sort firmID, stable;
set seed 260581;

sample 63;
sort year;
save usfirms_1, replace;
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index