Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: generating random numbers from a specified list


From   "Eva Poen" <eva.poen@gmail.com>
To   Statalist <statalist@hsphsun2.harvard.edu>
Subject   Re: st: generating random numbers from a specified list
Date   Fri, 15 Aug 2008 17:14:41 +0100

Jochen,

is your list of numbers equally spaced, in intervals of five? If so,
this is quite easy to do.
Let's assume your list starts at 10000, has intervals of five and goes
to 10500 (101 values).

set seed 123
bysort firm (year) : gen random = 10000+round(100*uniform(),1)*5 if _n==1
bysort firm (year) : replace random = random[1]

and you are done.
If it is not equally spaced, you could just add a row identifier to
your second data file (where you keep the list of numbers):

gen id = _n
sort id
save, replace

Say you have got 478 numbers in that list. Now you can create an
integer random number in the interval [1,478] in your original
dataset:

set seed 123
bysort firm (year) : gen id = 1+round(477*uniform(),1) if _n==1
bysort firm (year) : replace id = id[1]
sort id firm year

and then -merge- on your id variable. Make sure you inspect _merge
afterwards, since you don't want to keep the observations that only
appear in your second data file.

Hope this helps,
Eva


2008/8/15 Jochen Späth <jochen.spaeth@iaw.edu>:
> Dear statalisters,
>
> trying to build up a new dataset which is (almost) entirely random I'm confronted with the following problem: How can I tell Stata to generate a variable by randomly picking  numbers from a (previously) specified list? Here's a minimal example of what I want to do:
>
>
> Data right looks like follows:
>
> firm    year  az_ges  az_ges_vz
> 1000129 1975    15      11
> 1000129 1976    14      11
> 1000129 1977    8       6
> 1000129 1978    26      20
>
> 1000530 1993    12      9
> 1000530 1994    29      22
> 1000530 1995    26      20
> 1000530 1996    14      11
> 1000530 1997    18      14
>
> and so on, where firm is the cross section identifier and year the time variable. Note that data ist organised in long format.
> Now, I'd like to add a variable which contains for each firm a number picked at random from a specified list, f.i. list=(10000,10005,10010, and so on). This "random" number should be the same for firms across years. The list exists already, albeit in a separate data file that can't be merged with the data set at hand because it lacks the identifiers betnr and year.
>
> Any suggestions?
>
> Thanks,
>
> Jochen
>
> -------------------------------------------------------------------------------------------
>
> Jochen Späth
> Dipl.-Volkswirt
> Institut für Angewandte Wirtschaftsforschung (IAW) Tübingen
> Ob dem Himmelreich 1
> 72074 Tübingen
> Tel.: +49-(0)7071-9896-14
> Fax: +49-(0)7071-9896-99
> EMail: jochen.spaeth@iaw.edu
> IAW-Homepage: www.iaw.edu

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index