Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: select three non-identical random elements


From   Tomáš Houška <xbender@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: select three non-identical random elements
Date   Thu, 4 Apr 2013 13:56:31 +0200

Hello Nick,

thank you for your tip, but it does not solve my problem.

Let me give a bigger picture, so people might direct me to a generally
more efficient way to approach the problem.

I have panel sales data containing prices over time in different
cities (market_name). Each city belongs to some region (there are 5
regions overall, about 50 cities).

Example of data structure
Market_name / region / period / product/ price
1 / 1 / 2001m1 / 1 / 2.3
1 / 1 / 2001m2 / 1 / 2.4
...
1 / 1 / 2001m1 / 2 / 1.8
...
2 / 1 / 2001m1 / 1 / 2.8
...
5 / 2 / 2001m1 / 1 / 1.9
5 / 2 / 2001m1 / 2 / 3.2
...

My aim is to create instrumental variable by matching cities between
regions. Eventually I would like to get a new price variable for each
combination of market-period-product that would represent a price of
the same product in the same period but in a market from a different
region. So in the example above, you would pair up product 1 from
market_name==1 and period==tm(2001m1) with a product 1 in
market_name==5 (since it is in a different region) in the same period.
For the first observation of the new price variable, the value would
then be 1.9 (since that is the price of the same product in the same
period in a different market). Similarly you would use the value "3.2"
where it is originally "1.8".

Since each region has several (from 3 to 15) markets, my aim was to
control only matching between regions, but the randomize the matching
between markets within the region pair (this way I could only set that
 e.g. markets from region 1 are to be substituted by markets from
region 2), the actual matching of markets would be random (so market 1
would be randomly assigned a market from region 2 and the price
variable would get those prices). This way I get one set of matching
that I can work with and by changing -seed- I could easily get a
different matching allowing me to test, if it works better in my
demand model or not.

I am currently generating a new variable "market_iv1", which is a
result of random matching between markets in given region pair. It
results in some markets within region 1 having the same match from
region 2 (that is fine). But I would like to create variables
"market_iv2" and "market_iv3", which should have different match from
particular region for each market (i.e. if region 1 is being
substituted by region 2, then if for market 1 the match for
"market_iv1" was market 5, then for "market_iv2" and "market_iv3" it
should be different markets from region 2).

The point of my original question was to find out, how to ensure that
if market 1 in region 1 was matched with market 5 from region 2, then
the second match of market 1 in region 1 will be different than market
5.

I am currently doing just the matching part and planned on using
-merge- to get the prices according to the match. My current code is
this:
/*
regions:
wc = west coast, mw = midwest, se = southeast, ne = northeast, ec=east coast
*/
* generate prices
gen market_iv1 = .

* prepare list for random selection
levelsof market_name if region==1, local(wc)
local wc_size:word count `wc'
levelsof market_name if region==2, local(mw)
local mw_size:word count `mw'
levelsof market_name if region==3, local(se)
local se_size:word count `se'
levelsof market_name if region==4, local(ne)
local ne_size:word count `ne'
levelsof market_name if region==5, local(ec)
local ec_size:word count `ec'

levelsof market_name, local(market_id)
qui foreach market of local market_id {
sum region if market_name==`market', meanonly
local reg=r(mean)
if `reg'==1 {
local rand=floor((`ne_size'-1+1)*runiform() + 1)
local replacement: word `rand' of `ne'
replace market_iv1=`replacement' if market_name==`market'
}
if `reg'==2 {
local rand=floor((`se_size'-1+1)*runiform() + 1)
local replace: word `rand' of `se'
replace market_iv1=`replacement' if market_name==`market'
}
if `reg'==3 {
local rand=floor((`ec_size'-1+1)*runiform() + 1)
local replacement: word `rand' of `ec'
replace market_iv1=`replacement' if market_name==`market'
}
if `reg'==4 {
local rand=floor((`ec_size'-1+1)*runiform() + 1)
local replacement: word `rand' of `ec'
replace market_iv1=`replacement' if market_name==`market'
}
if `reg'==5 {
local rand=floor((`wc_size'-1+1)*runiform() + 1)
local replacement: word `rand' of `wc'
replace market_iv1=`replacement' if market_name==`market'
}
}


I hope I was able to explain my aim clearly. Any help is appreciated.
Thank you
Tomas
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index