Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Random simulations of flow matrix with constraints


From   Gordon Lee <gordon@gordonlee.me>
To   statalist@hsphsun2.harvard.edu
Subject   st: Random simulations of flow matrix with constraints
Date   Fri, 22 Feb 2013 01:26:18 +0000

I have data on traffic flows amongst a set of stations.

What I would like to do on Stata is to create 100 random simulations
that preserve the total number of incoming and outgoing links for each
station. This is in the footsteps of Roth et al (2011) under "the null
model". Link: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0015923

In other words, this is a matrix with constrained row sums and column sums.

---My data set (illustrative only)---

startstationid     endstationid     total_out     total_in
---------------------------------------------------------------------------
A                     B                     oA              dB
A                     C                     oA              dC
A                     D                     oA              dD
B                     A                     oB              dA
B                     C                     oB              dC
B                     D                     oB              dD
C                     A                     oC              dA
C                     B                     oC              dB
C                     D                     oC              dD
D                     A                     oD              dA
D                     B                     oD              dB
D                     C                     oD              dC

...where total_out is the sum of traffic flows from the startstation,
and total_in is the sum of traffic flows into the endstation. All
items take on integer values.


****
dofile: https://www.dropbox.com/s/c9s4gmqtd6ycmnz/do%20file.txt
dataset: https://www.dropbox.com/s/zx9oufzekusnsuj/dataset.dta

I tried the following commands, but stata tells me "'invalid name
r(198);". I can't seem to be able to fix this.

(note: 533361588 is the sum of all flows in the data)

forvalues j = 1/100 {
gen X'j' = 0
forvalues i = 1/533361588 {
gen Y = X'j'
sort startstationid
egen X_out=total(X'j'), by(startstationid)
sort endstationid
egen X_in=total(X'j'), by(endstationid)

gen constrained_out = 0
replace constrained_out = 1 if total_out=X_out_clone
gen constrained_in = 0
replace constrained_in = 1 if total_in=X_in_clone

gen constrained = max(constrained_out, constrained_in)

generate double u = runiform() if constrained!=1
sort u

replace X'j' = Y + 1 if _n == 1

drop Y X_out X_in X_out_clone X_in_clone constrained_out
constrained_in constrained u
}
}


Your help is most appreciated.


Gordon
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index