Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

st: Random simulations of flow matrix with constraints

 From Gordon Lee To statalist@hsphsun2.harvard.edu Subject st: Random simulations of flow matrix with constraints Date Fri, 22 Feb 2013 01:26:18 +0000

```I have data on traffic flows amongst a set of stations.

What I would like to do on Stata is to create 100 random simulations
that preserve the total number of incoming and outgoing links for each
station. This is in the footsteps of Roth et al (2011) under "the null

In other words, this is a matrix with constrained row sums and column sums.

---My data set (illustrative only)---

startstationid     endstationid     total_out     total_in
---------------------------------------------------------------------------
A                     B                     oA              dB
A                     C                     oA              dC
A                     D                     oA              dD
B                     A                     oB              dA
B                     C                     oB              dC
B                     D                     oB              dD
C                     A                     oC              dA
C                     B                     oC              dB
C                     D                     oC              dD
D                     A                     oD              dA
D                     B                     oD              dB
D                     C                     oD              dC

...where total_out is the sum of traffic flows from the startstation,
and total_in is the sum of traffic flows into the endstation. All
items take on integer values.

****
dofile: https://www.dropbox.com/s/c9s4gmqtd6ycmnz/do%20file.txt
dataset: https://www.dropbox.com/s/zx9oufzekusnsuj/dataset.dta

I tried the following commands, but stata tells me "'invalid name
r(198);". I can't seem to be able to fix this.

(note: 533361588 is the sum of all flows in the data)

forvalues j = 1/100 {
gen X'j' = 0
forvalues i = 1/533361588 {
gen Y = X'j'
sort startstationid
egen X_out=total(X'j'), by(startstationid)
sort endstationid
egen X_in=total(X'j'), by(endstationid)

gen constrained_out = 0
replace constrained_out = 1 if total_out=X_out_clone
gen constrained_in = 0
replace constrained_in = 1 if total_in=X_in_clone

gen constrained = max(constrained_out, constrained_in)

generate double u = runiform() if constrained!=1
sort u

replace X'j' = Y + 1 if _n == 1

drop Y X_out X_in X_out_clone X_in_clone constrained_out
constrained_in constrained u
}
}

Your help is most appreciated.

Gordon
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```