Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Creating square matrix with obs counts from 2 non unique id variables


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Creating square matrix with obs counts from 2 non unique id variables
Date   Tue, 19 Apr 2011 17:44:42 +0100

This is ugly but it works.

The idea is that you clone the data and then flip the variables round
in the clone. That ensures a square matrix. You can take out the fake
entries by rounding down as each was given a very small weight.

Someone should be able to improve on this.

clear
input obs ID1 ID2
1     10  12
2     10  11
3     11  12
4     11  10
end
local N = _N
local Np1 = _N + 1
expand 2
replace ID1 = ID2[_n - `N'] in `Np1'/L
replace ID2 = ID1[_n - `N'] in `Np1'/L
gen wt = cond(_n <= `N', 1, 1e-9)

(1)

collapse (sum) wt, by(ID1 ID2)
replace wt = round(wt)

OR

(2)

tab ID1 ID2 [iw=wt], matcell(foo)
levelsof ID1, local(names)
mata : st_matrix("foo", round(st_matrix("foo")))
matrix rownames foo = `names'
matrix colnames foo = `names'

On Tue, Apr 19, 2011 at 4:47 PM, Diogo L Pinheiro
<diogo.pinheiro@pubpolicy.gatech.edu> wrote:

> I've been trying to create a square matrix with counts of observations from
> two non unique ID variables. In this case, one id variable has the code for
> an organization an individual was and one for where the individual is. For
> example:
>
>
> Observation    ID1         ID2
> 1.                  10          12
> 2.                  10          11
> 3.                  11          12
> 4.                  11          10
>
> In the example above, observation 1 was at organization #10 at one time and
> organization #12 at another. Observation 2 was at 10 and then 11. And so on.
>
>
> From that, I want to create a square matrix that would look something like:
>
>
>              10           11           12
>
> 10           0              1            1
> 11           1              0            1
> 12           0              0            0
>
> So that I could have all the counts of each combination of organizations.
>
> The main problem is getting a square matrix in the end. I can create a non
> square matrix in this case by using
> tab ID2, gen(newID)
> collapse (sum) newID, by(ID1)
>
> but the problem is that this won't be a square matrix, as not every ID
> number shows up in both variables.. In the example above it would be missing
> a line (since not every ID number present in ID1 is present in ID2, and vice
> versa).
>
> So how could I get a square matrix from the situation above? I've found
> commands to create similar matrices, but they require unique IDs and only
> provide binary results for the existing combinations.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index