Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: creating a numeric matrix from string variables


From   wgould@stata.com (William Gould, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: creating a numeric matrix from string variables
Date   Fri, 05 Jun 2009 09:04:49 -0500

In the thread about using Mata to create a matrix of the number of 
agreements between firms, and in response to my last posting, 
Joe. J. said the first part was "exactly what I was looking for" 
and wondered politely why I had felt obligated to add the 
second part.  To remind you, in the final iteration of the first 
part, the resulting data looks like this, 

              +-------------------------------------------------+
              | company   f_11A   f_11K   f_12Z   f_14T   f_21S |
              |-------------------------------------------------|
           1. |     11A       0       0       2       1       0 |
           2. |     11K       0       0       1       0       1 |
           3. |     12Z       2       1       0       1       1 |
           4. |     14T       1       0       1       0       0 |
           5. |     21S       0       1       1       0       0 |
              +-------------------------------------------------+

and in the part I felt obligated to add, I recorded the data like this:

              +-----------------------------------+
              | c1   c2   company1   company2   n |
              |-----------------------------------|
           1. |  1    2        11A        11K   0 |
           2. |  1    3        11A        12Z   2 |
           3. |  1    4        11A        14T   1 |
           4. |  1    5        11A        21S   0 |
           5. |  2    3        11K        12Z   1 |
              |-----------------------------------|
           6. |  2    4        11K        14T   0 |
           7. |  2    5        11K        21S   1 |
           8. |  3    4        12Z        14T   1 |
           9. |  3    5        12Z        21S   1 |
          10. |  4    5        14T        21S   0 |
              +-----------------------------------+

There were two reasons for my unasked-for suggestion.

First, Joe mentioned something about merging in the characteristics of the
firms and, looking at the first organization, that looked hard to do.
Obviously, it would be easy to merge in the characteristics of one of the
parties -- the one recorded in the variable company -- but what about the
characteristics of the other parties?  In the first observation, for 
instance, also needed would be the characteristics of 12Z and 14T.

That lead me to the second organization, where each observation records a pair
of the parties and so that parties are on equal footing.  One could merge in
characteristics of company1 and of company2 into the data easily.

Second, I thought to myself, how would I analyze these data?  I should hasten
to add that I am not an expert or at least have to reason to think that I am,
since I don't even know the details of the problem.  Were those details
revealed, I could prove I'm not an expert.  Anyway, I wasn't sure what I would
do with these data in the first organization, but how to analyze them in
the second organization seemed obvious to me.  I could use logistic or
Poisson regression, or even a hurdle model.  To be explained would be whether
(or how many) agreements there were between pairs of companies based on their
characteristics and, presumably, interactions of their characteristics.

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index