Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: -svmat- with matrix colnames based on factor variable names?


From   <[email protected]>
To   <[email protected]>
Subject   Re: st: -svmat- with matrix colnames based on factor variable names?
Date   Sat, 12 Feb 2011 10:49:48 -0000

Thanks to Nick Cox, Austin Nichols and Scott Merryman for their
responses. Scott's message pointed, inter alia, to a useful Statalist
posting by himself, and also to a Stata News article in which graphing
after -margins- is usefully accomplished using Roger Newson's -parmest-
(on SSC). The small, but perhaps important, difference in my case is
that the x-axis variable in my proposed graphs are categorical (factor
variable values) rather than continuous (age in years in the News and
Scott's examples).  As Nick and Austin suggest, some pre- or
post-processing of the colnames/factor varlist is inevitable. Austin
kindly offered some assistance were I to suggest an example based on
public data. Here's one inspired by Scott's earlier example [I've added
use of variable hsizgp, which is categorical taking values 1(1)5]:

. webuse nhanes2f, clear

. logit diabetes i.hsizgp  i.black i.female age i.female#c.age, nolog

Logistic regression                               Number of obs   =
10335
                                                  LR chi2(8)      =
391.83
                                                  Prob > chi2     =
0.0000
Log likelihood = -1803.1523                       Pseudo R2       =
0.0980

------------------------------------------------------------------------
------
    diabetes |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------
------
      hsizgp |
          2  |  -.1797069   .1280784    -1.40   0.161    -.4307359
.0713221
          3  |     .11717   .1621709     0.72   0.470    -.2006792
.4350192
          4  |   .3391857   .1847375     1.84   0.066    -.0228931
.7012645
          5  |  -.0473484   .1967049    -0.24   0.810    -.4328829
.3381862
             |
     1.black |   .6699473   .1286628     5.21   0.000     .4177728
.9221217
    1.female |   1.435121    .489943     2.93   0.003     .4748507
2.395392
         age |   .0766945   .0066901    11.46   0.000      .063582
.0898069
             |
female#c.age |
          1  |  -.0213554   .0079371    -2.69   0.007    -.0369117
-.005799
             |
       _cons |  -7.403102   .4346565   -17.03   0.000    -8.255013
-6.551191
------------------------------------------------------------------------
------


. margins , at(female = 1  age = 40 hsizgp = (1(1)5) ) post

Predictive margins                                Number of obs   =
10335
Model VCE    : OIM

Expression   : Pr(diabetes), predict()

1._at        : hsizgp          =           1
               female          =           1
               age             =          40

2._at        : hsizgp          =           2
               female          =           1
               age             =          40

3._at        : hsizgp          =           3
               female          =           1
               age             =          40

4._at        : hsizgp          =           4
               female          =           1
               age             =          40

5._at        : hsizgp          =           5
               female          =           1
               age             =          40

------------------------------------------------------------------------
------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-------------+----------------------------------------------------------
------
         _at |
          1  |   .0250719   .0038488     6.51   0.000     .0175283
.0326154
          2  |   .0210404    .002844     7.40   0.000     .0154662
.0266145
          3  |   .0280953   .0041529     6.77   0.000     .0199558
.0362348
          4  |   .0348208   .0053377     6.52   0.000     .0243591
.0452825
          5  |    .023942   .0040474     5.92   0.000     .0160092
.0318748
------------------------------------------------------------------------
------

. matrix at = e(at)

. mat li at

at[5,10]
        1b.      2.      3.      4.      5.     0b.      1.     0b.
1.        
    hsizgp  hsizgp  hsizgp  hsizgp  hsizgp   black   black  female
female     age
r1       1       0       0       0       0       .       .       0
1      40
r2       0       1       0       0       0       .       .       0
1      40
r3       0       0       1       0       0       .       .       0
1      40
r4       0       0       0       1       0       .       .       0
1      40
r5       0       0       0       0       1       .       .       0
1      40

. matnames at // -matnames- from SSC (by Austin Nichols)

. di r(c)
 `":1b.hsizgp"' `":2.hsizgp"' `":3.hsizgp"' `":4.hsizgp"' `":5.hsizgp"'
`":0b.black"' `":1.black"' `":0b.female"' `":1.female"' `":age"'

. svmat at

. de at*

              storage  display     value
variable name   type   format      label      variable label
------------------------------------------------------------------------
----------------------------------------------------------------
at1             float  %9.0g                  
at2             float  %9.0g                  
at3             float  %9.0g                  
at4             float  %9.0g                  
at5             float  %9.0g                  
at6             float  %9.0g                  
at7             float  %9.0g                  
at8             float  %9.0g                  
at9             float  %9.0g                  
at10            float  %9.0g                  

. matrix at2 = e(at)

. svmat at2, name(col)
invalid syntax
r(198);

The workaround I was seeking was a quick/easy/automated way of going
from what is shown as the contents of r(c) after -matnames- to a list of
valid variable names that preserves the essential information. The new
list can be used to re-label the matrix prior to -svmat-. The new names
could be based, for instance, on the old ones but drop the ":" and "."
from the elements of r(c), and add some prefix ("_", say) to each of the
elements. My efforts using -subinstr- and also regular expression
functions were not productive, but that is probably a comment on my
skills. 

Thanks, Stephen

------------------------------

Date: Fri, 11 Feb 2011 13:02:58 -0600
From: Scott Merryman <[email protected]>
Subject: Re: st: -svmat- with matrix colnames based on factor variable
names?

On Fri, Feb 11, 2011 at 11:22 AM,  <[email protected]> wrote:
> I'm having difficulty using a list of factor variable names to
construct
> the column names of a matrix post-estimation.
>
> This arises when using -margins, over(.)- with multiple -at(.)-
options.
> I want to collect to collect all the estimates in a matrix with one
row
> per at/over combination, with additional columns also containing
values
> for the at values chosen. To keep track, I want the column names for
the
> additional at columns to contain names that identify the factor
> variables.
>
> Overall aim: once I have created the summary matrix , I want to
convert
> each of the columns to a variable, with all the at variables --
together
> with other variables containing estimates and SE -- to be saved to a
> separate mini-dataset, and used for graphing or table processing.

Perhaps the thread  "Graphic displays or results from margins" at:

http://www.stata.com/statalist/archive/2010-09/msg00807.html

or the example in the vol 25 number 3 of The Stata News (
http://www.stata.com/news/statanews.25.3.pdf ) would be useful.

Scott
===============
From: Austin Nichols <[email protected]>
Subject: Re: st: -svmat- with matrix colnames based on factor variable
names?

Stephen--
Why do you need the names option on -svmat-?  You can also write a
little loop to save the matrix to variables/obs on your data, without
putting illegal names on variables. But either way, you will have to
figure out what you want to call a variable that wants to be named
0b.edlevel or whatever.
If you provide an example with public data, I will come up with a few
solutions for you.
===============
------------------------------

Date: Fri, 11 Feb 2011 18:13:36 +0000
From: Nick Cox <[email protected]>
Subject: st: RE: -svmat- with matrix colnames based on factor variable
names?

I don't have a positive solution but I do note -svmat- was written for
version 4 and so it is indeed unlikely to support factor variables. So
any pre-processing for -svmat- would have to find a work-around for
that. 

Nick 
[email protected]
+++++++++++++++

Stephen
-------------------------------------
Professor Stephen P. Jenkins  <[email protected]>
Department of Social Policy and STICERD
London School of Economics and Political Science
Houghton Street
London WC2A 2AE, U.K.
Tel. +44 (0)20 7955 6527
Survival Analysis using Stata:
http://www.iser.essex.ac.uk/survival-analysis
Downloadable papers and software: http://ideas.repec.org/e/pje7.html


Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index