st: -svmat- with matrix colnames based on factor variable names?

Sat, 12 Feb 2011 18:09:09 -0000

[Reposted as original post (12 February 2011 10:50) hasn't appeared in the Archives] Thanks to Nick Cox, Austin Nichols and Scott Merryman for their responses. Scott's message pointed, inter alia, to a useful Statalist posting by himself, and also to a Stata News article in which graphing after -margins- is usefully accomplished using Roger Newson's -parmest- (on SSC). The small, but perhaps important, difference in my case is that the x-axis variable in my proposed graphs are categorical (factor variable values) rather than continuous (age in years in the News and Scott's examples). As Nick and Austin suggest, some pre- or post-processing of the colnames/factor varlist is inevitable. Austin kindly offered some assistance were I to suggest an example based on public data. Here's one inspired by Scott's earlier example [I've added use of variable hsizgp, which is categorical taking values 1(1)5]: . webuse nhanes2f, clear . logit diabetes i.hsizgp i.black i.female age i.female#c.age, nolog Logistic regression Number of obs = 10335 LR chi2(8) = 391.83 Prob > chi2 = 0.0000 Log likelihood = -1803.1523 Pseudo R2 = 0.0980 ------------------------------------------------------------------------ ------ diabetes | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------- ------ hsizgp | 2 | -.1797069 .1280784 -1.40 0.161 -.4307359 .0713221 3 | .11717 .1621709 0.72 0.470 -.2006792 .4350192 4 | .3391857 .1847375 1.84 0.066 -.0228931 .7012645 5 | -.0473484 .1967049 -0.24 0.810 -.4328829 .3381862 | 1.black | .6699473 .1286628 5.21 0.000 .4177728 .9221217 1.female | 1.435121 .489943 2.93 0.003 .4748507 2.395392 age | .0766945 .0066901 11.46 0.000 .063582 .0898069 | female#c.age | 1 | -.0213554 .0079371 -2.69 0.007 -.0369117 -.005799 | _cons | -7.403102 .4346565 -17.03 0.000 -8.255013 -6.551191 ------------------------------------------------------------------------ ------ . margins , at(female = 1 age = 40 hsizgp = (1(1)5) ) post Predictive margins Number of obs = 10335 Model VCE : OIM Expression : Pr(diabetes), predict() 1._at : hsizgp = 1 female = 1 age = 40 2._at : hsizgp = 2 female = 1 age = 40 3._at : hsizgp = 3 female = 1 age = 40 4._at : hsizgp = 4 female = 1 age = 40 5._at : hsizgp = 5 female = 1 age = 40 ------------------------------------------------------------------------ ------ | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------- ------ _at | 1 | .0250719 .0038488 6.51 0.000 .0175283 .0326154 2 | .0210404 .002844 7.40 0.000 .0154662 .0266145 3 | .0280953 .0041529 6.77 0.000 .0199558 .0362348 4 | .0348208 .0053377 6.52 0.000 .0243591 .0452825 5 | .023942 .0040474 5.92 0.000 .0160092 .0318748 ------------------------------------------------------------------------ ------ . matrix at = e(at) . mat li at at[5,10] 1b. 2. 3. 4. 5. 0b. 1. 0b. 1. hsizgp hsizgp hsizgp hsizgp hsizgp black black female female age r1 1 0 0 0 0 . . 0 1 40 r2 0 1 0 0 0 . . 0 1 40 r3 0 0 1 0 0 . . 0 1 40 r4 0 0 0 1 0 . . 0 1 40 r5 0 0 0 0 1 . . 0 1 40 . matnames at // -matnames- from SSC (by Austin Nichols) . di r(c) `":1b.hsizgp"' `":2.hsizgp"' `":3.hsizgp"' `":4.hsizgp"' `":5.hsizgp"' `":0b.black"' `":1.black"' `":0b.female"' `":1.female"' `":age"' . svmat at . de at* storage display value variable name type format label variable label ------------------------------------------------------------------------ ---------------------------------------------------------------- at1 float %9.0g at2 float %9.0g at3 float %9.0g at4 float %9.0g at5 float %9.0g at6 float %9.0g at7 float %9.0g at8 float %9.0g at9 float %9.0g at10 float %9.0g . matrix at2 = e(at) . svmat at2, name(col) invalid syntax r(198); The workaround I was seeking was a quick/easy/automated way of going from what is shown as the contents of r(c) after -matnames- to a list of valid variable names that preserves the essential information. The new list can be used to re-label the matrix prior to -svmat-. The new names could be based, for instance, on the old ones but drop the ":" and "." from the elements of r(c), and add some prefix ("_", say) to each of the elements. My efforts using -subinstr- and also regular expression functions were not productive, but that is probably a comment on my skills. Thanks, Stephen ------------------------------ Date: Fri, 11 Feb 2011 13:02:58 -0600 From: Scott Merryman <scott.merryman@gmail.com> Subject: Re: st: -svmat- with matrix colnames based on factor variable names? On Fri, Feb 11, 2011 at 11:22 AM, <S.Jenkins@lse.ac.uk> wrote: > I'm having difficulty using a list of factor variable names to construct > the column names of a matrix post-estimation. > > This arises when using -margins, over(.)- with multiple -at(.)- options. > I want to collect to collect all the estimates in a matrix with one row > per at/over combination, with additional columns also containing values > for the at values chosen. To keep track, I want the column names for the > additional at columns to contain names that identify the factor > variables. > > Overall aim: once I have created the summary matrix , I want to convert > each of the columns to a variable, with all the at variables -- together > with other variables containing estimates and SE -- to be saved to a > separate mini-dataset, and used for graphing or table processing. Perhaps the thread "Graphic displays or results from margins" at: http://www.stata.com/statalist/archive/2010-09/msg00807.html or the example in the vol 25 number 3 of The Stata News ( http://www.stata.com/news/statanews.25.3.pdf ) would be useful. Scott =============== From: Austin Nichols <austinnichols@gmail.com> Subject: Re: st: -svmat- with matrix colnames based on factor variable names? Stephen-- Why do you need the names option on -svmat-? You can also write a little loop to save the matrix to variables/obs on your data, without putting illegal names on variables. But either way, you will have to figure out what you want to call a variable that wants to be named 0b.edlevel or whatever. If you provide an example with public data, I will come up with a few solutions for you. =============== ------------------------------ Date: Fri, 11 Feb 2011 18:13:36 +0000 From: Nick Cox <n.j.cox@durham.ac.uk> Subject: st: RE: -svmat- with matrix colnames based on factor variable names? I don't have a positive solution but I do note -svmat- was written for version 4 and so it is indeed unlikely to support factor variables. So any pre-processing for -svmat- would have to find a work-around for that. Nick n.j.cox@durham.ac.uk

