Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: extracting rownames of a matrix into a variable (preparatory work for MICE)


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: extracting rownames of a matrix into a variable (preparatory work for MICE)
Date   Fri, 19 Sep 2008 16:48:42 +0100

This is asked every few months. See for example 

http://www.stata.com/statalist/archive/2008-05/msg01224.html

and the ensuing thread pointing to -svmat2- as a canned solution. 

Nick 
[email protected] 

Gresch,Cornelia

I want to extract a matrix into data-format in a way that the row-names are not lost but appear in a new variable as "observations" (contents of the first variable in string format). Does anybody of you have an idea how to get this running?

Background of the question: I work on an imputation model with MICE (command ice). Since the dataset is very large I need to decide which variables to use as independent variables. Therefore I want to create a matrix including the R� (or LL) of bivariate regression models for all possible combination of the variables. The col-names of this matrix correspond to the dependent variables, the row-names of the matrix to the independent variables of the bivariate regression model.

Because the final matrix will be very large I would like to convert it into variables (which can be easily used to define about 15-25 independent variables which again I want to use as predictors in the final imputation model).� Using�the command "svmat" results in the expected dataset but the row-names (which are necessary to identify the appropriate predictors) get lost.

Below you can see the code and the resulting problem in more detail:
(N.B.: variables are "e3his" "e220ex" "t_kfts" (which are metric); "tr10emp" "e121a" (which are categorial with corresponding dummies (tr10emp_1, tr10emp_2,...)); and "slspaet" "tr06sex" (which are binary))


matrix drop _all
foreach av of var e3his e220ex t_kfts {
� foreach katuv of any tr10emp e121a {
����� local `katuv' = "`katuv'_*"
� }
����� matrix `av' = (1)
�foreach uv of any e3his e220ex t_kfts /* metrische UVs */ ///
����� "`tr10emp'" "`e121a'"� /* polytome UVs */ ///
����� slspaet tr06sex /* Dummy-UVs */ {
� ��� quietly reg `av' `uv'
� ��� local r2 = "Fehler"
����� local r2 = e(r2)
����� matrix input R =(`r2')
����� matrix colnames `av' = `av'
����� matrix rownames R = `uv' 
����� matrix `av' = (`av' \R) 
� }�� 
����� if ("`av'" == "e3his") matrix ges_ols = (`av') 
����� if ("`av'" ~= "e3his") matrix ges_ols = (ges_ols, `av')
}
matrix list ges_ols 

ges_ols[8,3]
�������������� e3his���� e220ex���� t_kfts
������ r1��������� 1��������� 1��������� 1
��� e3his��������� 1� .04320558� .05278807
�� e220ex� .04320558��������� 1� .01621444
�� t_kfts� .05278807� .01621444��������� 1
tr10emp_*� .16007611� .02387149� .16865035
������ r1��������� 0��������� 0��������� 0
� slspaet� .01107509� .00863178� 4.931e-06
� tr06sex� 1.640e-06� .00104972� .00829559


If I convert this matrix to variables I get the following dataset:

svmat ges_ols, names(col)
�list 
���� +--------------------------------+
���� |��� e3his���� e220ex���� t_kfts |
���� |--------------------------------|
� 1. |������� 1��������� 1��������� 1 |
� 2. |������� 1�� .0432056�� .0527881 |
� 3. | .0432056��������� 1�� .0162144 |
� 4. | .0527881�� .0162144��������� 1 |
� 5. | .1600761�� .0238715�� .1686504 |
���� |--------------------------------|
� 6. |������� 0��������� 0��������� 0 |
� 7. | .0110751�� .0086318�� 4.93e-06 |
� 8. | 1.64e-06 ��.0010497�� .0082956 |
���� +--------------------------------+


But I also need a variable with the names of the variables in string format as I have it in the matrix (at least that's the only way I see to identify the 15-25 best fitting predictors). Does anybody of you have an idea how to get this running? Alternatively I tried to put the names as strings in the matrix, which also does not work and I also cannot transform the matrix (rows to columns and the other way around) and extract it that way since some of the independent variables are dummysets �(e.g. tr10emp_*) and therefore cannot be identified as variable name.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index