Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Mata - generalised cross products of the form X'SX

From	David Kantor <[email protected]>
To	[email protected]
Subject	Re: st: Mata - generalised cross products of the form X'SX
Date	Tue, 08 Feb 2011 09:29:54 -0500

At 07:12 AM 2/8/2011, Gordon Hughes wrote:

I am starting a new thread, but this is really a follow-up to mypost on performance monitoring. Following Bill Gould's helpfulsuggestion I have identified that most of the time is consumed inloops in my likelihood evaluation that calculate matrices of theform Z=X'SX where S is symmetric but not a diagonal matrix. Ofcourse, I am doing things outside loops where I can.
This is a standard form in any GLS type calculation. They are notdirectly amenable to use with the Mata function --cross()-- but theycould be got into that form by forming by factorising S=QQ', thenforming P=Q'X, and finally using P=cross(P,P).
Two questions follow:
A. Is there any Mata function that does this more directly? MaybeStata Corp might consider extending --cross()-- to handle suchcases. The command --matrix glsaccum-- does this in Stata.
B. Roughly, under what conditions might this sequence of stepsreduce execution time and/or memory use on the assumption that thedimension of S is large relative to cols(X)? In a related context Ihave to calculate X'B'BX where B is square but not symmetric andfound that use of --cross()-- does not save much (if any) time,though it is probably more efficient on memory. Hence, the overheadof factorisation may not pay off. On the other hand, the initialfactorisation would only need to be done once per function call,whereas the steps involving X have to be performed thousands oftimes per function call.
Gordon Hughes
[email protected]

I don't know enough to answer your question, but the situation lookssimilar to something I encountered some time ago, which may bevaluable to consider. This is about plain Stata code, not Mata.

This is in my mahascore.ado, as part of the mahapick module on SSC.For each observation, I wanted a value computed by first forming avector `X', based on values of specified variables in theobservation, and then computing...

 `X'' * `invcov' * `X' `X'

The "obvious" thing to do was to do that exact computation in a loop,looping over the observations: (Code samples omit the matter oftaking the sqrt of the result. Also, these are excerpts that refer tothings defined in sections of omitted code.)


local vref "\`v'[\`refobs']"

quietly gen double `gen' = .
tempname X Y

forvalues obsno = 1/`=_N' {
 matrix `X' = J(`nvars', 1, .)
 matrix rownames `X' = `varlist'

 foreach v of local varlist {
  local rownum = rownumb(matrix(`X'), "`v'")
  matrix `X'[`rownum', 1] = `v'[`obsno'] - `vref'
 }

 matrix `Y' = `X'' * `invcov' * `X'  // `Y' is 1x1
 quietly replace `gen' = (scalar(`Y'[1,1])) in `obsno'
}

----

Later, I learned of a much quicker way to get the same result. Thisis based on the corresponding calculation found in _Dif_mbase inpsmatch2 by Leuven & Sianesi. Instead of looping over theobservations, the looping is over the rows and columns of `invcov':


local rows : rownames `invcov'
local cols : colnames `invcov'

local icrows = rowsof(`invcov')
local iccols = colsof(`invcov')

local vjref "\`vj'[\`refobs']"
local vkref "\`vk'[\`refobs']"

quietly gen double `gen' = 0
forvalues j = 1 / `icrows' {
 forvalues k = 1 / `icrows' {  /* or iccols -- the same */
  local vj = word("`rows'", `j')
  local vk = word("`rows'", `k') /* or cols -- should be the same */
  quietly replace `gen' = `gen' + ///
   `invcov'[`j,', `k'] * (`vj' - `vjref') * (`vk' - `vkref')
 }
}

----

The latter is mathematically equivalent to the former, but muchfaster. It gave results that are almost identical to the former; thedifferences were minuscule. I can't say which was closer to the trueresult, but I suspect it was the latter -- a question that is similarto a matter that Bill Gould recently wrote about; he may want tochime in here as well.


I hope this is useful or interesting.
--David

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Mata - generalised cross products of the form X'SX
  - From: Gordon Hughes <[email protected]>

Prev by Date: RE: Antwort: Re: st: xtivreg2 with endogenous binary regressors
Next by Date: Re: Antwort: Re: st: xtivreg2 with endogenous binary regressors
Previous by thread: st: Mata - generalised cross products of the form X'SX
Next by thread: Re: st: Mata - generalised cross products of the form X'SX
Index(es):
- Date
- Thread