Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Computing local variance


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Computing local variance
Date   Sun, 22 Feb 2009 18:30:26 -0000

With just one trick, creating a pseudo-time variable, you can exploit
the existence of -mvsumm- from SSC, which codes this, and indeed a more
general approach. 

-mvsumm- is not blisteringly fast -- it predates Mata -- but it should
save almost all your programming time. 

Nick 
n.j.cox@durham.ac.uk 

Benjamin Villena Roldan

I have two continuous variables X
and Y. I'm trying to do the following:
1. Sort the data using X
2. For each observation of X, I compute the local variance of Y by a
nearest
neighborhood approach. I take the 2k closest observations to an
observation
X[i], i.e. using observations between X[i-k+1] and X[i+k]. 
3. I'm implementing this approach by using a forvalue loop such as

 sort X
 count if X!=.
 local k=ceil(r(N)^0.5/2)
 local K=r(N)-`k'
 gen SD_Y=.	
 forv i=`k'/`K' {
	local k0=`i'-`k'+1
      local k1=`i'+`k'
      qui summ Y in `k0'/`k1'
      replace SD_Y=r(sd) in `i'/`i'
	}
 
So, I have two questions/problems about this code
1. I need to do the same procedure several times and it is very
time-consuming. Is there a way to speed up the execution? How much time
would I gain if I implement a similar code in C++?
2. There are missing observations in X and Y, how can I restrict the
sort
command to deal with nonmissing values of both variables. A simple
answer is
to do 
-keep if X!=. & Y!=.
Can I do it without dropping data?


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index