Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Sergiy Radyakin <serjradyakin@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: How can I get the second last non-missing value? |

Date |
Thu, 13 Jun 2013 11:15:46 -0400 |

Rebecca, the code should be rather simple since it corresponds exactly to the question task posted. The idea is that we work with each row (obs) separately - that is loop by i. Within each row we start moving backwards from the last variable towards the beginning until we hit first not missing value. That's loop by j. It does include the last variable (cols(v)-0=cols(v)). From this point on we need to move to the next value left and proceed until we find the next non-missing value, or hit the wall (this is loop by k). If we find it we post it into the result, otherwise it remains empty. The purpose of the view is really not to save the memory, but to be able to loop by column number in a simple fashion, without worrying about the actual order of the variables in the list (they can be v3 v2 v5 for a dataset of v1 v2 v3 v4 v5, and I want to abstract from this problem of indexing). Robert has posted an absolutely simple and imho superior solution using -replace-. I like it a lot. The only thing is that (I imagine, no testing done) it might show some penalties under some circumstances. In particular, when the number of variables is large (think thousands) that solution would need to do that many replaces (or twice that many, but you get the idea). So consider a case when you have no missings at all. My solution would take only 1 iteration of each loop i-j-k and you end up with your result pretty quickly. Robert's solution would go from the begin to the end and would need to do those thousands of replaces. It doesn't matter with a small number of variables, and the efficiency of built-in commands more than compensates that for a small number of variables. However, I see it is definitely possible to change Robert's solution it to move backwards as well, in which case that would be an absolute winner both in performance and readability. It would also make it compatible with pre-mata Stata's like 5-8 and I would imagine earlier versions as well. As a matter of fact replace was present in Stata 1.0 :) as can be seen here: http://www.ats.ucla.edu/stat/sca/Stata1/m-r.pdf and I hope looping was also possible back than. It would be super interesting to see Stata 1.0 working live somewhere, perhaps on the StataCorp's YouTube channel? Best, Sergiy Radyakin On Thu, Jun 13, 2013 at 10:13 AM, Rebecca Pope <rebecca.a.pope@gmail.com> wrote: > Any time I would have saved by using Mata would have been completely > lost to time figuring out how to accomplish it in Mata. > > Sergiy, would you mind providing a little explanation for what your > code is doing? I made some notes below about what I think is going on, > but I just want to make sure I'm following you. > > *** > mata > > void prelast() { <= like -capture program drop-?, you're just > clearing out any previous definition of the program? > V=. <= define a null matrix V > st_view(V,.,st_local("varlist")) <= make a view onto the data > (presumably here to save memory?), all observations on the variables > given in the Stata local macro varlist (supplied elsewhere) > R=. > st_view(R,.,st_local("result")) <= this bit confused me at first b/c > I thought the variable had to exist already, but you handle this by > generating a result variable with all values missing before you run > prelast(), correct? > > for(i=1;i<=rows(V);i++) { <= loosely, for every observation in the dataset > > for(j=0;j<cols(V);j++) { <= loosely, for all variables given in > `varlist' except the last one > if (missing(V[i,cols(V)-j])==0) { <= j is increasing so the column > index here is decreasing, in effect, counting backwards > // found last non-missing > > if (cols(V)-j-1<1) break; //nothing before > > for(k=cols(V)-j-1;k>=1;k--) { <= lost me here, why increment k, > don't you know you want cols(V)-j-1 since cols(V)-j is the last > non-missing value? > if (missing(V[i,k])==0) > R[i,1]=V[i,k] <= replace the ith observation (row) in the R vector > with the appropriate value from V > break; > } > > break; > } > } > } > } > > end > *** > > Thanks, > Rebecca > > < snip > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: How can I get the second last non-missing value?***From:*duygu yıldırım <dyg_math@hotmail.com>

**Re: st: How can I get the second last non-missing value?***From:*Rebecca Pope <rebecca.a.pope@gmail.com>

**Re: st: How can I get the second last non-missing value?***From:*Sergiy Radyakin <serjradyakin@gmail.com>

**Re: st: How can I get the second last non-missing value?***From:*Rebecca Pope <rebecca.a.pope@gmail.com>

- Prev by Date:
**RE: st: Re: Variable Label Display** - Next by Date:
**st: using -mfx- or -margins- after the Wooldridge approach** - Previous by thread:
**Re: st: How can I get the second last non-missing value?** - Next by thread:
**st: how to search every observation of one variable in another variable** - Index(es):