From |
Jeph Herrin <stata@spandrel.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: reverse lookup |

Date |
Wed, 09 Jan 2013 09:54:04 -0500 |

Yes, the sort works as well for strings.

cheers, Jeph On 1/8/2013 10:19 PM, Nick Cox wrote:

The approach in my earlier code could be extended. program vallookup, sort version 8.2 syntax varname [if] [in] [, local(str) scalar(str) ] marksample touse, strok qui count if `touse' local nuse = r(N) if `nuse' == 0 error 2000 capture confirm numeric variable `varlist' if _rc == 0 { su `varlist' if `touse', meanonly if r(min) != r(max) { di as err "specification not satisfied by single value" exit 498 } di r(min) if "`local'" != "" { c_local `local' = r(min) } if "`scalar'" != "" { scalar `scalar' = r(min) } } else { sort `touse' `varlist' if `varlist'[_N - `nuse' + 1] != `varlist'[_N] { di as err "specification not satisfied by single value" exit 498 } di `varlist'[_N] if "`local'" != "" { c_local `local' = `varlist'[_N] } if "`scalar'" != "" { scalar `scalar' = `varlist'[_N] } } end On Tue, Jan 8, 2013 at 10:58 PM, Jeph Herrin <stata@spandrel.net> wrote:String variables are a problem all their own. I usually do something like: encode strvar, gen(strvar_coded) sum strvar_coded if period==1 local rate1= /// cond(`r(min)'==`r(max)',`=: label (strvar_coded) `=r(min)'',"") which however can run into trouble if there are too many values to -strvar-. On 1/8/2013 4:48 PM, Nick Cox wrote:Here's a sketch. (Also, what about string variables?) program vallookup version 8.2 syntax varname(numeric) [if] [in] [, local(str) scalar(str) ] marksample touse, strok qui count if `touse' if r(N) == 0 error 2000 capture confirm numeric variable `varlist' su `varlist' if `touse', meanonly if r(min) != r(max) { di as err "specification not satisfied by single value" exit 498 } di r(min) if "`local'" != "" { c_local `local' = r(min) } if "`scalar'" != "" { scalar `scalar' = r(min) } end On Tue, Jan 8, 2013 at 9:33 PM, Jeph Herrin <stata@spandrel.net> wrote:Yes, the Mata construct is the ideal. And obviously, one must have 1-1 mapping; this I usually check by: sum rate if period==1 local rate=cond(`r(min)'==`r(max)',r(min),.) I was thinking of writing some programs to do lookups like this, since I have been doing so many, and thought I'd ask first for an alternative. thanks, Jeph On 1/8/2013 2:27 PM, Nick Cox wrote:My short answer is that yes, this is awkward, but you are working with the most obvious way to do it in Stata. The problem is that in general ... if <condition> is not guaranteed to identify precisely one observation. It might yield one, or zero or more than one. In your case you need == in your code and can use su rate if period == 1, meanonly local value = r(min) The misnamed -meanonly- is quieter and more efficient. If the condition identifies precisely one observation, then clearly r(min), r(mean), r(max) will be identical. The problem is discussed from a different angle in SJ-6-4 dm0025 . . . . . . . . . . Stata tip 36: Which observations? Erratum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q4/06 SJ 6(4):596 (no commands) correction of example code for Stata tip 36 SJ-6-3 dm0025 . . . . . . . . . . . . . . Stata tip 36: Which observations? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q3/06 SJ 6(3):430--432 (no commands) tip for identifying which observations satisfy some specified condition Mata is not surprisingly less awkward here: : y = 1::10 : x = runiform(10,1) : x , y 1 2 +-----------------------------+ 1 | .5044846558 1 | 2 | .0174561641 2 | 3 | .680281796 3 | 4 | .9221656218 4 | 5 | .1094441491 5 | 6 | .7122591983 6 | 7 | .765775156 7 | 8 | .0226029507 8 | 9 | .9540165765 9 | 10 | .2686450339 10 | +-----------------------------+ : select(x, y :== 1) .5044846558 Nick On Tue, Jan 8, 2013 at 7:07 PM, Jeph Herrin <stata@spandrel.net> wrote:I've just written the same awkward code for the untoldth time, and I'm thinking there must be a better way to do it. The problem is to get a particular value of a variable into a local which corresponds to a particular value of another variable. I think this is usally call reverse lookup. For example, I might have -period- and -rate- and want to store the value of -rate- which corresponds to period = 1. My lazy solution is sum rate if period = 1 local rate1 `=r(mean)' That is, I summarize a single observation, then put the mean in local. Is there a better way to do this?* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

