Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: reverse lookup


From   Jeph Herrin <stata@spandrel.net>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: reverse lookup
Date   Wed, 09 Jan 2013 09:54:04 -0500

Yes, the sort works as well for strings.

I'm thinking of a program will return the list of matches, along with a flag regarding uniqueness, because in some contexts it's good have the set of matching values. But no need to post it here (grin).

cheers,
Jeph

On 1/8/2013 10:19 PM, Nick Cox wrote:
The approach in my earlier code could be extended.

program vallookup, sort
	version 8.2
	syntax varname [if] [in] [, local(str) scalar(str) ]

	marksample touse, strok
	qui count if `touse'
	local nuse = r(N)
	if `nuse' == 0 error 2000

	capture confirm numeric variable `varlist'

	if _rc == 0 {
		su `varlist' if `touse', meanonly
		if r(min) != r(max) {
			di as err "specification not satisfied by single value"
			exit 498
		}
	
		di r(min)
		if "`local'" != "" {
			c_local `local' = r(min)
		}
		if "`scalar'" != "" {
			scalar `scalar' = r(min)
		}
	}
	else {
		sort `touse' `varlist'
		if `varlist'[_N - `nuse' + 1] != `varlist'[_N] {
			di as err "specification not satisfied by single value"
			exit 498
		}

		di `varlist'[_N]
		if "`local'" != "" {
			c_local `local' = `varlist'[_N]
		}
		if "`scalar'" != "" {
			scalar `scalar' = `varlist'[_N]
		}
	}
end


On Tue, Jan 8, 2013 at 10:58 PM, Jeph Herrin <stata@spandrel.net> wrote:
String variables are a problem all their own. I usually do something like:

  encode strvar, gen(strvar_coded)
  sum strvar_coded if period==1
  local rate1= ///
     cond(`r(min)'==`r(max)',`=: label (strvar_coded) `=r(min)'',"")

which however can run into trouble if there are too many values to -strvar-.



On 1/8/2013 4:48 PM, Nick Cox wrote:

Here's a sketch. (Also, what about string variables?)

program vallookup
         version 8.2
         syntax varname(numeric) [if] [in] [, local(str) scalar(str) ]

         marksample touse, strok
         qui count if `touse'
         if r(N) == 0 error 2000

         capture confirm numeric variable `varlist'

         su `varlist' if `touse', meanonly
         if r(min) != r(max) {
                 di as err "specification not satisfied by single value"
                 exit 498
         }

         di r(min)
         if "`local'" != "" {
                 c_local `local' = r(min)
         }
         if "`scalar'" != "" {
                 scalar `scalar' = r(min)
         }
end

On Tue, Jan 8, 2013 at 9:33 PM, Jeph Herrin <stata@spandrel.net> wrote:

Yes, the Mata construct is the ideal. And obviously, one must have 1-1
mapping; this I usually check by:

   sum rate if period==1
   local rate=cond(`r(min)'==`r(max)',r(min),.)

I was thinking of writing some programs to do lookups like this, since I
have been doing so many, and thought I'd ask first for an alternative.

thanks,
Jeph


On 1/8/2013 2:27 PM, Nick Cox wrote:


My short answer is that yes, this is awkward, but you are working with
the most obvious way to do it in Stata. The problem is that in general

... if <condition>

is not guaranteed to identify precisely one observation. It might
yield one, or zero or more than one.

In your case you need == in your code and can use

su rate if period == 1, meanonly
local value = r(min)

The misnamed -meanonly- is quieter and more efficient. If the
condition identifies precisely one observation, then clearly r(min),
r(mean), r(max) will be identical.

The problem is discussed from a different angle in

SJ-6-4  dm0025  . . . . . . . . . .  Stata tip 36: Which observations?
Erratum
           . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
N.
J. Cox
           Q4/06   SJ 6(4):596                              (no commands)
           correction of example code for Stata tip 36

SJ-6-3  dm0025  . . . . . . . . . . . . . .  Stata tip 36: Which
observations?
           . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
N.
J. Cox
           Q3/06   SJ 6(3):430--432                                 (no
commands)
           tip for identifying which observations satisfy some
           specified condition

Mata is not surprisingly less awkward here:

: y = 1::10

: x = runiform(10,1)

: x , y
                     1             2
        +-----------------------------+
      1 |  .5044846558             1  |
      2 |  .0174561641             2  |
      3 |   .680281796             3  |
      4 |  .9221656218             4  |
      5 |  .1094441491             5  |
      6 |  .7122591983             6  |
      7 |   .765775156             7  |
      8 |  .0226029507             8  |
      9 |  .9540165765             9  |
     10 |  .2686450339            10  |
        +-----------------------------+

: select(x, y :== 1)
     .5044846558

Nick

On Tue, Jan 8, 2013 at 7:07 PM, Jeph Herrin <stata@spandrel.net> wrote:

I've just written the same awkward code for the untoldth time, and I'm
thinking there must be a better way to do it.

The problem is to get a particular value of a variable into a local
which
corresponds to a particular value of another variable. I think this is
usally call reverse lookup. For example, I might have -period- and
-rate-
and want to store the value of -rate- which corresponds to period = 1.
My
lazy solution is


    sum rate if period = 1
    local rate1 `=r(mean)'

That is, I summarize a single observation, then put the mean in local.
Is
there a better way to do this?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index