Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: FW: argmax for -summaraize-


From   "Feiveson, Alan H. (JSC-SK311)" <alan.h.feiveson@nasa.gov>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: FW: argmax for -summaraize-
Date   Tue, 25 Jul 2006 11:22:14 -0500

Thanks, Joseph, Jeph and Maarten for your helpful suggestions. I
incorporated these ideas into the following program which returns
scalars holding the observations with the min and max. I found the
easiest way to get those was to sort the original observation number
("ord" in the program below) along with the variable of interest. Then
after the bysorts, I resorted with respect to "ord" to get back the
original order.

program define argmax1
// finds observations with `x' = max and `x' = min
version 9.2
syntax anything [if] [in] 
tokenize `anything'
args x kmin kmax 
tempvar ind ord rep N
qui gen byte `ind'=0
qui replace `ind'=1 `if' `in'
qui gen int `ord'=_n
bysort `ind' (`x'): gen `rep'=_n
bysort `ind' (`x'): gen `N'=_N
summ `ord' if `ind'==1 & `rep'==1,meanonly
scalar `kmin'=r(mean)
summ `ord' if `ind'==1 & `rep'==N,meanonly
scalar `kmax'=r(mean)
sort ord
end




-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Joseph
Coveney
Sent: Tuesday, July 25, 2006 2:05 AM
To: Statalist
Subject: Re: st: FW: argmax for -summaraize-

Alan H. Feiveson wrote:

Hello - Does anyone know an efficient way to identify the observation at
which a particular variable is minimum or maximum (subject to `if'
and/or `in') ?

Apparently -summarize- does not return this value. I see nothing in
-egen- nor  does "findit argmax" produce anything. I can program this
myself by looping through the observations but that is not efficient. In
particular one cannot gurantee that anything like

summ x
local xmax=r(max)
if x = `xmax' {
...

will work because of rounding. I also wish to avoid -preserve-,
-collapse-, etc

------------------------------------------------------------------------
--------

Couldn't you just -generate- a 0/1 indicator variable?  Then just use
the indicator in your Boolean expression:  -if indicator_variable . . .-

Generating such a variable (1) allows for both -if- and -in-, (2) won't
be affected by missing values in the target variable, and (3) doesn't
appear to be liable to rounding errors regardless of whether the target
variable is
single- or double-precision:  one (and only one) maximum observation is
identified in each of 1000 200-observation datasets.

Joseph Coveney

clear
set more off
set seed `=date("2006-07-25", "ymd")'
set matsize 10000
tempname A
tempvar a max
set obs 200
generate double `a' = . // double-precision generate byte `max' = 0
forvalues i = 1/1000 {
    quietly replace `a' = uniform()
    summarize `a', meanonly
    quietly replace `max' = (`a' == r(max)) if (1==1) in 1/200
    summarize `max', meanonly
    matrix define `A' = (nullmat(`A') \ r(sum)) } drop _all svmat byte
`A', names(col) assert c1 == 1
*
clear
set obs 200
generate float `a' = . // single-precision generate byte `max' = 0
forvalues i = 1/1000 {
    quietly replace `a' = uniform()
    summarize `a', meanonly
    quietly replace `max' = (`a' == r(max)) if (1==1) in 1/200
    summarize `max', meanonly
    matrix define `A' = (nullmat(`A') \ r(sum)) } drop _all svmat byte
`A', names(col) assert c1 == 1 exit


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index