Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: flagging significant values in a variable


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: flagging significant values in a variable
Date   Sat, 3 Mar 2012 17:35:07 +0000

Threading is what your mailer does. It's not inherent in Statalist's
operation. (Archiving is a separate matter.)

But that's of no consequence: we all overlook previous emails from
time to time.

On the question of introducing a tolerance to the question, I stand by
my earlier comment.

If I understand you correctly the code you posted earlier was really
intended as a code sketch and readers were expected to be perceptive
enough to realise that. That really wasn't clear to me. I would be
surprised if it was clear to anybody else But let's concentrate on the
code and spell out what your approach implies, a loop over
observations.

forval i = 1/`=_N'
        if lci[`i'] - natlci[`i'] >`tol' {
               replace tag=1 in `i'
       }
       else if lci[`i']-natlci[`i'] < -`tol' {
               replace tag= 2 in `i'
      }
      else {
              replace tag = 0 in `i'
     }
}

However, this loop really isn't necessary as the whole thing can be
done in one line.

replace tag = cond(lci - natlci >`tol', 1, 2 * (lci[`i']-natlci[`i'] < -`tol'))

If that's over-compressed, there is a shorter version in about three
lines, similar in spirit to Graham's posting yesterday. That will be
much faster.

If you want to prefer a loop over observations here, that's your prerogative.

(To concentrate on one specific code question, I have left in your tolerance.)

Nick

On Sat, Mar 3, 2012 at 3:20 PM, Partho Sarkar <partho.ss+lists@gmail.com> wrote:
> True, I had overlooked the earlier solutions- because the question
> appears on 2 separate threads .  I answered the one w/o any answers at
> the time, w/o having seen the other thread. ( Btw this raises some
> issues about duplicate threads, possibly unintendedly so,  which often
> confuse! )
>
> The code was just a sketch of an idea- I assumed (mistakenly perhaps)
> that the user would realize the need to qualify the if loops in
> practice (start off the loops with a foreach statement to loop through
> all the observations) .  The tolerance given is also only an example!
> All this based on what I think is a legitimate interpretation of the
> original question!
>
> Partho
>
> On Sat, Mar 3, 2012 at 8:22 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>> This post overlooks earlier solutions posted yesterday. I see no need
>> to complicate anything by introduction of a tolerance, which seems
>> based on an idea that the rates are exact decimals to 4 d.p.
>>
>> Also, the code won't work as intended because it confuses the -if-
>> command and the -if- qualifier.
>>
>> FAQ     . . . . . . . . . . . . . . . . . . . . .  if command vs. if qualifier
>>        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  J. Wernow
>>        6/00    I have an if command in my program that only seems
>>                to evaluate the first observation, what's going on?
>>                http://www.stata.com/support/faqs/lang/ifqualifier.html
>>
>> Nick
>>
>> On Sat, Mar 3, 2012 at 2:10 PM, Partho Sarkar <partho.ss+lists@gmail.com> wrote:
>>> Tim,
>>>
>>> I am afraid you haven't spelt it out very clearly! Based on one
>>> possible interpretation, this would be one way to do it (shown only
>>> for the LCI (renamed lci) variable):
>>>
>>> ---------------------------START CODE-------------------------------------------
>>>
>>> egen natlci=total(lci*(region==99)) // generates a value for each
>>> obs., equal to national value)
>>> local tol .0001  // define tolerance for "significantly lower or higher"
>>> gen byte tag= .
>>> if lci-natlci>`tol' {
>>> replace tag=1
>>> }
>>> else if lci-natlci< -`tol' {
>>> replace tag= 2
>>> }
>>> else {
>>> replace tag = 0
>>> }
>>>
>>> ---------------------------END CODE-------------------------------------------
>>>
>>> Hope this helps
>>>
>>> Partho
>>>
>>>                        From      Tim Evans <Tim.Evans@wmciu.nhs.uk>
>>>                        To        "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
>>>                        Subject   st: flagging significant values in a variable
>>>                        Date      Fri, 2 Mar 2012 09:24:46 +0000
>>>
>>>                         Hi,
>>>
>>>                        I have a dataset that has variables of rates, LCI and UCI for a
>>> number of regions in addition to a national average (rate, LCI, UCI)
>>> so that it looks like this:
>>>
>>>                        rate            LCI             UCI             region
>>>                        0.9727  0.9583  0.9849  1
>>>                        0.9713  0.9523  0.9867  2
>>>                        0.9835  0.9667  0.9971  3
>>>                        0.9790  0.9741  0.9836  99
>>>
>>>                        What I would like to do is generate a flag beside each row that
>>> will flag up entries where they are significantly higher (1) or lower
>>> (2) or not significantly different (0) to region 99 - I'm unsure as to
>>> the code here and would appreciate any advice. I'm using Stata 11.2.
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index