Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: vreverse when not all possible values of a variable are usedby respondents

From   n j cox <>
Subject   Re: st: vreverse when not all possible values of a variable are usedby respondents
Date   Thu, 31 May 2007 22:06:02 +0100

The FAQ advises

"Say what command(s) you are using. If they are not part of official Stata, say where they come from: the STB/SJ, SSC, or other archives."

In this case, -vreverse- rang a bell. So I fired up -findit vreverse-
and found it on SSC, purporting to be mine. A bell rang again.
One hand clapped. In the darkness, a frog croaked. I met once more a child I had forgot. Well, not quite. I did remember it really.

How you do answer a question like Christopher's? Or how should he
answer it himself? Trying out some examples is clearly a good idea.
Looking at the help is another: here is the definition:

"Suppose that in the observations specified varname varies between minimum min and maximum max. Then

newvar = min + max - varname

and value labels are mapped accordingly."

Looking at the code is another; it's just more Stata.

. viewsource vreverse.ado

shows that the help is correct, Here is the key segment:

su `varlist' if `touse', meanonly
local max = r(max)
local min = r(min)
local type : type `varlist'
gen `type' `generate' = (`min') + (`max') - `varlist' if `touse'

(That's more long-winded than necessary for the obvious purpose
because of what I need to use later.)

In short, -vreverse- takes the observed min and max
and uses those to reverse. So 3-5 will get reversed to
5-3, which Christopher does not want. -vreverse- knows
nothing about what might have been. Like Stata, it is not
strong on theology or metaphysics. It is an old-fashioned empiricist
or positivist working strictly from the data in sight, or
more precisely in memory.

The issue arises elsewhere in Stata, particularly when people
want possible values that might have occurred, but didn't,
to be tabulated with frequency zero. There was a thread on
that recently.

There are at least two ways forward that I can see. One
is that I should add an option to -vreverse- so that the user
can spell out what the range really is. As Christopher points
out, he deals with a ragbag of scales, some 0-4, some 1-5,
etc. and so the putative range would have to be specified for
each variable, like it (Likert?) or not. Equivalently,
somebody could clone my program and add the option,
naming it something else.

Another is to look afresh at the problem. -vreverse- was
written in 2003, so a different take may be apparent now,
especially if -vreverse- was really a bad idea.

All these little programs for little problems may be
useful if they save someone busy a lot of head-scratching
on how to do it. Equally, they can be a distraction if users
are discouraged from learning more general tools that would
help in solving many other problems.

What is the problem? An integer variable is coded the
wrong way round, so far as you are concerned. What might
be integers 1/5 is coded 5/1. If that were all the problem,

gen new = 6 - old

is the solution. The justification for -vreverse- is
that it does more than this: it also fixes value labels,
vsriable label, format and characteristics. Note that
it is not superintelligent about variable labels. If
-old- has a variable label
"math ability scale: 1 = Newton 5 = simpleton"
then -vreverse- will not reach inside and flip them round.
Nor will what I am going to propose fix that.

Now in 2007 there is an official Stata command that
does most of this, namely -clonevar-. So a recipe
starts like this, for a 1/5 variable:

clonevar new = old
replace new = 6 - old

That takes care of everything except any value
labels that need to be fixed. -clonevar- fixes
format, variable label and characteristics.
(If you don't know what characteristics are,
you probably don't need to know yet.) -clonevar-
is StataCorp's name rather than the original
proposal of -clone-. -clone- is really too good a name
to assign to a command that is only a little deal.
In fact, in Stata 13 you will be able to type

. clone WWG

and a virtual Bill Gould will be added to
your machine to advise you on Mata. Now
that will be really handy.

Back to value labels: We could just them
type them out in a single line, but that's
going to be tedious whenever several variables
are involved. Let's define the new labels
and apply them:

forval i = 1/5 {
local j = 6 - `i'
local lbl : label (old) `j'
label def new `i' `"`lbl'"', modify
label val new new

Now generalise this code so that it will work
with a bunch of variables all coded 1-5:

foreach v of var foo bar buzz blair brown bush twig {
clonevar new_`v' = `v'
` replace new_`v' = 6 - `v'

forval i = 1/5 {
local j = 6 - `i'
local lbl : label (old) `j'
label def new_`v' `i' `"`lbl'"', modify

label val new_`v' new_`v'

Evidently, your variable names will be different, and there
could in principle be conflicts with names in use. For
variables coded 0-4, you just change the magic numbers:

foreach v of var something else altogether {
clonevar new_`v' = `v'
` replace new_`v' = 4 - `v'

forval i = 0/4 {
local j = 4 - `i'
local lbl : label (old) `j'
label def new_`v' `i' `"`lbl'"', modify

label val new_`v' new_`v'

Well, in fact this is how programs often are born. Here is a sketch
of one with more pretensions:

*! 1.0.0 NJC 31 May 2007
program reversev
version 9
syntax varlist(numeric) [if] [in] , min(int) max(int)

marksample touse
qui count if `touse'
if r(N) == 0 error 2000

if `min' > `max' {
di as err "`min' > `max'"
exit 498
else if `min' == `max' {
di as err "`min' = `max'"
exit 498

foreach v of local varlist {
capture assert `v' == int(`v') if `touse'
if _rc {
di as err "non-integer values in `v'"
exit 498

quietly foreach v of local varlist {
clonevar new_`v' = `v' if `touse'
` replace new_`v' = `min' + `max' - `v'

forval i = `min'/`max' {
local j = `min' + `max' - `i'
local lbl : label (old) `j'
label def new_`v' `i' `"`lbl'"', modify

label val new_`v' new_`v'

What the new names are is wired in. An ambition might be to
give the user more control over the names. This code is not

So, there are at least two paths here: add an extra hook
to -vreverse-, or write some fresh code.

Finally, note that Kyle Longest posted -revrs- on SSC recently for
a similar problem. I looked at its help to see what it does,
and found it a bit vague, but you could go further.


Christopher W. Ryan, MD

I am analyzing a survey that contains the Toronto Alexithymia scale, the
Interpersonal Reactivity Index, and the Psychological Mindedness Scale.
These are typical ordinal "Likert-type" scales, with values on each item
ranging from 1-5, 0-4, etc, depending on the scale.

Many of the items contained no responses at one end of the scale or the
other. For example, on many items, none of the respondents chose 0, 1,
or 2; everyone chose 3 or above.

I used the vreverse package to reverse the items that needed reversing,
but I wonder now if that was inappropriate when the full range of values
was not used by respondents. If an item *can* range from 1 to 5, but
only 3-5 were actually *used*, would the original 5s be recoded to 1s in
the new variable (the desired behavior), or into 3s (undesired)?

I think it is the latter case:

sysuse auto, clear
label define rep 1 one 2 two 3 three 4 four 5 five
label values rep78 rep
vreverse rep78, generate(flipped)
tab rep78, nolabel
tab flipped, nolabel
tab rep78 flipped, nolabel

*now drop all the cases where rep78 is less than 3, and do it all again.
drop if rep78<3
vreverse rep78, generate(flipped)
tab rep78, nolabel
tab flipped, nolabel
tab rep78 flipped, nolabel

Do I understand the behavior of vreverse properly?
* For searches and help try:

© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index