Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: Copy string variable as value label


From   Friedrich Huebler <huebler@rocketmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: Re: Copy string variable as value label
Date   Mon, 25 Apr 2005 10:22:46 -0700 (PDT)

Nick and Roger,

Thank you. With -sencode- my original code can be replaced by a
single line.

. sencode var, gen(newvar) gsort(mean)

Friedrich

--- Roger Newson <roger.newson@kcl.ac.uk> wrote:
> -sencode- can indeed solve Friedrich's problem (using the -gsort()-
> option 
> to encode in an arbitrary order). The current version of -sencode- 
> (downloadable from SSC) uses file manipulation only in 2 places:
> 
> 1. There is an initial -preserve- and a final -restore, not-, in
> case the 
> user presses -Break- in the middle of executing -sencode-.
> 
> 2. In order for -sencode- to work if the -label()- option is given
> as an 
> existing label, -sencode- uses -label save- to save the existing
> label to a 
> temporary file, and then uses -file- to read that temporary file
> and find 
> the highest integer with an existing label, so that any additional
> string 
> values encoded are allocated integers even higher. I couldn't find
> a better 
> way, at least in Stata 7 or 8, to obtain the highest labelled
> integer for 
> an existing label.
> 
> Roger
> 
> 
> At 15:46 25/04/2005, Nick Cox wrote (in reply to Friedrich
> Huebler):
> >This problem, or at least a relative of
> >it, can be attacked, I think, using Roger Newson's -sencode-.
> >
> >His solution includes a certain amount of file manipulation.
> >In my version of the problem when I looked
> >at it two years ago I didn't find any need
> >for that, but I haven't looked closely enough to work
> >out what aspects of the problem Roger solves that I
> >don't or indeed vice versa.
> >
> >There doesn't seem to be a help file for my resulting program,
> >but the code is a bit more general than yours.
> >
> >program seqencode, sortpreserve
> >*! NJC 1.0.0 1 May 2003
> >         version 8
> >         syntax varname(string) [if] [in], Generate(str) [
> Label(str) Unique ]
> >
> >         local limit = cond(c(flavor) == "Small", 1000, 65536)
> >
> >         quietly {
> >                 marksample touse, strok
> >                 count if `touse'
> >                 if r(N) == 0 error 2000
> >
> >                 // variable is new?
> >                 confirm new variable `generate'
> >
> >                 // label is new?
> >                 if "`label'" == "" local label "`generate'"
> >                 capture label list `label'
> >                 if _rc != 111 {
> >                         di as err "label `label' already defined"
> >                         exit 110
> >                 }
> >
> >                 if "`unique'" != "" {
> >                         // each value `touse' mapped to its own
> -label-
> >                         replace `touse' = -`touse'
> >                         sort `touse' `_sortindex'
> >
> >                         // define labels
> >                         count if `touse'
> >                         if `r(N)' > `limit' error 134
> >                         forval i = 1 / `r(N)' {
> >                                 label def `label' `i' ///
> >                                         `"`= `varlist'[`i']'"',
> modify
> >                         }
> >
> >                         gen long `generate' = _n  if `touse'
> >                 }
> >                 else {
> >                         // get first occurrences
> >                         tempvar first
> >                         bysort `touse' `varlist' (`_sortindex') :
> ///
> >                                 gen byte `first' = -(_n == 1 &
> `touse')
> >                         sort `first' `_sortindex'
> >
> >                         // define labels
> >                         count if `first'
> >                         if `r(N)' > `limit' error 134
> >                         forval i = 1 / `r(N)' {
> >                                 label def `label' `i' ///
> >                                         `"`= `varlist'[`i']'"',
> modify
> >                         }
> >
> >                         // copy values from first occurrences
> >                         gen long `generate' = _n  if `touse'
> >                         bysort `touse' `varlist' (`generate'):
> ///
> >                                 replace `generate' =
> `generate'[1]
> >                 }
> >
> >                 compress `generate'
> >
> >                 // assign labels
> >                 label val `generate' `label'
> >                 label var `generate' `"`: variable label
> `varlist''"'
> >         }
> >end
> >
> >Nick
> >n.j.cox@durham.ac.uk
> >
> >Friedrich Huebler
> >
> > > When a string variable is converted to a numeric variable with
> > > -encode-, the numeric values follow the sort order of the
> string
> > > variable. I would like to -encode- a string variable based on
> the
> > > sort order of another variable. My original data is like this:
> > >
> > > var   mean
> > > a     1.5
> > > b     1.2
> > > b     1.2
> > > b     1.2
> > > c     1.8
> > > c     1.8
> > >
> > > I would like to create the variable "newvar" like this, using
> the
> > > sort order of the variable "mean":
> > >
> > > var   mean   newvar   (label for newvar)
> > > b     1.2    1        b
> > > b     1.2    1        b
> > > b     1.2    1        b
> > > a     1.5    2        a
> > > c     1.8    3        c
> > > c     1.8    3        c
> > >
> > > My solution is shown below. Creating "newvar" itself is simple
> but
> > > there must be a better way to assign the labels.
> > >
> > > sort mean
> > > egen newvar = group(mean)
> > > lab def newvar 1 "temp"
> > > levels(newvar), local(levels)
> > > foreach l of local levels {
> > >   gen temp = ""
> > >   replace temp = var if newvar==`l'
> > >   levels(temp), local(templabel)
> > >   lab def newvar `l' `templabel', modify
> > >   drop temp
> > > }
> > > lab val newvar newvar
> > >
> > > How can this code be improved? Thank you for your suggestions.

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index