Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: renaming variables based on long labels


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: renaming variables based on long labels
Date   Thu, 21 Jul 2011 10:45:33 -0500

By the way -rename- has been much extended in Stata 12. At some point
I will write up as a Statalist posting a kind of dictionary comparing
-renvars- examples and -rename- examples for anyone using 12 who is
curious.

Nick

On Thu, Jul 21, 2011 at 10:14 AM, Nick Cox <njcoxstata@gmail.com> wrote:
> Elan:
>
> You can use -abbrev(,)- followed by -subinstr()-. I don't think that
> is absolutely guaranteed to produce distinct variable names, but you
> can try. If you are happy with 32 character names the combinatorics
> would be in your favour, although using the same elements such as
> "Relationship" works the other way.
>
> Nick
>
> On Thu, Jul 21, 2011 at 9:51 AM, Cohen, Elan <cohened@upmc.edu> wrote:
>> Nick,
>>
>> My -su- example was simply meant to illustrate that Stata _does_ have machinery for recognizing unique variable names, or at least unique combinations of characters.  Here's another example.
>>
>> local pre somethinglong
>> g `pre'abcd = 0
>> g `pre'abdd = 0
>> su `pre'*
>> * The variables display as:
>> something~cd
>> something~dd
>>
>> g `pre'aacd = 0
>> su `pre'*
>> * The variables display as:
>> somethin~bcd
>> something~dd
>> somethin~acd
>>
>> Somehow Stata knows to abbreviate the variables differently based on the other relevant variables.  I'm imagining I could apply this same machinery to -rename- (obviously replacing '~' with say '_').  I can't view the source code for -summarize- so I'm not sure how Stata does this.
>>
>> Does anyone know of another command that does something similar with source code available?
>>
>> Thanks,
>>
>> - Elan
>>
>>
>>> -----Original Message-----
>>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
>>> statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
>>> Sent: Thursday, July 21, 2011 10:37
>>> To: statalist@hsphsun2.harvard.edu
>>> Subject: Re: st: renaming variables based on long labels
>>>
>>> You are confusing quite different facts.
>>>
>>> In your code, -renvars- (SJ) is adding the prefix "longprefix" to the
>>> variable names in the -bg2- dataset. But it so happens that all the
>>> variable names so created remain legal. The longest is just 18
>>> characters long so the upper limit of 32 is not biting at all.
>>>
>>> However, the names are long enough for Stata's abbreviation machinery
>>> to be used when you call -summarize-, but this is not a matter of
>>> changing the variable names, and in any case the character ~ is not
>>> allowed within variable names.
>>>
>>> I imagine your problem is perfectly soluble, but will require more ad
>>> hoc code from you. Example:
>>>
>>> foreach var of varlist VAR* {
>>>         local label `=strtoname("`:var lab `var''")
>>>         local label : subinstr local label "Relationship" "Reln", all
>>>         local label : subinstr local label "Democrat" "D", all
>>>         ....
>>> }
>>>
>>> On Thu, Jul 21, 2011 at 9:00 AM, Cohen, Elan <cohened@upmc.edu> wrote:
>>> >
>>> > I have a dataset with variable names VAR1-VAR40.  All variables have
>>> (pretty long) labels.  I'm using
>>> >
>>> > foreach var of varlist VAR* {
>>> >        rename `var' `=strtoname("`:var lab `var''")'
>>> > }
>>> >
>>> > to rename each variable according to their label.  I'm running into a
>>> problem because the variable labels are not unique within the first 32
>>> characters and so I'm getting "Relationship_of_interviewee_to_s already
>>> defined".
>>> >
>>> > Can Stata automatically generate a unique name based on existing variables?
>>>  This seems like it's possible based on the following output:
>>> >
>>> > webuse bg2
>>> > renvars, prefix(longprefix)
>>> > su
>>> >
>>> > where Stata truncates variables names and automatically finds a unique
>>> string to display for each variable (-findit renvars-).
>>> >
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index