Re: st: renaming variables based on long labels

Thu, 21 Jul 2011 10:45:33 -0500

By the way -rename- has been much extended in Stata 12. At some point I will write up as a Statalist posting a kind of dictionary comparing -renvars- examples and -rename- examples for anyone using 12 who is curious. Nick On Thu, Jul 21, 2011 at 10:14 AM, Nick Cox <njcoxstata@gmail.com> wrote: > Elan: > > You can use -abbrev(,)- followed by -subinstr()-. I don't think that > is absolutely guaranteed to produce distinct variable names, but you > can try. If you are happy with 32 character names the combinatorics > would be in your favour, although using the same elements such as > "Relationship" works the other way. > > Nick > > On Thu, Jul 21, 2011 at 9:51 AM, Cohen, Elan <cohened@upmc.edu> wrote: >> Nick, >> >> My -su- example was simply meant to illustrate that Stata _does_ have machinery for recognizing unique variable names, or at least unique combinations of characters. Here's another example. >> >> local pre somethinglong >> g `pre'abcd = 0 >> g `pre'abdd = 0 >> su `pre'* >> * The variables display as: >> something~cd >> something~dd >> >> g `pre'aacd = 0 >> su `pre'* >> * The variables display as: >> somethin~bcd >> something~dd >> somethin~acd >> >> Somehow Stata knows to abbreviate the variables differently based on the other relevant variables. I'm imagining I could apply this same machinery to -rename- (obviously replacing '~' with say '_'). I can't view the source code for -summarize- so I'm not sure how Stata does this. >> >> Does anyone know of another command that does something similar with source code available? >> >> Thanks, >> >> - Elan >> >> >>> -----Original Message----- >>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- >>> statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox >>> Sent: Thursday, July 21, 2011 10:37 >>> To: statalist@hsphsun2.harvard.edu >>> Subject: Re: st: renaming variables based on long labels >>> >>> You are confusing quite different facts. >>> >>> In your code, -renvars- (SJ) is adding the prefix "longprefix" to the >>> variable names in the -bg2- dataset. But it so happens that all the >>> variable names so created remain legal. The longest is just 18 >>> characters long so the upper limit of 32 is not biting at all. >>> >>> However, the names are long enough for Stata's abbreviation machinery >>> to be used when you call -summarize-, but this is not a matter of >>> changing the variable names, and in any case the character ~ is not >>> allowed within variable names. >>> >>> I imagine your problem is perfectly soluble, but will require more ad >>> hoc code from you. Example: >>> >>> foreach var of varlist VAR* { >>> local label `=strtoname("`:var lab `var''") >>> local label : subinstr local label "Relationship" "Reln", all >>> local label : subinstr local label "Democrat" "D", all >>> .... >>> } >>> >>> On Thu, Jul 21, 2011 at 9:00 AM, Cohen, Elan <cohened@upmc.edu> wrote: >>> > >>> > I have a dataset with variable names VAR1-VAR40. All variables have >>> (pretty long) labels. I'm using >>> > >>> > foreach var of varlist VAR* { >>> > rename `var' `=strtoname("`:var lab `var''")' >>> > } >>> > >>> > to rename each variable according to their label. I'm running into a >>> problem because the variable labels are not unique within the first 32 >>> characters and so I'm getting "Relationship_of_interviewee_to_s already >>> defined". >>> > >>> > Can Stata automatically generate a unique name based on existing variables? >>> This seems like it's possible based on the following output: >>> > >>> > webuse bg2 >>> > renvars, prefix(longprefix) >>> > su >>> > >>> > where Stata truncates variables names and automatically finds a unique >>> string to display for each variable (-findit renvars-). >>> > > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

