Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: removing characters from string-formatted variables mixed in with numeric-formatted variables


From   Doug Hess <douglasrhess@gmail.com>
To   "Cohen, Elan" <cohened@upmc.edu>
Subject   st: Re: removing characters from string-formatted variables mixed in with numeric-formatted variables
Date   Fri, 22 Jun 2012 12:06:09 -0400

Somebody replied off list to me with the following code which worked
wonderfully. The -ds- command is for listing "variables matching name
patterns or other characteristics." (See -help ds- ). This is only a
minor difference from the suggestion by Elan below, which also works.

ds *, has(type string)
  display "`r(varlist)'"
  destring `r(varlist)', replace ignore("'")

Thank you.

-Doug


On Fri, Jun 22, 2012 at 11:28 AM, Cohen, Elan <cohened@upmc.edu> wrote:
> Doug,
>
> I believe the following one-liner should work for you:
>
> destring *, replace ignore("'")
>
> HTH,
>
> - Elan
>
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Doug Hess
> Sent: Friday, June 22, 2012 11:03
> To: statalist@hsphsun2.harvard.edu
> Cc: Doug Hess
> Subject: st: removing characters from string-formatted variables mixed in with numeric-formatted variables
>
> Hello,
>
> I imported into Stata from text files a data set of survey responses
> for a large national survey. Many of the variables have single quotes
> around numeric values. For instance, a variable may include the values
> '-9', '1', '2' instead of simply -9, 1, 2.  However, not every
> variable includes these characters for numeric values. (Not sure why!)
> Thus, Stata formats some variables as string and some as numeric
> during the import (using the import "text data from a spreadsheat"
> menu). However, the order of the variables is not strings first,
> numeric second. It's all hodgepodge.
>
> I want to remove all the stray single quote marks. So, after poking
> around on Statalist I tried using the -replace- command, the
> -subinstr- function, and a loop:
>
> local abc = "control bedrms region smsa metro3 lmed lmeda lmedb fmr"
> /* Note I truncated this list, there are dozens of variables in the
> dataset I wish to clean up. */
>    foreach varname of local abc {
>        replace `varname'=subinstr(`varname',"'","",.)
>        destring `varname', replace
>        }
>
> However, this loop stops when it runs into a variable formatted as
> numeric. Given that there are dozens of these variables, I don't want
> to use the -order- command one by one to put the string variables
> first (or last). Is there a way to use the format of the variables
> with -if- to limit the -order- command or -replace- command? Or other
> ideas?
>
> Thank you. (Note: I subscribe to the list's digest mode, so cc'ing me
> on any responses would be helpful.)
>
> Doug
> douglasrhess@gmail.com
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index