Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Need Help with converting String Variables to Numeric Variables

From   "Impavido, Gregorio" <>
To   "" <>
Subject   RE: st: Need Help with converting String Variables to Numeric Variables
Date   Tue, 19 Jun 2012 11:01:52 -0400

In addition to the condition suggested by Nick, you could try the substitution before you destring.

Make a list of target entries to be substituted with 

gen var2 = var  // not to alter the original data
tab var2 if regexm(var2, "[^0-9 .]")  // or
tab var2 if missing(real(var2))

and then replace

replace var2 = "0" if var2=="targetvalues"

clearly not convenient if you happen to have many target values as you would have many replace entries. However it would spare you the use of -force- in -destring- (my personal lack of confidence with that option)

-----Original Message-----
From: [] On Behalf Of Nick Cox
Sent: Tuesday, June 19, 2012 4:25 AM
Subject: Re: st: Need Help with converting String Variables to Numeric Variables

The table I suggested was the wrong way round.

tab numvar if missing(strvar)

should be

tab strvar if missing(numvar)

but the mention of -if missing(strvar)- might have alerted you to the
key trick: making your conversions conditional on the value of a

For example, empty strings "", one or more spaces " ", "  ", etc.,
periods ".", stray text "foo", inequalities such as "<4" will all map
to numeric missing under -destring, force-. It may be, at the easiest,

replace numvar = 0 if strpos(strvar, "<") == 1

is enough to get what you want, but you need to look at your data.

On Tue, Jun 19, 2012 at 3:38 AM, Dudekula, Anwar <> wrote:
> Dear Nick and Daniel,
> Thanks a lot for the response.
> Unfortunately, I have missing values in the original variable.
> Hence the new variable generated with destring command and  force option generates new additional missing values  in addition to missing values generated corresponding  to the missing values in original variable
> The practical problem with this issues is that the original variable is a biomarker  and  a missing value is  being created for an observation like "<0.01" in original variable.
> In stata missing value is a huge number but infact this number should be equal to zero

Nick Cox []

> I agree with Daniel. But I would check what is being mapped to zero.
> destring numvar, force gen(strvar)
> tab numvar if missing(strvar)

> On Mon, Jun 18, 2012 at 11:58 PM, daniel klein
> <> wrote:
>> Can't you just -destring- your string variable, specifying the -force-
>> option, and -replace- the created missing values with 0 in the new
>> variable? Or am I missing something here?
> Anwar
>> I am working on a data set of a hospital where  one of the variables
>> has string values as observations .
>> I am able to convert numeric string values to numeric values using
>> destring command.
>> But I have some of the observations as "< 4", "<0.04","less than 0.1"
>> which needs to be converted to zero .
>> In Stata we have option to convert nonmumeric values to missing values
>> , but I need to convert them to Zero .

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index