Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Need Help with converting String Variables to Numeric Variables


From   "Impavido, Gregorio" <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: Need Help with converting String Variables to Numeric Variables
Date   Tue, 19 Jun 2012 11:01:52 -0400

In addition to the condition suggested by Nick, you could try the substitution before you destring.

Make a list of target entries to be substituted with 

gen var2 = var  // not to alter the original data
tab var2 if regexm(var2, "[^0-9 .]")  // or
tab var2 if missing(real(var2))

and then replace

replace var2 = "0" if var2=="targetvalues"

clearly not convenient if you happen to have many target values as you would have many replace entries. However it would spare you the use of -force- in -destring- (my personal lack of confidence with that option)


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Nick Cox
Sent: Tuesday, June 19, 2012 4:25 AM
To: [email protected]
Subject: Re: st: Need Help with converting String Variables to Numeric Variables

The table I suggested was the wrong way round.

tab numvar if missing(strvar)

should be

tab strvar if missing(numvar)

but the mention of -if missing(strvar)- might have alerted you to the
key trick: making your conversions conditional on the value of a
variable.

For example, empty strings "", one or more spaces " ", "  ", etc.,
periods ".", stray text "foo", inequalities such as "<4" will all map
to numeric missing under -destring, force-. It may be, at the easiest,
that

replace numvar = 0 if strpos(strvar, "<") == 1

is enough to get what you want, but you need to look at your data.

On Tue, Jun 19, 2012 at 3:38 AM, Dudekula, Anwar <[email protected]> wrote:
> Dear Nick and Daniel,
>
> Thanks a lot for the response.
>
> Unfortunately, I have missing values in the original variable.
>
> Hence the new variable generated with destring command and  force option generates new additional missing values  in addition to missing values generated corresponding  to the missing values in original variable
>
> The practical problem with this issues is that the original variable is a biomarker  and  a missing value is  being created for an observation like "<0.01" in original variable.
>
> In stata missing value is a huge number but infact this number should be equal to zero

Nick Cox [[email protected]]

> I agree with Daniel. But I would check what is being mapped to zero.
>
> destring numvar, force gen(strvar)
> tab numvar if missing(strvar)

> On Mon, Jun 18, 2012 at 11:58 PM, daniel klein
> <[email protected]> wrote:
>
>> Can't you just -destring- your string variable, specifying the -force-
>> option, and -replace- the created missing values with 0 in the new
>> variable? Or am I missing something here?
>
> Anwar
>
>> I am working on a data set of a hospital where  one of the variables
>> has string values as observations .
>>
>> I am able to convert numeric string values to numeric values using
>> destring command.
>>
>> But I have some of the observations as "< 4", "<0.04","less than 0.1"
>> which needs to be converted to zero .
>>
>> In Stata we have option to convert nonmumeric values to missing values
>> , but I need to convert them to Zero .

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index