[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: destring numeric and non-numeric data

From   Tirthankar Chakravarty <>
Subject   Re: st: destring numeric and non-numeric data
Date   Fri, 21 Aug 2009 16:13:59 +0100

Use the -sieve()- function in Nick Cox's excellent -egenmore- (SSC):
input str8 percent str15 words
"45%" "1234 sjkhdfjh kjdfk"
"45%" "1234 sjkhdfjh kjdfk"
"45%" "1234 sjkhdfjh kjdfk"
"45%" "1234 sjkhdfjh kjdfk"
"45%" "1234 sjkhdfjh kjdfk"
egen percentnum = sieve(percent), keep(numeric)
egen wordsnum = sieve(words), keep(numeric)
destring percentnum wordsnum, replace
li, clean


On Fri, Aug 21, 2009 at 4:04 PM, Taylor Cook<> wrote:
> I am working with CMS's Hospital Compare data for the first time. One
> of the sets lists recommended treatment for a condition (ex:aspirin
> for heart attack), the percent of patients with the condition that
> received the treatment (Score), and the total number of patients who
> presented with the condition (SampelSize).
> The variables I am interested in, Score and SampleSize, are both
> string variables and, here is the tricky part, CMS recorded the data
> with numeric and non-numeric symbols. For example, all of the scores
> are "95%" and the sample size is "106 patients." These percent symbols
> and the word "patient" have made it difficult to destring. Any
> suggestions would be greatly appreciated.
> Thanks,
> Taylor
> *
> *   For searches and help try:
> *
> *
> *

To every ω-consistent recursive class κ of formulae there correspond
recursive class signs r, such that neither v Gen r nor Neg(v Gen r)
belongs to Flg(κ) (where v is the free variable of r).

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index