RE: st: Help with string problem

 From Fred Wolfe To statalist@hsphsun2.harvard.edu, Subject RE: st: Help with string problem Date Fri, 25 Aug 2006 10:51:30 -0500

That is a great egen. But it doesn't seem to work completely to omit HEX(A0), unless I have done something wrong. Always likely.

. use fwbids,clear
. egen apatkey2 = sieve(apatkey), keep(a n o)
. gen l1 = length(apatkey)
. gen l2 = length(apatkey2)

. egen apatkey3 = sieve(apatkey2), omit(space)
. gen l3 = length(apatkey3)

. egen apatkey4 = sieve(apatkey3), keep(a n)
. gen l4 = length(apatkey4)

+---------------------------------------------------------------------------------------+
| apatkey greger apatkey2 l1 l2 apatkey3 l3 apatkey4 l4 |
|---------------------------------------------------------------------------------------|
1. | ABI000000-01 1 ABI000000-01 12 12 ABI000000-01 12 ABI00000001 11 |
2. | AHR000000 1 AHR000000 12 11 AHR000000 11 AHR000000 9 |
3. | AHR360227 1 AHR360227 12 11 AHR360227 11 AHR360227 9 |
4. | ALB431118 1 ALB431118 12 11 ALB431118 11 ALB431118 9 |
5. | ALD771122 1 ALD771122 12 11 ALD771122 11 ALD771122 9 |
|---------------------------------------------------------------------------------------|

At 10:13 AM 8/25/2006, Nick Cox wrote:

```"you" here presumably meaning Fred's collaborators.

There is a home-grown -egen- function called -sieve()-
in -egenmore- from SSC that could be used to keep
alphanumeric characters only.

Nick
n.j.cox@durham.ac.uk

Rafal Raciborski

> you could also use the clean() function in excel first, which removes
> all nonprintable characters, before pasting into stata.
```
```
Fred Wolfe
National Data Bank for Rheumatic Diseases
Wichita, Kansas
Tel +1 316 263 2125
fwolfe@arthritis-research.org

