Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: -trim()-ed but cannot -destring()-:- hidden text characters?


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: -trim()-ed but cannot -destring()-:- hidden text characters?
Date   Wed, 27 Apr 2005 22:17:37 +0100

Glad you solved your problem. I seize 
the opportunity to plug the multi-author 
FAQ on text editors and Stata:

http://fmwww.bc.edu/repec/bocode/t/textEditors.html

Now on -charlist- and -destring-, you are indeed incorrect, but 
forgiveably so! 

-charlist- is a user-written program from SSC, 
which I wrote during the lifetime of Stata 7. 
-destring- is an official Stata command. 

It is true that -charlist- did not find any 
exotic characters that are also obviously 
visible using the font I tried in the Results window. 

. charlist rank
 0123456789 

However, occurrences of char(160) are 
indicated by the results left in memory 
by -charlist-: 

. ret li

macros:
             r(chars) : " 0123456789 "
          r(sepchars) : "  0 1 2 3 4 5 6 7 8 9   "
             r(ascii) : "32 48 49 50 51 52 53 54 55 56 57 160 "

And -destring- is empowered once that information is 
available: 

. destring rank, ignore(`=char(160)') replace
rank: characters   removed; replaced as int

I understand char(160) to be what HTML calls 
a non-breakable space, so it is indeed difficult
to see when output as the highest ASCII character present, 
but realising that we can see that Stata did 
its level best to show it. Look carefully at the
results of -ret li-. 

How and why char(160) differs from char(32) 
I am hoping someone will be able to explain. 
Perhaps the two characters exist for the convenience of word 
processing software. 

This also explains why -list- was no apparent help here. 
-list- showed char(160) but not distinguishably. 

Nick 
n.j.cox@durham.ac.uk 

Daniel Egan
 
> Thanks to Ronan, Roger, and Dan Blanchette (offlist) for assistance
> with this. In the end a text editor was used to erase the hidden
> characters, and Dan Blanchette wrote a nifty little program (specific
> to my dataset?) which evicted the unwanted characters.
> 
> As  a final note for clarity's sake, I did in fact -list-, -browse-,
> and even -charlist- the string variables. -charlist- did not return
> any non-numeric elements, thus -destring, ignore()- was impotent.  As
> far as I know therefore, the only means to detect these characters is
> with a text editor.
> 
> If I am incorrect, please let me know. 
> 
> Cheers, 
> Dan
> 
>  
> 
> On 4/26/05, Ronán Conroy <rconroy@rcsi.ie> wrote:
> > Daniel Egan wrote:
> > 
> > > Hello list,
> > >
> > > I am having a strange problem converting strings into numerical
> > > format. I have a string which by all appearances is just a number.
> > >
> > 
> > Daniel's data did indeed contain odd characters which are invisible
> > under normal circumstances. A good text editor will show 
> them and allow
> > you to zap them. Actually, any text editor will do this; a good text
> > editor will do it free.
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index