Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: is there a -hexdump- command for variables?


From   Sergiy Radyakin <serjradyakin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: is there a -hexdump- command for variables?
Date   Thu, 12 Nov 2009 18:00:39 -0500

Dear Ada,

for numerical variables setting format to %21x is quite useful:
. sysuse auto
(1978 Automobile Data)

. format price %21x

. list price in 1/5

     +-----------------------+
     |                 price |
     |-----------------------|
  1. | +1.0030000000000X+00c |
  2. | +1.28d0000000000X+00c |
  3. | +1.dae0000000000X+00b |
  4. | +1.2d00000000000X+00c |
  5. | +1.e930000000000X+00c |
     +-----------------------+


For string variables see below.

Best regards,
    Sergiy Radyakin

net from http://www.adeptanalytics.org/radyakin/stata/hexstring
or
http://www.adeptanalytics.org/radyakin/stata/hexstring/hexstring.exe
for Windows users
or
http://www.adeptanalytics.org/radyakin/stata/hexstring/hexstring.zip
for single file download


 -hexstring- a command to create a new string variable containing ASCII codes
                 of values of another string variable.

Author:	Sergiy Radyakin, Consultant, DECRG, The World Bank
Date:  	12nov2009

Syntax:
	hexstring varname, generate(newvarname) [separator("something") decimal]

Explanation:
	varname - is the name of existing string variable
	newvarname - is the name of the new variable to be generated and hold
	             the codes
	separator - if specified, this character will separate the byte codes
	             in the resulting strings
	decimal - if specified, codes should be decimal codes, default (if
	             this option is not specified) is hexadecimal (base 16) codes

Command will issue warnings for every observation where there is not enough
space to store the codes of all the characters in the resulting variable.
Current Stata's limit for string variables is 244 characters.

-hexstring- will store as many characters as possible.

Use command -hexstring_vars- for multiple variables.

Syntax:
	hexstring_vars varlist, stub(string) [separator("something") decimal]

Explanation:
	varlist - is the list of variables to be converted
	stub - is the prefix for the new variable name, e.g. if stub is "codes",
	             then variable "make" will be encoded into "codes_make"
Other options are described above in help for -hesxtring-

hexstring_test.do and hexstring_vars_test.do are examples of use of
these commands.

Examples:

.
. sysuse auto
. hexstring make, generate("make_hex")

. list make make_hex in 1/5, notrim

       make                                          make_hex
  1.   AMC Concord           41 4d 43 20 43 6f 6e 63 6f 72 64
  2.   AMC Pacer                   41 4d 43 20 50 61 63 65 72
  3.   AMC Spirit               41 4d 43 20 53 70 69 72 69 74
  4.   Buick Century   42 75 69 63 6b 20 43 65 6e 74 75 72 79
  5.   Buick Electra   42 75 69 63 6b 20 45 6c 65 63 74 72 61

.
. display "Decimal character codes"
Decimal character codes

. hexstring make, generate("make_dec") decimal

. list make make_dec in 1/5, notrim

       make                                                   make_dec
  1.   AMC Concord               65 77 67 32 67 111 110 99 111 114 100
  2.   AMC Pacer                          65 77 67 32 80 97 99 101 114
  3.   AMC Spirit                   65 77 67 32 83 112 105 114 105 116
  4.   Buick Century   66 117 105 99 107 32 67 101 110 116 117 114 121
  5.   Buick Electra     66 117 105 99 107 32 69 108 101 99 116 114 97

.
. hexstring make, generate("make_hexdot") separator(".")

. list make make_hexdot in 1/5, notrim

       make                                       make_hexdot
  1.   AMC Concord           41.4d.43.20.43.6f.6e.63.6f.72.64
  2.   AMC Pacer                   41.4d.43.20.50.61.63.65.72
  3.   AMC Spirit               41.4d.43.20.53.70.69.72.69.74
  4.   Buick Century   42.75.69.63.6b.20.43.65.6e.74.75.72.79
  5.   Buick Electra   42.75.69.63.6b.20.45.6c.65.63.74.72.61


*** END OF FILE ***




On Thu, Nov 12, 2009 at 3:26 PM, Ada Ma <heu034@googlemail.com> wrote:
> Hi Statalisters,
>
> The -hexdump- command screens a file to check out what characters
> appear within a file.  I am hoping to do something similar - but only
> with a subset of the variables within the data.
>
> I know that I can do it by saving the variables in a separate file,
> and do a -hexdump- on that file.  I am just wondering if there are
> other command which would allow me to save that step.
>
> I have a couple dozens of string varibles and I want to check that
> they only have alphabetical and numerical characters within them, and
> strip out the characters which aren't.  It's kind of hard to know what
> I need to strip out if I don't know what are in there to be strip out.
>  Thus my question above.
>
> Many thanks for your help in advance!!
>
> Regards,
> Ada
>
>
> --
> Ada Ma
> Research Fellow
> Health Economics Research Unit
> University of Aberdeen, UK.
> http://www.abdn.ac.uk/heru/
> Tel: +44 (0) 1224 555189
> Fax: +44 (0) 1224 550926
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index