Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: dropping digits from variables in Stata 10


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: dropping digits from variables in Stata 10
Date   Mon, 23 Feb 2009 11:13:50 -0000

These identifiers should be read in and maintained as string variables.
Then you can extract the first three digits using e.g.
-substr(str_zipcode, 1, 3)-.  Here -str_zipcode- is a string variable
version of a numeric -zipcode-. 

Retrospective surgery to get a string variable with leading zeros from a
numeric variable with integer values is 

gen str_zipcode = substr("0000000", 1, 7 - length(zipcode)) +
string(zipcode) 

or (more elegantly) 

gen str_zipcode = string(zipcode, "%07.0f") 

However, note that unless your variable was read in as a long or double
you may have lost accuracy in final digits, so re-doing the input and
insisting on string variables is likely to be much the safest way. 

Nick 
[email protected] 

Ekaterina Hertog

1) I have got a variable consisting of 7 digit zipcodes. I want to 
create a second variable which will only  consist of the first 3 digits 
of each zip-code and I cannot find a way to do it.
2) Some of the zipcodes start with 00, e.g 0037845 and Stata drops the 
front 00 turning such zipcodes into 5-digit numbers (e.g. 37845). I need

to make Stata understand that these 00 are meaningful and return them 
back into the zipcodes and I cannot find how to do this. They were 
present in my original csv file, but when I converted it into a Stata 
file using Stattransfer they were gone.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index