Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: How blanks are treated when vars are read in as string from an ASCII raw data file


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: How blanks are treated when vars are read in as string from an ASCII raw data file
Date   Thu, 24 Nov 2005 10:45:57 -0000

This is the dreaded loop over observations. In 
most problems it is unnecessary, including this
one. Thanks to the smart people at StataCorp, 
you can go 

gen newvar = strvar1 + substr("000",1, 3 - length(strvar2)) + strvar2 

and the loop comes free as part of the innards of -generate-. 

In Clare's problem, trim(strvar1), trim(strvar2) and 
length(trim(strvar2)) might be needed and would do no 
harm. 

Nick 
[email protected] 

Richard Palmer-Jones
 
> * if your variables are strvar1 & strvar2
> gen str6 newvar = ""
> qui des
> local N = r(N)
> forval i = 1/`N' {
>     local t1 = strvar1[`i']
> 
>     local t2 = strvar2[`i']
>     local leng2 = length("`t2'")
>     local dum2 = ""
>     if `leng2' == 1 {
>         local dum2 = "00"
>     }
>     else if `leng2' == 2 {
>         local dum2 = "0"
>     }
>     local var = "`t1'`dum2'`t2'"
>     replace newvar = "`var'" in `i'
> }
> list
> 
>     | strvar1   strvar2   newvar |
>      |----------------------------|
>   1. |       B         1     B001 |
>   2. |       B       120     B120 |
>   3. |     CCH         7   CCH007 |
>   4. |     CCH        23   CCH023 |
>   5. |     CCH       213   CCH213 |
>      |----------------------------|
>   6. |      UW        23    UW023 |
>   7. |      UW       232    UW232 |
> 
> Richard
> 
> On 11/24/05, Ian Watson <[email protected]> wrote:
> > Clare
> >
> > I've reproduced your problem and compared it with 
> yesterday's solution
> > and can only come up with one suggestion.
> >
> > When you infile the string and then split it, using the 
> substr function,
> > the right hand component (which I called num in yesterday's 
> post) has
> > leading blanks on it. These are then replaced by leading 0s 
> using the
> > subinstr function.
> >
> > However, when you infile the string as two strings, Stata possibly
> > strips the leading blanks from it. Even though it has the 
> designation of
> > a str3 type, it may not have the same "contents" as num did 
> (which was
> > also a str3 type) because that latter was created from substr. That
> > is " 23" and "23" look the same on the screen, but they're 
> not the same
> > data.
> >
> > This is only a guess, and I can't find an easy way to test 
> it. But it
> > suggests you're better off reading your string in as a full 
> string, then
> > splitting it, rather than as two strings. At least that 
> works (even if
> > the reason is not altogether clear to me why).

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index