Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: String function


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: String function
Date   Mon, 24 Oct 2005 13:15:53 +0100

Just a picky footnote: the initial -generate- 
and repeated -replace- can be avoided here. 

gen var1copy = var1
gen comma = 0
gen length = length(var1)
summ length
local length = r(max)
forvalues i = 1/`length' {
	replace comma = comma + 1 if substr(var1,1,1)==","
	replace var1 = substr(var1,2,.)
}

can thus be slimmed to 

gen comma = 0
gen length = length(var1)
summ length, meanonly 
forvalues i = 1/`r(max)' {
	replace comma = comma + (substr(var1,`i',1) == ",") 
}

or, if you were paid by the reciprocal of the number of 
lines written, to 

gen comma = 0 
forvalues i = 1/`=substr("`: type var1'",4,.)' { 
	replace comma = comma + substr(var1,`i',1) == ",") 
}

Nick 
[email protected] 

Rafal Raciborski
 
> I did not find an earlier answer either but the below works fine.
> rafal
> 
> 
> . use comma, clear
> 
> . list
> 
>      +-----------------------------+
>      |                        var1 |
>      |-----------------------------|
>   1. |             one, two, three |
>   2. |                         one |
>   3. | one, two, three three, four |
>   4. |          one one, two two,  |
>   5. |                     ,, one  |
>      |-----------------------------|
>   6. |                             |
>   7. |                 one, two,,, |
>   8. |                        ,,,, |
>   9. |                       ,,.., |
> 10. |          ,. .. . ,, ,-, =.  |
>      +-----------------------------+
> 
> . gen var1copy = var1
> (1 missing value generated)
> 
> . gen comma = 0
> 
> . gen length = length(var1)
> 
> . summ length
> 
>     Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>       length |        10        10.8    8.534896          0         27
> 
> . local length = r(max)
> 
> . forvalues i = 1/`length' {
>   2.         replace comma = comma + 1 if substr(var1,1,1)==","
>   3.         replace var1 = substr(var1,2,.)
>   4. }
> 
> <snip>
> 
> . list
> 
>      +-----------------------------------------------------+
>      | var1                      var1copy   comma   length |
>      |-----------------------------------------------------|
>   1. |                    one, two, three       2       15 |
>   2. |                                one       0        3 |
>   3. |        one, two, three three, four       3       27 |
>   4. |                 one one, two two,        2       18 |
>   5. |                            ,, one        2        7 |
>      |-----------------------------------------------------|
>   6. |                                          0        0 |
>   7. |                        one, two,,,       4       11 |
>   8. |                               ,,,,       4        4 |
>   9. |                              ,,..,       3        5 |
> 10. |                 ,. .. . ,, ,-, =.        5       18 |
>      +-----------------------------------------------------+

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index