Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Sergiy Radyakin <serjradyakin@gmail.com> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: loop until "0 real changes made" |

Date |
Mon, 29 Jul 2013 16:16:36 -0400 |

-Clonevar- uses the information that the width of the result is known, so compared to unassisted -generate- it saves, basically a -compress- cycle. However the pure -generate- with type specified is still about 10% faster then -clonevar- in your example (single CPU Stata): . forval i=1/100 { 2. . timer on 1 3. clonevar s2 = s 4. timer off 1 5. . timer on 2 6. gen `:type s' s4 = s 7. timer off 2 8. . drop s2 s4 9. } r; t=44.60 15:58:21 . . timer list 1: 23.10 / 100 = 0.2310 2: 21.47 / 100 = 0.2147 Best, Sergiy On Mon, Jul 29, 2013 at 3:42 PM, Robert Picard <picard@netbox.com> wrote: > Perhaps an example will explain why... > > * --------------- begin example --------------------------- > clear > set obs 1000000 > > set rms on > > gen s = string(uniform(),"%21x") > clonevar s2 = s > gen s3 = s > > gen `:type s' s4 = s > > * --------------- end example ----------------------------- > > > > On Mon, Jul 29, 2013 at 3:34 PM, Sergiy Radyakin <serjradyakin@gmail.com> wrote: >> On Mon, Jul 29, 2013 at 3:19 PM, Robert Picard <picard@netbox.com> wrote: >>> Here's a more complete example of how to continue making substitutions >>> until there are no more changes. I'm with Nick on using -clonevar- >>> when making an exact copy of a variable, it is faster than -generate- >> >> Pardon my ignorance, but how is -clonevar- (implemented as an ado >> program) possibly faster than -generate- (built-in), if it is using >> -generate- inside and on top of that does some other things?? (like >> copying labels, formats, etc, which are not necessary for this >> exercise). >> >> From clonevar.ado ( 1.0.1 13oct2004): >> gen `type' `newvar' = `varname' `if' `in' >> >> Sergiy >> >> . >>> Also, avoid -regexr()- in Stata 13, it's slow as molasses. >>> >>> * --------------- begin example --------------------------- >>> clear >>> set obs 100000 >>> >>> gen AD1 = string(uniform(),"%21x") >>> gen AD2 = string(uniform(),"%21x") >>> list in 1/5 >>> >>> foreach v of var AD* { >>> local more 1 >>> while `more' { >>> clonevar stemp = `v' >>> replace `v' = subinstr(`v',"0X-","X-",.) >>> count if `v' != stemp >>> local more = r(N) >>> drop stemp >>> } >>> } >>> list in 1/5 >>> * --------------- end example ----------------------------- >>> >>> >>> On Mon, Jul 29, 2013 at 12:34 PM, Sergiy Radyakin >>> <serjradyakin@gmail.com> wrote: >>>> Nick's solution with two variables is the most generic approach that >>>> is useful in situations where it is difficult to predict if any >>>> changes are going to happen as a result of your code. It certainly is >>>> going to work here as well (I would only use a tempvar instead of AD2 >>>> and generate instead of clonevar). >>>> >>>> However, why would you do this recoding to non-Turkish characters? >>>> Stata works with Turkish characters like with any other for which a >>>> corresponding ANSI page is available and proper font is installed: >>>> >>>> http://radyakin.org/statalist/2013072901/turkish.png >>>> http://radyakin.org/statalist/2013072901/turkish.do >>>> >>>> The ANSI page for Turkish is 1254. And I would try e.g.: >>>> replace `v'=regexr(`v', "`=char(196)'+`=char(158)'","`=char(208)'") >>>> instead of >>>> replace `v'=regexr(`v', "`=char(196)'+`=char(158)'","G") >>>> >>>> >>>> Best, Sergiy Radyakin >>>> >>>> On Mon, Jul 29, 2013 at 10:06 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>> Plus the "+" if needed. >>>>> Nick >>>>> njcoxstata@gmail.com >>>>> >>>>> >>>>> On 29 July 2013 15:05, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>> One answer is not to use regular expressions here at all. Use >>>>>> -subinstr()- with statements like >>>>>> >>>>>> replace `v' = subinstr(`v', "`=char(195)'`=char(135)'","C", .) >>>>>> >>>>>> Another answer is to set up a count of changes and stop when you hit zero. >>>>>> >>>>>> clonevar AD2 = AD >>>>>> >>>>>> foreach v of var AD { >>>>>> replace AD2 = AD >>>>>> <work with AD> >>>>>> count if AD2 != AD >>>>>> if r(N) == 0 continue, break >>>>>> } >>>>>> >>>>>> Nick >>>>>> njcoxstata@gmail.com >>>>>> >>>>>> On 29 July 2013 14:48, Haluk Vahaboglu <vahabo@hotmail.com> wrote: >>>>>> >>>>>>> I am using Stata 12.1 for Linux-64 bit and dealing with Turkish characters in string variables. I convert these Turkish characters (ı, ş, ü etc) to readable equivalents (i, s, u etc). Doing this with the code below: >>>>>>> >>>>>>> foreach v of var AD { >>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(135)'","C") >>>>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(176)'","I") >>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(167)'","c") >>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(182)'","o") >>>>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(177)'","i") >>>>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(158)'","G") >>>>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(159)'","g") >>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(156)'","U") >>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(188)'","u") >>>>>>> replace `v'=regexr(`v', "`=char(197)'+`=char(158)'","S") >>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(150)'","O") >>>>>>> replace `v'=regexr(`v', "`=char(197)'+`=char(159)'","s") >>>>>>> } >>>>>>> >>>>>>> However, this code cannot accomplish the conversion at the first time. Therefore, I have to do it 5 to 10 times to get a (0 real changes made) message. >>>>>>> My question is: can I make this loop run automatically until I get the (0 real changes made) message which indicates that all characters are converted. >>>>> >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: loop until "0 real changes made"***From:*Robert Picard <picard@netbox.com>

**Re: st: loop until "0 real changes made"***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: loop until "0 real changes made"***From:*Haluk Vahaboglu <vahabo@hotmail.com>

**Re: st: loop until "0 real changes made"***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: loop until "0 real changes made"***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: loop until "0 real changes made"***From:*Sergiy Radyakin <serjradyakin@gmail.com>

**Re: st: loop until "0 real changes made"***From:*Robert Picard <picard@netbox.com>

**Re: st: loop until "0 real changes made"***From:*Sergiy Radyakin <serjradyakin@gmail.com>

**Re: st: loop until "0 real changes made"***From:*Robert Picard <picard@netbox.com>

- Prev by Date:
**RE: st: Multicollinearity Problem in Stata** - Next by Date:
**Re: st: loop until "0 real changes made"** - Previous by thread:
**Re: st: loop until "0 real changes made"** - Next by thread:
**Re: st: loop until "0 real changes made"** - Index(es):