Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Robert Picard <picard@netbox.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: loop until "0 real changes made" |

Date |
Mon, 29 Jul 2013 18:19:31 -0400 |

You are indeed correct that -clonevar- cannot possibly be faster than a properly constructed -generate- command since it is implemented as an ado and ultimately calls -generate-, a built-in command. You therefore raise the issue of how much overhead is involved in calling -clonevar-. I would not worry about it: . clear . set obs 10 obs was 0, now 10 . timer clear . gen s = string(uniform(),"%21x") . local repeat 10000 . forvalues i=1/`repeat' { 2. timer on 1 3. clonevar s2 = s 4. timer off 1 5. timer on 2 6. gen `:type s' s4 = s 7. timer off 2 8. drop s2 s4 9. } . . timer list 1: 0.54 / 10000 = 0.0001 2: 0.06 / 10000 = 0.0000 . dis "overhead for a single call = " (r(t1) - r(t2)) / `repeat' overhead for a single call = .0000482 On Mon, Jul 29, 2013 at 4:16 PM, Sergiy Radyakin <serjradyakin@gmail.com> wrote: > -Clonevar- uses the information that the width of the result is known, > so compared to unassisted -generate- it saves, basically a -compress- > cycle. However the pure -generate- with type specified is still about > 10% faster then -clonevar- in your example (single CPU Stata): > > . forval i=1/100 { > 2. > . timer on 1 > 3. clonevar s2 = s > 4. timer off 1 > 5. > . timer on 2 > 6. gen `:type s' s4 = s > 7. timer off 2 > 8. > . drop s2 s4 > 9. } > r; t=44.60 15:58:21 > > . > . timer list > 1: 23.10 / 100 = 0.2310 > 2: 21.47 / 100 = 0.2147 > > Best, Sergiy > > > > On Mon, Jul 29, 2013 at 3:42 PM, Robert Picard <picard@netbox.com> wrote: >> Perhaps an example will explain why... >> >> * --------------- begin example --------------------------- >> clear >> set obs 1000000 >> >> set rms on >> >> gen s = string(uniform(),"%21x") >> clonevar s2 = s >> gen s3 = s >> >> gen `:type s' s4 = s >> >> * --------------- end example ----------------------------- >> >> >> >> On Mon, Jul 29, 2013 at 3:34 PM, Sergiy Radyakin <serjradyakin@gmail.com> wrote: >>> On Mon, Jul 29, 2013 at 3:19 PM, Robert Picard <picard@netbox.com> wrote: >>>> Here's a more complete example of how to continue making substitutions >>>> until there are no more changes. I'm with Nick on using -clonevar- >>>> when making an exact copy of a variable, it is faster than -generate- >>> >>> Pardon my ignorance, but how is -clonevar- (implemented as an ado >>> program) possibly faster than -generate- (built-in), if it is using >>> -generate- inside and on top of that does some other things?? (like >>> copying labels, formats, etc, which are not necessary for this >>> exercise). >>> >>> From clonevar.ado ( 1.0.1 13oct2004): >>> gen `type' `newvar' = `varname' `if' `in' >>> >>> Sergiy >>> >>> . >>>> Also, avoid -regexr()- in Stata 13, it's slow as molasses. >>>> >>>> * --------------- begin example --------------------------- >>>> clear >>>> set obs 100000 >>>> >>>> gen AD1 = string(uniform(),"%21x") >>>> gen AD2 = string(uniform(),"%21x") >>>> list in 1/5 >>>> >>>> foreach v of var AD* { >>>> local more 1 >>>> while `more' { >>>> clonevar stemp = `v' >>>> replace `v' = subinstr(`v',"0X-","X-",.) >>>> count if `v' != stemp >>>> local more = r(N) >>>> drop stemp >>>> } >>>> } >>>> list in 1/5 >>>> * --------------- end example ----------------------------- >>>> >>>> >>>> On Mon, Jul 29, 2013 at 12:34 PM, Sergiy Radyakin >>>> <serjradyakin@gmail.com> wrote: >>>>> Nick's solution with two variables is the most generic approach that >>>>> is useful in situations where it is difficult to predict if any >>>>> changes are going to happen as a result of your code. It certainly is >>>>> going to work here as well (I would only use a tempvar instead of AD2 >>>>> and generate instead of clonevar). >>>>> >>>>> However, why would you do this recoding to non-Turkish characters? >>>>> Stata works with Turkish characters like with any other for which a >>>>> corresponding ANSI page is available and proper font is installed: >>>>> >>>>> http://radyakin.org/statalist/2013072901/turkish.png >>>>> http://radyakin.org/statalist/2013072901/turkish.do >>>>> >>>>> The ANSI page for Turkish is 1254. And I would try e.g.: >>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(158)'","`=char(208)'") >>>>> instead of >>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(158)'","G") >>>>> >>>>> >>>>> Best, Sergiy Radyakin >>>>> >>>>> On Mon, Jul 29, 2013 at 10:06 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>> Plus the "+" if needed. >>>>>> Nick >>>>>> njcoxstata@gmail.com >>>>>> >>>>>> >>>>>> On 29 July 2013 15:05, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>>> One answer is not to use regular expressions here at all. Use >>>>>>> -subinstr()- with statements like >>>>>>> >>>>>>> replace `v' = subinstr(`v', "`=char(195)'`=char(135)'","C", .) >>>>>>> >>>>>>> Another answer is to set up a count of changes and stop when you hit zero. >>>>>>> >>>>>>> clonevar AD2 = AD >>>>>>> >>>>>>> foreach v of var AD { >>>>>>> replace AD2 = AD >>>>>>> <work with AD> >>>>>>> count if AD2 != AD >>>>>>> if r(N) == 0 continue, break >>>>>>> } >>>>>>> >>>>>>> Nick >>>>>>> njcoxstata@gmail.com >>>>>>> >>>>>>> On 29 July 2013 14:48, Haluk Vahaboglu <vahabo@hotmail.com> wrote: >>>>>>> >>>>>>>> I am using Stata 12.1 for Linux-64 bit and dealing with Turkish characters in string variables. I convert these Turkish characters (ı, ş, ü etc) to readable equivalents (i, s, u etc). Doing this with the code below: >>>>>>>> >>>>>>>> foreach v of var AD { >>>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(135)'","C") >>>>>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(176)'","I") >>>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(167)'","c") >>>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(182)'","o") >>>>>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(177)'","i") >>>>>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(158)'","G") >>>>>>>> replace `v'=regexr(`v', "`=char(196)'+`=char(159)'","g") >>>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(156)'","U") >>>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(188)'","u") >>>>>>>> replace `v'=regexr(`v', "`=char(197)'+`=char(158)'","S") >>>>>>>> replace `v'=regexr(`v', "`=char(195)'+`=char(150)'","O") >>>>>>>> replace `v'=regexr(`v', "`=char(197)'+`=char(159)'","s") >>>>>>>> } >>>>>>>> >>>>>>>> However, this code cannot accomplish the conversion at the first time. Therefore, I have to do it 5 to 10 times to get a (0 real changes made) message. >>>>>>>> My question is: can I make this loop run automatically until I get the (0 real changes made) message which indicates that all characters are converted. >>>>>> >>>>>> * >>>>>> * For searches and help try: >>>>>> * http://www.stata.com/help.cgi?search >>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>> >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: loop until "0 real changes made"***From:*Haluk Vahaboglu <vahabo@hotmail.com>

**Re: st: loop until "0 real changes made"***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: loop until "0 real changes made"***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: loop until "0 real changes made"***From:*Sergiy Radyakin <serjradyakin@gmail.com>

**Re: st: loop until "0 real changes made"***From:*Robert Picard <picard@netbox.com>

**Re: st: loop until "0 real changes made"***From:*Sergiy Radyakin <serjradyakin@gmail.com>

**Re: st: loop until "0 real changes made"***From:*Robert Picard <picard@netbox.com>

**Re: st: loop until "0 real changes made"***From:*Sergiy Radyakin <serjradyakin@gmail.com>

- Prev by Date:
**RE: st: References for `= foo'** - Next by Date:
**Re: st: loop until "0 real changes made"** - Previous by thread:
**Re: st: loop until "0 real changes made"** - Next by thread:
**RE: st: loop until "0 real changes made"** - Index(es):