Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: What is EGEN_Varname and EGEN_SVarname ?


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: What is EGEN_Varname and EGEN_SVarname ?
Date   Tue, 31 Jul 2012 09:13:01 -0500

Fuller comment is difficult given that you have not formally declared
exactly what you want to support and that non-trivial code needs to be
tested through a detailed script, but a very quick glance at your code
identifies various problems which you may want to know about. (As
earlier indicated, I am not in sympathy with the overall goal and
recommend rather, as did Nick Winter, just looping over -egen- calls,
which is, in my experience, more efficient, and less error-prone, than
trying to program like this.)

1. As pointed out before, your local macro -types- will be more than
244 characters long, so -strpos()- will fail sometimes to do what you
want it to do. Most commands that allow variable types to be specified
just pass the variable type to -generate- and let that find a syntax
error in specifying a variable type. Another way to test  a variable
type is that one of the following should always work

tempvar foo
capture gen `type'  `foo' = 1
if _rc gen `type' `foo' = "1"
if _rc {
      di as err "`type' invalid variable type"
      exit 198
}
drop `foo'

2. You seem to be presuming that at most single explicit variable type
will be sufficient for a multiple call to -egen-. In practice that
will be a fair assumption for most numeric problems, but likely to be
wrong for at least some string problems.

3. As in #1, some of your other manipulations will fail to work
properly with strings longer than 244 characters. (That -wordcount()-
will bite you was raised in a different thread yesterday.)

4. As I understand it, your syntax does not test for -if-, -in- or
missing values at the outset, but lets each individual -egen- call
sort that out. That's your call as program author, but note that it
could lead to inconsistencies with different input variables,
especially with missing values. It's a more usual Stata standard to
use -marksample- to identify a subset of observations with non-missing
values on all specified observations on all variables specified.

5. Your -program- definition lacks a -version- statement.

6. There is some rather ad hoc parsing. The -parse- command would make
some of your coding easier (and easier to follow).

Nick

On Mon, Jul 30, 2012 at 5:50 PM, Pradipto Banerjee
<pradipto.banerjee@adainvestments.com> wrote:
> In case, it helps why I was asking about EGEN_Varname and EGEN_SVarname, this is a command of the-egenmult- code I have written that repeatedly calls -egen-. The way it works is as follows:
>
> . sysuse auto, clear
>
> Example # 1
> . by foreign, sort: egenmult {test1 test2} = sum({price mpg})
>
> It works equivalently as:
>
> . sort foreign
> . by foreign: egen test1 = sum(price)
> . by foreign: egen test2 = sum(mpg)
>
> Example # 2
> . drop test1 test2
> . by foreign, sort: egenmult float {test1 test2} = pctile(price), p({10 90})
>
> It works equivalent as:
>
> . sort foreign
> . by foreign: egen float test1 = pctile(price),p(10)
> . by foreign: egen float test2 = pctile(price),p(90)
>
> I've been trying to see if I'm missing anything in the code below. Thanks.
>
> -----
> program define egenmult, byable(onecall) sortpreserve
>         local fullexp `0'
>         gettoken type 0 : 0, parse(" ")
>         local restexp `"`0'"'
>
>         /* gen a list of all possible types */
>         local types "byte int long float double"
>         forvalues i=1(1)244 {
>                 local types "`types' str`i'"
>         }
>         local typegiven strpos(`"`types'"',`"`type'"')
>         if `typegiven' > 0 local fullexp `restexp'
>         else local type
>
>         /* get the number & locations of the { } */
>         local partexp = `"`fullexp'"'
>         local curvefnd = 0
>         local numbrak = 0
>         local allopenpos
>         local allclosepos
>         local alloptions
>         while `curvefnd' == 0 {
>                 local openpos = strpos(`"`partexp'"',"{")
>                 if `openpos'>0 {
>                         local allopenpos `allopenpos' `openpos'
>                         local closepos = strpos(`"`partexp'"',"}")
>                         if `closepos' == 0 error 198
>                         local allclosepos `allclosepos' `closepos'
>                         local numbrak = `numbrak'+1
>                         if `"`alloptions'"'=="" {
>                                 local newoptions = substr(`"`partexp'"',`openpos'+1,`closepos'-`openpos'-1)
>                                 local alloptions `"`newoptions'"'
>                                 local numloop = wordcount(`"`newoptions'"')
>                         }
>                         else {
>                                 local newoptions = substr(`"`partexp'"',`openpos'+1,`closepos'-`openpos'-1)
>                                 if `numloop'!=wordcount(`"`newoptions'"') error 198
>                                 local alloptions `"`alloptions'"' `"`newoptions'"'
>                         }
>                 }
>                 else {
>                         local curvefnd = 1
>                 }
>                 local partexp = subinstr(`"`partexp'"',"{","!",1)
>                 local partexp = subinstr(`"`partexp'"',"}","!",1)
>         }
>         local tollen = strlen(`"`partexp'"')
>
>         /* recreate & run the individual commands */
>         forvalues iloop=1(1)`numloop' {
>                 forvalues jbrak=1(1)`numbrak' {
>                         if `jbrak'==1 {
>                                 local jthbrakopen : word `jbrak' of `allopenpos'
>                                 local newexp = substr(`"`partexp'"',1,`jthbrakopen'-1)
>                                 local repexp : word `jbrak' of `"`alloptions'"'
>                                 local repexp = word(`"`repexp'"',`iloop')
>                                 local newexp = `"`newexp'"' + " " + `"`repexp'"' + " "
>                         }
>                         else {
>                                 local jthbrakopen : word `jbrak' of `allopenpos'
>                                 local jprevbrak = `jbrak'-1
>                                 local jprevbrakclose : word `jprevbrak' of `allclosepos'
>                                 local newexp = `"`newexp'"' + " " + substr(`"`partexp'"',`jprevbrakclose'+1,`jthbrakopen'-`jprevbrakclose'-1)
>                                 local repexp : word `jbrak' of `"`alloptions'"'
>                                 local repexp = word(`"`repexp'"',`iloop')
>                                 local newexp = `"`newexp'"' + " " + `"`repexp'"' + " "
>                         }
>                 }
>                 local jprevbrakclose : word `numbrak' of `allclosepos'
>                 local newexp = `"`newexp'"' + substr(`"`partexp'"',`jprevbrakclose'+1,`tollen'-`jprevbrakclose')
>                 if _by() by `_byvars': egen `type' `newexp'
>                 else egen `type' `newexp'
>         }
>
> end
>
> -----
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index