Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: What is EGEN_Varname and EGEN_SVarname ?


From   Pradipto Banerjee <pradipto.banerjee@adainvestments.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: What is EGEN_Varname and EGEN_SVarname ?
Date   Mon, 30 Jul 2012 17:50:08 -0500

In case, it helps why I was asking about EGEN_Varname and EGEN_SVarname, this is a command of the-egenmult- code I have written that repeatedly calls -egen-. The way it works is as follows:

. sysuse auto, clear

Example # 1
. by foreign, sort: egenmult {test1 test2} = sum({price mpg})

It works equivalently as:

. sort foreign
. by foreign: egen test1 = sum(price)
. by foreign: egen test2 = sum(mpg)

Example # 2
. drop test1 test2
. by foreign, sort: egenmult float {test1 test2} = pctile(price), p({10 90})

It works equivalent as:

. sort foreign
. by foreign: egen float test1 = pctile(price),p(10)
. by foreign: egen float test2 = pctile(price),p(90)

I've been trying to see if I'm missing anything in the code below. Thanks.

-----
program define egenmult, byable(onecall) sortpreserve
        local fullexp `0'
        gettoken type 0 : 0, parse(" ")
        local restexp `"`0'"'

        /* gen a list of all possible types */
        local types "byte int long float double"
        forvalues i=1(1)244 {
                local types "`types' str`i'"
        }
        local typegiven strpos(`"`types'"',`"`type'"')
        if `typegiven' > 0 local fullexp `restexp'
        else local type

        /* get the number & locations of the { } */
        local partexp = `"`fullexp'"'
        local curvefnd = 0
        local numbrak = 0
        local allopenpos
        local allclosepos
        local alloptions
        while `curvefnd' == 0 {
                local openpos = strpos(`"`partexp'"',"{")
                if `openpos'>0 {
                        local allopenpos `allopenpos' `openpos'
                        local closepos = strpos(`"`partexp'"',"}")
                        if `closepos' == 0 error 198
                        local allclosepos `allclosepos' `closepos'
                        local numbrak = `numbrak'+1
                        if `"`alloptions'"'=="" {
                                local newoptions = substr(`"`partexp'"',`openpos'+1,`closepos'-`openpos'-1)
                                local alloptions `"`newoptions'"'
                                local numloop = wordcount(`"`newoptions'"')
                        }
                        else {
                                local newoptions = substr(`"`partexp'"',`openpos'+1,`closepos'-`openpos'-1)
                                if `numloop'!=wordcount(`"`newoptions'"') error 198
                                local alloptions `"`alloptions'"' `"`newoptions'"'
                        }
                }
                else {
                        local curvefnd = 1
                }
                local partexp = subinstr(`"`partexp'"',"{","!",1)
                local partexp = subinstr(`"`partexp'"',"}","!",1)
        }
        local tollen = strlen(`"`partexp'"')

        /* recreate & run the individual commands */
        forvalues iloop=1(1)`numloop' {
                forvalues jbrak=1(1)`numbrak' {
                        if `jbrak'==1 {
                                local jthbrakopen : word `jbrak' of `allopenpos'
                                local newexp = substr(`"`partexp'"',1,`jthbrakopen'-1)
                                local repexp : word `jbrak' of `"`alloptions'"'
                                local repexp = word(`"`repexp'"',`iloop')
                                local newexp = `"`newexp'"' + " " + `"`repexp'"' + " "
                        }
                        else {
                                local jthbrakopen : word `jbrak' of `allopenpos'
                                local jprevbrak = `jbrak'-1
                                local jprevbrakclose : word `jprevbrak' of `allclosepos'
                                local newexp = `"`newexp'"' + " " + substr(`"`partexp'"',`jprevbrakclose'+1,`jthbrakopen'-`jprevbrakclose'-1)
                                local repexp : word `jbrak' of `"`alloptions'"'
                                local repexp = word(`"`repexp'"',`iloop')
                                local newexp = `"`newexp'"' + " " + `"`repexp'"' + " "
                        }
                }
                local jprevbrakclose : word `numbrak' of `allclosepos'
                local newexp = `"`newexp'"' + substr(`"`partexp'"',`jprevbrakclose'+1,`tollen'-`jprevbrakclose')
                if _by() by `_byvars': egen `type' `newexp'
                else egen `type' `newexp'
        }

end

-----

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Pradipto Banerjee
Sent: Monday, July 30, 2012 5:25 PM
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: What is EGEN_Varname and EGEN_SVarname ?

Thanks, Billy. After reading your email, I realized that perhaps my questions weren't clear. So, I thought to rewrite them:

1. One of the things I specifically asked was where did the -egen- code figure out that it was supplied with -by- or -bysort-.

2. On a standalone basis, what does it mean if we use "_all" such as in `"`args'"' =="_all"? I already AWARE of the usage(s) of _all in -drop _all-, -program drop _all- etc. This usage of `"`args'"' =="_all" is NOT documented in the programming manual!

3. In the -egen- code, what the following line achieve? Or intends to achieve?

local args : subinstr local args "`_sortindex'"  "", all word

4. Regarding your comment "The global macros need to be defined so they can be passed to the other functions (which is why Nick explicitly mentioned that they are there because local macros would not work in this context).", I was specifically asking: why do they need to be passed on to other functions and why do they need to be defined twice?

5. Yes, I have read the Programming Manual - and none of the questions I have asked can be resolved using the Manual. Kit Baum's book as the name suggests is an Introduction. Please let me know if that book the specifically any of the questions I asked.

Best.


-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of William Buchanan
Sent: Monday, July 30, 2012 5:10 PM
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: What is EGEN_Varname and EGEN_SVarname ?

Hi Pradipto,

I'll try to answer what I can, but it may not be much and others on the list
would likely have much more refined/precise answers.

1. The lines of code:

  if _by() {
                local byopt "by(`_byvars')"
                local cma ","
        }

might be what you are looking for, but you should also read the help file
for the -byable- option in the programming manual.  This will better explain
things.

2. Not sure about all of your question here, but you should consider the
context in which this line occurs

if `"`args'"' == "_all" | `"`args'"' == "*" {
                version 7.0, missing
                unab args : _all
                local args : subinstr local args "`_sortindex'"  "", all
word
                version 6.0, missing
        }

Which seems like it is parsing different ways that you could include the
entire varlist in your dataset.  And _all is used in more than just -drop-.
For example, -log query _all- or -program drop _all- are other uses of
"_all".

3. The global macros need to be defined so they can be passed to the other
functions (which is why Nick explicitly mentioned that they are there
because local macros would not work in this context).


Have you read the programming manual for Stata yet?  If not I would highly
suggest going through that and possibly looking into other resources (Kit
Baum's book is pretty awesome).  Otherwise, you could probably just as
easily create a work around using tempvars that you could combine into a
single variable that is stored at the end of the routine with a loop.

HTH,
Billy




-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Pradipto Banerjee
Sent: Monday, July 30, 2012 1:47 PM
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: What is EGEN_Varname and EGEN_SVarname ?

Billy,

I'm primarily trying to understand Stata syntax within the -egen- code to
write my "own" function rather than modify the -egen- code. Perhaps you can
help me. For instance,


1. How does the egen function handle the "by" part:

. by var1: egen ...
. bysort var1 (var2): egen ...

i.e., where in the egen code, does it figure out that it needs to sort
before doing a by operation etc (i.e. where does it figure out that it was
supplied with -by- instead of -bysort- etc). Similarly, in another part of
the function, it's written

. capture noisily _g`fcn' `type' `dummy' = (`args') `if' `in' /*
                */ `cma' `byopt' `options'

Why is the "by" part done this way?, i.e. why isn't this given as

. capture noisily by _byvars: _g`fcn' `type' `dummy' = (`args') `if' `in' ,
`options'


2. In the part of the code where it is written:

(a) . local args : subinstr local args "`_sortindex'"  "", all word

Why are we changing all instances of _sortindex as blanks? Isn't the
sortindex typically like __000000 (or something similar), and it isn't
normally provided when a person is calling the egen function?

(b) . if `"`args'"' =="_all"

Usually, I had seen _all been used as drop _all. What does the above part of
the code mean?

3. (a) Why do we need to define EGEN_SVarname and EGEN_Varname to `name' and
_sortindex before calling the _g`fcn', AND (b) why do we need to define them
again to Null?

Thanks






-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of William Buchanan
Sent: Monday, July 30, 2012 1:00 PM
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: What is EGEN_Varname and EGEN_SVarname ?

Pradipto,

Another important thing to consider, with regards to Nick's advice, is that
he is basically trying to prevent you from making any changes to your
installed copy of Stata that might "break" it.  If you're changing the
global macros in the egen files then you run the risk of making a change
that prevents egen from working correctly in the future.  If you wanted to
see what those variables are, you could do something like:

viewsource egen.ado (copy the material in the window into the do file editor
or another text editor of your choice)

save your clone as myegen.ado to your personal directory (if you're unsure
where this is you can use the -sysdir- command)

pick the function that you were trying to modify and do the same thing
(e.g., copy the source into a new file myconcat.ado for example).

Now if you make a call to -myegen- make sure that you are using the function
that you've duplicated and you can change the global macros in your
duplicate files.

The point, however, is to avoid editing any of the files that are native to
the Stata installation unless you're willing to risk making potentially
harmful changes.  Additionally, Nick also provided you with an example
earlier illustrating how you could make the egen subroutine that you were
interested in byable, so you could just start working with that code and
modifying it as necessary.

HTH,
Billy



-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: Monday, July 30, 2012 9:41 AM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: What is EGEN_Varname and EGEN_SVarname ?

I am mixing explanations and advice on different levels. Feel free to take
the explanations when correct and ignore the advice if it doesn't appeal.
Otherwise I think Maarten's reply to you captures my attitude well. It is a
defining characteristic of a discussion list that people aren't obliged to
offer exactly the kind of support that posters of questions might prefer.

Nick

On Mon, Jul 30, 2012 at 8:48 AM, Pradipto Banerjee
<pradipto.banerjee@adainvestments.com> wrote:
> Nick
>
> I don't understand your line of thought ... you actually want me NOT
> to
advance my understanding of programming in Stata? Why are some folks allowed
to understand advanced programming and others not?
>
> Sorry to say this, but one huge negative of Stata is that there is so
> much
undocumented stuff - and for efficient programming we have to repeatedly
rely on email lists like this.
>
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
> Sent: Saturday, July 28, 2012 9:33 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: What is EGEN_Varname and EGEN_SVarname ?
>
> These are names of globals defined in -egen- to communicate between
> programs, which manifestly can't be done through locals, apart from
> using non-documented features. If you're following my earlier advice
> you will leave them in peace.
>
> Nick
>
> On 27 Jul 2012, at 16:03, Pradipto Banerjee
> <pradipto.banerjee@adainvestments.com
>  > wrote:
>
>> I noticed some of the codes, e.g. _ggroup2 use EGEN_Varname and
>> EGEN_SVarname and recently one of the codes shared by Nick (called
>> ereplace) use these. What is EGEN_Varname and EGEN_SVarname ?
> *
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

 This communication is for informational purposes only. It is not intended
to be, nor should it be construed or used as, financial, legal, tax or
investment advice or an offer to sell, or a solicitation of any offer to
buy, an interest in any fund advised by Ada Investment Management LP, the
Investment advisor.  Any offer or solicitation of an investment in any of
the Funds may be made only by delivery of such Funds confidential offering
materials to authorized prospective investors.  An investment in any of the
Funds is not suitable for all investors.  No representation is made that the
Funds will or are likely to achieve their objectives, or that any investor
will or is likely to achieve results comparable to those shown, or will make
any profit at all or will be able to avoid incurring substantial losses.
Performance results are net of applicable fees, are unaudited and reflect
reinvestment of income and profits.  Past performance is no guarantee of
future results. All f!
 inancial data and other information are not warranted as to completeness or
accuracy and are subject to change without notice.

Any comments or statements made herein do not necessarily reflect those of
Ada Investment Management LP and its affiliates. This transmission may
contain information that is confidential, legally privileged, and/or exempt
from disclosure under applicable law. If you are not the intended recipient,
you are hereby notified that any disclosure, copying, distribution, or use
of the information contained herein (including any reliance thereon) is
strictly prohibited. If you received this transmission in error, please
immediately contact the sender and destroy the material in its entirety,
whether in electronic or hard copy format.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

 This communication is for informational purposes only. It is not intended to be, nor should it be construed or used as, financial, legal, tax or investment advice or an offer to sell, or a solicitation of any offer to buy, an interest in any fund advised by Ada Investment Management LP, the Investment advisor.  Any offer or solicitation of an investment in any of the Funds may be made only by delivery of such Funds confidential offering materials to authorized prospective investors.  An investment in any of the Funds is not suitable for all investors.  No representation is made that the Funds will or are likely to achieve their objectives, or that any investor will or is likely to achieve results comparable to those shown, or will make any profit at all or will be able to avoid incurring substantial losses.  Performance results are net of applicable fees, are unaudited and reflect reinvestment of income and profits.  Past performance is no guarantee of future results. All f!
 inancial data and other information are not warranted as to completeness or accuracy and are subject to change without notice.

Any comments or statements made herein do not necessarily reflect those of Ada Investment Management LP and its affiliates. This transmission may contain information that is confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is strictly prohibited. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

 This communication is for informational purposes only. It is not intended to be, nor should it be construed or used as, financial, legal, tax or investment advice or an offer to sell, or a solicitation of any offer to buy, an interest in any fund advised by Ada Investment Management LP, the Investment advisor.  Any offer or solicitation of an investment in any of the Funds may be made only by delivery of such Funds confidential offering materials to authorized prospective investors.  An investment in any of the Funds is not suitable for all investors.  No representation is made that the Funds will or are likely to achieve their objectives, or that any investor will or is likely to achieve results comparable to those shown, or will make any profit at all or will be able to avoid incurring substantial losses.  Performance results are net of applicable fees, are unaudited and reflect reinvestment of income and profits.  Past performance is no guarantee of future results. All f!
 inancial data and other information are not warranted as to completeness or accuracy and are subject to change without notice.

Any comments or statements made herein do not necessarily reflect those of Ada Investment Management LP and its affiliates. This transmission may contain information that is confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is strictly prohibited. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index