Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Regression by industry and year excluding firm i


From   "Sarah Edgington" <sedging@ucla.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Regression by industry and year excluding firm i
Date   Fri, 13 Dec 2013 11:41:13 -0800

Ahmed,
As an aside, this is strikes me as one of those instances where you would
benefit a great deal from debugging your code on a subset of your data.  You
need enough data for your regressions to run without errors but I'd try
getting the loop working on a subset of a few hundred observations rather
than the whole data set.  That will run much more quickly.  The resulting
predictions will be nonsense but they'll serve as a proof of concept.  Once
you're happy that you have code that does what you expect you can run it on
the whole dataset with a certain amount of confidence that even if it takes
a very long time, you'll get the results that reflect your intended process.
-Sarah

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Fernando Rios
Avila
Sent: Friday, December 13, 2013 11:21 AM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Regression by industry and year excluding firm i

Ahmed,
In addition to Nick Cox comments, keep in mind that based on your
explanation, you need to run 95000 regressions. which will be very time
consuming. But, computer time is "cheap".
I would suggest, however, to clarify if each observation represent a
different Firm, which is assumption on how your code and Nick's are handling
the problem.
Fernando
HTH

On Fri, Dec 13, 2013 at 2:12 PM, Nick Cox <njcoxstata@gmail.com> wrote:
> Sorry, no.
>
> The code hasn't finished running, so
>
> 1. Good news. No obvious bug.
>
> 2. I'd expect that code to be slow. You want a regression for every 
> observation.
>
> I don't think you've demonstrated anything wrong with my code, so I 
> can't possibly fix it. That doesn't mean the code must be right, but 
> you need to show me incorrect results first. The point is that your 
> code would, I imagine, have been even slower had it been correct.
> Several of the changes I made would have speeded up things compared 
> with your code.
>
> I don't have your data to test anything, but without wanting to seem 
> arrogant, I think you need to be confident that I made a mistake 
> before you change my code.
>
> Nick
> njcoxstata@gmail.com
>
>
> On 13 December 2013 19:01, Abdalla, Ahmed <ahmed.abdalla@kcl.ac.uk> wrote:
>> Dear Nick
>> Many Thanks for that.
>> I understand your code now. I ran it. However, STATA has been running the
loop for more than 40 minutes now and I got no output !!!
>> I will explain more:
>> I have a model:
>> wce= b0+b1wlag_ce+b2 wato+b3 wlag_acc +b4wacc+b5 wdsale+b6 wndsale
>>
>> I want to run this model using all observations in a particular industry
-year excluding firm i. Expected wce for firm i are measured using the
coefficients I obtain from the industry year regressions multiplied by the
actual values of the variables in the model for firm i.
>> As far as I understand your code should achieve my target, but it took
long time and didn't give any results !
>> I even tried another code that worked well and give me results in
seconds, but it doesn't exclude firm i from the estimation. I will write
this code for you here:
>> egen sic2id=group(sic_2 datadate)
>> egen count=count(sic2id), by(sic2id)
>> drop if count<10
>> drop count
>> drop sic2id
>> egen sic2id=group(sic_2 datadate)
>>
>> gen b0=.
>> gen b1= .
>> gen b2=.
>> gen b3=.
>> gen b4=.
>> gen b5=.
>> gen b6=.
>>
>> sum sic2id
>> scalar max2=r(max)
>> local k=max2
>> set more off
>> forvalues x=1(1)`k'{
>> capture reg wce wlag_ce wato wlag_acc wacc wdsale wndsale if sic2id==`x'
>> capture replace b0= _b[_cons]
>> capture replace b1= _b[wlag_ce]
>> capture replace b2= _b[wato]
>> capture replace b3= _b[wlag_acc]
>> capture replace b4= _b[wacc]
>> capture replace b5= _b[wdsale]
>> capture replace b6= _b[wndsale]
>> }
>>
>> I appreciate if you can explain what was wrong with your code and update
the new code I have posted here to exclude firm i.
>>
>>
>>
>>
>> ________________________________________
>> From: owner-statalist@hsphsun2.harvard.edu 
>> <owner-statalist@hsphsun2.harvard.edu> on behalf of Nick Cox 
>> <njcoxstata@gmail.com>
>> Sent: 13 December 2013 18:03
>> To: statalist@hsphsun2.harvard.edu
>> Subject: Re: st: Regression by industry and year excluding firm i
>>
>> Remarks
>>
>> 1. If you are cycling over observations, you don't need a variable 
>> containing observation numbers, nor to use -levelsof-.
>>
>> 2. -in- is always faster than the corresponding -if-.
>>
>> 3. wlag_ce=!=. is presumably a typo, but to Stata it will be illegal
syntax.
>>
>> 4. -capture replace b0= _b[_cons]- will end with the last intercept 
>> calculated. I guess you don't want that.
>>
>> 5. Checking for missing values is redundant as -regress- will never 
>> include them.
>>
>> With these and some other small tricks, here is an attempt at 
>> rewriting your code.
>>
>> local X wlag_ce wato wlag_acc wacc wdsale wndsale tokenize "`X'"
>>
>> forval j = 0/6 {
>> gen b`j'=.
>> }
>>
>> forval i = 1/`=_N' {
>> local same sic_2[`i'] == sic_2 & datadate[`i'] == datadate qui count 
>> if `same' & _n != `i'
>>
>> if r(N) > 10 {
>> reg wce `X' if `same' & _n != `i'
>> }
>>
>> quietly if _rc == 0 {
>> replace b0 = _b[_cons] in `i'
>> forval j = 1/6 {
>> replace b`j' = _b[``j''] in `i'
>> }
>> }
>> }
>>
>> gen pred_ce= b0 + b1*wlag_ce + b2*wato + b3*wlag_acc + /// b4*wacc + 
>> b5*wdsale + b6*wndsale
>>
>> Nick
>> njcoxstata@gmail.com
>>
>>
>> On 13 December 2013 17:33, Abdalla, Ahmed <ahmed.abdalla@kcl.ac.uk>
wrote:
>>> Dear Statalist
>>> I run a regression to estimate core earnings for each variable in my
dataset. The regression is run using all observations in a particular
industry year EXCLUDING firm i. Expected core earnings for firm i is
estimated using the coefficients multiplied by the actual values of
variables in the model for firm i.
>>> I run the following code.
>>>
>>> First: I get an error message for macro length being exceeded.
>>> Second: I try to use other commands for looping, the loop runs but it
gives me error message for invalid syntax.
>>> My problem is on how to exclude firm i ? I hope if you have any
suggestions regarding running regressions by industry and year and excluding
firm i from the estimation procedures.
>>>
>>>
>>> gen obs= [_n]
>>> gen runn=1
>>>
>>> gen b0=.
>>> gen b1= .
>>> gen b2=.
>>> gen b3=.
>>> gen b4=.
>>> gen b5=.
>>> gen b6=.
>>>
>>> levelsof obs,local(levels)
>>> foreach x of local levels{
>>> gen mark=1 if obs==runn
>>> gen sic_lp= sic_2 if obs ==runn
>>> qui summ sic_lp
>>> replace sic_lp = r(mean) if sic_lp==.
>>> gen datadate_lp= datadate if obs == runn qui summ datadate_lp 
>>> replace datadate_lp = r(mean) if datadate_lp==.
>>> format datadate_lp %d
>>> gen sample =1 if sic_lp== sic_2 & datadate_lp== datadate & sale !=. &
wce !=. & wlag_ce=!=. & wato !=. & wacc !=. & wlag_acc!=. & wdsale !=. &
wndsale !=.
>>> egen sample_sum= sum(sample) if mark != 1 capture reg wce wlag_ce 
>>> wato wlag_acc wacc wdsale wndsale if sample==1 & mark != 1 & 
>>> sample_sum >10 capture replace b0= _b[_cons] capture replace b1= 
>>> _b[wlag_ce] if obs==runn capture replace b2= _b[wato] if obs==runn 
>>> capture replace b3= _b[wlag_acc] if obs==runn capture replace b4= 
>>> _b[wacc] if obs==runn capture replace b5= _b[wdsale] if obs==runn 
>>> capture replace b6= _b[wndsale] if obs==runn drop mark sic_lp 
>>> datadate_lp sample sample_sum replace runn= runn+1 }
>>>
>>> gen pred_ce= b0+ b1*wlag_ce + b2*wato +b3*wlag_acc + b4*wacc + 
>>> b5*wdsale + b6*wndsale
>>>
>>>
>>> I appreciate your help
>>>
>>>
>>>
>>>
>>>
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index