Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: xt: unit-specific trends

From   Austin Nichols <>
Subject   Re: st: xt: unit-specific trends
Date   Thu, 19 Apr 2012 09:48:19 -0400

László Sándor <>:
No need to run regressions, loop, etc.
You can just use a little algebra and by:
though it will be faster and more accurate in Mata.

If you decide to move into Mata, see also e.g.

2012/4/19 László Sándor <>:
> Quick comments on this:
> I forgot to flag that the residual variable need to exist beforehand
> for -genbump- below, this is only replacing values of it.
> More importantly: The operation is still far, far from linear in the
> number of individuals (N in the panel — T is fixed). I could again
> finish a 1% subsample in around 10 minutes or so, but my bold attempt
> at 10% overnight still only finished 4 out of the 8 variables to be
> transformed this way in 10 or 11 hours.
> Maybe caching and memory is an issue here, but if anybody (StataCorp?)
> had a comment on this otherwise, that would be helpful.
> Maybe firing up _regress and _predict all the time is very costly? Or
> the marksample is not fast enough with the by option? (Does the code
> know that once it finished with seven consecutive rows there is
> nothing to check further below "whether" `touse' is 1 anywhere else? I
> guessed byable commands produce efficient subscripting for some
> underlying Mata code…) Or even the byable command does not use MP
> resources efficiently? (Still, even remaining serial, the speed-up
> could be much closer to linear, no?)
> I thought individual-specific trends are almost as trendy nowadays as
> fixed-effects — I wonder if they could be done much faster.
> Thanks,
> Laszlo
> 2012/4/18 László Sándor <>:
>> In case anyone cares, this is what I came up with. (Detrends, demeans,
>> and also allows for a level shift.) And this is faster, as I expected.
>> program define genbump, byable(recall, noheader)
>>        version 11
>>        syntax =/exp  [if] [in], trend(varname) bump(varname) resid(varname)
>>    marksample touse, novarlist
>>    tempvar res
>>        quietly {
>>        _regress `exp' `trend' `bump' if `touse'
>>        _predict `res', resid
>>        replace `resid' = `res'+_b[`bump']*`bump' if `touse'
>>        }
>> end
>> 2012/4/18 László Sándor <>:
>>> Thanks, Nick,
>>> I left out a crucial part: I need to run it for observations in the
>>> 10K magnitude (full sample: 400K, but I also try to sample down).
>>> I just had the 200 / 4 mins as a measure of speed.
>>> I would really love to see this speed up.
>>> So I should make the residual-generation a separate command, and make
>>> it byable (but no egen), then? Any other trick up your sleeve?
>>> Gratefully, as always,
>>> Laszlo
>>> On Wed, Apr 18, 2012 at 7:56 PM, Nick Cox <> wrote:
>>>> If a total task takes 3-4 minutes, dots to show progress are
>>>> pointless, in my view.
>>>> -egen- is for convenience. Writing -egen- will not speed up; it will
>>>> just slow things down. Nick
>>>> 2012/4/19 László Sándor <>:
>>>>> Or a quick idea: Shall I write an -egen- extension instead? Or all
>>>>> benefits would come from its byability anyway?
>>>>> 2012/4/18 László Sándor <>:
>>>>>> Let me get back to this now that I know how fast I am doing using -_dots-.
>>>>>> Now I know it takes 3-4 minutes to loop through 200 cases while all I
>>>>>> do each time is a trivial regression on 4-7 observations and
>>>>>> predicting the residuals.
>>>>>> I would greatly welcome suggestions on how to speed this up relative
>>>>>> to the code below. Most likely checking all cases for the -if-
>>>>>> condition when only few would satisfy and they could come in blocks
>>>>>> after a single sort could help things but I am out of ideas how to do
>>>>>> that. Making the code "byable" would at least use some features of MP?
>>>>>> Thanks!
>>>>>> Laszlo
>>>>>> sum nid, d
>>>>>> _dots 0
>>>>>> forval i = 1/`r(max)' {
>>>>>> foreach v of varlist assets liabs netassets koejd {
>>>>>> cap reg `v' year post if nid == `i'
>>>>>> if _rc == 0 {
>>>>>> predict resid, resid
>>>>>> qui replace r`v' = resid + _b[post]*post if e(sample)
>>>>>> drop resid
>>>>>> }
>>>>>> }
>>>>>> _dots `i' 0
>>>>>> }
>>>>>> 2012/4/13 László Sándor <>:
>>>>>>> Hi all,
>>>>>>> I am trying to demean and detrend my panel data allowing for unit
>>>>>>> specific trends (using Stata 11.0 MP for Windows). I found some
>>>>>>> previous posts about this, but I am not satisfied with the speed of
>>>>>>> the solutions. I would be most happy with a "byable" solution, like
>>>>>>> this pseudocode:
>>>>>>> bys id: {
>>>>>>> reg var t
>>>>>>> pred dtrended_var, res
>>>>>>> }
>>>>>>> I know this is not possible. However, looping through my ids and if
>>>>>>> conditions is not feasible either (or I collect them into a local with
>>>>>>> -levelsof-?). Actually, with all the if conditions, it is not
>>>>>>> attractive either, let alone feasible. (Or if I sort by id, I can use
>>>>>>> in conditions in the balanced subset, which I presume to be much
>>>>>>> faster?)
>>>>>>> Or shall I just loop over a new id that will be consecutive integers
>>>>>>> if I -egen, group- the old id (or do the same with ins)?
>>>>>>> I had some hopes about -xtdata- or -areg-, but to no avail. Yet I look
>>>>>>> for some guidance on doing this the right way, if even the simple
>>>>>>> -areg- could have been made faster by "orders of magnitude" from Stata
>>>>>>> 11 to 12…
>>>>>>> Thank you for any thoughts,
>>>>>>> Laszlo

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index