Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: algorithmic question : running sum and computations

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: algorithmic question : running sum and computations
Date	Fri, 17 Aug 2012 12:43:56 +0100

Using your data as a sandpit

.  clear

.  input id    date str1 product quantity

            id       date    product   quantity
  1.  1       1           A           10
  2.  1       2           A           -10
  3.  1       1           B            100
  4.  1       2           B            -50
  5.  1       4           C            15
  6.  1       8           C            100
  7.  1       9           C            -115
  8.  1      10          C            10
  9.  1      11          C            -10
 10.  end

it seems that we are interested in the length of time it takes for
cumulative quantity to return to 0. -sum()- is there for cumulative
sums:

.  bysort id product (date) : gen cumq = sum(q)

In one jargon, we are interested in "spells" defined by the fact that
they end in 0s for cumulative quantity. In Stata it is easiest to work
with initial conditions defining spells, so we negate the date
variable to reverse time:

.  gen negdate = -date

As dates can be repeated for the same individual, treating data as
panel data requires another fiction, that panels are defined by
individuals and products:

.  egen panelid = group(id product)

Now we can -tsset- the data:

.  tsset panelid negdate
       panel variable:  panelid (unbalanced)
        time variable:  negdate, -11 to -1, but with a gap
                delta:  1 unit

-tsspell- from SSC, which you must install, is a tool for handling
spells. It requires -tsset- data; the great benefit of that is that it
handles panels automatically. (In fact almost all the credit belongs
to StataCorp.) Here the criterion is that a spell is defined by
starting with -cumq == 0-

.  tsspell, fcond(cumq == 0)

-tsspell- creates three variables with names by default _spell _seq
_end. _end is especially useful: it is an indicator variable for end
of spells (beginning of spells when time is reversed). You can read
more in the help for -tsspell-.

.  sort id product date

.  l id product date cumq _*

     +---------------------------------------------------+
     | id   product   date   cumq   _spell   _seq   _end |
     |---------------------------------------------------|
  1. |  1         A      1     10        1      2      1 |
  2. |  1         A      2      0        1      1      0 |
  3. |  1         B      1    100        0      0      0 |
  4. |  1         B      2     50        0      0      0 |
  5. |  1         C      4     15        2      3      1 |
     |---------------------------------------------------|
  6. |  1         C      8    115        2      2      0 |
  7. |  1         C      9      0        2      1      0 |
  8. |  1         C     10     10        1      2      1 |
  9. |  1         C     11      0        1      1      0 |
     +---------------------------------------------------+

You want the mean length of completed spells. Completed spells are
tagged by _end == 1 or  cumq == 0

.  egen meanlength = mean(_seq/ _end), by(id)

This is my favourite division trick: _seq / _end is _seq if _end is 1
and missing if _end is 0; missings are ignored by -egen-'s -mean()-
function, so you get the mean length for each individual. It is
repeated for each observation for each individual so you could go

. egen tag = tag(id)
. l id meanlength if tag

I wrote a tutorial on spells.

 SJ-7-2  dm0029  . . . . . . . . . . . . . . Speaking Stata: Identifying spells
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q2/07   SJ 7(2):249--265                                 (no commands)
        shows how to handle spells with complete control over
        spell specification

which is accessible at
http://www.stata-journal.com/sjpdf.html?articlenum=dm0029

Its principles underlie -tsspell-, but -tsspell- is not even
mentioned, for which there is a mundane explanation. Explaining some
basics as clearly and carefully as I could produced a paper that was
already long and detailed, and adding detail on -tsspell- would just
have made that worse.

For more on spells, see Rowling (1997, 1998, 1999, etc.).

Nick

On Fri, Aug 17, 2012 at 11:30 AM, Francesco <[email protected]> wrote:
> Dear Statalist,
>
> I am stuck with a little algorithmic problem and I cannot find an
> simple (or elegant) solution...
>
> I have a panel dataset as (date in days) :
>
> ID    DATE    PRODUCT QUANTITY
> 1       1           A           10
> 1       2           A           -10
>
> 1       1           B            100
> 1       2           B            -50
>
> 1       4           C            15
> 1       8           C            100
> 1       9           C            -115
>
> 1      10          C            10
> 1      11          C            -10
>
>
>
> and I would like to know the average time (in days) it takes for an
> individual in order to complete a full round trip (the variation in
> quantity is zero)
> For example, for the first id we can see that there we have
>
> ID PRODUCT delta_DATE delta_QUANTITY
> 1         A               1=2-1                  0=10-10
> 1         C               5=4-9                  0=15+100-115
> 1         C               1=11-10               0=10-10
>
> so on average individual 1 takes (1+5+1)/3=2.3 days to complete a full
> round trip. Indeed I can discard product B because there is no round
> trip, that is 100-50 is not equal to zero.
>
> My question is therefore ... do you have an idea obtain this simply in
> Stata ? I have to average across thousands of individuals... :)
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: algorithmic question : running sum and computations
  - From: Francesco <[email protected]>

References:
- st: algorithmic question : running sum and computations
  - From: Francesco <[email protected]>

Prev by Date: Re: st: algorithmic question : running sum and computations
Next by Date: Re: st: algorithmic question : running sum and computations
Previous by thread: Re: st: algorithmic question : running sum and computations
Next by thread: Re: st: algorithmic question : running sum and computations
Index(es):
- Date
- Thread