[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
n j cox <n.j.cox@durham.ac.uk> |

To |
n j cox <n.j.cox@durham.ac.uk> |

Subject |
Re: st: running sum restarting after missing value |

Date |
Fri, 20 Jul 2007 15:09:41 +0100 |

Just as a follow-up: Given that you have -tsset- your data . tsset id year panel variable: id, 1 to 1 time variable: year, 1984 to 1989 and installed -tsspell- from SSC, then spells are defined by -var1- being non-missing: . tsspell , cond(!missing(var1)) This automatically produces three variables, by default _seq, _spell, _end. . l +---------------------------------------------------+ | id year var1 sum_var _seq _spell _end | |---------------------------------------------------| 1. | 1 1984 1 1 1 1 0 | 2. | 1 1985 1 2 2 1 0 | 3. | 1 1986 1 3 3 1 1 | 4. | 1 1987 . . 0 0 0 | 5. | 1 1988 1 1 1 2 0 | |---------------------------------------------------| 6. | 1 1989 1 2 2 2 1 | +---------------------------------------------------+ You could then type . bysort id spell : gen sum_var1 = sum(var1) n j cox wrote:

This is confusing. You say you want a running sum, but your example

does not show one. I am going to trust your initial statement and

ignore your example.

-sum()- is trained to ignore missings. Usually this is s feature,

but not for your purposes.

One take on this is that you evidently regard runs of non-missing values

as distinct spells. Identifying such spells explicitly would enable

you to do something like this:

by id spell : gen sum_var1 = sum(var1)

and with one bound you are then home free.

-tsspell- on SSC is one user-written tool for working with spells. I took a step back from that to explain, in excruciating detail, how to do it all (and more) from first principles in

Cox, N.J. 2007. Identifying spells. Stata Journal 7(2): 249-265.

In this particular problem there are various direct ways of

restarting, without identifying spells of non-missing values as

distinct spells. Here's one:

gen sum_var1 = .

bysort id (year) :

replace sum_var1 =

cond(missing(var1[_n-1]),

var1,

var1 + sum_var1[_n-1]))

For sources on information on -cond()-, -search cond()-.

In words, if the previous one is missing, the sum becomes the present value; otherwise add the present value to the sum so far. Missings

will map to missings on this rule.

As usual, note that -replace- entails a previous -generate-.

The connections between this work on "identifying spells" and work of any other past or present students of Hogwarts Academy is allusive, elusive and illusory.

By the way, it seems that you have yet to read to the end of the

Statalist FAQ:

http://www.stata.com/support/faqs/res/statalist.html#spell

Nick

n.j.cox@durham.ac.uk

Erasmo Giambona

I am trying to create a running sum using the sum function by group.

My problem is that I would like STATA to restart summing again after

each missing value and match the total with all previous observations.

For example:

id yeara var1

1 1984 1

1 1985 1

1 1986 1

1 1987 .

1 1988 1

1 1989 1

My output should look like

id year var1 sum_var1

1 1984 1 3

1 1985 1 3

1 1986 1 3

1 1987 . .

1 1988 1 2

1 1989 1 2

I tried the following, but it doesn't get me what I need.

by id: gen sum_var1=sum(var1) if var1!=.

* * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: running sum restarting after missing value***From:*"Austin Nichols" <austinnichols@gmail.com>

**References**:**Re: st: running sum restarting after missing value***From:*n j cox <n.j.cox@durham.ac.uk>

- Prev by Date:
**st: biprobit and restrictions on RHO** - Next by Date:
**Re: st: running sum restarting after missing value** - Previous by thread:
**Re: st: running sum restarting after missing value** - Next by thread:
**Re: st: running sum restarting after missing value** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |