[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: running sum restarting after missing value

From   n j cox <>
Subject   Re: st: running sum restarting after missing value
Date   Fri, 20 Jul 2007 14:39:51 +0100

This is confusing. You say you want a running sum, but your example
does not show one. I am going to trust your initial statement and
ignore your example.

-sum()- is trained to ignore missings. Usually this is s feature,
but not for your purposes.

One take on this is that you evidently regard runs of non-missing values
as distinct spells. Identifying such spells explicitly would enable
you to do something like this:

by id spell : gen sum_var1 = sum(var1)

and with one bound you are then home free.

-tsspell- on SSC is one user-written tool for working with spells. I took a step back from that to explain, in excruciating detail, how to do it all (and more) from first principles in

Cox, N.J. 2007. Identifying spells. Stata Journal 7(2): 249-265.

In this particular problem there are various direct ways of
restarting, without identifying spells of non-missing values as
distinct spells. Here's one:

gen sum_var1 = .

bysort id (year) :
replace sum_var1 =
var1 + sum_var1[_n-1]))

For sources on information on -cond()-, -search cond()-.

In words, if the previous one is missing, the sum becomes the present value; otherwise add the present value to the sum so far. Missings
will map to missings on this rule.

As usual, note that -replace- entails a previous -generate-.

The connections between this work on "identifying spells" and work of any other past or present students of Hogwarts Academy is allusive, elusive and illusory.

By the way, it seems that you have yet to read to the end of the
Statalist FAQ:


Erasmo Giambona

I am trying to create a running sum using the sum function by group.
My problem is that I would like STATA to restart summing again after
each missing value and match the total with all previous observations.
For example:

id yeara var1
1 1984 1
1 1985 1
1 1986 1
1 1987 .
1 1988 1
1 1989 1

My output should look like

id year var1 sum_var1
1 1984 1 3
1 1985 1 3
1 1986 1 3
1 1987 . .
1 1988 1 2
1 1989 1 2

I tried the following, but it doesn't get me what I need.

by id: gen sum_var1=sum(var1) if var1!=.

* For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index