Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: creating new variable

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: creating new variable Date Wed, 12 Oct 2011 18:10:42 +0100

```Yes, it's possible. Here's one way

bysort gvkey (appyear) : gen newvar = _n-1 if appyear != appyear[_n-1]
bysort gvkey : replace newvar = newvar[_n-1] if missing(newvar)

See

SJ-2-1  pr0004  . . . . . . . . . . Speaking Stata:  How to move step by: step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
Q1/02   SJ 2(1):86--102                                  (no commands)
explains the use of the by varlist : construct to tackle
a variety of problems with group structure, ranging from
simple calculations for each of several groups to more
advanced manipulations that use the built-in _n and _N

and

FAQ     . . . . . . . . . . . . . . . . . . . . . . . Replacing missing values
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
9/05    How can I replace missing values with previous or
following nonmissing values or within sequences?
http://www.stata.com/support/faqs/data/missing.html

for the ideas behind the first and second commands.

Nick

On Wed, Oct 12, 2011 at 5:17 PM, S.H. Former <S.H.Former@uvt.nl> wrote:

> I have a question regarding the composition of my database. I want to know whether it is possible to incorporate the following variable. I want to create a variable that counts previous observations but only till five years back. Let me explain it using a part my database which looks as follows:
>
> gvkey   appyear
> 1004    1985
> 1004    2001
> 1010    1977
> 1010    1977
> 1010    1977
> 1010    1977
> 1010    1977
> 1010    1977
> 1010    1977
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1978
> 1010    1979
> 1010    1979
> 1010    1979
> 1010    1979
> 1010    1979
> 1010    1979
> 1010    1979
> 1010    1979
> 1010    1979
>
> Now I want a new variable that counts the number of observations for each Gvkey (so it has to be counted for each Gvkey) where the number of observations is counted until five years back. So the new variable should look like this:
>
>  gvkey  appyear                 New Variable            Comment
> 1004    1985                    0                        no observations before
> 1004    2001                    0                       longer than five years ago
> 1010    1977                    0                       no observations
> 1010    1977                    0                       no observations
> 1010    1977                    0                       no observations
> 1010    1977                    0                       no observations
> 1010    1977                    0                       no observations
> 1010    1977                    0                       no observations
> 1010    1977                    0                       no observations
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1978                    7                       number of observations in 1977 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> 1010    1979                    18              number of observation in 1977 and 1978 by Gvkey 1010
> Of course, the comment would not have to be included but is to make the problem clear to you! Hope you can help me,

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```