Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: how to summarize while accounting for duplicate values

From   "Nick Cox" <>
To   <>
Subject   RE: st: how to summarize while accounting for duplicate values
Date   Mon, 22 Feb 2010 12:01:03 -0000

How do I list observations in a group that differ on a variable?

explains some related technique. The problem there is just the
complement of that here. 

bysort id (somevar) : gen byte same = somevar[1] == somevar[_N] 

tags panels within which -somevar- is identical by 1 and those in which
it differs by 0. 


Thank you to those who suggested using egen and using the first record  
for an individual.

Quoting Martin Weiss <>:

> If this is indeed Dan`s intent, he may also like -egen, tag()-.

Jeph Herrin

> If your individuals are identified by the variable -id-,
> then create an indicator for one record per -id-:
>    bys id: gen first=_n==1
> Then all you have to do is condition on -first-
>   sum birthday if first
> to get summary stats on one record per individual. wrote:

>> I am trying to summarize a dataset in Stata 10.1 on individuals'
>> reproductive success throughout their life. The database is set up
>> each individual having an annual entry with multiple variables. Some
>> variables' values change annually (e.g. # offspring born), while
>> variables have duplicated values for each year as they do not change
>> over the course of their life (e.g. age of first reproduction,
>> lifespan). How do I get Stata to provide summary statistics for
>> variables of interest in the population that accounts for
>> duplicated values for some variables that don't change annually?

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index