Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: filling in existing ids and generating new ids for unique actors


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: filling in existing ids and generating new ids for unique actors
Date   Fri, 7 Sep 2012 16:47:34 +0100

See

FAQ     . . . . . .  Listing observations in a group that differ on a variable
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        11/01   How do I list observations in a group that differ
                on a variable?
                http://www.stata.com/support/faqs/data/diff.html

Nick

On Fri, Sep 7, 2012 at 3:50 PM, Erik Aadland <erikaadland@hotmail.com> wrote:
> In some of my cases, the observation for -actor- is rightfully missing.
> Could this missing "value" for -actor- produce the error message?:
> 5 contradictions in 1079 by-groups
> assertion is false
> r(9);
> Having consulted Manual [D] for the assert command, example 1 seems to suggest this.
> Kind regards,
> Erik.
>
> ----------------------------------------
>> From: erikaadland@hotmail.com
>> To: statalist@hsphsun2.harvard.edu
>> Subject: RE: st: filling in existing ids and generating new ids for unique actors
>> Date: Fri, 7 Sep 2012 14:37:37 +0000
>>
>> Having run the code:
>> bysort actor (actor_id) : replace actor_id = actor_id[_n-1] if missing(actor_id)
>> and followed up with the check:
>> by actor : assert actor_id[1] == actor_id[_N]
>>
>> I got the following error message:
>> 5 contradictions in 1079 by-groups
>> assertion is false
>> r(9);
>> I tried to search for technique on how I can identify these 5 contradictions so that I can fix the problem, but could not find any.
>> Any suggestions on how to proceed?
>> Kind regards,
>> Erik.
>>
>>
>>
>> ----------------------------------------
>> > Date: Fri, 7 Sep 2012 14:12:01 +0100
>> > Subject: Re: st: filling in existing ids and generating new ids for unique actors
>> > From: njcoxstata@gmail.com
>> > To: statalist@hsphsun2.harvard.edu
>> >
>> > Comments embedded below.
>> >
>> > Nick
>> >
>> > On Fri, Sep 7, 2012 at 1:03 PM, Erik Aadland <erikaadland@hotmail.com> wrote:
>> > > Dear Statalist.
>> > > I have an unbalanced panel dataset.
>> > > The structure is as follows:
>> > > year actor_id actor
>> > > 2000 . Paul
>> > > 2001 . Paul
>> > > 2002 . Paul
>> > > 2000 . Sarah
>> > > 2001 1 Sarah
>> > > 2002 1 Sarah
>> > > 2000 . Simon
>> > > 2001 2 Simon
>> > > 2002 2 Simon
>> > > I have 2 problems:
>> > > 1. I want to fill in the missing existing actor_id for those actors that already have an actor_id in some years but not others.
>> >
>> > That's
>> >
>> > bysort actor (actor_id) : replace actor_id = actor_id[_n-1] if
>> > missing(actor_id)
>> >
>> > But follow by a check:
>> >
>> > by actor : assert actor_id[1] == actor_id[_N]
>> >
>> > For the principles, see
>> >
>> > SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step
>> > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
>> > Q1/02 SJ 2(1):86--102 (no commands)
>> > explains the use of the by varlist : construct to tackle
>> > a variety of problems with group structure, ranging from
>> > simple calculations for each of several groups to more
>> > advanced manipulations that use the built-in _n and _N
>> >
>> > > 2. I want to generate a new unique actor_id for those actors that have no actor_id in the dataset. This actor_id needs to be different from those already existing for other actors in the dataset.
>> > > The variable -actor- lists the unique name for each actor and this unique name could be used as a basis for assigning the actor_id.
>> >
>> > su actor_id, meanonly
>> > local max = r(max)
>> > egen new_actor_id = group(actor) if missing(actor_id)
>> > replace actor_id = new_actor_id + `max' if missing(actor_id)
>> >
>> > What this does:
>> >
>> > 1. Find the largest actor_id in use. So, it will be safe to use higher numbers.
>> >
>> > 2. Use -egen-'s -group()- to generate new ids to those without them.
>> > These will run 1, 2, 3, ..
>> >
>> > 3. New actor_id = new id + maximum for those without them.
>> > *
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index