Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: filling in existing ids and generating new ids for unique actors


From   Erik Aadland <erikaadland@hotmail.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: filling in existing ids and generating new ids for unique actors
Date   Fri, 7 Sep 2012 13:40:15 +0000

Thank you so much again, Nick, for your solutions and the reference.
Your input on Stata technique is very helpful.
Sincerely,
Erik.


----------------------------------------
> Date: Fri, 7 Sep 2012 14:12:01 +0100
> Subject: Re: st: filling in existing ids and generating new ids for unique actors
> From: njcoxstata@gmail.com
> To: statalist@hsphsun2.harvard.edu
>
> Comments embedded below.
>
> Nick
>
> On Fri, Sep 7, 2012 at 1:03 PM, Erik Aadland <erikaadland@hotmail.com> wrote:
> > Dear Statalist.
> > I have an unbalanced panel dataset.
> > The structure is as follows:
> > year actor_id actor
> > 2000 . Paul
> > 2001 . Paul
> > 2002 . Paul
> > 2000 . Sarah
> > 2001 1 Sarah
> > 2002 1 Sarah
> > 2000 . Simon
> > 2001 2 Simon
> > 2002 2 Simon
> > I have 2 problems:
> > 1. I want to fill in the missing existing actor_id for those actors that already have an actor_id in some years but not others.
>
> That's
>
> bysort actor (actor_id) : replace actor_id = actor_id[_n-1] if
> missing(actor_id)
>
> But follow by a check:
>
> by actor : assert actor_id[1] == actor_id[_N]
>
> For the principles, see
>
> SJ-2-1 pr0004 . . . . . . . . . . Speaking Stata: How to move step by: step
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox
> Q1/02 SJ 2(1):86--102 (no commands)
> explains the use of the by varlist : construct to tackle
> a variety of problems with group structure, ranging from
> simple calculations for each of several groups to more
> advanced manipulations that use the built-in _n and _N
>
> > 2. I want to generate a new unique actor_id for those actors that have no actor_id in the dataset. This actor_id needs to be different from those already existing for other actors in the dataset.
> > The variable -actor- lists the unique name for each actor and this unique name could be used as a basis for assigning the actor_id.
>
> su actor_id, meanonly
> local max = r(max)
> egen new_actor_id = group(actor) if missing(actor_id)
> replace actor_id = new_actor_id + `max' if missing(actor_id)
>
> What this does:
>
> 1. Find the largest actor_id in use. So, it will be safe to use higher numbers.
>
> 2. Use -egen-'s -group()- to generate new ids to those without them.
> These will run 1, 2, 3, ..
>
> 3. New actor_id = new id + maximum for those without them.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index