Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Create Variable of Groupings


From   "Joseph Coveney" <[email protected]>
To   <[email protected]>
Subject   st: Re: Create Variable of Groupings
Date   Sat, 2 Nov 2013 17:38:36 +0900

Lisa Wang wrote:

I want to create a new variable that groups my observations so I can
do something like a panel analysis.

I have variables: identifiers date amount id3. id3 is a concatenation
of identifiers and date.

For instance,

identifiers | date | amount | id3
1007 | 17aug2006 | 10 | 1007 17030
1007 | 17aug2006 | 7 | 1007 17030
1007 | 17aug2006 | 8.5 | 1007 17030
2049 | 26may2009 | 10 | 2049 18043
2049 | 26may2009| 7 | 2049 18043
2049 | 12mar2007 | 7 | 2049 17237
2049 | 12mar2007 | 7 | 2049 17237
2049 |12mar2007 | 7 | 2049 17237

I would like it to output event_id = 1 for 1007 17030, 2 for 2049
18043, 3 for 2049 17237 etc etc....down the page.

But at this point it seems to give me 2681 for 1007 17030, 5130 for
2049 18043 (ie. it is not sequential).

I tried this:
- bysort id* date : gen event_id = _n - but that gives me numbering
WITHIN groups
and also tried:
- egen event_id = group(id3) - but it was not sequential. Do you think
I need to so a by or sort beforehand?


Thank you in advance for all your helpful suggestions as I am
currently stuck and can't proceed.

--------------------------------------------------------------------------------

See the line of code below, starting at "Begin here".

Joseph Coveney

. input long identifiers str9 date double amount str1 id3

      identifiers       date      amount        id3
  1. 1007 17aug2006 10 1007 17030
  2. 1007 17aug2006 7 1007 17030
  3. 1007 17aug2006 8.5 1007 17030
  4. 2049 26may2009 10 2049 18043
  5. 2049 26may2009 7 2049 18043
  6. 2049 12mar2007 7 2049 17237
  7. 2049 12mar2007 7 2049 17237
  8. 2049 12mar2007 7 2049 17237
  9. end

. quietly replace id3 = string(identifiers) + ///
>     " " + string(date(date, "DMY"))

. 
. *
. * Begin here
. *
. generate byte event_id = sum(id3 != id3[_n-1])

. 
. list, noobs sepby(event_id)

  +-------------------------------------------------------+
  | identi~s        date   amount          id3   event_id |
  |-------------------------------------------------------|
  |     1007   17aug2006       10   1007 17030          1 |
  |     1007   17aug2006        7   1007 17030          1 |
  |     1007   17aug2006      8.5   1007 17030          1 |
  |-------------------------------------------------------|
  |     2049   26may2009       10   2049 18043          2 |
  |     2049   26may2009        7   2049 18043          2 |
  |-------------------------------------------------------|
  |     2049   12mar2007        7   2049 17237          3 |
  |     2049   12mar2007        7   2049 17237          3 |
  |     2049   12mar2007        7   2049 17237          3 |
  +-------------------------------------------------------+

. 
. exit

end of do-file

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index