Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: counting number of new cases in every year

From	Navid Asgari <[email protected]>
To	"'[email protected]'" <[email protected]>
Subject	st: counting number of new cases in every year
Date	Fri, 11 May 2012 00:48:01 +0800

Dear Statalist,

I have a dataset which looks like this:


      Year   P |
     |----------|
  1. | 1995   A |
  2. | 1995   B |
  3. | 1995   A |
  4. | 1995   C |
  5. | 1995   D |
     |----------|
  6. | 1995   A |
  7. | 1995   E |
  8. | 1995   A |
  9. | 1996   B |
 10. | 1996   A |
     |----------|
 11. | 1996   A |
 12. | 1996   M |
 13. | 1996   A |
 14. | 1996   H |
 15. | 1996   A |
     |----------|
 16. | 1996   C

I use the following to count number of new values under variable "P" that exists in the year 1996, but not 1995:

gen id = _n
> reshape long P , i(id)
> bysort P (Year id) : gen seq = _n

Count if Year==1996 & seq==1

Now I want to do the same thing for more than 2 successive years (e.g. 1993,1994,1995,1996). So, values of variable "P" in every year will be compared with the value of its previous year (1994 to 1993, then 1995 to 1994, and so forth....


The complexity of this lies in the fact that this comparison has to be done by each unique value of another variable and the starting year and ending year varies in each group. In fact this is how the structure of the real data looks like:


     | Year   P    company |
     |---------------------|
  1. | 1995   A   Company1 |
  2. | 1995   B   Company1 |
  3. | 1995   A   Company1 |
  4. | 1995   C   Company1 |
  5. | 1995   D   Company1 |
     |---------------------|
  6. | 1995   A   Company1 |
  7. | 1995   E   Company1 |
  8. | 1995   A   Company1 |
  9. | 1996   B   Company1 |
 10. | 1996   A   Company1 |
     |---------------------|
 11. | 1996   A   Company1 |
 12. | 1996   M   Company1 |
 13. | 1996   A   Company1 |
 14. | 1996   H   Company1 |
 15. | 1996   A   Company1 |
     |---------------------|
 16. | 1996   C   Company1 |
 17. | 1993   G   Company2 |
 18. | 1993   G   Company2 |
 19. | 1993   M   Company2 |
 20. | 1993   K   Company2 |
     |---------------------|
 21. | 1993   A   Company2 |
 22. | 1993   B   Company2 |
 23. | 1994   C   Company2 |
 24. | 1994   M   Company2 |
 25. | 1994   K   Company2 |
     |---------------------|
 26. | 1994   L   Company2 |
     +---------------------+

So for every group under variable company the code will count number of new values of variable "P" in every year that did not exist a year before...

Thanks in advance,
Navid

NUS Business School


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: counting number of new cases in every year
  - From: Nick Cox <[email protected]>

Prev by Date: RE: st: Interactions using xtlogit
Next by Date: st: comparing variables across time
Previous by thread: st: ODBC misreading or not reading numbers with allstring option
Next by thread: st: RE: counting number of new cases in every year
Index(es):
- Date
- Thread