Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: unique value count in several variables


From   "Wanli Zhao" <[email protected]>
To   <[email protected]>
Subject   st: RE: unique value count in several variables
Date   Sun, 19 Jun 2005 17:26:35 -0400

I feel I need to report on my running for people interested. I have a large
panel, about 1600 cross-section and 11 years. Scott's program generates
nvals variable with a single value 1005 ( I do not know what it means) for
all the gvkey-year. Nick's modification seems to work. The problem is the
time is unacceptable. I broke the program and the values seem correct for
finished part.
Nick's original "reshape" program also gave me an error message as follows:
[reshape error
(note: j = ssic1 ssic2)
i (gvkey year sid) indicates the top-level grouping such as subject id.
j (_j) indicates the subgrouping such as time.
xij variable is K.
Thus, the following variable(s) should be constant within i:
      nvals
nvals not constant within i (gvkey year sid) for 28662 values of i:]

I guess the problem is that my ssic1 and ssic2 have many missing values.
Thanks.

Wanli Zhao


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Sunday, June 19, 2005 8:06 AM
To: [email protected]
Subject: st: RE: RE: RE: RE: unique value count in several variables

Please remove the "gen" from the last line of the loop. 

Nick
[email protected] 

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of Nick Cox
> Sent: 19 June 2005 12:37
> To: [email protected]
> Subject: st: RE: RE: RE: unique value count in several variables
> 
> 
> I too am fond of -levelsof-. For the problem mentioned, this would 
> need to be embedded in a loop over groups, somewhat as follows:
> 
> gen nvals = . 
> egen group = group(Gvkey year)
> su group, meanonly
> qui forval i = 1/`r(max)' { 
> 	levelsof psic if group == `i', local(p) 
> 	levelsof ssic if group == `i', local(s)
> 	local total: list s | p
> 	local total:list uniq total
> 	local count:list sizeof total
> 	replace gen nvals = `count' if group == `i' 
> }
> 
> Nick
> [email protected]
> 
> > -----Original Message-----
> > From: [email protected]
> > [mailto:[email protected]]On Behalf Of Scott 
> > Merryman
> > Sent: 19 June 2005 12:30
> > To: [email protected]
> > Subject: st: RE: RE: unique value count in several variables
> > 
> > 
> > In addition to Nick's suggestion of using -reshape-, another 
> > possibility is to use -levelsof- and the macro extended functions 
> > (assuming your cross sections are not too large):
> > 
> > 
> > . l, noobs
> > 
> >   +------------------------------------+
> >   | gvkey   psic   ssic   year   subno |
> >   |------------------------------------|
> >   |  1223   4767   4743   1999       1 |
> >   |  1223   4767   4763   1999       2 |
> >   |  1223   4757   4767   1999       3 |
> >   |  1223   4767   4753   1999       4 |
> >   |  1223   4777   4787   1999       5 |
> >   |------------------------------------|
> >   |  1223   4767   4743   1999       6 |
> >   +------------------------------------+
> > 
> > . levelsof psic, local(p)
> > 4757 4767 4777
> > 
> > . levelsof ssic, local(s)
> > 4743 4753 4763 4767 4787
> > 
> > . local total: list s | p
> > 
> > . local total:list uniq total
> > 
> > . local count:list sizeof total
> > 
> > . gen nvals = `count'
> > 
> > . l, noobs
> > 
> >   +--------------------------------------------+
> >   | gvkey   psic   ssic   year   subno   nvals |
> >   |--------------------------------------------|
> >   |  1223   4767   4743   1999       1       7 |
> >   |  1223   4767   4763   1999       2       7 |
> >   |  1223   4757   4767   1999       3       7 |
> >   |  1223   4767   4753   1999       4       7 |
> >   |  1223   4777   4787   1999       5       7 |
> >   |--------------------------------------------|
> >   |  1223   4767   4743   1999       6       7 |
> >   +--------------------------------------------+
> > 
> > 
> > Scott
> > 
> > 
> > > -----Original Message-----
> > > From: [email protected] [mailto:owner- 
> > > [email protected]] On Behalf Of Wanli Zhao
> > > Sent: Saturday, June 18, 2005 3:17 PM
> > > To: [email protected]
> > > Subject: st: RE: unique value count in several variables
> > > 
> > > Thanks, Nick. I looked into the suggestions and I think I
> might have
> > > confused you on my problem. My panel data is like this:
> > > Gvkey  psic  ssic  year  subno
> > > 1223   4767  4743  1999  1
> > > 1223   4767  4763  1999  2
> > > 1223   4757  4767  1999  3
> > > 1223   4767  4753  1999  4
> > > 1223   4777  4787  1999  5
> > > 1223   4767  4743  1999  6
> > > 
> > > Using command unique, I can count the distinct values of
> > psic and ssic by
> > > gvkey by year. So for psic it's 3 and for ssic it's 5. what
> > I want is to
> > > count the distinct values of both psic and ssic by gvkey by
> > year. In this
> > > case, it's 7 (4767, 4757, 4777, 4743, 4763, 4753, 4787). 
> > How to generate a
> > > new variable for my purpose? Hope I'm clear now. Pls help.
> > > 
> > > Thanks.
> > > Wanli Zhao
> > > 
> > 
> > 
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> > 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index