Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: RE: unique value count in several variables


From   "Wanli Zhao" <zhaowl@temple.edu>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: RE: unique value count in several variables
Date   Sun, 19 Jun 2005 19:08:29 -0400

Nick,
I do not have nvals beforehand. I finally modified your "reshape" program as
I did manually in Eviews and it worked. I just replace the missing values
with some number (I put 99) and run your program and the nvals shows the
right number (of course it includes missing value as a distinct sic). So the
only complication in my case is the missing value needs to be a number.
Thanks again.

Wanli

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
Sent: Sunday, June 19, 2005 5:54 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: RE: unique value count in several variables

Scott's program does not claim to subdivide by your key and year and it does
not do so. 

What you call "Nick's original program" appears to be my first code as
modified by you. It was based on the idea that -nvals- did not exist
beforehand, and indeed the purpose of the code is to create -nvals-. In your
case, you appear to have used it after creating -nvals- in some other way.
That won't work. At a minimum, you need to drop -nvals- first. 
It is possible also that complications you didn't tell us about have not
been taken into account in modifying the code, as you are here using
variable names not previously explained. 

Naturally, people often simplify their problem for Statalist to show the
essence of it. That's great for the people who answer the questions. 
However, the original posters then need to add back the complications in
exactly the right way. 

Otherwise put, there is nothing in this report that looks to me like a bug
in Scott's code or mine given the original example you specified. 

You are right that the second approach will be slower than the first.
There's a lot of looping and testing -if-. 

Nick
n.j.cox@durham.ac.uk 

Wanli Zhao
 
> I feel I need to report on my running for people interested. 
> I have a large
> panel, about 1600 cross-section and 11 years. Scott's program 
> generates nvals variable with a single value 1005 ( I do not know what 
> it means) for all the gvkey-year. Nick's modification seems to work. 
> The problem is the time is unacceptable. I broke the program and the 
> values seem correct for finished part.
> Nick's original "reshape" program also gave me an error message as 
> follows:
> [reshape error
> (note: j = ssic1 ssic2)
> i (gvkey year sid) indicates the top-level grouping such as subject 
> id.
> j (_j) indicates the subgrouping such as time.
> xij variable is K.
> Thus, the following variable(s) should be constant within i:
>       nvals
> nvals not constant within i (gvkey year sid) for 28662 values of i:]
> 
> I guess the problem is that my ssic1 and ssic2 have many missing 
> values.
> Thanks.
> 
> Wanli Zhao
> 
> 
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox
> Sent: Sunday, June 19, 2005 8:06 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: st: RE: RE: RE: RE: unique value count in several variables
> 
> Please remove the "gen" from the last line of the loop. 
> 
> Nick
> n.j.cox@durham.ac.uk
> 
> > -----Original Message-----
> > From: owner-statalist@hsphsun2.harvard.edu
> > [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox
> > Sent: 19 June 2005 12:37
> > To: statalist@hsphsun2.harvard.edu
> > Subject: st: RE: RE: RE: unique value count in several variables
> > 
> > 
> > I too am fond of -levelsof-. For the problem mentioned, this would 
> > need to be embedded in a loop over groups, somewhat as follows:
> > 
> > gen nvals = . 
> > egen group = group(Gvkey year)
> > su group, meanonly
> > qui forval i = 1/`r(max)' { 
> > 	levelsof psic if group == `i', local(p) 
> > 	levelsof ssic if group == `i', local(s)
> > 	local total: list s | p
> > 	local total:list uniq total
> > 	local count:list sizeof total
> > 	replace gen nvals = `count' if group == `i' 
> > }
> > 
> > Nick
> > n.j.cox@durham.ac.uk
> > 
> > > -----Original Message-----
> > > From: owner-statalist@hsphsun2.harvard.edu
> > > [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Scott 
> > > Merryman
> > > Sent: 19 June 2005 12:30
> > > To: statalist@hsphsun2.harvard.edu
> > > Subject: st: RE: RE: unique value count in several variables
> > > 
> > > 
> > > In addition to Nick's suggestion of using -reshape-, another 
> > > possibility is to use -levelsof- and the macro extended functions 
> > > (assuming your cross sections are not too large):
> > > 
> > > 
> > > . l, noobs
> > > 
> > >   +------------------------------------+
> > >   | gvkey   psic   ssic   year   subno |
> > >   |------------------------------------|
> > >   |  1223   4767   4743   1999       1 |
> > >   |  1223   4767   4763   1999       2 |
> > >   |  1223   4757   4767   1999       3 |
> > >   |  1223   4767   4753   1999       4 |
> > >   |  1223   4777   4787   1999       5 |
> > >   |------------------------------------|
> > >   |  1223   4767   4743   1999       6 |
> > >   +------------------------------------+
> > > 
> > > . levelsof psic, local(p)
> > > 4757 4767 4777
> > > 
> > > . levelsof ssic, local(s)
> > > 4743 4753 4763 4767 4787
> > > 
> > > . local total: list s | p
> > > 
> > > . local total:list uniq total
> > > 
> > > . local count:list sizeof total
> > > 
> > > . gen nvals = `count'
> > > 
> > > . l, noobs
> > > 
> > >   +--------------------------------------------+
> > >   | gvkey   psic   ssic   year   subno   nvals |
> > >   |--------------------------------------------|
> > >   |  1223   4767   4743   1999       1       7 |
> > >   |  1223   4767   4763   1999       2       7 |
> > >   |  1223   4757   4767   1999       3       7 |
> > >   |  1223   4767   4753   1999       4       7 |
> > >   |  1223   4777   4787   1999       5       7 |
> > >   |--------------------------------------------|
> > >   |  1223   4767   4743   1999       6       7 |
> > >   +--------------------------------------------+
> > > 
> > > 
> > > Scott
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- 
> > > > statalist@hsphsun2.harvard.edu] On Behalf Of Wanli Zhao
> > > > Sent: Saturday, June 18, 2005 3:17 PM
> > > > To: statalist@hsphsun2.harvard.edu
> > > > Subject: st: RE: unique value count in several variables
> > > > 
> > > > Thanks, Nick. I looked into the suggestions and I think I
> > might have
> > > > confused you on my problem. My panel data is like this:
> > > > Gvkey  psic  ssic  year  subno
> > > > 1223   4767  4743  1999  1
> > > > 1223   4767  4763  1999  2
> > > > 1223   4757  4767  1999  3
> > > > 1223   4767  4753  1999  4
> > > > 1223   4777  4787  1999  5
> > > > 1223   4767  4743  1999  6
> > > > 
> > > > Using command unique, I can count the distinct values of
> > > psic and ssic by
> > > > gvkey by year. So for psic it's 3 and for ssic it's 5. what
> > > I want is to
> > > > count the distinct values of both psic and ssic by gvkey by
> > > year. In this
> > > > case, it's 7 (4767, 4757, 4777, 4743, 4763, 4753, 4787). 
> > > How to generate a
> > > > new variable for my purpose? Hope I'm clear now. Pls help.
> > > > 
> > > > Thanks.
> > > > Wanli Zhao
> > > > 
> > > 
> > > 
> > > *
> > > *   For searches and help try:
> > > *   http://www.stata.com/support/faqs/res/findit.html
> > > *   http://www.stata.com/support/statalist/faq
> > > *   http://www.ats.ucla.edu/stat/stata/
> > > 
> > 
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> > 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index