[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: unique value count in several variables |

Date |
Sun, 19 Jun 2005 22:54:25 +0100 |

Scott's program does not claim to subdivide by your key and year and it does not do so. What you call "Nick's original program" appears to be my first code as modified by you. It was based on the idea that -nvals- did not exist beforehand, and indeed the purpose of the code is to create -nvals-. In your case, you appear to have used it after creating -nvals- in some other way. That won't work. At a minimum, you need to drop -nvals- first. It is possible also that complications you didn't tell us about have not been taken into account in modifying the code, as you are here using variable names not previously explained. Naturally, people often simplify their problem for Statalist to show the essence of it. That's great for the people who answer the questions. However, the original posters then need to add back the complications in exactly the right way. Otherwise put, there is nothing in this report that looks to me like a bug in Scott's code or mine given the original example you specified. You are right that the second approach will be slower than the first. There's a lot of looping and testing -if-. Nick n.j.cox@durham.ac.uk Wanli Zhao > I feel I need to report on my running for people interested. > I have a large > panel, about 1600 cross-section and 11 years. Scott's program > generates > nvals variable with a single value 1005 ( I do not know what > it means) for > all the gvkey-year. Nick's modification seems to work. The > problem is the > time is unacceptable. I broke the program and the values seem > correct for > finished part. > Nick's original "reshape" program also gave me an error > message as follows: > [reshape error > (note: j = ssic1 ssic2) > i (gvkey year sid) indicates the top-level grouping such as > subject id. > j (_j) indicates the subgrouping such as time. > xij variable is K. > Thus, the following variable(s) should be constant within i: > nvals > nvals not constant within i (gvkey year sid) for 28662 values of i:] > > I guess the problem is that my ssic1 and ssic2 have many > missing values. > Thanks. > > Wanli Zhao > > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox > Sent: Sunday, June 19, 2005 8:06 AM > To: statalist@hsphsun2.harvard.edu > Subject: st: RE: RE: RE: RE: unique value count in several variables > > Please remove the "gen" from the last line of the loop. > > Nick > n.j.cox@durham.ac.uk > > > -----Original Message----- > > From: owner-statalist@hsphsun2.harvard.edu > > [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox > > Sent: 19 June 2005 12:37 > > To: statalist@hsphsun2.harvard.edu > > Subject: st: RE: RE: RE: unique value count in several variables > > > > > > I too am fond of -levelsof-. For the problem mentioned, this would > > need to be embedded in a loop over groups, somewhat as follows: > > > > gen nvals = . > > egen group = group(Gvkey year) > > su group, meanonly > > qui forval i = 1/`r(max)' { > > levelsof psic if group == `i', local(p) > > levelsof ssic if group == `i', local(s) > > local total: list s | p > > local total:list uniq total > > local count:list sizeof total > > replace gen nvals = `count' if group == `i' > > } > > > > Nick > > n.j.cox@durham.ac.uk > > > > > -----Original Message----- > > > From: owner-statalist@hsphsun2.harvard.edu > > > [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Scott > > > Merryman > > > Sent: 19 June 2005 12:30 > > > To: statalist@hsphsun2.harvard.edu > > > Subject: st: RE: RE: unique value count in several variables > > > > > > > > > In addition to Nick's suggestion of using -reshape-, another > > > possibility is to use -levelsof- and the macro extended functions > > > (assuming your cross sections are not too large): > > > > > > > > > . l, noobs > > > > > > +------------------------------------+ > > > | gvkey psic ssic year subno | > > > |------------------------------------| > > > | 1223 4767 4743 1999 1 | > > > | 1223 4767 4763 1999 2 | > > > | 1223 4757 4767 1999 3 | > > > | 1223 4767 4753 1999 4 | > > > | 1223 4777 4787 1999 5 | > > > |------------------------------------| > > > | 1223 4767 4743 1999 6 | > > > +------------------------------------+ > > > > > > . levelsof psic, local(p) > > > 4757 4767 4777 > > > > > > . levelsof ssic, local(s) > > > 4743 4753 4763 4767 4787 > > > > > > . local total: list s | p > > > > > > . local total:list uniq total > > > > > > . local count:list sizeof total > > > > > > . gen nvals = `count' > > > > > > . l, noobs > > > > > > +--------------------------------------------+ > > > | gvkey psic ssic year subno nvals | > > > |--------------------------------------------| > > > | 1223 4767 4743 1999 1 7 | > > > | 1223 4767 4763 1999 2 7 | > > > | 1223 4757 4767 1999 3 7 | > > > | 1223 4767 4753 1999 4 7 | > > > | 1223 4777 4787 1999 5 7 | > > > |--------------------------------------------| > > > | 1223 4767 4743 1999 6 7 | > > > +--------------------------------------------+ > > > > > > > > > Scott > > > > > > > > > > -----Original Message----- > > > > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner- > > > > statalist@hsphsun2.harvard.edu] On Behalf Of Wanli Zhao > > > > Sent: Saturday, June 18, 2005 3:17 PM > > > > To: statalist@hsphsun2.harvard.edu > > > > Subject: st: RE: unique value count in several variables > > > > > > > > Thanks, Nick. I looked into the suggestions and I think I > > might have > > > > confused you on my problem. My panel data is like this: > > > > Gvkey psic ssic year subno > > > > 1223 4767 4743 1999 1 > > > > 1223 4767 4763 1999 2 > > > > 1223 4757 4767 1999 3 > > > > 1223 4767 4753 1999 4 > > > > 1223 4777 4787 1999 5 > > > > 1223 4767 4743 1999 6 > > > > > > > > Using command unique, I can count the distinct values of > > > psic and ssic by > > > > gvkey by year. So for psic it's 3 and for ssic it's 5. what > > > I want is to > > > > count the distinct values of both psic and ssic by gvkey by > > > year. In this > > > > case, it's 7 (4767, 4757, 4777, 4743, 4763, 4753, 4787). > > > How to generate a > > > > new variable for my purpose? Hope I'm clear now. Pls help. > > > > > > > > Thanks. > > > > Wanli Zhao > > > > > > > > > > > > > * > > > * For searches and help try: > > > * http://www.stata.com/support/faqs/res/findit.html > > > * http://www.stata.com/support/statalist/faq > > > * http://www.ats.ucla.edu/stat/stata/ > > > > > > > * > > * For searches and help try: > > * http://www.stata.com/support/faqs/res/findit.html > > * http://www.stata.com/support/statalist/faq > > * http://www.ats.ucla.edu/stat/stata/ > > > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: RE: RE: unique value count in several variables***From:*"Wanli Zhao" <zhaowl@temple.edu>

- Prev by Date:
**st: RE: unique value count in several variables** - Next by Date:
**st: recoding strings variables to numerical** - Previous by thread:
**st: RE: RE: unique value count in several variables** - Next by thread:
**st: RE: RE: RE: unique value count in several variables** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |