[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: Counting unique values across a set of variables: Re-sent |

Date |
Mon, 24 May 2004 17:34:52 +0100 |

Another way to do it: gen ZZ = 0 qui forval i = 1 /`=_N' { foreach v of var B-Y { local list `"`list' `"`=`v'[`i']'"'"' local uniq : list uniq list } replace ZZ = `: list sizeof uniq' in `i' local list } The single, double, and compound double quotes require a little care here. This is the somethimes deprecated loop over observations, which nevertheless has a certain charm in this case. Nick n.j.cox@durham.ac.uk P.S. in the previous message, add a final -renpfix- to get your variable names back to the status quo ante. > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Nick Cox > Sent: 24 May 2004 17:16 > To: statalist@hsphsun2.harvard.edu > Subject: st: RE: Counting unique values across a set of variables: > Re-sent > > > I think this is easiest through a > > -reshape- > do stuff > -reshape- > > sequence, otherwise known as the Stata twostep. > > First we -rename- variables, so that > they have a common prefix, say > > foreach v of var B-Y { > rename `v' S_`v' > } > > Then we -reshape- to long: > > reshape long S_ , i(A) string > > Now our count of distinct strings is > > bysort A S_ : gen Z = _n == 1 > by A : replace Z = sum(Z) > by A : replace Z = Z[_N] > > Now we -reshape- back > > reshape wide S_ , i(A) string > > and then -Z- is an extra variable > in the dataset. > > Note that this counts "." > as a value like any other. (And > indeed also "", " ", " ", etc.) > > If you want to subtract 1 because "." > is not of interest that one > way to do that is > > gen countperiod = 0 > foreach v of var B-Y { > replace countperiod = countperiod + (`v' == ".") > } > > replace Z = Z - (countperiod > 0) > > Nick > n.j.cox@durham.ac.uk > > > -----Original Message----- > > From: owner-statalist@hsphsun2.harvard.edu > > [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of CM > > Sent: 24 May 2004 16:53 > > To: statalist@hsphsun2.harvard.edu > > Subject: st: Counting unique values across a set of > variables: Re-sent > > > > > > Hi all, > > > > I checked findit but don't believe I found what I > > need. > > > > Each row in my data represents a respondent. Besides > > the first column "A" representing ID, the other > > columns (call them B thru Y) contain strings or "." I > > need to create a variable in column Z that counts the > > number of unique strings found for any given > > respondent in B thru Y. Advice? > > > > Thanks in advance, > > CM > > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: RE: Counting unique values across a set of variables: Re-sent***From:*CM <cmhsieh54@yahoo.com>

- Prev by Date:
**st: RE: Counting unique values across a set of variables: Re-sent** - Next by Date:
**st: RE: RE: RE: Counting unique values across a set of variables: Re-sent** - Previous by thread:
**st: RE: Counting unique values across a set of variables: Re-sent** - Next by thread:
**Re: st: RE: RE: Counting unique values across a set of variables: Re-sent** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |