# RE: st: Re: Making sure identifiers are unique

 From "louis boakye-yiadom" To statalist@hsphsun2.harvard.edu Subject RE: st: Re: Making sure identifiers are unique Date Tue, 08 Mar 2005 16:07:46 +0000

```Michael,
Thanks a lot. I'll do that.

Louis

```

I guess you need to look at the help for collapse some more. -collapse- calculates whatever summary statistics you specify for each unique combination of the by variable list and then collapses the dataset to one observation for each of these unique combinations. It will, by design, always result in the by varlist becoming a unique identifier.

Michael Blasnik
michael.blasnik@verizon.net

----- Original Message ----- From: "louis boakye-yiadom" <louisby@hotmail.com>
To: <statalist@hsphsun2.harvard.edu>
Sent: Tuesday, March 08, 2005 10:48 AM
Subject: st: Making sure identifiers are unique

Dear all,
I've been trying to determine the identifiers of a data set, and to ensure they're unique. Suspecting the variables, "region" and "district" are the identifiers, I gave the commands below, and got the output shown:
. sort region district
. by region district: assert _N==1
assertion is false
r(9);

Owing to the fact that I'm more interested in the "district"-level data, I wanted to know whether a collapsed version of the data will have unique identifiers. I therefore gave the following set of commands and got the results shown:
. gen x=1
. collapse (count) x, by (region district)
. sort region district
. by region district: assert _N==1

My question is: What can account for the collaped data being uniquely identified by "region" and "district", whilst the original data are not? I'm using version 8.2.

Many thanks,
Louis

```
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```