Thanks for the quick response. This is very helpful. Ana On Oct 28, 2011, at 11:24 AM, Nick Cox wrote:

1. -egen- has a -mode()- function. egen mode = mode(regionname), by(regioncode) 2. For that you need something like egen tag = tag(mode regioncode) egen ndistinctvalues = total(tag), by(mode) See also for a reviewSJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinctobservations(help distinct if installed) . . . . . . N. J. Cox and G.M. LongtonQ4/08 SJ 8(4):557--568 shows how to answer questions about distinct observations from first principles; provides a convenience command 3. ssc inst groups help groups groups regionname regioncode (there are other ways, but I like this one) Nick n.j.cox@durham.ac.uk Vitorino, Maria Ana Suppose I have the following data: regioncode regionname X AAA Y BBB Z CCC X . X AAA Y BBB Z . Z AAA Z CCC Z CCCAssume also that the regioncode variable is correct but there aresome errors and missing values in the regionname variable.1) Is there an efficient way to fix the entries in the regionnamevariable? (For this we need to assume that the correspondencebetween regioncode and regioname that occurs more frequently is thecorrect one.)I usually deal with this type of issues using several lines of codeso I'm wondering if there is a more efficient way making use of somestata commands that I'm not familiar with.Also, if, after correcting the mistakes, I want to 2)check if the correspondence between the two variables is unique3) create a table with regionname regioncode and frequency ofobservations (but not a two-way table)What is the most efficient way? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

