Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: String variable behaving badly

From	Anna Reimondos <[email protected]>
To	[email protected]
Subject	st: String variable behaving badly
Date	Thu, 11 Oct 2012 21:29:48 +1100

Dear Statalist

I am currently cleaning a survey dataset with a variety of numeric as
well as string variables. I recently discovered some very odd
behaviour with one of the string variables.

An extract of the data containing two variables (an ID variable and
the problematic string variable) is available here:

http://wikisend.com/download/508418/stringdata.dta

In the dataset are the 23 responses from people who answered a
question about who they believe is the most influential sports person
in Australia. All these 23 people answered the same thing 'Evonne
Goolagong Cawley' (a famous sports lady).

The problem is that when I do a simple tab of the variable there are
two entries for Evonne Goolagong Cawley instead of just one. I don't
understand what is happening. In the dataset you can see that the
first 2 respondents are somehow being identified as having a different
answer to the rest of the people even though the spelling is exactly
the same. I have tried trimming the data, triple checking the spelling
 and so on, but can't get to the bottom of this and it is driving me
up the wall.

Just for reference this 'issue' is affecting other entries as well,
where what I think looks like exactly the same response is not
recognised as such.
 Any help would be much appreciated.

I am using Stata 12.1

Thanks!


Anna
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: String variable behaving badly
  - From: Nick Cox <[email protected]>

Prev by Date: Re: st: Keep/Drop Observations for Top/Bottom X%
Next by Date: st: String variable behaving oddly
Previous by thread: st: stcurve: why two different graphs?
Next by thread: Re: st: String variable behaving badly
Index(es):
- Date
- Thread