Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: String variable behaving oddly


From   Anna Reimondos <areimondos@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: String variable behaving oddly
Date   Thu, 11 Oct 2012 21:33:37 +1100

Dear Statalist

I am currently cleaning a survey dataset with a variety of numeric as
well as string variables. I recently discovered some very odd
behaviour with one of the string variables that I have to deal with
before I can finish my work.


In the example beow there are 23 responses from people who answered a
question about who they believe is the most influential sports person
in Australia. All these 23 people answered the same thing 'Evonne
Goolagong Cawley' (some famous sports lady).


The problem is that when I do a simple tab of the variable there are
two entries for Evonne Goolagong Cawley instead of just one. I don't
understand what is happening.


. tab var1

   [F4a] Most influential sportspeople: |
                              1st choice |      Freq.     Percent        Cum.
 ----------------------------------------+-----------------------------------
                 Evonne Goolagong Cawley |          2        8.70        8.70
                 Evonne Goolagong Cawley |         21       91.30      100.00
 ----------------------------------------+-----------------------------------
                                   Total |         23      100.00


Twp respondents are somehow being identified as having a different
answer to the rest of the people even though the spelling is exactly
the same. I have tried trimming the data, triple checking the spelling
 and so on, but can't get to the bottom of this and it is driving me
up  the wall.

Just for reference this 'issue' is affecting other entries as well,
where what I think looks like exactly the same response is not
recognised as such.

 Any help would be much appreciated.

I have a copy of the dataset (just an extract) if anyone is interested.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index