Joao Pedro W. de Azevedo
> I've just joined two datasets using the append command on
> STATA7, however
> I'm having a problem.
> When I ask to tabulate the variable region (string), which
> was present on
> both datasets I get the following simplified output:
>
> region | Freq.
> ---------------------------------------+-----------------
> Aztec | 18
> Barnsley and Doncaster | 18
> Bedfordshire | 18
> Birmingham | 18
> Aztec | 15
> Barnsley and Doncaster | 15
> Bedfordshire | 15
> Birmingham | 15
>
> The problem here is that for some reason STATA does not
> recognize that the
> names are the same, and tabulates them as if they two Aztec
> regions were
> completelly different.
> For some random reason I copied the entiere collum with
> this particular
> variable to Excell, and then copied back to the same STATA
> file, creating a
> new variable (var158). Please not that I did not modify
> this variable in any
> way, while I had it on Excell.
> For some reason, when I tabulate this new variable, STATA
> does recognize
> that the name of the region were the same and produce the
> correct output
> (bellow).
>
>
> var158 | Freq.
> -------------------------------------+----------------
> Aztec | 33
> Barnsley and Doncaster | 33
> Bedfordshire | 33
> Birmingham | 33
>
> I would like to know if anyone could give me an idea of
> what is happening,
> and how I could fix this within STATA itself.
I suspect hidden leading spaces. That is, " Aztec"
will sort before "Aztec". In one case, your leading spaces
were preserved; in the other, not.
-trim()- trims leading (and trailing) spaces.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/