Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: destringing values led to Stata recoding them as missing


From   Suzy <[email protected]>
To   [email protected]
Subject   Re: st: Re: destringing values led to Stata recoding them as missing
Date   Fri, 27 Aug 2004 23:26:01 -0400

Dear Daniel,
I used the destring option because I wasn't able to analyze the data as is - I would get error messages regarding not being able to analyze string. These values are codes that represent disorders, so you are correct. But since I am a fairly new user of Stata, I just figured that it couldn't read those values because of the dashes or the alpha-numeric since the datapoints that were only numbers were read and analyzed with no problem.

Daniel Egan wrote:


Hi Suzy,

Just to be clear, are you sure you want to create numeric values? The usual
reason for destringing a variable is that it IS a numeric variable that has
typos which cause it to be regarded as text. Is this is a continuous
variable that does have a numeric (linear etc) relationship. If each of
these string variables represent different disorders, you should have a good
methodological reason for making them numeric. Otherwise, keep them in an
"apples and oranges" arrangement of strings, i.e. diabetes (1003) is not
"one more than" malaria (1002)...

In essence, if you want to use each of these variables as categoricals, they
are fine as is - as strings. You will be able to analyze them as strings, in
a categorical or dummy variable sense.


I may be way off here, but just wanted to make sure you knew you could
analyze them as is.....

Apologies if I am being obvious.

Dan

----- Original Message ----- From: "Suzy" <[email protected]>
To: <[email protected]>
Sent: Friday, August 27, 2004 5:44 PM
Subject: st: destringing values led to Stata recoding them as missing


| Dear Statalisters;
|
| I have seven variables of over 300,000 observations each. Within each
| variable, I have over 2000 different values. These datapoints
| represent specific codes - for example : (72200 = intervertebral disc
| disorder). Within each of these seven variables, there are datapoints
| (values) with dashes or alphabets (Ie: 4109- or V2389). The majority
| of the values though, are purely numeric (23405). I used the destring
| option so that I could analyze the data and Stata treated all those
| datapoints that included dashes and alphabets as missing. Now there is a
| period . where there used to be a value. I have two questions:
|
| 1. Will the restring option restore the datapoints?
|
| 2. How can I successfully "destring" these values so that I can include
| them in my analysis?
|
| Any help and/or specific code would be very helpful as I am only
| marginally competent with Stata basics.
|
| Thank you!
| Suzy
|
|
| *
| * For searches and help try:
| * http://www.stata.com/support/faqs/res/findit.html
| * http://www.stata.com/support/statalist/faq
| * http://www.ats.ucla.edu/stat/stata/
|
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index