Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Lukas Bösch" <L.Boesch@gmx.de> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RE: problems with encode |

Date |
Wed, 01 Jun 2011 16:40:38 +0200 |

Hello Thank you for your quick reply. Unfortunately Stata does not accept the command you suggested. If i enter: destring y1990-y2009, replace dpcomma Stata says: option dpcomma not allowed I am working with Stata10.0. Might this be a reason why this command does not work? I looked through all the options given for the command -destring- in the Stata help before asking statalist. These are generate, replace, ignore, force, float, percent. I tried them, apart from ignore and force, but they don't work. They either create missings, or stata tells me: contains nonnumeric characters; no generate/replace. Thank you Lukas -------- Original-Nachricht -------- > Datum: Wed, 1 Jun 2011 11:08:46 +0100 > Von: Nick Cox <n.j.cox@durham.ac.uk> > An: "\'statalist@hsphsun2.harvard.edu\'" <statalist@hsphsun2.harvard.edu> > Betreff: st: RE: problems with encode > As you have found out, -encode- will take values like "0,43" and map them > to integers with value labels. They will look the same as before, but in > principle the approach is quite wrong for such data. > > As you say, you need to -destring- such variables. What you need with your > data is to spell out the -dpcomma- option. > > Note that with -destring- you can operate on several variables at once. > > . destring y1990-y2009, replace dpcomma > > All this is documented in the help for -destring-. > > Nick > n.j.cox@durham.ac.uk > > Lukas Bösch > > I am having problems working with data that were stored as string > variables and that i converted to numeric variables with encode. In this case the % > of a countrys surface is sored as a string variable, (y1990-y2009) and i > am encoding it into numeric variable (v1990-2009) > > here is one example: > > encode y1990, gen (v1990) I did this for 1990-2009 and 130 countries but > only show the first two countries and the 4 first years. > > country y1990 y1991 y1992 y1993 v1990 v1991 v1992 v1993 > Afghan 0,43 0,43 0,43 0,43 0,43 0,43 0,43 0,43 > Algeria 6,31 6,31 6,31 6,31 6,31 6,31 6,31 6,31 > > this seems to work fine, but when i am reshaping the data into a long > form, it doesn't work any more. > > drop y1990-y2009: > reshape long v, i(id) j(year); > > year country v > 1990 Afghanistan 0,83 > 1991 Afghanistan 0,83 > 1992 Afghanistan 0,83 > 1993 Afghanistan 0,83 > 1990 Algeria 5,62 > 1991 Algeria 5,88 > 1992 Algeria 6,17 > 1993 Algeria 5,92 > > reshaping the string variable works fine though. > > > Another problem i am having with encoded data is the following: > > The human development index is measured all 5 years (1990, 1995, 2000, > 2005, 2009). I have stored it as a string variable and want to have it for the > whole time period (1990-2009). In order to do this i just copy the values > for the next years. For example, the hdi of 1990 is copied into 1991, 1992, > 1993 and 1994. Here again, i start with encoding the data. > > encode v1990, gen(value1990). I am doing this for all 5 years. > > country v1990 v1995 v2000 value1990 value1995 value2000 > Norway 0,838 0,869 0,906 0,838 0,869 0,906 > Austra 0,819 0,887 0,914 0,819 0,887 0,819 > > the next step is to generate the missing years and to copy the value of > the existing years. > > gen value1991 = value1990 > > value1991 > 106 > 103 > > In this case, stata creates a ranking and doesn't copy the value. > > > I am wondering how to deal with these problems. If i do these operations, > the reshaping and copying, with the string variables, everything works > fine, but i cant calculate with string variables and my aim is to do a > regression model. > I have read in the help file that, if the string variable contains numeric > values simply stored as strings, which is my case, i should use the > destring or generate real() functions. > I tried those, but they dont work: > > destring v1990, replace > > stata says: v1990 contains nonnumeric characters; no replace > > and the same with destring, gen() > > in the case of generate value1990 = real(v1990) > > stata just generates missings. > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ -- Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: problems with encode***From:*Maarten Buis <maartenlbuis@gmail.com>

**References**:**st: problems with encode***From:*"Lukas Bösch" <L.Boesch@gmx.de>

**st: RE: problems with encode***From:*Nick Cox <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: Issue on sorting on categorical variables** - Next by Date:
**st: xi3 with xtreg?** - Previous by thread:
**st: RE: problems with encode** - Next by thread:
**Re: st: RE: problems with encode** - Index(es):