Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: problems with encode


From   "Lukas Bösch" <L.Boesch@gmx.de>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: problems with encode
Date   Wed, 01 Jun 2011 16:40:38 +0200

Hello

Thank you for your quick reply.

Unfortunately Stata does not accept the command you suggested.
If i enter:
           destring y1990-y2009, replace dpcomma
Stata says:
           option dpcomma not allowed

I am working with Stata10.0. Might this be a reason why this command does not work?
I looked through all the options given for the command -destring- in the Stata help before asking statalist. These are generate, replace, ignore, force, float, percent. I tried them, apart from ignore and force, but they don't work. They either create missings, or stata tells me:
contains nonnumeric characters; no generate/replace.

Thank you

Lukas

-------- Original-Nachricht --------
> Datum: Wed, 1 Jun 2011 11:08:46 +0100
> Von: Nick Cox <n.j.cox@durham.ac.uk>
> An: "\'statalist@hsphsun2.harvard.edu\'" <statalist@hsphsun2.harvard.edu>
> Betreff: st: RE: problems with encode

> As you have found out, -encode- will take values like "0,43" and map them
> to integers with value labels. They will look the same as before, but in
> principle the approach is quite wrong for such data. 
> 
> As you say, you need to -destring- such variables. What you need with your
> data is to spell out the -dpcomma- option. 
> 
> Note that with -destring- you can operate on several variables at once. 
> 
> . destring y1990-y2009, replace dpcomma
> 
> All this is documented in the help for -destring-. 
> 
> Nick 
> n.j.cox@durham.ac.uk 
> 
> Lukas Bösch
> 
> I am having problems working with data that were stored as string
> variables and that i converted to numeric variables with encode. In this case the %
> of a countrys surface is sored as a string variable, (y1990-y2009) and i
> am encoding it into numeric variable (v1990-2009)
> 
> here is one example:
> 
> encode y1990, gen (v1990) I did this for 1990-2009 and 130 countries but
> only show the first two countries and the 4 first years.
> 
> country	y1990	y1991	y1992	y1993	v1990	v1991	v1992	v1993	
> Afghan	0,43	0,43	0,43	0,43	0,43	0,43	0,43	0,43	
> Algeria	6,31	6,31	6,31	6,31	6,31	6,31	6,31	6,31	
> 
> this seems to work fine, but when i am reshaping the data into a long
> form, it doesn't work any more.
> 
> drop y1990-y2009:
> reshape long v, i(id) j(year);
> 
> year	country	         v
> 1990	Afghanistan	0,83
> 1991	Afghanistan	0,83
> 1992	Afghanistan	0,83
> 1993	Afghanistan	0,83
> 1990	Algeria	        5,62
> 1991	Algeria	        5,88
> 1992	Algeria	        6,17
> 1993	Algeria	        5,92
> 
> reshaping the string variable works fine though.
> 
> 
> Another problem i am having with encoded data is the following:
> 
> The human development index is measured all 5 years (1990, 1995, 2000,
> 2005, 2009). I have stored it as a string variable and want to have it for the
> whole time period (1990-2009). In order to do this i just copy the values
> for the next years. For example, the hdi of 1990 is copied into 1991, 1992,
> 1993 and 1994. Here again, i start with encoding the data.
> 
> encode v1990, gen(value1990). I am doing this for all 5 years.
> 
> country	v1990	v1995	v2000 value1990 value1995 value2000
> Norway	0,838	0,869	0,906  0,838     0,869      0,906
> Austra	0,819	0,887	0,914  0,819     0,887      0,819
> 
> the next step is to generate the missing years and to copy the value of
> the existing years.
> 
> gen value1991 = value1990
> 
> value1991
> 106
> 103
> 
> In this case, stata creates a ranking and doesn't copy the value.
> 
> 
> I am wondering how to deal with these problems. If i do these operations,
> the reshaping and copying, with the string variables, everything works
> fine, but i cant calculate with string variables and my aim is to do a
> regression model. 
> I have read in the help file that, if the string variable contains numeric
> values simply stored as strings, which is my case, i should use the
> destring or generate real() functions.
> I tried those, but they dont work:
> 
> destring v1990, replace
> 
> stata says: v1990 contains nonnumeric characters; no replace
> 
> and the same with destring, gen()
> 
> in the case of generate value1990 = real(v1990)
> 
> stata just generates missings.
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

-- 
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index