Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: RE: re. strings and non-string: the month case |
Date | Thu, 21 Nov 2013 18:17:29 +0000 |
<> What's more your previous variable is garbage, as first you -encode-d it without a label specified, so that "April" meant 1 "August" meant 2 etc. and then you defined the labels so that 1 means "January". This means that "April" originals are now "January" "August" originals are now "February" etc. so make sure you start again and destroy the false creations. Nick njcoxstata@gmail.com On 21 November 2013 17:36, Joe Canner <jcanner1@jhmi.edu> wrote: > As Richard Goldstein noted, you must use the -label- option, which implies that you must also put the label definition before the -encode- statement: > > . label define label_mon 1 "Jan" 2 "Feb" 3 "Mar" 4 "Apr" 5 "May" 6 "Jun" 7 "Jul" 8 "Aug" 9 "Sep" 10 "Oct" 11 "Nov" 12 "Dec" > . enc Month, gen(MONTHS_NUM) label(label_mon) > . label values MONTHS_NUM label_mon > > Otherwise, encode will use the alphabetical order of months; hence, April is #1, August is #2, etc... > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of PAPANIKOLAOU P. > Sent: Thursday, November 21, 2013 12:24 PM > To: statalist@hsphsun2.harvard.edu > Subject: RE: st: RE: re. strings and non-string: the month case > > Dear All, > Thank you so much. > I have followed your suggestions through: 1. Use the encode with the generate option, creating a numerical variable. Use label define and label values, see below. Following this, I run the tab command to get the frequencies, which are presented in the correct order (jan-Dec). > But, when I do label list, I do not understand why the list is in different order (1 April, 2 August and so forth) and with the full names of the month. I would have expected to see that the label list would present the months in the jan-dec order. I am confused why this has happened. Would you please let me know. Many thanks, Panos > > enc Month, gen(MONTHS_NUM) > > label define label_mon 1 "Jan" 2 "Feb" 3 "Mar" 4 "Apr" 5 "May" 6 "Jun" > 7 "Jul" 8 "Aug" 9 "Sep" 10 "Oct" 11 "Nov" 12 "Dec" > > . label values MONTHS_NUM label_mon > > . tab MONTHS_NUM > > Month | Freq. Percent Cum. > ------------+----------------------------------- > Jan | 393 8.92 8.92 > Feb | 330 7.49 16.41 > Mar | 340 7.72 24.13 > Apr | 348 7.90 32.03 > May | 379 8.60 40.64 > Jun | 376 8.54 49.17 > Jul | 368 8.35 57.53 > Aug | 418 9.49 67.01 > Sep | 423 9.60 76.62 > Oct | 328 7.45 84.06 > Nov | 347 7.88 91.94 > Dec | 355 8.06 100.00 > ------------+----------------------------------- > Total | 4,405 100.00 > > . label list MONTHS_NUM > MONTHS_NUM: > 1 April > 2 August > 3 December > 4 February > 5 January > 6 July > 7 June > 8 March > 9 May > 10 November > 11 October > 12 September > >> -----Original Message----- >> From: owner-statalist@hsphsun2.harvard.edu >> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox >> Sent: Thursday, November 21, 2013 11:22 AM >> To: statalist@hsphsun2.harvard.edu >> Subject: Re: st: RE: re. strings and non-string: the month case >> >> understanding what it does and doesn't do is important. >> >> Joe is creating a daily date from monthly information alone and then > extracting the month. That works because it is what is wanted, but the dates themselves are 1 Jan 1960, 1 Feb 1960, and so forth as without a day of the month Stata assumes 1 and without a year it assumes 1960. >> You discard that day and year information in this case, but know how > the trick was done. >> >> Nick >> njcoxstata@gmail.com >> >> >> On 21 November 2013 16:04, Joe Canner <jcanner1@jhmi.edu> wrote: >>> Panos, >>> >>> In conjunction with -encode-, as suggested by Richard Goldstein, you > might want to use the following statement to convert month names to month numbers: >>> >>> . gen month_num=month(date(MONTH,"M")) >> >> PAPANIKOLAOU P. >> >>> I have got a string variable: MONTH, where records the months of the > year (e.g., January, February and so forth) on micro-data basis. >>> Would you please assist me how best I would attach a set of label > values where 1 refers to January, 2 is for February and so forth. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/