Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: RE: Cut function


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: RE: Cut function
Date   Mon, 2 Aug 2010 18:01:56 +0100

Your problem keeps changing. 

Now it is to round into one minute intervals. I see no reason why -floor()- and/or -ceil()- could not help. 

It also appears that you want to insert extra observations. Why? What good would that do? 

Nick 
n.j.cox@durham.ac.uk 

Katia Bobulova

Dear Nick,

thank you very much for your help, but I don't think that the ceil()
function is the right solution for my problem.

I will try to explain better what I am trying to do with an example.

    Data            Time
03jan2000       93157
03jan2000       93201
03jan2000       93248
03jan2000       93305
03jan2000       93602
03jan2000       93805
03jan2000       94000

I want to divide time in 1-min intervals, so I would like to have this result:

    Data            Time
03jan2000       93000
03jan2000       93100
03jan2000       93200
03jan2000       93200
03jan2000       93300
03jan2000       93400
03jan2000       93500
03jan2000       93600
03jan2000       93700
03jan2000       93800
03jan2000       93900
03jan2000       94000

The cut function doesn't work as well because in that case I will not
have also the intervals for 93500, 93700 for example, that were not in
the "original" time.

Is there someone who can help me?

2010/7/28 Nick Cox <n.j.cox@durham.ac.uk>:
> The reference here to -ceiling()- was a typo. The ceiling function is implemented as -ceil()-, as another reference indicates. Sorry if that floored you.

Nick Cox

> You can have anything you like with zero observations: it just won't be visible in your dataset.
>
> More seriously:
>
> I don't understand your precise problem but I've never wanted to use -egen, cut()- (as compared with -egen, group()- which I use very frequently).
>
> If I want to coarsen, I always want to use constant intervals and to be totally clear which way things were rounded. I thus turn to -floor()- or -ceiling()-. Of course, other people often want otherwise.
>
> To give a non-time series example:
>
> . sysuse auto
> (1978 Automobile Data)
>
> . clonevar mpg2 = mpg
>
> . replace mpg2 = 5 * floor(mpg/5)
> (60 real changes made)
>
> . tab mpg2
>
>    Mileage |
>      (mpg) |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>         10 |          8       10.81       10.81
>         15 |         27       36.49       47.30
>         20 |         20       27.03       74.32
>         25 |         12       16.22       90.54
>         30 |          4        5.41       95.95
>         35 |          2        2.70       98.65
>         40 |          1        1.35      100.00
> ------------+-----------------------------------
>      Total |         74      100.00
>
> Thus with -floor()- you can round down; your definition will be transparent once you know what -floor()- does; and the resulting values will be automatically self-explanatory. Rounding up just requires -ceil()- instead.
>
> There is more at
>
> SJ-3-4  dm0002  . . . . . . . . Stata tip 2: Building with floors and ceilings
>        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
>        Q4/03   SJ 3(4):446--447                                 (no commands)
>        tips for using floor() and ceil()
>
> Some years ago I suggested to StataCorp that -floor()- and -ceil()- be extended to allow two arguments so that -floor(mpg, 5)- would have the effect above, but while I am still waiting it's easy enough to apply rounding to any interval.

Katia Bobulova

> I used the command egen cut to divide the time in 5-min intervals.
>
> My 5-min intervals start from 9:30, however, for some days the time
> starts for example at 9:40 and so on.
>
> Is there a way to have the time interval with zero observations?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index