[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Segmenting a dataset

From	David Kantor <[email protected]>
To	[email protected]
Subject	Re: st: Segmenting a dataset
Date	Thu, 17 May 2007 16:10:14 -0400

At 03:46 PM 5/17/2007, Morrison Hodges wrote:

I have a dataset of 10 variables and 5000 observations. I need to calculate
the median of each variable in groups of 30 observations, i.e., the median
of each variable in observations 1-30, then the median for 31-60, then
61-90, etc. I know I can get the median from the p50 value of -summarize-,
but I'm not sure how to obtain consecutive segments of 30 observations each
to perform -summarize- on. Can anyone help?
Thanks, Morry Hodges

Do you want to just see what the medians are? If so, just do..
summarize var1 var2 ... in 1/30, det
summarize var1 var2 ... in 31/60, det
etc.

You can do this in a loop, if you prefer:
forvalues j = 1(30) `=_N' {
summarize var1 var2 ... in `j' / `=min( `j'+30, _N), det
}

----

On the other hand, do you want the values deposited in the dataset? If so then, first get a "group" variable.
gen int group = floor(_n / 30)

Now if you want the values deposited into the data as constants by group...
bysort group: egen med1 = median(var1)
and so on for the other variables.

If you want just a set of collapsed values...
collapse (median) med1 = var1 (median) med2 = var2 ... , by(group)

I hope this helps.
--David

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

References:
- st: Segmenting a dataset
  - From: "Morrison Hodges" <[email protected]>

Prev by Date: Re: st: Segmenting a dataset
Next by Date: Re: st: robust with multinomial multilevel (gllamm)
Previous by thread: Re: st: Segmenting a dataset
Next by thread: Re: st: Segmenting a dataset
Index(es):
- Date
- Thread