Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: making quantiles based on controls

From   Austin Nichols <[email protected]>
To   [email protected]
Subject   Re: st: making quantiles based on controls
Date   Fri, 15 Oct 2010 00:02:31 -0400

Delahanty, Ryan <[email protected]>:

First, I don't think that code does anything like what you want; try:

webuse nhanes2, clear
g bmi=weight/height^2*10000
su bmi, d
generate bmiquart = 0
replace bmiquart=1        if             21<bmi<=23
replace bmiquart=2        if             23<bmi<=26
replace bmiquart=3        if             26<bmi
ta bmiquart, sum(bmi)

Next, anyone with a missing bmi has a bmi "higher" than any finite
value, so an open-ended comparison like the last is generally a bad

Lastly, and more constructively, you might try a variant on:

webuse nhanes2, clear
g bmi=weight/height^2*10000
_pctile bmi if race==1, nq(100)
g qw20=(bmi<=r(r20)) if bmi<.
loc j 20
foreach i in 40 60 80 100 {
 g qw`i'=(bmi>r(r`j') & bmi<=r(r`i')) if bmi<.
 loc j=`i'
g qw=qw20+2*qw40+3*qw60+4*qw80+5*qw100
ta qw race, col


webuse nhanes2, clear
g bmi=weight/height^2*10000
_pctile bmi if race==1, nq(5)
forv i=1/5 {
 g qw`i'=(bmi<=r(r`i')) if bmi<.
egen qw=rowtotal(qw?)
replace qw=6-qw
ta qw race, col

which could be turned into a -program- with a modicum of effort (and
can also be extended to allow svy weights easily).

On Thu, Oct 14, 2010 at 11:46 AM, Delahanty, Ryan
<[email protected]> wrote:
> I spend way too much time turning things into tertiles, quartiles, etc.
> So say I have to split a continuous variable like bmi into quartiles based on CONTROL values, I have to do this:
> (In this example affection status of 1 = case, 0 = control).
> by affection, sort: summarize bmi, detail
> (which tells me the splits, and then I use them like so)
> generate bmiquart = 0
> replace bmiquart=1        if             21<bmi<=23
> replace bmiquart=2        if             23<bmi<=26
> replace bmiquart=3        if             26<bmi
> and then I generate the dummies for bmiquart as I do the regression with something like this:
> xi, prefix(): logistic affecttion age income education i.bmiquart
> But the above is kind of cumbersome when splitting data a billion different ways. I'd like to substitute the first five lines and instead use a command like this:
> xtile bmiquart = bmi, nquantiles(4)
> The problem with this is that this will just split it based on bmi for cases and controls. Is there a simple command or switch that will allow the quantiles to be generated based on the distribution in controls rather than both?
> Thanks,
> Ryan

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index