[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <[email protected]> |

To |
<[email protected]> |

Subject |
st: RE: pctile and xtile question again |

Date |
Thu, 17 Jan 2008 17:44:54 -0000 |

I have comments on two levels. First, on how to do this. As always, it is easiest for list members to see code in terms of datasets everyone can use. Your first bit seems rather indirect. I would use -centile- instead. Individual percentiles are left behind in memory as r class results by -centile-. Thus you need not put them into a variable and then take them out again, or create any variables you only need for one purpose. . sysuse auto . centile weight, centile(70) . gen byte weight_group = weight > r(c_1) if weight < . Then you can proceed directly to something like . egen mpg_group = xtile(mpg), by(weight_group) nq(3) . egen both_group = group(mpg_group weight_group) label Remember the request to explain where non-official commands you use come from. Thus -egen, xtile()- is a user-written function (by Ulrich Kohler) in the -egenmore- package on SSC. Extending this to two percentiles: . centile weight, centile(30 70) . gen byte weight_group = cond(weight < r(c_1), 1, cond(weight < r(c_2), 2, 3)) if weight < . and you can proceed as before . egen mpg_group = xtile(mpg), by(weight_group) nq(3) . egen both_group = group(mpg_group weight_group) label Note that in the auto dataset there are not in fact any missing values for -weight- but excluding them explicitly is usually going to be the right thing in most problems, and at worst does nothing. In fact, with two variables, a double restriction ... if weight < . & mpg < . is usually going to be the right thing, and at worst it does nothing and will not bite. Second, on why you are doing this. It may be impertinent, but I am curious. Under what circumstances must you do precisely this? Categorisation by quantiles throws away data. Seemingly arbitrary quantiles or numbers of quantiles do that capriciously. When is this the right thing to do in any data analysis? Nick [email protected] Rajesh Tharyan ============== I have two variables x and y, which I have to put into 6 groups. I am using the code below (code I) to first cut the x variable into 2 groups based on its 70th percentile value. And then, for each group of the x variable I cut the y variable into 3 equal groups, and finally put the two together to form the final six groups. What I would like to do is cut the y variable for each group of x based on the 30th and 70th percentile value. The code (Code II) below is my present solution and it seems very complicated. Any suggestions are very much appreciated. IS it possible to cut at specified percentiles? Code I *************start******************** * this bit cuts the x variable into two groups based on the 70th percentile value pctile xu=x, nq(10) genp(xx) replace xu=. if xx~=70 sort xu (Is this step necessary? I get slightly different numbers if I sort and when I do not sort for example for one group I get 481 with and 477 without sorting) xtile xc = x, cutpoints(xu) drop xx xu * this bits cuts the y variable into three groups for each group of x egen yc=xtile(y), by(xc) nq(3) * forming the final 6 groups gen gp=10*xc+yc ****************end******************* Code II ************start********* pctile xu=x, nq(10) genp(xx) replace xu=. if xx~=70 sort xu xtile xc = x, cutpoints(xu) drop xx xu pctile xmmu=y if xc==1, nq(10) genp(yy) replace xmmu=. if yy~=30 & yy~=70 pctile xmmcu1=y if xc==2, nq(10) genp(yy1) replace xmmcu1=. if yy1~=30 & yy1~=70 xtile yc=y if mc==1, cutpoints(xmmu) xtile yc1=y if mc==2, cutpoints(xmmu1) replace yc=yc1 if yc==. & xc==2 drop xmmu xmmu1 yc1 yy yy1 gen gp=10*xc+yc ***********end************ * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: RE: pctile and xtile question again***From:*"Rajesh Tharyan" <[email protected]>

**References**:**st: Censored variables***From:*"Andreas Drichoutis" <[email protected]>

**st: pctile and xtile question again***From:*"Rajesh Tharyan" <[email protected]>

- Prev by Date:
**st: is it possible to modify -myrereg-?** - Next by Date:
**Re: st: Plotting (time-dependent) regression coefficients** - Previous by thread:
**st: pctile and xtile question again** - Next by thread:
**st: RE: RE: pctile and xtile question again** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |