[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: estimating expenditure quartiles for subgroups of survey dat

From   Steven Samuels <>
Subject   Re: st: estimating expenditure quartiles for subgroups of survey dat
Date   Tue, 17 Jun 2008 13:36:39 -0400


Dan, this sounds like a useful approach, especially if there are additional covariates. Your analysis plan has changed

1. Display and compare quantiles of your subgroups.


1. Find the best fitting gamma model for each subgroup.
2. Assess how well the models fit.
3. If the models fit, compare the parameters of the best fitting models from each subgroup.
4. If the parameters differ, describe how the fitted distributions differ (e.g. with density plots).

To assess the gamma fits, you can use -qgamma- and -pgamma-, downloadable from SSC. Also, -streg- will fit a generalized (3 parameter) gamma, and this can be used as an additional check. If the gamma model fits poorly in one or more of the subgroups, you can always go back to your original plan.


On Jun 14, 2008, at 7:08 AM, wrote:

Thanks, Steve, for some valuable suggestions.

Reflecting on this, it occurs to me that I could use gammafit to test the
location (and shape) parameters of the subpops' expenditures -- the gamma
appears to describe the distribution for people with expenditure -- and a
logit to test the probability of use.

I also dug back into the archives and came across a posting by Nick Cox
(distribution fitting curiosity, December 2006) that gets to the same end
by a different route.

Date: Fri, 13 Jun 2008 10:36:00 -0400
From: Steven Samuels <>
Subject: Re: st: estimating expenditure quartiles for subgroups of survey

I assume that you used -pctile- to compute your weighted quartiles.

I would not recommend hypothesis tests for percentiles of descriptive
survey data with clustering and weights, even if I knew what tests to
use (I don't). The distributions, including percentiles, of several
finite populations will never be identical, and null hypotheses of
equality are false a priori. (The exception is hypotheses about
superpopulations.) Your question appears to be: how different are
the expenditure distributions in the subpopulations? If so, I think
that confidence intervals are a better approach. Download Roger
Newson's -somsersd- package from SSC. It contains -cendif-, which
will find confidence intervals for pairwise differences in
percentiles and will accept probability weights and clusters.

Confining yourself to a small set of quantiles could mislead. If
sample size permits, enlarge the set of percentiles that you feed to -
pctile- and -cendif-. You might also check weighted histograms for
multiple modes and other anomalies.

- -Steve

* For searches and help try:
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index