Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Fred Wolfe <fwolfe@arthritis-research.org> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Comparing overlapping groups |

Date |
Wed, 3 Oct 2012 04:31:53 -0500 |

Thanks very much, David. Fred On Tue, Oct 2, 2012 at 10:05 AM, David Hoaglin <dchoaglin@gmail.com> wrote: > Dear Fred, > > If the 4 definitions were mutually exclusive subsets, you could use a > regression that has indicator variables for FM2, FM3, and FM4 (the > constant term would handle FM1, or you could include an indicator for > FM1 and turn off the constant). The result would be equivalent to a > one-way analysis of variance with 4 groups. > > Since the definitions overlap (though you have not said how many of > the overlaps are present in your data or the numbers of observations > in the overlaps --- if all 2442 observations meet at least one of the > 4 definitions, you could have as many as 15 subgroups), you could > start with a regression model that has indicators for FM2, FM3, and > FM4. The constant will give you an average for FM1, and the > coefficients of the three indicators will give incremental effects, > relative to FM1. The results may not be satisfactory, and they may be > difficult to interpret. A better approach, along the lines of main > effects and interactions, would also include indicators for each of > the subsets that involve 2 or more of the definitions. Then, for > example, you could get an estimate of the level of phq_sss among > people who meet only FM1, an increment for people who meet both FM1 > and FM2, and further increments for people who meet FM1, FM2, and FM3 > and people who meet all 4 definitions. > > I hope this discussion is helpful. > > David Hoaglin > > On Tue, Oct 2, 2012 at 10:06 AM, Fred Wolfe > <fwolfe@arthritis-research.org> wrote: >> Dear Statalisters, >> >> I am analyzing a medical condition (FM) that has 4 different >> definitions for the same condition. A person can be in 1 or more of >> four definition defined groups (FM1, FM2, FM3, FM4). There are 2442 >> observations. >> >> I am interested the value of a dependent variable, phq_sss, according >> to each group definition. >> >> For the first two definitions, I get these results >> >> . regress phq_sss i.wsp >> >> Source | SS df MS Number of obs = 2442 >> -------------+------------------------------ F( 1, 2440) = 605.51 >> Model | 7621.27967 1 7621.27967 Prob > F = 0.0000 >> Residual | 30711.1417 2440 12.5865335 R-squared = 0.1988 >> -------------+------------------------------ Adj R-squared = 0.1985 >> Total | 38332.4214 2441 15.7035729 Root MSE = 3.5478 >> >> ------------------------------------------------------------------------------ >> phq_sss | Coef. Std. Err. t P>|t| [95% Conf. Interval] >> -------------+---------------------------------------------------------------- >> 1.wsp | 6.247731 .2538992 24.61 0.000 5.74985 6.745611 >> _cons | 2.728905 .0751615 36.31 0.000 2.581518 2.876292 >> ------------------------------------------------------------------------------ >> >> . regress phq_sss i.mwsp >> >> Source | SS df MS Number of obs = 2442 >> -------------+------------------------------ F( 1, 2440) = 229.25 >> Model | 3292.19831 1 3292.19831 Prob > F = 0.0000 >> Residual | 35040.2231 2440 14.3607472 R-squared = 0.0859 >> -------------+------------------------------ Adj R-squared = 0.0855 >> Total | 38332.4214 2441 15.7035729 Root MSE = 3.7896 >> >> ------------------------------------------------------------------------------ >> phq_sss | Coef. Std. Err. t P>|t| [95% Conf. Interval] >> -------------+---------------------------------------------------------------- >> 1.mwsp | 10.37138 .6849863 15.14 0.000 9.028161 11.71459 >> _cons | 3.144753 .0771774 40.75 0.000 2.993413 3.296093 >> ------------------------------------------------------------------------------ >> >> There are two additions definitions that are not shown. >> >> So the difference for group members as opposed to none groups members >> in the two analyses above is: >> wsp 6.2 >> mwsp 10.4 >> (there will be 2 other groups). >> >> My question is, how do i tell if the results are statistically >> different between the 4 groups, given the overlapping membership in >> the groups. I have a feeling that some sort of permutation test is the >> way to get such an answer. I'd appreciate suggestions. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Comparing overlapping groups***From:*David Hoaglin <dchoaglin@gmail.com>

**References**:**st: Comparing overlapping groups***From:*Fred Wolfe <fwolfe@arthritis-research.org>

**Re: st: Comparing overlapping groups***From:*David Hoaglin <dchoaglin@gmail.com>

- Prev by Date:
**Re: st: Outreg for asclogit** - Next by Date:
**st: Truncated sample or Heckman selection** - Previous by thread:
**Re: st: Comparing overlapping groups** - Next by thread:
**Re: st: Comparing overlapping groups** - Index(es):