Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Very small sample and multivariate analysis?


From   "Joseph Coveney" <jcoveney@bigplanet.com>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Re: Very small sample and multivariate analysis?
Date   Tue, 18 Sep 2012 11:27:01 +0900

Yupa wrote:

I need an advice. I have a dataset with 25 observations and three
variables: a biomarker (continuous variable), a first dummy coded 0/1
(group) and a second dummy coded 0/1.
The distribution of the biomarker isn't normal. I found a statistical
difference in the biomarker level between group 0 and group 1 with the
Mann Whitney test, but not with the t test.
A referee asked for a multivariate analysis to account for the
contribution of the second dummy variable...
Which approach/analysis may I consider?

--------------------------------------------------------------------------------

In addition to Jay's advice, you can try one of the approaches below:

1. If you're not including a term for interaction, then rank-transform the 
   biomarker values and perform a two-way factorial ANOVA on the ranks.

       egen double biomarker_rank = rank(biomarker)
       anova biomarker_rank first_dummy second_dummy

  Note: if you're interested in *stratifying* on the second dummy, then you can
  use -emh-, a user-written command that can generalize the Mann-Whitney test 
  to do this.  It can be installed from SSC.

      emh biomarker first_dummy, anova strata(second_dummy) trans(modridit)

2. Transform the biomarker so that the residuals after -regress- or -anova- are
   normal-like to your satisfaction.  For many biomarkers, a logarithmic 
   transform is a good place to start.

      generate double ln_biomarker = ln(biomarker + 1)
      regress ln_biomarker i.first_dummy i.second_dummy
      predict double ln_biomarker_res, residuals
      qnorm ln_biomarker_res
      pnorm ln_biomarker_res

3.  Use a permutation (randomization) test.

      program define tester
          version 12.1

          anova `0'
          test first_dummy
      end

      permute biomarker F=r(F), reps(1000) nodots ///
          seed(`=date("2012-09-18", "YMD")'): ///
          tester biomarker first_dummy second_dummy

    Again, if you're stratifying on the second_dummy, then use -permute- with
    its -strata()- option.

Joseph Coveney



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index