Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: simple program to compare spearman coeffs using bootstrap - please help!


From   "Chong, Qi Lin Andrew" <qchong@middlebury.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: simple program to compare spearman coeffs using bootstrap - please help!
Date   Thu, 29 Apr 2010 02:12:02 -0400

Hi all, 

sorry to bother again. This is a rudimentary program for what I'm trying to do. I have put the data in the form eff1a eff1b eff2a eff2b where these correspond to the same observation (in this case they are four efficiency scores attached to a firm). I just want to run a spearman coeff on eff1a eff1b and eff2a eff2b separately, and then take the difference. 

Can anyone help with the basic problems in this program? 

capture program drop spearmandiff
program define spearmandiff, rclass
spearman eff1a eff1b
return scalar def1=r(rho)
spearman eff2a eff2b
return scalar def2=r(rho)
gen diff = def1-def2
save diff 
end

bootstrap diff=r(diff), reps(1000) : spearmandiff

The program is wrong but it illustrates that I just want to draw a sample, do a spearman test on the first 2 and the last 2 data pairs, and then compare the differences. The final result for bootstrap should give me a mean difference and a SD. 

I basically just want to calculate the statistical signficiance of the difference between the two spearman coefficients. 1) I am not sure how to save the different values of diff that I get from drawing the different bootstrap samples, and 2) how to make sure they end up in the final output for the multiple bootstrap sampling. Will this give me the confidence interval I require to evaluate its statistical significance? 

Thanks for any help you can render! 

Andrew 

________________________________________
From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] On Behalf Of Roger Newson [r.newson@imperial.ac.uk]
Sent: Wednesday, April 28, 2010 8:28 AM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: comparing differences in Kendall's tau or Spearman's coefficient using somersd and/or bootstrap

Yes, this sounds like what you ought to be doing. Correlations between
X1 and Y1 can only be compared with correlations between X2 and Y2 using
firms with all 4 variables.

I hope this helps.

Best wishes

Roger


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/

Opinions expressed are those of the author, not of the institution.

On 28/04/2010 05:37, Chong, Qi Lin Andrew wrote:
> Hi all,
>
> thanks a lot to Dr. Newson for responding to my question. To him and all others, I am unsure how I should proceed with the bootstrap command. As Dr. Newson explained accurately, I am "trying to measure the tau-a correlation between "Efficiency definition 1" (ED1) in Year A and ED1 in Year B, and then measure the tau-a correlation between "Efficiency Definition 2" (ED2) in Year A and ED2 in Year B, and then calculate a confidence interval for the difference between the 2 taus."
>
> If I were to reorder the data set so that each firm has "ED1 YearA" "ED1 YearB" "ED2 YearA" "ED2 YearB" lined up next to each other, etc, ed1a ed1b ed2a ed2b, how should I go about estimating this interval?
>
> If it is easy to give me the exact commands, I would very much appreciate it, but if not any advice would be much appreciated. I am trying to calculate the statistical significance of their difference, so should I be drawing bootstrap samples of firms with ed1a ed1b ed2a ed2b all together? And then calculating one difference between ed1a ed1b&  ed2a ed2b for each sample?
>
> Would this involve some kind of programming using bootstrap alone and w/o somersd? Also, does it matter that Spearman is in terms of RANKS, so if I happen to draw the same sample twice from say the 6000 observations I have, then there would be a tie in the bootstrap drawn sample (6000 obs from original 6000 but with replacement)? Would this tie be a problem?
>
> Many thanks for your help!
>
> Andrew
>
>
>
> ________________________________________
> From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] On Behalf Of Roger Newson [r.newson@imperial.ac.uk]
> Sent: Monday, April 26, 2010 6:08 AM
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: comparing differences in Kendall's tau or Spearman's coefficient using somersd
>
> As I understand it, you are trying to measure the tau-a correlation
> between "Efficiency definition 1" (ED1) in Year A and ED1 in Year B, and
> then measure the tau-a correlation between "Efficiency Definition 2"
> (ED2) in Year A and ED2 in Year B, and then calculate a confidence
> interval for the difference between the 2 taus.
>
> The somersd package unfortunately does not yet do this, The examples in
> the manual are more similar to the case where there is a third
> efficiency definition (ED3), which is thought too be a "gold standard".
> We would then compare the tau-a between ED3 and ED1 with the tau-a
> between ED3 and ED2, and find out which is greater.
>
> To do what you want to do, the best answer will be to use either the
> jackknife or the bootstrap. Both of these are available in Stata.
> However, your query has drawn attention to an important limitation of
> the -somersd- package. It would be an improvement if -somersd- could
> output the delta-jackknife pseudovalues or the delta-jackknife influence
> function to an output dataset (or resultsset) with 1 observation per
> cluster and data on the pseudovalues or influence function. This is one
> of many improvements I would like to make to -somersd- when I have the time.
>
> I hope this helps.
>
> Best wishes
>
> Roger
>
>
> Roger B Newson BSc MSc DPhil
> Lecturer in Medical Statistics
> Respiratory Epidemiology and Public Health Group
> National Heart and Lung Institute
> Imperial College London
> Royal Brompton Campus
> Room 33, Emmanuel Kaye Building
> 1B Manresa Road
> London SW3 6LR
> UNITED KINGDOM
> Tel: +44 (0)20 7352 8121 ext 3381
> Fax: +44 (0)20 7351 8322
> Email: r.newson@imperial.ac.uk
> Web page: http://www.imperial.ac.uk/nhli/r.newson/
> Departmental Web page:
> http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
>
> Opinions expressed are those of the author, not of the institution.
>
> On 25/04/2010 19:11, Chong, Qi Lin Andrew wrote:
>> The scores are bounded from 1 to infinity, and a half-normal
>> distribution has been assumed for them in the original stochastic
>> frontier (and a normal distribution for the errors). A score of say
>> 1.25 indicates a firm incurs 25% more costs than the most efficient
>> firm it can be compared
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index