If I was using the estimated parameter for each individual as the
outcome variable in a regression, then I would probably not correct the
confidence intervals for the regression parameters using the sample
standard errors of the individual tau-as. The standard errors of the
individual tau-as are calculated assuming that the multiple bivariate
(X,Y)-pairs for each individual are sampled independently from a
sub-population of such bivariate measurements belonging to that
individual. If that assumption is not quite true, then the individual
tau-as can still be useful to know, and their joint distribution with
other individual-specific variables may still be useful to know about.
Having said that, I am curious to know why anybody would want to use the
individual tau-as as the outcome variable in a regression model, and
what question they might hope to answer by doing this.
I hope this helps.
Best wishes
Roger
Roger Newson
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
www.imperial.ac.uk/nhli/r.newson/
Opinions expressed are those of the author, not of the institution.
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Kelvin Foo
Sent: 17 November 2006 22:52
To: [email protected]
Subject: Re: st: RE: RE: running -ktau- and storing its results for each
observation
Thanks Roger, I'll try out the packages you suggested.
Your mention of confidence intervals and P-values have set me thinking
about a statistical issue which I previously did not consider... After
computing Kendall's tau parameter for each individual, I plan to use
them as the dependent variable in my estimation command, mainly
ordinary least squares regressions. After running the estimation
command, will I need to adjust the standard errors of the estimated
coefficients (since confidence intervals and p-values would mean that
the ktau estimate is not 100% precise)?
I know that if the explanatory variable is derived from some previous
estimation, then its standard error in the second step regression have
to be corrected, by bootstrapping or other methods. I'm not sure if
the same applies for the dependent variable. Can anyone
comment/advise?
Thanks.
Kelvin
On 11/17/06, Newson, Roger B <[email protected]> wrote:
> I would agree that you should start by reshaping the data to long.
> However, the next step might be to use -parmby- (part of the -parmest-
> package) together with -somersd- (part of the -somersd- package) to
> create an output dataset (or resultsset) with 1 observation per
> individual ID and data on estimates, confidence intervals and P-values
> for that individual's tau-a parameter. That way, you have a confidence
> interval and a P-value for each estimated individual tau-a, not just
an
> estimate.
>
> The -somersd- and -parmest- packeges can be downloaded from SSC using
> the -ssc- command. More documents about -somersd- and -parmest- can be
> downloaded from my website (see my signature below).
>
> I hope this helps.
>
> Roger
>
>
> Roger Newson
> Lecturer in Medical Statistics
> Respiratory Epidemiology and Public Health Group
> National Heart and Lung Institute
> Imperial College London
> Royal Brompton campus
> Room 33, Emmanuel Kaye Building
> 1B Manresa Road
> London SW3 6LR
> UNITED KINGDOM
> Tel: +44 (0)20 7352 8121 ext 3381
> Fax: +44 (0)20 7351 8322
> Email: [email protected]
> www.imperial.ac.uk/nhli/r.newson/
>
> Opinions expressed are those of the author, not of the institution.
>
>
> Kelvin Foo
>
>> I have a dataset where individuals rank items from 1 to 10 in the
>> years 1990 and 2005. I would like to find the Kendall's tau
statistic
>> for each individual's rankings of items between these two years, and
>> store the results in a new variable. In my dataset, the individuals
>> are the observations and the rankings are stored in 20 variables,
>> named A_90, B_90... J_90, A_05, B_05,.. J_05. A to J are the 10
items
>> and '90', '05' are the years.
>>
>> How can I carry out this task?
>>
>> My guess is to first use -reshape- to have 1990's rankings stored in
>> one variable (rank90), and 2005's rankings in another variable
>> (rank05). Each individual would have an identifier number associated
>> with him, and this will appear 10 times in the reshaped long format.
>>
>> Next, I was thinking of running
>>
>> generate correl=. // variable for storing Kendall tau results
>> by identifier: ktau rank90 rank05
>>
>> But how do I get Stata to store the r(tau_a) result in correl for
each
>> individual before moving on to the next one?
>>
>> Or is there an alternative way in which I can find the Kendall tau
for
>> each observation, given that my rankings are stored across different
>> variables?
>
> *