Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nahla Betelmal <nahlaib@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: sign test output |

Date |
Thu, 17 Jan 2013 10:21:32 +0000 |

Dear Nick, Thank you for the comments. the variable I am testing is not binary , and the literary of my field is concerned whether the mean (median) of this variable is different than zero. So, U is the mean in case the variable is normally distributed, or U is the median in case the distribution is not normal. from my readings in statistics , I know that in order to decide whether to use parametric or non-parametric tests, the data normality distribution should be checked first. Shapiro-Wilk is used to test normality, when the number of observations is less than 30. Otherwise, we should use Kolmogorov-Smirnov for large sample (as in my sample). So, when the test accepts the null (normality), we should use the parametric test (i.e. t-test) which examines the mean. On the other hand if the null of normality was reject, we should use the non-parametric test ( sign test) instead which examines the median (As in my case). Also, for the comment about robust, I meant exactly what said (I used the robust term loosely) Thanks for suggesting to read again, sure I will do. Many thanks again Nahla On 17 January 2013 09:49, Nick Cox <njcoxstata@gmail.com> wrote: > Your t-test is testing a quite different hypothesis. If the two states > 0 and 1 of a binary variable have equal frequencies, then its mean is > 0.5, not 0. > > That aside, the t-test can not be more appropriate for a binary > variable than what you have done already, and this is predictable in > advance, as a distribution with two distinct states is not a normal > distribution. You do not need a Kolmogorov-Smirnov test to tell you > that. > > For the record, what I suggested is best not described as a robust > test. It was calculating a confidence interval, and I showed that for > your data the result was robust to the method of calculation, meaning > merely not sensitive. The word "robust" was used informallly. > > You never define what you mean by u, so I am not commenting on any > details about u. > > I recommend that you read (or re-read) a good introductory text on > statistics, as you appear confused on some basic matters. > > Nick > > On Thu, Jan 17, 2013 at 7:52 AM, Nahla Betelmal <nahlaib@gmail.com> wrote: > >> Thank you Maarten and Nick for the great help. >> >> So, in this case I would reject the null in favour of the alternative >> u>0 as p value 0.000. However, using t-test on the same sample >> provided the opposite (i.e. accept the null). >> >> ttest DA_T_1 == 0 >> >> One-sample t test >> ------------------------------------------------------------------------------ >> Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] >> ---------+-------------------------------------------------------------------- >> DA_T_1 | 346 1.564346 1.68628 31.36663 -1.752338 4.88103 >> ------------------------------------------------------------------------------ >> mean = mean(DA_T_1) t = 0.9277 >> Ho: mean = 0 degrees of freedom = 345 >> >> Ha: mean < 0 Ha: mean != 0 Ha: mean > 0 >> Pr(T < t) = 0.8229 Pr(|T| > |t|) = 0.3542 Pr(T > t) = 0.1771 >> >> >> I think this is due to the distribution of the sample, so I performed >> K-S normality test. It shows that data is not normally distributed, >> hence I should use the non-parametric sign test instead of t-test. In >> other words I would reject the null u=0 in favor of u>0 , right? >> >> >> ksmirnov DA_T_1 = normal((DA_T_1-DA_T_1_mu)/ DA_T_1_s) >> >> One-sample Kolmogorov-Smirnov test against theoretical distribution >> normal((DA_T_1-DA_T_1_mu)/ DA_T_1_s) >> >> Smaller group D P-value Corrected >> ---------------------------------------------- >> DA_T_1: 0.4878 0.000 >> Cumulative: -0.4330 0.000 >> Combined K-S: 0.4878 0.000 0.000 >> >> >> N.B. Thank you so much Nick for the robust test you mentioned, I will >> use that as well) >> >> Many thanks >> >> Nahla >> >> On 16 January 2013 09:33, Nick Cox <njcoxstata@gmail.com> wrote: >>> In addition, it could be as or more useful to think in terms of >>> confidence intervals. With this sample size and average, 0.5 lies well >>> outside 95% intervals for the probability of being positive, and that >>> is robust to method of calculation: >>> >>> . cii 346 221 >>> >>> -- Binomial Exact -- >>> Variable | Obs Mean Std. Err. [95% Conf. Interval] >>> -------------+--------------------------------------------------------------- >>> | 346 .6387283 .0258248 .5856497 .6894096 >>> >>> . cii 346 221, jeffreys >>> >>> ----- Jeffreys ----- >>> Variable | Obs Mean Std. Err. [95% Conf. Interval] >>> -------------+--------------------------------------------------------------- >>> | 346 .6387283 .0258248 .5871262 .6880204 >>> >>> . cii 346 221, wilson >>> >>> ------ Wilson ------ >>> Variable | Obs Mean Std. Err. [95% Conf. Interval] >>> -------------+--------------------------------------------------------------- >>> | 346 .6387283 .0258248 .5868449 .6875651 >>> >>> Nick >>> >>> On Wed, Jan 16, 2013 at 9:13 AM, Maarten Buis <maartenlbuis@gmail.com> wrote: >>>> On Wed, Jan 16, 2013 at 9:38 AM, Nahla Betelmal wrote: >>>>> I have generated this output using non-parametric test "one sample >>>>> sign test" with null: U=0 , & Ua > 0 >>>>> >>>>> However, I do not understand the output. where is the p-value? is it >>>>> 0.5 in all cases or the 0.000 ( as in the first and third cases) and >>>>> 1.000 as in the second case? >>>>> >>>>>. signtest DA_T_1= 0 >>>>> >>>>> Sign test >>>>> >>>>> sign | observed expected >>>>> -------------+------------------------ >>>>> positive | 221 173 >>>>> negative | 125 173 >>>>> zero | 0 0 >>>>> -------------+------------------------ >>>>> all | 346 346 >>>>> >>>>> One-sided tests: >>>>> Ho: median of DA_T_1 = 0 vs. >>>>> Ha: median of DA_T_1 > 0 >>>>> Pr(#positive >= 221) = >>>>> Binomial(n = 346, x >= 221, p = 0.5) = 0.0000 >>>> >>>> The p-value is the last number, so in your case 0.0000. The stuff >>>> before the p-value tells you how it is computed: it is based on the >>>> binomial distribution, and in particular it is the chance of observing >>>> 221 successes or more in 346 trials when the chance of success at each >>>> trial is .5. For this tests this chance is the p-value, and it is very >>>> small, less than 0.00005. If you type in Stata -di binomialtail(346, >>>> 221, 0.5)- you will see that this chance is 1.381e-07, i.e. >>>> 0.00000001381. > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: sign test output***From:*Maarten Buis <maartenlbuis@gmail.com>

**Re: st: sign test output***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: sign test output***From:*Nahla Betelmal <nahlaib@gmail.com>

**Re: st: sign test output***From:*Maarten Buis <maartenlbuis@gmail.com>

**Re: st: sign test output***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: sign test output***From:*Nahla Betelmal <nahlaib@gmail.com>

**Re: st: sign test output***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: generate variable versus define scalar, with conditional statement** - Next by Date:
**Re: st: Stata Wishlist** - Previous by thread:
**Re: st: sign test output** - Next by thread:
**Re: st: sign test output** - Index(es):