Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Missing observations

From   David Hoaglin <>
Subject   Re: st: Missing observations
Date   Fri, 21 Jun 2013 07:53:36 -0400

One answer to the question "What is not clear about this?" is "Almost
everything."  The people who are trying to help are groping in the
dark, because we do not know enough about the data and the aims of the

"9 variables" suggests that the data are 9 measures on the same units
(corporations?  mutual funds?), in which some measures are missing on
some (even many) units.  OTOH, "1-yr raw return 'group'" suggests that
each variable contains data on a separate group (i.e., a particular
unit belongs to exactly one group).  These are completely different
types of data, usually analyzed in different ways.  I do not
immediately see time series, as Nick did; that is yet another type of

We do not know what form your data take, so we are trying to guess.
You know your data, and you are trying to puzzle out how our
suggestions could apply to your data (though they may not be
applicable).  This process is very inefficient for all concerned
(count the messages in this thread, and ask how much progress has been

Please describe the data and the aim of the analysis in enough detail
that we can at least understand the problem.  If the actual details
are sensitive and cannot be shared, please describe an equivalent set
of data that does not have such restrictions.

David Hoaglin

On Thu, Jun 20, 2013 at 2:44 PM, Csaba Kertai
<> wrote:
> Thank you Nick. Could you let me know what is not clear about this, please? Let me explain what I want to do in another way. I have 9 variables each having different number of values. These 9 variables are return variables (e.g. 1-year raw return, 2-year raw return etc.) and I need to compare the means/medians/25th/75th/90th percentiles and the percentage of positive values (within one 'group') of these variables to see whether, say, the median difference between the 1-yr raw return 'group' and the 2-yr raw return 'group' is significant. For this, I have to use traditional parametric tests (i.e. the t-test) and non-parametric bootstrapping.
> Could you help me with this, please? I've been scouring the Internet for a solution to testing percentile differences but it seems that there's not much on this particular issue.
> There are basically three things I cannot get my head round: how to test the median difference of 2 'groups' (tried 'signrank' and 'signtest' but these tests are paired tests), the percentiles difference of two 'groups', and the difference of the percentage of positive values between 2 'groups'.
> So you say that one solution could be to stack the 9 variables on top of each other and then group them by, say, inserting a second column (grouping variable) with numbers that will identify the 9 groups?

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index