Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: which statistical analysis to use

 From David Hoaglin To statalist@hsphsun2.harvard.edu Subject Re: st: which statistical analysis to use Date Wed, 18 Apr 2012 06:57:16 -0400

```Dear Deborah,

The example of the 3 companies did not come through in your message.
If I understand the data, each company ranks the 7 skills that it
considers most important, assigning a score of 1 to the most important
and so on to the score of 7 for the 7th most important (and a score of
0 for the 20 skills that it considers less important).  Or does the
order run from 7 to 1 (see below)?

Thus, each company's response is a ranking (but of only 7 of the 27
skills).  Your problem is not one of multiple responses (as when the
instruction says, "Choose all that apply.").

I have not worked with such data on rankings, and I can't provide
references to the statistical literature.  (I'm curious, though, and
will look when I get a chance.)  One complication is that, within a
company, the 7 scores are correlated.  Another issue is that a
company's scores provide an ordering, but not a measure of importance
on some absolute scale (such a measure would make the analysis
easier).

A preliminary analysis could ignore the correlation and average the
companies' scores for each skill (including the 0s, but since 0 is
less than 1, the most important skill should have a score of 7).
Those means should be all right, but their standard errors will be
wrong (maybe too large) because of the correlation.

It might also be of interest to make the response dichotomous,
replacing the positive scores with 1.  You could then see how many
subsets of the 27 skills are actually present in your data and cluster
the companies according to the subset that they consider important.
If the set of companies has no structure (e.g., different industries),
you would have a frequency count for the subsets.  Similarly, you
could count the companies that included a particular skill among their
7.  If the companies have some structure, you could take that into
account.

BTW, what chi-squared test did you try?

David Hoaglin

On Tue, Apr 17, 2012 at 7:59 AM, Deborah Beckers
<deborahbeckers@hotmail.com> wrote:
> Hello everybody,
>
>
> I'm having a problem with statistical analysis for my thesis. I am using stata 11 for windows.
> My data consists of a survey filled in by 360 companies, and the question I want to use is a question where they get a list of 27 employee skills, and they have to choose the 7 most important skills, by giving them a score from 1 to 7. The other skills (which they find less important) are not given any score (they are zero in my data). The data for that question thus looks somewhat as follows (example for 3 companies, one row per company:
>
>
> My question is: what kind of statistical analysis should I do, and how, to find out whether certain skills are ranked as more (or less) important than others by the companies, and if this difference is significant?
> I tried to do this with a chi-square test for goodness of fit, but i got the error: too many variables specified.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```