Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Hoaglin <dchoaglin@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: which statistical analysis to use |

Date |
Wed, 18 Apr 2012 06:57:16 -0400 |

Dear Deborah, The example of the 3 companies did not come through in your message. If I understand the data, each company ranks the 7 skills that it considers most important, assigning a score of 1 to the most important and so on to the score of 7 for the 7th most important (and a score of 0 for the 20 skills that it considers less important). Or does the order run from 7 to 1 (see below)? Thus, each company's response is a ranking (but of only 7 of the 27 skills). Your problem is not one of multiple responses (as when the instruction says, "Choose all that apply."). I have not worked with such data on rankings, and I can't provide references to the statistical literature. (I'm curious, though, and will look when I get a chance.) One complication is that, within a company, the 7 scores are correlated. Another issue is that a company's scores provide an ordering, but not a measure of importance on some absolute scale (such a measure would make the analysis easier). A preliminary analysis could ignore the correlation and average the companies' scores for each skill (including the 0s, but since 0 is less than 1, the most important skill should have a score of 7). Those means should be all right, but their standard errors will be wrong (maybe too large) because of the correlation. It might also be of interest to make the response dichotomous, replacing the positive scores with 1. You could then see how many subsets of the 27 skills are actually present in your data and cluster the companies according to the subset that they consider important. If the set of companies has no structure (e.g., different industries), you would have a frequency count for the subsets. Similarly, you could count the companies that included a particular skill among their 7. If the companies have some structure, you could take that into account. BTW, what chi-squared test did you try? David Hoaglin On Tue, Apr 17, 2012 at 7:59 AM, Deborah Beckers <deborahbeckers@hotmail.com> wrote: > Hello everybody, > > > I'm having a problem with statistical analysis for my thesis. I am using stata 11 for windows. > My data consists of a survey filled in by 360 companies, and the question I want to use is a question where they get a list of 27 employee skills, and they have to choose the 7 most important skills, by giving them a score from 1 to 7. The other skills (which they find less important) are not given any score (they are zero in my data). The data for that question thus looks somewhat as follows (example for 3 companies, one row per company: > > > My question is: what kind of statistical analysis should I do, and how, to find out whether certain skills are ranked as more (or less) important than others by the companies, and if this difference is significant? > I tried to do this with a chi-square test for goodness of fit, but i got the error: too many variables specified. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: which statistical analysis to use***From:*Deborah Beckers <deborahbeckers@hotmail.com>

- Prev by Date:
**st: 2012 German Stata Users Group Meeting** - Next by Date:
**st: AW: new features for cmp and xtabond2** - Previous by thread:
**st: RE: which statistical analysis to use** - Next by thread:
**Re: st: which statistical analysis to use** - Index(es):