[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
John Bunge <jota.be@web.de> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
RE: st: Computation of Correlation Coefficients - COMPLEX |

Date |
Wed, 07 Nov 2007 18:36:20 +0100 |

Dear Nick and Bill, thank you for your replies and your advices which I will regard in the future. I think I was misunderstood by Bill and want to make my problem more explicit by stylizing the dataset I have: . list +-------------------------+ | cid deid year dec | |---------------------------| | 1 1 1980 -1 | | 1 2 1980 0 | | 1 3 1980 1 | | 1 4 1980 1 | | . . . . | | . . . . | | 1 4000 1999 -1 | | 2 1 1980 0 | | 2 2 1980 -1 | | 2 3 1980 0 | | 2 4 1980 1 | | . . . . | | . . . . | | 2 4000 1999 -1 | | . . . . | | . . . . | |200 1 1980 -1 | |200 2 1980 0 | |200 3 1980 0 | |200 4 1980 -1 | | . . . . | | . . . . | |200 4000 1999 1 | +-------------------------+ with cid - number of country (1-200), deid - number of decision (1-4000), and dec - decision (1 for 'yes', 0 for 'abstain' and -1 for 'no'). (i do not know if this table will be displayed the way I see it here in my e-mail programme, i only can hope that is going to be shown accurately) The correlation coefficients (cc's) for the decisions I want to compute are: between country 1 and 2, between country 1 and 3, ..., between country 1 and 200,... between country 199 and 200, respectively. The total number of cc's will be (200*199)/2 = 19,900. Now note that I need these coefficients for every single year, not over all decisions during the whole time period 1980 - 1999. So in the end, I will have the coefficients for the country-pair 1-2 (and for all other country pairs, too) for 1980, for 1981, ..., and for 1999. That is, in the end I will have 19,900*20 = 398,000 coefficients. Any helpful suggestions are highly appreciated. Sorry for having set up the problem in a precipitate manner. Best, John. ----------------------------------------------------------------- Von: statalist@hsphsun2.harvard.edu Gesendet: 07.11.07 17:03:04 An: Betreff: RE: st: Computation of Correlation Coefficients - COMPLEX Like perhaps some others, I wasn't clear on what John wanted, but now that Bill has offered a concrete interpretation, I am going to complicate things by saying what my guess would be: for each decision in each year: take all pairs of countries, and for each pair, create data points based on the scores given: (-1, -1), (-1, 0), etc. work out the correlation for that bivariate set. I am not clear why that might be interesting or useful, but it's what John's words seem to imply. I agree with Bill's general advice. Nick n.j.cox@durham.ac.uk William Gould John Bunge <jota.be@web.de> wrote, > I want to compute correlation coefficients within the following setting: > > I have the variables CID (No. of country; 1-200), DEID (No. of decision; > 1-4000), YEAR and DEC (no, abstain, yes; expressed as: -1,0,1). > > There were several decisions per year in which more or less all the > countries took part. > > I want to compute the correlation in these decisions between every country > pair for all single years. I think John as a dataset that looks something like this, . list +-------------------------+ | cid deid year dec | |-------------------------| 1. | 1 1 1990 1 | 2. | 1 2 1990 0 | 3. | 1 1 1991 0 | 4. | 1 2 1991 -1 | 5. | 1 1 1992 0 | |-------------------------| 6. | 1 2 1992 1 | 7. | 2 1 1990 0 | 8. | 2 2 1990 -1 | 9. | 2 1 1991 1 | 10. | 2 2 1991 1 | |-------------------------| 11. | 2 1 1992 -1 | 12. | 2 2 1992 0 | +-------------------------+ As I understand it, John wants to correlate dec in 1990 with 1991, 1990 with 1992, etc., matching decisions on (cid, deid). The answer is, of course, -correlate-, but -correlate- correlates variables in the same observation. So I need a dataset with dec in 1990, 1991, and 1992 in the same observation. The first step is to convert the data to the wide form: . reshape wide dec, i(cid deid) j(year) (note: j = 1990 1991 1992) Data long -> wide --------------------------------------------------------------------- Number of obs. 12 -> 4 Number of variables 4 -> 5 j variable (3 values) year -> (dropped) xij variables: dec -> dec1990 dec1991 dec1992 --------------------------------------------------------------------- Now the data look like this, . list +------------------------------------------+ | cid deid dec1990 dec1991 dec1992 | |------------------------------------------| 1. | 1 1 1 0 0 | 2. | 1 2 0 -1 1 | 3. | 2 1 0 1 -1 | 4. | 2 2 -1 1 0 | +------------------------------------------+ and I can obtain the correlations by typing . corr dec* (obs=4) | dec1990 dec1991 dec1992 -------------+--------------------------- dec1990 | 1.0000 dec1991 | -0.4264 1.0000 dec1992 | 0.0000 -0.8528 1.0000 -- Bill wgould@stata.com P.S. John also wrote, > two days ago I posted a query, unfortunately there came no reply > on it. and added, "If the problem is not expressed clearly, please give me advice". I would like to do just that, not only for John, but for others who ask questions that do not receive an answer. In this case, John asked the question too concisely. John had an excellent summary of his problem, but didn't go the extra step of including an example to make it easy for me to answer his question. Instead, I HAD TO CONCOCT THE EXAMPLE and I spent more time doing that than actually answering the question. Questioners, understand: For those of us answering questions, the satisfaction is in the answering. We are loath working on the asking part. John had a dynamite opening. I love it when questions are concise, because then I can quickly decide whether I have anything to contribute. To make it more likely John received an answer, however, John then needed to continue to set the problem up for me. Give me a small example. Make everything explicit so that then, all I have to do is say, type this. After John's concise intro, he could have added, For instance, here's a small dataset with 2 countries, 3 years, and 2 decisions: <insert listing here> What I want is the correlation of decisions in 1990 and 1991, 1990 and 1992, and 1991 and 1992, calculating the correlation across country. For instance, the correlation in 1990 and 1991 would be based on the correlation of dec in 1990 dec in 1991 -------------------------- 1 0 <- from obs 1 & 3; cid=1, deid=1 0 -1 <- from obs 2 & 4; cid=1, deid=2 0 1 <- etc... -1 1 Remember, when asking a question, you are playing on our sympathies and our desire to show off. Those who answer cannot help but more sympathetic when it appears you have invested time in formulating the question. I hope this is helpful. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ -- John Bunge Debt and Finance Analysis Unit United Nations Conference on Trade and Development (UNCTAD) Palais des Nations 1211 Genève 10 Switzerland Office: +41 229175902 Mobile: +41 762901769 _____________________________________________________________________ Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! http://smartsurfer.web.de/?mc=100071&distributionid=000000000066 * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Computation of Correlation Coefficients - COMPLEX***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: Estimates save and lincom - Stata10** - Next by Date:
**RE: st: Computation of Correlation Coefficients - COMPLEX** - Previous by thread:
**RE: st: Computation of Correlation Coefficients - COMPLEX** - Next by thread:
**RE: st: Computation of Correlation Coefficients - COMPLEX** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |