[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Sorting Data

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: Sorting Data
Date	Wed, 13 Nov 2002 17:33:23 -0000

Mkenda A.F
>
> I have the followig type of data that I want to sort in a
> way that I will
> explain.
>
> 		Table 1 ( Year 1980)
> Country     vote1   vote2   vote3   vote4   vote5 etc
> USA          Y       N       Y       Y       A
> Albania      N       Y       N       N       Y
> Algeria      N       Y       Y       A       Y
> Argentina    Y       N       Y       Y       Y
> ETC
>
>
> This data is of the UN general assembly voting on different
> resolutions and
> it repeats itself for twenty years (1980 to 2000).
>
> I want to constuct a variable for each country to represent
> the number of
> times that
> each country votes in line with the way USA votes. I am
> thinking of doing
> the following:
>
> In the first step to assign one (unit) to each vote that is
> the same as the
> vote casted (or
> not casted) by the USA and zero to each vote that differs
> with the US vote
> on the same resolution.
> Thus the data above would look like:
>
> 		Table 2 (Year 1980)
> Country   vote1   vote2    vote3   vote4    vote5  etc
> Albania    0       0         0       0       0
> Algeria    0       0         1       0       0
> Argentina  1       1         1       1       0
>
>
> Thereafter I am thinking of making a row-wise summation to get a new
> variable for the frequency that
> a country voted in tune with the USA.
>
>
> I'll very much appreciate if anyone can suggest to me a
> simple program for
> reorganising the
> data from Table 1 above to Table 2 and making the row-wise
> summation for
> each country for each year.

I am not clear how you are holding the multiple years,
so I will focus on one way of dealing with your
example for one year:

. l

       Country      vote1      vote2      vote3      vote4      vote5
  1.       USA          Y          N          Y          Y          A
  2.   Albania          N          Y          N          N          Y
  3.   Algeria          N          Y          Y          A          Y
  4. Argentina          Y          N          Y          Y          Y

First we -reshape- to long:

. reshape long vote , i(Country)

. l

       Country         _j       vote
  1.   Albania          1          N
  2.   Albania          2          Y
  3.   Albania          3          N
  4.   Albania          4          N
  5.   Albania          5          Y
  6.   Algeria          1          N
  7.   Algeria          2          Y
  8.   Algeria          3          Y
  9.   Algeria          4          A
 10.   Algeria          5          Y
 11. Argentina          1          Y
 12. Argentina          2          N
 13. Argentina          3          Y
 14. Argentina          4          Y
 15. Argentina          5          Y
 16.       USA          1          Y
 17.       USA          2          N
 18.       USA          3          Y
 19.       USA          4          Y
 20.       USA          5          A

Now we want to flag USA. In this example,
it is last alphabetically, but what about
Zambia, Zimbabwe, etc., etc., in the full
data set? So we do
that with an indicator variable:

. gen byte isUSA = Country == "USA"

Now within each vote (indexed by -_j-,
created by -reshape-) we compare
each vote with the USA's.
As a result of our indicator
variable being included in the
-sort-, the USA will be last
in each block:

. bysort _j (isUSA) : gen agreeswithUSA = vote == vote[_N]

. l

       Country         _j       vote     isUSA  agreesw~A
  1.   Albania          1          N         0          0
  2.   Algeria          1          N         0          0
  3. Argentina          1          Y         0          1
  4.       USA          1          Y         1          1
  5.   Albania          2          Y         0          0
  6.   Algeria          2          Y         0          0
  7. Argentina          2          N         0          1
  8.       USA          2          N         1          1
  9.   Albania          3          N         0          0
 10.   Algeria          3          Y         0          1
 11. Argentina          3          Y         0          1
 12.       USA          3          Y         1          1
 13.   Albania          4          N         0          0
 14.   Algeria          4          A         0          0
 15. Argentina          4          Y         0          1
 16.       USA          4          Y         1          1
 17.   Albania          5          Y         0          0
 18.   Algeria          5          Y         0          0
 19. Argentina          5          Y         0          0
 20.       USA          5          A         1          1

Now we -reshape- back

. reshape wide vote agrees , i(Country) j(_j)

and count across the observations:

.  egen Nagreements = rsum(agrees*)

. l Country vote? Nagreements

       Country      vote1      vote2      vote3      vote4      vote5
Nagreem~s
  1.   Albania          N          Y          N          N          Y
0
  2.   Algeria          N          Y          Y          A          Y
1
  3. Argentina          Y          N          Y          Y          Y
4
  4.       USA          Y          N          Y          Y          A
5

There's more on -by:- in Stata Journal 2(1), 2002 and
more on -reshape- in [R] reshape and the Stata FAQs.


Nick
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: RE: RE: Sorting Data (2)
  - From: "Nick Cox" <[email protected]>

References:
- st: Sorting Data
  - From: "Mkenda A.F" <[email protected]>

Prev by Date: Re: st: Sargan's difference test
Next by Date: Re: st: Sorting Data
Previous by thread: st: Sorting Data
Next by thread: st: RE: RE: Sorting Data (2)
Index(es):
- Date
- Thread