Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: data manipulation help


From   Toyoto Iwata <[email protected]>
To   [email protected]
Subject   Re: st: data manipulation help
Date   Fri, 13 Aug 2004 15:45:37 +0900

Dear Yumin,

This is least-elegant-but-work method.

1. If you get such data when you "insheet",
 
   +----------------------------------------------+
     |         v1           v2     v3     v4     v5 |
     |----------------------------------------------|
  1. |    country     variable   1960   1961   2002 |
  2. | Afganistan          Aid      1      2      3 |
  3. | Afganistan   Population      4      5      6 |
  4. | Afganistan         Wage      7      8      9 |
  5. |    Albania          Aid     10     11     12 |
     |----------------------------------------------|
  6. |    Albania   Population     13     14     15 |
  7. |    Albania         Wage     16     17     18 |
  8. |      China          Aid     19     20     21 |
  9. |      China   Population     22     23     24 |
 10. |      China         Wage     25     26     27 |
     +----------------------------------------------+

foreach v of varlist v3 - v5 {

local a = `v'[1]
rename `v' v`a'
}

foreach v of varlist v1 - v2 {
local a = `v'[1]
rename `v' `a'
}

drop in 1
(1 observation deleted)

list 
 /*Result was omitted by me. I recommend to use list
to see what occurs by lines below.*/

2. Reshape yearly data vertically. 
gen id = _n
gen vid = mod(id - 1,3) + 1 
/* This variable "vid" is for the next step. */

reshape long v, i(id) j(year)

3. Reshpe with holizontal "variable id".

sort country year variable
replace id = _n
bysort country year: gen cy = _n
gen cyc = 1 if cy == 1
gen cyid = sum(cyc)
drop id variable cy cyc

reshape wide v, i(cyid) j(vid)

If you need it, you must rename v1-v3
with "aid", "population", and "wage".

> > I have been trying to save data from World
> > Development
> > Indicators (WDI, online) into a cross-section
> > time-series format. The original data look like the
> > following
> > 
> > Country	Variable Name	1960	1961	?2002
> > Afghanistan	Aid	X	X	?X
> > Afghanistan	Population	X	X	$B)9(J 
> > ??????> ??????> Afghanistan	Wage	X	X	?X
> > Albania	 Aid		X	X        ?X
> > Albania	Population	X	X	?X
> > ??????> ??????> Albania	Wage	X	X	?X
> > 
> > The ideal format I would like to have would be:
> > 
> > Country	Aid	Population	??Wage
> > Afghanistan	1960	X	X	?X
> > Afghanistan	1961	X	X	?X
> > ??????> ??????> Afghanistan	2002	X	X	?X
> > Albania	1960	X	X	?X
> > Albania	1961	X	X	?X
> > ??????> ??????> Albania	2002	X	X	?X
> > ??????> China	1960	X	X	?X
> > China	1961	X	X	?X
> > ??????> ??????> China	2002	X	X	?X
> > 
> > The problem is that some countries have data on more
> > variables than do other countries. Any kindly
> > suggestion on how to execute the data manipulation
> > trick in stata or on how to save WDI online properly
> > would be tremendously appreciated. Many thanks!
> > 
> > Best,
> > Yumin
> > 
> > 


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index