Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: (st) Automatical aggregation from 4 digit codes to 3,2 digitcodes?


From   Philipp Rehm <[email protected]>
To   [email protected]
Subject   Re: st: (st) Automatical aggregation from 4 digit codes to 3,2 digitcodes?
Date   Thu, 23 Nov 2006 10:33:31 +0100

This may work:

* inputting your example data:
clear
input ID year var1
1111 72 2
1112 72 1
1113 72 4
1121 72 3
1111 73 1
1112 73 2
1113 73 3
1121 72 4
end

* doing the trick:
foreach N in 3 2 {
preserve
gen ID_`N'd=real(substr(string(ID),1,`N'))
collapse (sum) var1, by(ID_`N'd year)
rename ID_`N' ID
tempfile temp_`N'
save `temp_`N'', replace
restore
}

foreach N in 3 2 {
append using `temp_`N''
}

sort year ID
list

+--------------------+
| ID year var1 |
|--------------------|
1. | 11 72 14 |
2. | 111 72 7 |
3. | 112 72 7 |
4. | 1111 72 2 |
5. | 1112 72 1 |
|--------------------|
6. | 1113 72 4 |
7. | 1121 72 4 |
8. | 1121 72 3 |
9. | 11 73 6 |
10. | 111 73 6 |
|--------------------|
11. | 1111 73 1 |
12. | 1112 73 2 |
13. | 1113 73 3 |
+--------------------+


Just a few things:
- The basic trick I use it to -collapse- the data-set, and then to append it.
- Note that the -append-s have to stay outside the loop.
- Quite often, one wants to specify weights in a collapse. That's easy to add.
- If you want the 1-digit aggregates as well, just add a "1" into both -foreach- loops.

HTH,
Philipp




[email protected] wrote:

Hello. Stata Users.
I need your help.
I have panel set containing ID(4 digit), year and Var1, for simple case.
I just want to add 2 or 3 digit level Var1 using aggregating up 4-digit level
coded. For example, Currently my dataset looks like (from sort by year id)

 ID   year   var1
 1111   72    2
 1112   72    1
 1113   72    4
 1121   72    3
   .     .    .
  ..     ..   ..

 1111   73     1
 1112   73     2
 1113   73     3
 1121   72     4


 I would like to add var1 as follows:

 ID     year     var1
 11     72     ???(7+??+....)
 111    72     7 (2+1+3)
 1111   72     2
 1112   72     1
 1113   72     4
 112    72     ?? (3+?+?...)
 1121   72     3
   .     .    .
  ..     ..   ..
 11     73     ?? (6+?)
 111    73     6 (1+2+3)
 1111   73     1
 1112   73     2
 1113   73     3
 112    73     ? (4+..)
 1121   72     4


As you see, the aggregating values should be added by year and ID.
Since I have lot of codes, it would take forever to generate each 3 and 2 digit
level manually. It is possible way to create automatically 3 and 2 digit level
code and following new var1 in the system?
Thanks. Any comments and helpful remarks should be appreciated
WT





*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index