# st: Re: Data management, years of schooling

 From "Joseph Coveney" To "Statalist" Subject st: Re: Data management, years of schooling Date Wed, 4 Feb 2009 12:55:40 +0900

```There are probably better ways, but something like that below should do it.
(Note that I'd normally prefer something more like

generate byte education _yrs = mod(hi_edu, 10) + ///
7 * inrange(hi_edu, 21, 24) + ///
11 * inrange(hi_edu, 31, 35)

because it would be easier to maintain--more self-documenting--but there's an
outside chance that it is somewhat slower in execution, perhaps even
noticeably so if you've got a very large amount of data.)

Joseph Coveney

. clear *

. set more off

. input hhid hi_educ years

hhid    hi_educ      years
1. 1       11      1
2. 2       21      8
3. 3       17      7
4. 4       16      6
5. 5       24      11
6. 6       31      12
7. 7       32      13
8. 8       13      3
9. 9       22      9
10. end

. generate byte education_yrs = mod(hi_educ, 10) + ///
```
```  7 * floor(hi_educ / 20) + ///
4 * floor(hi_educ / 30)
```
```
. list, noobs separator(0)

+-----------------------------------+
| hhid   hi_educ   years   educat~s |
|-----------------------------------|
|    1        11       1          1 |
|    2        21       8          8 |
|    3        17       7          7 |
|    4        16       6          6 |
|    5        24      11         11 |
|    6        31      12         12 |
|    7        32      13         13 |
|    8        13       3          3 |
|    9        22       9          9 |
+-----------------------------------+

. exit

Ronnie Babigumira wrote:

```
```I have an interesting data management problem. My data look like this
```
```[see below]
```
```Where hi_educ is the highest level of education for household. From this I
would like to extract the number of years of schooling.

Now, for values below 17, the years of schooling is the last digit
for values between 21 and 24, it is 7 + the last digit
for values between 31 and 35 it is 11 + the last digit

What I would like to end up with is something like this

hhid hi_educ years
1 11 1
2 21 8
3 17 7
4 16 6
5 24 11
6 31 12
7 32 13
8 13 3
9 22 9

I am stuck here
gen str3 test = ""
replace test  = substr(string(hi_educ), -1,.) if inrange(hi_educ,11,17)

I would appreciate any help
```
```

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```