Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: adding data for identical individuals


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: adding data for identical individuals
Date   Wed, 15 Jan 2003 22:08:43 -0000

Rodrigo Brice�o
>
> I have a database with incomes and a database with socioeconomic
> characteristics of individuals. The income database has
> 9,000 records and
> the socioeconomic has 19,000. I merged the two databases
> using some unique
> identifiers between the two databases.
>
> With the new database I need to construct a new variable
> called (income
> deciles) and for this I need the total income for the
> household (assigned to
> each individual of the household). The problem is that not
> every observation
> (in the new database) corresponding to the same household has income
> observations in the merged database (it could be the
> childrens, people with
> no reported income, etc).
> I already think in a procedure to make my statistical
> analysis right:
> To add incomes for each household due to the fact that each
> household could
> have several sources of income (salaries, transfers, etc.)
> First of all I
> don't know how to add those values (a new variable
> independent of the source
> I guess) neither how to assign the sum (for each household)
> to each of the
> members of the household. I require this to make the deciles.
> Can somebody help me with the commands or the steps that I
> need to do.
>
> Example:
> HH  Member  Age  Sex  SourceIncome  Total Income
> 1     1            37     1           11                   1500
> 1     1            42     2           11                   3000
> 1     1            42     2           53                    400
> 1     2            14     2            .                       .
> 1     2            25     2            .                       .
>
> You can find identical observations in the variable
> "member" because is
> possible that each household is composed by several families.
>

The total income in a household is

. bysort hh : egen hhincome = sum(total_income)

This treats all missings as 0. Short of some elaborate
imputation exercise, this appears to be the only thing
you can do.

When you say you want deciles, I guess that you
want a grouping into 10 groups using -xtile-. One
way to do this is use just one observation from
each household, and then to smear the results
across all values for each household.

. egen tag = tag(hh)
. xtile dechhincome = hhincome if tag, nq(10)
. bysort hh (dechhincome) : replace dechhincome = dechhincome[1]

One magic word here is -by:-. For manipulations involving
groups, -by:- is invaluable. Use the manual index
to see various sections on -by:-. Alternatively,
there is an overall tutorial in the Stata Journal:

How to move step by: step. Stata Journal 2(1):86-102
(2002)

Nick
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index