Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: reshaping long panel into wide to get rowtotals
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: reshaping long panel into wide to get rowtotals
Date
Wed, 25 May 2011 18:52:52 +0100
I don't see that you need to -reshape-. It sounds as if you should be
using -collapse- to group related observations. But a deeper point is
that you shouldn't expect firm advice because you haven't explained
what you regard as a unit: is it household on a particular day? It is
not clear from this example whether you have repeated observations for
each household.
A minor point is that -rsum()- is now undocumented in favour of
-rowtotal()-, although the two are identical in effect.
FWIW, I continue my personal campaign against the expression "a data"
when you mean mean "a dataset".
Nick
On Wed, May 25, 2011 at 6:41 PM, ABDUL ADAM <[email protected]> wrote:
> I have a panel data with long format that looks like this:
>
> +---------------------------------------------------------------------------------+
> | hhnr mydate valc200 valc150 valcrest tot_val |
> |-----------------------------------------------------------------------------------|
> 18. | 16414 10jul2006 . . . 0 |
> 19. | 16414 10jul2006 . 1120.958 . 1120.958 |
> 20. | 16531 10jul2006 . . 1199.145 1199.145 |
> 21. | 16531 10jul2006 . . . 0 |
> 22. | 16545 10jul2006 . . 1535.672 1535.672 |
> |-----------------------------------------------------------------------------------|
> 23. | 16820 10jul2006 . . 1557.154 1557.154 |
> 24. | 17222 10jul2006 . . . 0 |
> 25. | 17432 10jul2006 . . 2796.389 2796.389 |
> 26. | 18116 10jul2006 . 3217.72 . 3217.72 |
> 27. | 18562 10jul2006 . . 949.102 949.102 |
> |------------------------------------------------------------------------------------|
> 28. | 18605 10jul2006 . . 7903.555 7903.555 |
> 29. | 18753 10jul2006 . 1622.18 . 1622.18 |
> 30. | 18914 10jul2006 . 7723.083 . 7723.083 |
> 31. | 18985 10jul2006 . . 7358.771 7358.771 |
> 32. | 18985 10jul2006 . 2766.125 . 2766.125 |
> |------------------------------------------------------------------------------------|
> 33. | 19139 10jul2006 . . . 0 |
> 34. | 19435 10jul2006 . . . 0 |
> 35. | 19459 10jul2006 . 2181.597 . 2181.597 |
> 36. | 19467 10jul2006 . . 1900.701 1900.701 |
> 37. | 19653 10jul2006 . . 2373.175 2373.175 |
> |------------------------------------------------------------------------------------|
> 38. | 20048 10jul2006 . 946.1188 . 946.1188 |
>
>
> I want to generate a new variable (tot_val) that is row sum of the three preceding variables (i.e valc200 valc150 valcrest). When I use egen tot_val=rsum(valc200 valc150 valcrest), as expected I get a sum which is equal to one of the variables because the other two have missing values. For instance in row 31 I get a total of 7358.771 which is the same as valc150 in that row.I think my problem is I need to get similar households(hhnr)to be in the same row (e.g. hhnr 18985 appears in rows 31 & 32 in the same day) to get their sum later. To do this I tried to reshape the data from long to wide but I am getting: hhnr not unique within mydate; this is because some households are reporting purchase of a given item twice within a same date.
>
> Apart from the reshape attempt I feel I could have generated the above variables in a better way instead of:
> gen valc200 = valuewSADT if cc200==1
> gen valc150 = valuewSADT if cc150==1
> gen valcrest = valuewSADT if ccrest==1
>
> My final aim is to produce the totals and use them to derive expenditure shares
> I would really be GRATEFUL to any explanations/tips.
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/