RE: AW: st: AW: generating count and sum variable over two different categorical variables

```<>

Nick seems to understand your intentions, I honestly do not. That could be
entirely my problem. Anyway, could you provide the number that you want to
come out of the calculation in your example?

HTH
Martin

Thx, Martin. However I need to sum the total number of people in each region
comprised of from 3-10 different districts. Using

> by region yr: egen totpop=total(distr_pop)

entails summing the district_population the same number of times as the
number of cases. What I want is to sum the different districts for each year
for a total regional population.

+--------------------------------------------------------------+
| pid   distr_~p   district   region     yr   number    totpop |
|--------------------------------------------------------------|
| 221     440674          3        1   1953        7   2802725 |
| 684     440674          3        1   1953        7   2802725 |
| 574     158681          6        1   1953        7   2802725 |
| 770     440674          3        1   1953        7   2802725 |
| 869     440674          3        1   1953        7   2802725 |
|--------------------------------------------------------------|
| 454     440674          3        1   1953        7   2802725 |
| 497     440674          3        1   1953        7   2802725 |
| 790     444041          3        1   1954        1    444041 |
| 802     112982         13        2   1954        1    112982 |
| 767     227937         18        4   1954        1    227937 |
|--------------------------------------------------------------|
|   .     139172          8        .   1953        0    139172 |
+--------------------------------------------------------------+

//M

On 11. jan. 2010, at 17.24, Martin Weiss wrote:

>
> <>
>
>
>
> *************
> clear*
>
> input   pid   distr_pop   district   region    yr
> 221     440674          3          1   1953
> 869     440674          3          1   1953
>   .     139172          8          .   1953
> 497     440674          3          1   1953
> 684     440674          3          1   1953
> 574     158681          6          1   1953
> 770     440674          3          1   1953
> 454     440674          3          1   1953
> 767     227937         18          4   1954
> 802     112982         13          2   1954
> 790     444041          3          1   1954
> end
>
> compress
>
> bys region yr: egen number=count(pid)
> by region yr: egen totpop=total(distr_pop)
>
> li, noo
> *************
>
>
>
> HTH
> Martin
>
>
> Sorry...
>
>
> . list pid distr_pop district region yr in 50/60
>     | pid   distr_pop   district   region    yr |
>     |---------------------------------------------|
> 50. | 221     440674          3          1   1953 |
> 51. | 869     440674          3          1   1953 |
> 52. |   .     139172          8          .   1953 |
> 53. | 497     440674          3          1   1953 |
> 54. | 684     440674          3          1   1953 |
>     |---------------------------------------------|
> 55. | 574     158681          6          1   1953 |
> 56. | 770     440674          3          1   1953 |
> 57. | 454     440674          3          1   1953 |
> 58. | 767     227937         18          4   1954 |
> 59. | 802     112982         13          2   1954 |
>     |---------------------------------------------|
> 60. | 790     444041          3          1   1954 |
>     +---------------------------------------------+
>
> So what I need to do is generate a variable counting the number of cases
> from each region for each year and also a variable containing the sum of
the
> population for each region for each year. There are between 3 and 10
> districts in each region.
>
> Any idea or do I have to program it from scratch?
>
> Regards,
> M
>
>
>
>
>
>
>>
>> <>
>>
>>
>>
>> As always: Show an excerpt of your data!
>>
>>
>>
>> HTH
>> Martin
>>
>>
>> Dear listers,
>> I´m doing a survival analysis, but also need to present some graphs on a
>> regional level. In other words my data is setup on an individual level
> with
>> categorical variables for year and region. What I need to do is generate
a
>> count variable for the cases counted over year AND region. Using the -
> egen
>> total- command I´m only able to sum over either year or region. Not both
> as
>> far as I understand. Also is there a way to sum over two categories, not
>> just one.
>>
>> ie my data is setup according to
>>
>> case year region
>>
>> and i´d like it setup acording to
>>
>> case year region #cases/year/region
>>
>>
>> Regards,
>> M
>>
>>
>>
>
>
>
>
```

