Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Egen to sum across rows (with an if across rows)


From   Nick Cox <[email protected]>
To   "[email protected]" <[email protected]>
Subject   Re: st: Egen to sum across rows (with an if across rows)
Date   Mon, 29 Apr 2013 09:35:25 +0100

Replacing

          replace wardtime`cat' = wardtime`cat' + t`j'  if  cat`j' == "`cat'"

with

          replace wardtime`cat' = wardtime`cat' + t`j'  if  t`j' < . &
cat`j' == "`cat'"

or

      replace wardtime`cat' = wardtime`cat' + max(t`j', 0)  if  cat`j'
== "`cat'"

are other fixes.

Nick
[email protected]


On 29 April 2013 09:29, Lucy GELDER <[email protected]> wrote:
> Many thanks Nick. The code you suggested worked after I tweaked the data to accommodate the missing values I had in both the time and category columns. I did this by:
>
> forvalues i = 1/157{
> replace t`i'=0 if t`i'==.
> replace cat`i'="Z" if cat`i'==""
> }
>
> Lucy
> ________________________________________
> From: [email protected] [[email protected]] On Behalf Of Nick Cox [[email protected]]
> Sent: Monday, 29 April 2013 3:38 PM
> To: [email protected]
> Subject: Re: st: Egen to sum across rows (with an if across rows)
>
> You are correct. Wildcards cannot be used in -if- qualifiers (or -if-
> commands for that matter).
>
> Your syntax needs fixing in other ways. You use -cat- in the -foreach-
> statement but don't  refer to it in the loop. That's not illegal in
> itself, but the code couldn't do what you want.
>
> You state different variable names in different places, but the spirit
> of what you want seems clear.
>
> Try this:
>
> foreach cat in A B C D {
>
>       gen wardtime`cat' = 0
>
>       qui forval j = 1/157 {
>              replace wardtime`cat' = wardtime`cat' + t`j'  if  cat`j' == "`cat'"
>       }
>
> }
>
> That's assuming variables -t1-t157- -cat1-cat157-
>
> There is a general review of technique in this territory in
>
> SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
>         (help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
>         Q1/09   SJ 9(1):137--157
>         shows how to exploit functions, egen functions, and Mata
>         for working rowwise; rowsort and rowranks are introduced
>
> .pdf at http://www.stata-journal.com/sjpdf.html?articlenum=pr0046
>
> Nick
> [email protected]
>
>
> On 29 April 2013 08:18, Lucy GELDER <[email protected]> wrote:
>
>> I have a dataset which includes 157 columns of times in hours (t1-t157) and 157 columns of categories, with values A - D (cat1-cat157).
>>
>> I want to sum across the columns by category, so that I end up with four columns timeA-timeD containing the total times for each category.
>>
>> I have tried:
>>
>> foreach cat in A B C D{
>>
>>         egen wardtimeA= rowtotal(wardtime*) if (wardcat*)=="A"
>>
>> }
>>
>> and get the error "wardcat* invalid name". I presume this means I can't use the wild card in the if statement?
>>
>> Does anyone know of a way I can do this without reshaping to data long.....this is a very large dataset and I would prefer to keep it wide if possible.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index