Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: AW: levelsof problem?


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: AW: levelsof problem?
Date   Tue, 27 Jul 2010 18:06:41 +0100

This problem seems to me simpler than is being implied. 

The direct problem is that that Joe J needs a varlist to feed to -egen-'s -rowtotal()- function. 

His starting point could be the wildcard *_F which catches all the variable names ending in _F. The difficulty is that this includes the US_F variable which for Joe J is a step too far. (At this point I merely hint at the possibility of numerous obvious political jokes without actually making any of them.) 

The command -unab-, although usually billed as a programmer's command, is useful here. It does just one thing, unabbreviate (meaning expand) a varlist to all its implied  names, so that 

unab all : *_F 

unpacks all the names of the variables ending in _F and puts the result in a local macro. To remove US_F from the list we can turn to macro manipulation 

local US US_F 
local eu : list all - US 

which gives us a macro -eu- containing the desired names. 

Some people might want to emphasise that the varlist expansion is also done by other commands: see e.g. help on -describe, varlist-, -ds-, or -findname- (SJ). But any of those does much more than this one thing, so it is most straightforward to stick to -unab-. 

It also happens that the names of the countries concerned are held as values of Joe J's string variable -country-. The only real problem here is that the list result returned by -levelsof- is complicated by double quote delimiters, but as Tirthankar shows -- and the help file clearly explains -- an option -clean- gets rid of those. 

For Joe J's example dataset 

levelsof country if eu==1, local(lev) clean
egen eutotal = rowtotal(`lev')

should have worked so far as I can see. There is no need, for the example dataset, to spell out the _F suffix, although Tirthankar's code shows how to do it if needed. 

Confusion on names: Joe J mixed references to 

1. -egen, rsum()- and -egen, rowtotal()-. 
2. -levels- and -levelsof-. 

In both cases (just a coincidence, this) the second name has been the preferred name since Stata 9. 

Nick 
n.j.cox@durham.ac.uk 

joe j

Thanks a lot, Tirthankar!

Tirthankar Chakravarty

> Then this (cumbersome) script should do what you want:
> *********************************************
> clear
> input str2 country      eu      GE_F NL_F  UK_F US_F
> US      0       1       1       1       0
> US      0       1       1       1       0
> NL      1       1       0       1       1
> IN      0       1       1       1       1
> GE      1       0       1       1       1
> GE      1       0       1       1       1
> US      0       1       1       1       0
> US      0       1       1       1       0
> US      0       1       1       1       0
> PT      1       1       1       1       1
> end
> g PT_F = 2
> levelsof country if eu==1, local(lev) clean
> local lev2
> foreach x of local lev {
>        local lev2 " `lev2' `x'_F "
> }
> egen eutotal = rowtotal(`lev2')
> *********************************************

joe j 

>> Thanks, Martin. This is not quite what I wanted; The following command
>>  is good enough.
>> egen eutotal=rowtotal(GE_F NL_F  UK_F)
>>
>> The *_F variables need to be selected based on whether they belong to
>> eu or not (GE_F NL_F  UK_F are selected, but not US_F) (The values of
>> _*F variables are not based on whether eu=1 or otherwise).  But there
>> are many groupings, like eu, and a lot of countries, so I was looking
>> for an easy method to select. But it seems to me that manual selection
>> is the only choice.

Martin Weiss 

>>> You could of course -replace- to the values you want based on the -if-
>>> qualifier after the fact:
>>>
>>>
>>> *************
>>> egen eutotal=rowtotal(GE_F NL_F  UK_F)
>>> replace eutotal=. if !eu
>>> *************
>>>
>>>
>>> The reason that your second approach does not work is that Stata expects a
>>> -varlist- while you feed it
>>>
>>> `"GE"' `"NL"' `"PT"'_F
>>>
>>> which it cannot process. Type -ma di- to see the contents of your -macro-s.

joe j

>>> >From a data set roughly like the following
>>> clear
>>> input str2 country      eu      GE_F NL_F  UK_F US_F
>>> US      0       1       1       1       0
>>> US      0       1       1       1       0
>>> NL      1       1       0       1       1
>>> IN      0       1       1       1       1
>>> GE      1       0       1       1       1
>>> GE      1       0       1       1       1
>>> US      0       1       1       1       0
>>> US      0       1       1       1       0
>>> US      0       1       1       1       0
>>> PT      1       1       1       1       1
>>> end
>>>
>>> I want to calculate the  row sum of all *_F variables pertaining to eu
>>> countries (all excluding US_F):
>>> egen eutotal=rowtotal(GE_F NL_F  UK_F)
>>>
>>> However, I would prefer to follow some rules in selecting the variables,
>>> like
>>>
>>> levels country if eu==1, local(lev)
>>> egen eutotal=rsum(`lev'_F)
>>>
>>> This doesn't work, however. Any pointers would be appreciated.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index