Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: AW: levelsof problem? |

Date |
Wed, 28 Jul 2010 17:32:06 +0100 |

What you didn't tell us turns out to be important. (That's a fact, not a criticism.) Another possibility you might consider is tagging e.g. all EU variables with a characteristic. See help for -char-. You could do this once and for all with foreach v of var DE_* NL_* { char def `v'[eu] eu } after which you can things like ds *_F, has(char eu) and the subset of names of variables with "eu" characteristic defined will be produced and will be accessible as r(varlist). With -findname- (SJ) the syntax would be findname *_F, charname(eu) and -findname- also allows you to put the list of variable names directly into a local macro. -findname- is my attempt to improve on official -ds-, not quite as hubristic as that might appear. Of course, the varlist will be longer than DE_* NL_*. Characteristics are saved with datasets, an important detail. You wrote: > Apparently the following does not work: > > egen eutotal = rowtotal(`lev'_D) > egen eutotal = rowtotal(`lev'_F) It works precisely as Stata's designers intended, but the effect is just text substitution and the effect is adding a suffix to the entire macro text, not that of adding a suffix to each word of the macro text. As Clyde Schechter underlines, that act requires more work. Nick n.j.cox@durham.ac.uk joe j On second thought, I should create a variable: gen countryF =country+"_F" and then run the following. ******* levelsof countryF if eu==1, local(lev) clean egen eutotal = rowtotal(`lev') ******* Similarly for variables with _D suffix. Thanks again for all your suggestions. joe j > Nick, Tirtankar, many many thanks. > > Nick's following suggestion would have worked for me > ******* > levelsof country if eu==1, local(lev) clean > egen eutotal = rowtotal(`lev') > ******* > However, _Fs are not the only variables based on country names; there > are others with _D suffix and some with no suffix. Apparently the > following does not work: > > ******* > egen eutotal = rowtotal(`lev'_D) > egen eutotal = rowtotal(`lev'_F) > ******* > -unab- is a good suggestion, and would be useful at some point. > However, in addition to US_F there are many countries that I want to > keep out! So for now I'd have to stick with Tirtankar's tips. > > I am sorry about rowtotal/rsum et al mix up. I have an older Stata > version at home, so I keep switching between the old and new commands > in my do file:) > > Joe. > > On Tue, Jul 27, 2010 at 7:26 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote: >> Correct on the first point, but that's the default. I know Kit Baum hates it, but my impression is that most users don't change it by -set varabbrev off-. >> >> I don't understand your second point. If it's that the solution may need modification in so far as the real problem of Joe J may differ from the toy problem, then naturally I agree. >> >> Nick >> n.j.cox@durham.ac.uk >> >> Tirthankar Chakravarty >> >> Not sure, but I think this: >> >> levelsof country if eu==1, local(lev) clean >> egen eutotal = rowtotal(`lev') >> >> will work only if you set -varabbrev- on. The -unab- tip is a good one >> and I thought about it, but the "US_F" variable could be a moving >> target (or not). >> >> 2010/7/27 Nick Cox <n.j.cox@durham.ac.uk>: >> >>> This problem seems to me simpler than is being implied. >>> >>> The direct problem is that that Joe J needs a varlist to feed to -egen-'s -rowtotal()- function. >>> >>> His starting point could be the wildcard *_F which catches all the variable names ending in _F. The difficulty is that this includes the US_F variable which for Joe J is a step too far. (At this point I merely hint at the possibility of numerous obvious political jokes without actually making any of them.) >>> >>> The command -unab-, although usually billed as a programmer's command, is useful here. It does just one thing, unabbreviate (meaning expand) a varlist to all its implied names, so that >>> >>> unab all : *_F >>> >>> unpacks all the names of the variables ending in _F and puts the result in a local macro. To remove US_F from the list we can turn to macro manipulation >>> >>> local US US_F >>> local eu : list all - US >>> >>> which gives us a macro -eu- containing the desired names. >>> >>> Some people might want to emphasise that the varlist expansion is also done by other commands: see e.g. help on -describe, varlist-, -ds-, or -findname- (SJ). But any of those does much more than this one thing, so it is most straightforward to stick to -unab-. >>> >>> It also happens that the names of the countries concerned are held as values of Joe J's string variable -country-. The only real problem here is that the list result returned by -levelsof- is complicated by double quote delimiters, but as Tirthankar shows -- and the help file clearly explains -- an option -clean- gets rid of those. >>> >>> For Joe J's example dataset >>> >>> levelsof country if eu==1, local(lev) clean >>> egen eutotal = rowtotal(`lev') >>> >>> should have worked so far as I can see. There is no need, for the example dataset, to spell out the _F suffix, although Tirthankar's code shows how to do it if needed. >>> >>> Confusion on names: Joe J mixed references to >>> >>> 1. -egen, rsum()- and -egen, rowtotal()-. >>> 2. -levels- and -levelsof-. >>> >>> In both cases (just a coincidence, this) the second name has been the preferred name since Stata 9. >>> >>> Nick >>> n.j.cox@durham.ac.uk >>> >>> joe j >>> >>> Thanks a lot, Tirthankar! >>> >>> Tirthankar Chakravarty >>> >>>> Then this (cumbersome) script should do what you want: >>>> ********************************************* >>>> clear >>>> input str2 country eu GE_F NL_F UK_F US_F >>>> US 0 1 1 1 0 >>>> US 0 1 1 1 0 >>>> NL 1 1 0 1 1 >>>> IN 0 1 1 1 1 >>>> GE 1 0 1 1 1 >>>> GE 1 0 1 1 1 >>>> US 0 1 1 1 0 >>>> US 0 1 1 1 0 >>>> US 0 1 1 1 0 >>>> PT 1 1 1 1 1 >>>> end >>>> g PT_F = 2 >>>> levelsof country if eu==1, local(lev) clean >>>> local lev2 >>>> foreach x of local lev { >>>> local lev2 " `lev2' `x'_F " >>>> } >>>> egen eutotal = rowtotal(`lev2') >>>> ********************************************* >>> >>> joe j >>> >>>>> Thanks, Martin. This is not quite what I wanted; The following command >>>>> is good enough. >>>>> egen eutotal=rowtotal(GE_F NL_F UK_F) >>>>> >>>>> The *_F variables need to be selected based on whether they belong to >>>>> eu or not (GE_F NL_F UK_F are selected, but not US_F) (The values of >>>>> _*F variables are not based on whether eu=1 or otherwise). But there >>>>> are many groupings, like eu, and a lot of countries, so I was looking >>>>> for an easy method to select. But it seems to me that manual selection >>>>> is the only choice. >>> >>> Martin Weiss >>> >>>>>> You could of course -replace- to the values you want based on the -if- >>>>>> qualifier after the fact: >>>>>> >>>>>> >>>>>> ************* >>>>>> egen eutotal=rowtotal(GE_F NL_F UK_F) >>>>>> replace eutotal=. if !eu >>>>>> ************* >>>>>> >>>>>> >>>>>> The reason that your second approach does not work is that Stata expects a >>>>>> -varlist- while you feed it >>>>>> >>>>>> `"GE"' `"NL"' `"PT"'_F >>>>>> >>>>>> which it cannot process. Type -ma di- to see the contents of your -macro-s. >>> >>> joe j >>> >>>>>> >From a data set roughly like the following >>>>>> clear >>>>>> input str2 country eu GE_F NL_F UK_F US_F >>>>>> US 0 1 1 1 0 >>>>>> US 0 1 1 1 0 >>>>>> NL 1 1 0 1 1 >>>>>> IN 0 1 1 1 1 >>>>>> GE 1 0 1 1 1 >>>>>> GE 1 0 1 1 1 >>>>>> US 0 1 1 1 0 >>>>>> US 0 1 1 1 0 >>>>>> US 0 1 1 1 0 >>>>>> PT 1 1 1 1 1 >>>>>> end >>>>>> >>>>>> I want to calculate the row sum of all *_F variables pertaining to eu >>>>>> countries (all excluding US_F): >>>>>> egen eutotal=rowtotal(GE_F NL_F UK_F) >>>>>> >>>>>> However, I would prefer to follow some rules in selecting the variables, >>>>>> like >>>>>> >>>>>> levels country if eu==1, local(lev) >>>>>> egen eutotal=rsum(`lev'_F) >>>>>> >>>>>> This doesn't work, however. Any pointers would be appreciated. >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: AW: levelsof problem?***From:*joe j <joe.stata@gmail.com>

**References**:**st: levelsof problem?***From:*joe j <joe.stata@gmail.com>

**Re: st: AW: levelsof problem?***From:*joe j <joe.stata@gmail.com>

**Re: st: AW: levelsof problem?***From:*Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>

**Re: st: AW: levelsof problem?***From:*joe j <joe.stata@gmail.com>

**RE: st: AW: levelsof problem?***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: AW: levelsof problem?***From:*Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>

**RE: st: AW: levelsof problem?***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: AW: levelsof problem?***From:*joe j <joe.stata@gmail.com>

**Re: st: AW: levelsof problem?***From:*joe j <joe.stata@gmail.com>

- Prev by Date:
**Re: st: Suest and Cox regression** - Next by Date:
**Re: st: Suest and Cox regression** - Previous by thread:
**Re: st: AW: levelsof problem?** - Next by thread:
**Re: st: AW: levelsof problem?** - Index(es):