Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Stata coding


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Stata coding
Date   Mon, 13 Apr 2009 13:40:19 +0100

-egen- does not drop, meaning -drop-, observations that do not satisfy
an -if- condition; it merely ignores them. 

In this example, I will for illustration take it that -if native == 1-
specifies that people are natives. You should of course substitute your
own correct syntax. 

There are various ways to attack your first problem.

Here is one. 

bysort country : egen float nativeattitudes = mean(attitudes) if native
== 1 

bysort country (nativeattitudes) : replace nativeattitudes =
nativeattitides[1] 

Here is another. 

bysort country: egen nativeattitudes = mean(cond(native == 1, attitudes,
.)) 

The answer to your second problem is yes, if I understand it correctly.
Just put two or more variables in your variable list fed to -by:-. 

Nick 
n.j.cox@durham.ac.uk 

Rahsaan Maxwell, Ph.D.

Does anyone know if there is a fast way to code new variables in Stata
so that
they equal the mean value of a subpopulation's score on particular other
variables?

For example, I am analyzing two main subpopulations: migrants and
natives.
Both of which are nested in countries and regions.  I want to calculate
a new
variable which captures the mean attitudes of natives in each
country/region
and then use that as an independent variable in a model predicting
migrant
outcomes.

I have used the code below to calculate mean attitudes of the entire
population
for country/region but I can't figure out how to make that calculation
apply to
only one subgroup yet still have the variable be valid for the whole
population.  (If I add an 'if' clause at the end of the code it will
drop all
cases that don't apply).

by country, sort : egen float countryattitudes = mean(attitudes)

by region, sort : egen float regionattitudes = mean(attitudes)

I have been calculating subgroup means 'by hand' and then manually
writing code
for the countries.  Which is slow.  But I have almost 300 regions and
multiple
variables so I'm wondering if there is a faster way to do this?

And more generally, is it possible to create a new variable with
multiple
values per country/region for means across different groups.  I.E. I
want
country1 to have one value for the mean attitude of natives and another
value
for the mean attitude of migrants.  I want to do this in order to graph
migrant/native means across the countries and regions.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index