Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Programming Repetition for categories

From	Tim <[email protected]>
To	[email protected]
Subject	Re: st: Programming Repetition for categories
Date	Thu, 03 Oct 2013 13:06:30 +1000

I would probably make a variable for the categories, then use -by-

. egen cat_SHARE_DEP = cut(SHARE_DEP), at(0, 10000000, 20000000, 500000,100000000, 250000000, 1000000000, 10000000000), label

. foreach avg in Q BRANCH A TYPE P MEMB_TOT {
.    bys cat_SHARE_DEP: egen avg_`avg' = mean(`avg')
. }

The if you really want separate variables for the different means, youcan separate them later, but it's probably not necessary. It willprobably be easier to work with -by- and/or -if- to select the categorymeans you want.

As for your code, the -if SHARE_DEP- command refers to the value ofSHARE_DEP in the first observation, so only one of your if clauses willever run, and if when it runs it will operate on the whole dataset asyou have not used a subsetting -if- in the -egen- command.


See  [U] 11.1.3 if exp and  [P] if

Tim BP

On 3/10/2013 12:45, Andrew Hovel wrote:

I am trying to program repeated calculation of means for my a set of
variables categorized in bins. I am using Stata 12 for windows.

I am new to Stata programming, so I'm guessing there is a better way
to do this than I am attempting, but here goes:

I am calculating means of six variables (Q BRANCH A TYPE P MEMB_TOT)
in my data across 7 different categories of another variable,
SHARE_DEP  (represents a value of  total shares and deposits held by
credit unions)

The categories I use are 0-10million, 10-20million, 20-50 million,
50-100million, 100-250million, 250m-1billion, and >1billion

The code I am using is:
  ***average <10m
if SHARE_DEP < 10000000 {
foreach average in Q BRANCH A TYPE P MEMB_TOT {
  egen avg010_`average' = mean(`average')
  }
}
***average 10-20m
  if SHARE_DEP >= 20000000 & SHARE_DEP < 50000000 {
foreach avg in Q BRANCH A TYPE P  MEMB_TOT{
  egen avg2050_`avg' = mean(`avg')
  }
}
***
and so forth through those >1billion.

The problem here is that the means generated for the first step are
equivalent to the whole population mean, not the mean for observations
where SHARE_DEP < 10000000. (I checked this separately using -sum- for
the variables after dropping all observations where SHARE_DEP >
10000000.)
The subsequent if programs don't even execute.

Any help or suggestions for resolving this would be great.

-AH
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Programming Repetition for categories
  - From: Andrew Hovel <[email protected]>

References:
- st: Programming Repetition for categories
  - From: Andrew Hovel <[email protected]>

Prev by Date: st: Programming Repetition for categories
Next by Date: Re: st: Programming Repetition for categories
Previous by thread: st: Programming Repetition for categories
Next by thread: Re: st: Programming Repetition for categories
Index(es):
- Date
- Thread