I work with household survey data and would like to create summary
statistics disaggregated by gender, age and other characteristics. In
my do-file the summary statistics are stored as scalars and then
converted to a matrix. I encountered a problem with the disggregation
by age because I use varying age ranges. Let's assume I want to
summarize income by gender and age (for ages 20-24 in this case) with
the following data.
age male income
20 1 6
21 1 7
22 0 5
23 0 9
24 1 6
* Set start and end age (NOTE: THE AGES VARY);
scalar startage = 20;
scalar endage = 24;
* Summarize income by gender;
sum income if male==1;
scalar male = r(mean);
sum income if male==0;
scalar female = r(mean);
* Summarize income by age;
local start = 1;
local end = (endage - startage + 1);
forvalues i = `start'/`end' {;
sum income if age==(`i'+startage-1);
scalar age`i' = r(mean);
};
The problem is in the following step. Because I use varying age
ranges I cannot list all age variables individually (age1, age2,
age3, ...) but want to refer to them with a wildcard character.
drop _all;
matrix data = startage,endage,male,female,age*;
svmat data;
This leads to this error message:
age* not found
r(111);
Can the matrix be created without listing all scalars individually?
