Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

st: Model design question

 From To Subject st: Model design question Date Thu, 20 Dec 2012 11:19:17 -0500

```Hello,

My name is Ted Kaniuka, and I am using STAT 12 2 core and I am in the field
of Educational Leadership/economics. I don't have the strongest statistics
background but I try. My question has to do with design.

I am new to panel models so forgive me if my questions are rudimentary. I
want to regress a set of constructs measured using a survey onto student
achievement scores grouped by each school. The constructs are the
differences between how each school's principal and teachers perceive the
culture. So I have one set of scores for the principal and one set for each
teacher. To establish the difference I subtracted the principal's scores
from those of each teacher in each school. So for one school I can have 20
sets of scores if there are twenty teachers. The problem rests in that I
only have one overall achievement score for the entire school. There is no
way to link individual teacher's survey results to their students test
scores. Here is the model and a different explanation:

The difference variables were developed by subtracting the school
principal's domain score for each teacher's domain scores.  The difference
score would then become the new predictor variable.  This process was
repeated for each of the five domains in the TWC.  A basic regression model
could be:

Yj = β0 + β1jX1j + β2jX2j + β3jX3j + β4jX4j + β5jX5j

Where Xij = the school-level difference for school j between the principal's
and teachers' perceptions of the each TWC domain i, with Yj = the reading or
math score for each school.

So here is the dilemma, if I treat each school as a single unit I create a
mean score for all the teachers across each construct, subtract the lone
principal's scores from these group scores and then match the resulting
difference scores with each school's achievement score. I end up with 700
cases. I believe that I will need to weight each case as some schools have
10 teachers  while others 40 and it seems that I should account for this.
However, I wanted to do a multi-level design but it does not seem that I
should do this since I will have the same number of cases at each level. But
I do want to control for school building characteristics (wealth, size,
experience level of staff, tenure of staff at the school). I could revert
back to the data set that has 28,000 cases but that results in a data set
that for a school of twenty teachers, twenty difference scores and twenty
achievement scores that are not the scores for each teacher's class. But to
do a multi-level model this makes more sense since I have 28000 cases at one
level and 700 at the other. The large data set has many more cases but seems
incorrect to use since the student achievement scores are for the school not
the individual teacher.

I hope this is clear, while not a coding question I hope that you will have
the time to provide some feedback.

Ted

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```