Re: st: An EXCEL easy STATA difficult problem

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject Re: st: An EXCEL easy STATA difficult problem Date Mon, 8 Jun 2009 08:33:03 +0000 (GMT)

--- On Fri, 5/6/09, Havedd Wadf <kend200905@yahoo.com> wrote:
>     The data set has more than 3000 lines, each of which
> has three parts:
>      X1, ...,Xn,Y1,....,Yn,Z1,...Zt
>      First, for each line, I do a simple regression:
>      Y=a+b*X
>      Second, for each line, I use the Z's and the
> estimated a and b to do some calculations.

Looks like a problem for -reshape- and -statsby-. Below is
an example with just two observations, but it should also
work for 3000 observations.

*-------------- begin example ---------------
// create some example data
drop _all
input x1 x2 x3 y1 y2 y3 z1 z2 z3
1 2 3 4 3 5 7 8 9
3 1 2 4 5 6 6 7 8
end
list

// -reshape- needs an id variable
// here create such an id which is 1 for the first
// observation, 2 for the second, etc.
gen caseid = _n

// reshape the dataset into long format
reshape long x y z, i(caseid) j(sortid)

// store it in a temporary dataset
// so that we can merge the regression coefficients
// in at a later time
tempfile tofill
sort caseid sortid
save `tofill'

// create a dataset with the regression coefficients
// for each observation (caseid)
statsby, by(caseid) clear : regress y x
list

// merge these regression coeficients back into the
// reshaped dataset
sort caseid
merge caseid using `tofill'

assert _merge == 3
drop _merge
sort caseid sortid
list
*------------------ end example -----------------

Hope this helps,
Maarten

-----------------------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/