st: RE: RE: Regression across variables

 From "Nick Cox" To Subject st: RE: RE: Regression across variables Date Wed, 12 Nov 2003 13:37:07 -0000

```You're correct. I misread this problem.
I have a new problem in that I have
to guess what the Excel syntax does,
but it looks fairly transparent.

You should -reshape-, I suggest.

. reshape long affrxr2tag, i(array_id) string

Put the controls in a variable, e.g. with

. egen control = fill(0.25 0.5 1 2 4 0.25 0.5 1 2 4)

or with -repeat()- from -egenmore- on SSC

. egen control = repeat(), v(0.25 0.5 1 2 4)

then

. bysort array_id : regress affrxr2tag control

-statsby- could be vital here.

Alternatively,

1. Jeroen Weesie wrote a -slope()- for -egen-.

. findit _gslope

I don't think it's what your problem quite needs.

2. Nick Winter wrote a -corr()- for -egen-. That's
in -egenmore- from SSC.

I'd still check the linearity carefully by
looking at a series of graphs.

Nick
n.j.cox@durham.ac.uk

Wallace, John
>
> I was trying to keep my examples general in the belief that
> it would be more
> broadly useful for others, but for clarity's sake, here's a
> more explicit
> example.
>
> Some of the developmental arrays made by my company have probes
> complementary (in the DNA sense) to control reagents at specific
> concentrations in the sample fluid.  One way to measure the
> quality of the
> arrays is to perform a regression of signal for those
> probes against the
> known concentration of the control reagents in the sample.
> I've found that
> the slope and r-squared of the least-squares linear
> regression correlates
> nicely with other measures of array quality, but computing
> the fit isn't
> trivial.  At the moment I export the probe intensities from
> the analysis
> software into excel, line them up against the
> concentrations for the control
> reagents, and use Excel's Slope(y,x) and Rsq(y,x) functions
> to get the
> parameters I'm looking for.
> I would prefer to do that in Stata, for all the reasons we
> love Stata.  The
> data looks like:
>
>        array_id   a~a_x_at   a~b_x_at   a~c_x_at   a~d_x_at
>   a~e_x_at
>   1.     930877       12.4       22.7       51.5        108
>      293.5
>   2.     930878        7.6         13       53.1         99
>      244.2
>   3.     930898       17.7         37       90.4        198
>      436.6
>   4.     930879       11.5       18.2       55.7        114
>      277.8
>   5.     930884       11.3       24.1       56.6      126.7
>      301.3
>   6.     930885       13.3       19.8         57        139
>      270.1
>
> the variable names are truncated from affxr2taga_x_at,
> affxr2tagb_x_at, etc
>
> The Controls are at the following concentrations
> TagA: 0.25 E-12M (i.e. 250 femtomolar)
> TagB	0.5 E-12M
> TagC	1.0 E-12M
> TagD	2.0 E-12M
> TagE	4.0 E-12M
>
> So, in Excel I would have cells like
> 	A	B	C	D	E
> R1	0.25	0.5	1.0	2.0	4.0
> R2	12.4	22.7	51.5	108	293.5
>
> And in column F I would use =SLOPE(A2:E2,A1:E1) to get the
> slope of the
> linear regression and =RSQ(A2:E2,A1:E1) to get the coefficient of
> determination.
>
> In stata terms, each observation would get a value in new
> variables "slope"
> and "fit".  I've seen some egen commands like rmean() or
> rsd() that works at
> the observation level like that; calculating values in new
> variables from a
> function performed "across" variables for each observation.
>
> One approach I thought about was using -xpose- to switch
> observations with
> variables, then generating a new variable "conc" and doing
> a plain ol'
> regression of array_id vs conc.  That's less attractive
> though, because
> xpose mangles your dataset (even using the ,varnames
> option, you can't get
> the original variable names back by running -xpose- again)
>
> It seems to me, from reading your earlier replies that you
> think I'd like
> to, for example, calculate how much the 6 measures of
> a~a_x_at correlate
> with a constant of 0.25.  That's not the case; I'm
> interested in how the
> slope of (a-e vs pM) varies from array to array.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```