[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: RE: Regression across variables |

Date |
Wed, 12 Nov 2003 13:37:07 -0000 |

You're correct. I misread this problem. I have a new problem in that I have to guess what the Excel syntax does, but it looks fairly transparent. You should -reshape-, I suggest. . reshape long affrxr2tag, i(array_id) string Put the controls in a variable, e.g. with . egen control = fill(0.25 0.5 1 2 4 0.25 0.5 1 2 4) or with -repeat()- from -egenmore- on SSC . egen control = repeat(), v(0.25 0.5 1 2 4) then . bysort array_id : regress affrxr2tag control -statsby- could be vital here. Alternatively, 1. Jeroen Weesie wrote a -slope()- for -egen-. . findit _gslope I don't think it's what your problem quite needs. 2. Nick Winter wrote a -corr()- for -egen-. That's in -egenmore- from SSC. I'd still check the linearity carefully by looking at a series of graphs. Nick n.j.cox@durham.ac.uk Wallace, John > > Thanks for your reply, Nick > I was trying to keep my examples general in the belief that > it would be more > broadly useful for others, but for clarity's sake, here's a > more explicit > example. > > Some of the developmental arrays made by my company have probes > complementary (in the DNA sense) to control reagents at specific > concentrations in the sample fluid. One way to measure the > quality of the > arrays is to perform a regression of signal for those > probes against the > known concentration of the control reagents in the sample. > I've found that > the slope and r-squared of the least-squares linear > regression correlates > nicely with other measures of array quality, but computing > the fit isn't > trivial. At the moment I export the probe intensities from > the analysis > software into excel, line them up against the > concentrations for the control > reagents, and use Excel's Slope(y,x) and Rsq(y,x) functions > to get the > parameters I'm looking for. > I would prefer to do that in Stata, for all the reasons we > love Stata. The > data looks like: > > array_id a~a_x_at a~b_x_at a~c_x_at a~d_x_at > a~e_x_at > 1. 930877 12.4 22.7 51.5 108 > 293.5 > 2. 930878 7.6 13 53.1 99 > 244.2 > 3. 930898 17.7 37 90.4 198 > 436.6 > 4. 930879 11.5 18.2 55.7 114 > 277.8 > 5. 930884 11.3 24.1 56.6 126.7 > 301.3 > 6. 930885 13.3 19.8 57 139 > 270.1 > > the variable names are truncated from affxr2taga_x_at, > affxr2tagb_x_at, etc > > The Controls are at the following concentrations > TagA: 0.25 E-12M (i.e. 250 femtomolar) > TagB 0.5 E-12M > TagC 1.0 E-12M > TagD 2.0 E-12M > TagE 4.0 E-12M > > So, in Excel I would have cells like > A B C D E > R1 0.25 0.5 1.0 2.0 4.0 > R2 12.4 22.7 51.5 108 293.5 > > And in column F I would use =SLOPE(A2:E2,A1:E1) to get the > slope of the > linear regression and =RSQ(A2:E2,A1:E1) to get the coefficient of > determination. > > In stata terms, each observation would get a value in new > variables "slope" > and "fit". I've seen some egen commands like rmean() or > rsd() that works at > the observation level like that; calculating values in new > variables from a > function performed "across" variables for each observation. > > One approach I thought about was using -xpose- to switch > observations with > variables, then generating a new variable "conc" and doing > a plain ol' > regression of array_id vs conc. That's less attractive > though, because > xpose mangles your dataset (even using the ,varnames > option, you can't get > the original variable names back by running -xpose- again) > > It seems to me, from reading your earlier replies that you > think I'd like > to, for example, calculate how much the 6 measures of > a~a_x_at correlate > with a constant of 0.25. That's not the case; I'm > interested in how the > slope of (a-e vs pM) varies from array to array. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: RE: Regression across variables***From:*"Wallace, John" <John_Wallace@affymetrix.com>

- Prev by Date:
**st: average wage [was: RE: Question]** - Next by Date:
**st: RE: xtzinb** - Previous by thread:
**st: RE: Regression across variables** - Next by thread:
**st: time between last contact and death** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |