# st: ratios generated from count data

 From Nikolaos Pandis To statalist@hsphsun2.harvard.edu Subject st: ratios generated from count data Date Sun, 21 Dec 2008 05:52:17 -0800 (PST)

```Hi to everyone,

I was wondering if someone would help with my data.

I have counted the number of articles that show non significant and  significant results in 5 different journals, over a five year period. The count data is further divided in 4 categories depending on the subject of the article.

The strategy I followed is:

I collapsed my data to avoid the zero values that were generated per monthly journal issue.

I calculated the ratios for nonsig/sign.

I transformed the ratio variable into log scale and I got reasonably normally distributed data.

I ran the following analysis:

xi:regress logratio i.journalnumb i.subject,r eform(exp(Coef.))
i.journalnumb     _Ijournalnu_0-4     (naturally coded; _Ijournalnu_0 omitted)
i.subject         _Isubject_1-4       (naturally coded; _Isubject_1 omitted)

Linear regression                                 Number of obs =     100
F(  7,    92) =   20.48
Prob > F      =  0.0000
R-squared     =  0.6031
Root MSE      =  .66097

------------------------------------------------------------------------------
|               Robust
logratio |exp(Coef.) Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Ijournaln~1 |1.411162   .2435852     2.00   0.049      1.00159    1.988217
_Ijournaln~2 |1.342944   .2330319     1.70   0.093     .9514501    1.895525
_Ijournaln~3 |4.983856   1.073649     7.46   0.000     3.249011     7.64504
_Ijournaln~4 |4.879356    .985192     7.85   0.000     3.267424    7.286508
_Isubject_2 |1.460614   .2671218     2.07   0.041     1.015758    2.100297
_Isubject_3 |.9367424   .1905576    -0.32   0.749     .6253973    1.403086
_Isubject_4 |2.381482   .4815867     4.29   0.000     1.593756    3.558547

Questions:

1. My concern relates to the count nature of my data. I was not sure that creating ratios of count data and then treating the ratio outcome as a continuous variable and perform linear regression is correct. Is this valid or would you recommend another approach?

2. If I use an interaction term (journalnumb*subject-not in the analysis shown above), I get a significant result. I am not sure how to interpret that in conjunction with the rest of the analysis.

3. How would you perform pairwise comparisons of the subgroups created?

Thank you very much for taking the time to read my posting.

Best wishes,

Nikolaos

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```