Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: longitudinal data

From   Nick Cox <>
To   "''" <>
Subject   RE: st: longitudinal data
Date   Mon, 13 Jun 2011 14:24:23 +0100

Without wanting to inhibit Paul Seed's independence in any way, I draw attention to advice in the Statalist FAQ which strongly suggests that you keep threads public. 

Offering public help once need not be an indication that anyone is available for private help afterwards. 


joshua liwindi

Hi Paul,
Thanks for the response, this has given me a starting point,
I wont hesitate to contact you for more help.
I hope you wont mind


 From: Seed, Paul
Dear Joshua, 

In principle the method analysis should be specified in the 
study protocol, and will be implied in the power calculation.
You should check what was written at the time.

As you are not experienced in Stata, you might be best keeping 
the data in wide format, and using summary scores.
The most obvious summary score would be the 
sum (or average) of the parasite counts.
You can achieve the averages using something like 
 egen av_count = rmean(count1 count2 count7 count14 count21 count28)
depending on your variable names. count0 (pre-randomisation) is deliberately left out.

Use of egen will mean that individual missing counts are ignored, and the average
worked out for the non-missing values. The best way of handling missing data is 
another topic entirely.

See Matthews, J.N.S., Altman, Douglas G., Campbell, M.J. and 
Royston, Patrick. Analysis of serial measurements in medical 
research. Br Med J, 1990;300:230-235. 

You will only arrive at odds ratios and proportions if your summary score is a simple yes/no
with each patient (or each count) categorised as "unchanged" or "reduced", 
or whatever pair of categories is most appropriate.

For analysis purposes, you achieve more power if you adjust for the baseline 
disease severity (pre-randomisation). Something like 
 regress av_count sp_as2 sp_as3 count0
will achieve this. I assume that sp_as2 sp_as3 are 1 for the relevant treatment and 0 otherwise.
(I also assume that this is a parallel group trial, and there are no planned crossovers.)

See Frison L & Pocock SJ (1992) Repeated measures in clinical trials: 
analysis using mean summary statistics and its implications for 
design. Statistics in Medicine; 11: 1685-1704.

You should also look at distributions and consider whether you should be working 
with the log of the counts, possibly first adding a notional 1 or (0.5) to cope with any zeros.

Stata provides confidence intervals for differences in parasite counts will be provided by regression or 

Paul T Seed, Senior Lecturer in Medical Statistics, 
Division of Women's Health, King's College London and King's Health Partners
020 7188 3642

Date: Sun, 12 Jun 2011 15:38:21 +0000
From: "joshua liwindi" <>
Subject: st: longitudinal data

Hi all,
 I would like to get a hint in the easiest way to analyse this data, am not very good in STATA and more so with Biostatistics.
 I have a data set with data from an anti-malarial study with 3 arms placebo(SP), and two active arms, (SP+AS3) and (SP+AS2).
 gametocytes count is measured at days 0,1,2,7,14,21 and 28
The dataset is presented in a wide format
am intrested in effectiveness of treatment on gametocytes count and its prevalence during the 28 days
 1. How do i calculate odds ratio and proportions and  2. p values of (SP vs AS1) and (SP vs AS2) and
 3. Corresponding p values of difference in the prevalence in those two groups with their respective 95% confidence intervals for the days above.

 Do I need to change the data set to a long format or I can do all this analysis with the current wide format.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index