Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: analysing experimental panel data
Joerg Luedicke <firstname.lastname@example.org>
Re: st: analysing experimental panel data
Thu, 18 Oct 2012 12:29:54 -0500
On second thought, there might be a source of misunderstanding here.
The OP stated that:
"We presented participants with two mores set of prices in a randomized order"
The way I read this is that all 2,000 study participants were
confronted with both additional prices, and that the order in which
these prices were presented to them was randomized.
However, I can imagine that the OP meant to say that they drew a
random sample of 1,000 individuals and assigned them to price a, and
the other 1,000 to price b. If the latter is true, then my approach is
of course useless here. In this case it looks like a straightforward
difference-in-difference design to me, as the Econ folks would call
it. That is, you would have a binary variable for treatment a/b and a
binary variable for baseline/follow-up and then use these variables
including their interaction to estimate the differences between
baseline/follow-up for both treatment arms. If these differences are
expected to vary across groups one would just need to include
additional interaction effects.
On Thu, Oct 18, 2012 at 11:12 AM, Joerg Luedicke
> This is a quite general inquiry and there is probably a lot of wriggle
> room in terms of how to analyze these data. There may also be a number
> of details that could matter that don't show up in the post. However,
> here is one possible approach.
> First of all, I would not call this 'panel data' in a sense that time
> as such does not seem to play a role here. I would rather just call it
> hierarchical data. Another thing is that this study probably does not
> qualify as an experiment since there are no randomized
> treatment/control groups (at least that is what I gather from the
> post, so please correct me if I am wrong). So my first intuition here
> would be to fit a multilevel model (aka mixed effects model) with a
> bunch of interaction terms. I only consider varying intercepts here
> but this could of course be extended to varying slopes as well. I also
> only consider 3 groups here, for sake of simplicity.
> Let's start with generating some data:
> //2k individuals
> set seed 1234
> set obs 2000
> gen id=_n
> gen ei=rnormal() //unit-specific error term
> //3 groups
> gen p1=runiform()
> gen group=cond(p1<.60, 1, cond(p < .80, 2, 3 ))
> label def gr 1"No cannabis" 2"sometimes" 3"regularly"
> label val group gr
> qui tab group, g(group_)
> //Expanding to 3 observations each
> expand 3
> bys id: gen treat=_n
> label def trt 1"base" 2"min price" 3"tax"
> label val treat trt
> qui tab treat, g(trt_)
> //Generating outcome (count of drinks at a Saturday night)
> //assuming only non-cannabis users care about prices
> gen xb = 0.3 + 0.2*group_2 + 0.4*group_3 - 0.2*trt_2 - 0.2*trt_3 ///
> + 0.2*group_2*trt_2 + 0.2*group_3*trt_3 + 0.2*group_2*trt_3 +
> 0.2*group_3*trt_2 ///
> + ei
> gen exp=exp(xb)
> gen y=rpoisson(exp)
> In the above data generation we assume that people who consume
> cannabis drink more than people who don't, and people who use it
> regularly drink even more than people who just use it sometimes. We
> further assume that people who do not use cannabis drink less when
> prices increase, but cannabis users do not care about prices.
> We can then fit the model using a multilevel Poisson model:
> //Fitting a multilevel Poisson model
> xtmepoisson y i.group##i.treat || id:
> And can obtain marginal counts for all treatment by cannabis groups:
> //Predicted counts using model fixed effects
> margins i.group##i.treat, predict(fixedonly)
> after which we can compare differences in drinking amounts using
> -test- (possibly with the -mtest- option if we do multiple
> comparisons). However, these are not really marginal counts in the
> sense that they are not population averaged counts because we
> disregard the random error which stems from the variation of
> differences in baseline drinking among the 2k individuals. Getting
> 'real' population averaged effects here is not easy because we cant
> just average over the random effects since the error is only normally
> distributed with a mean of zero at the predictor scale, not the
> outcome scale. However, an easy alternative would be to just fit a
> marginal model:
> //Population averaged model
> xtgee y i.group##i.treat, family(poisson) link(log) i(id) vce(robust)
> And again we can look at the marginal counts:
> //Marginal counts
> margins i.group##i.treat
> and can do some testing, for example:
> //Testing the difference in #drinks between baseline and min-price increase
> //for people who use cannabis sometimes vs. non-users
> test (_b[2.group#1bn.treat]-_b[2.group#2.treat]) = ///
> Depending on what you actually want to test it might be unnecessary to
> go via -margins-. For example the above test is equivalent to the test
> for the group_2#treat_2 interaction term in the model. However, it is
> always a good idea to look at some model predictions to check whether
> they actually make sense etc.
> On Wed, Oct 17, 2012 at 9:08 PM, Matthew Sunderland
> <email@example.com> wrote:
>> Hi All
>> I am seeking advice on how best to analyse data arising from an experiment. We surveyed 2,000 people asking them to hypothetically purchase and consume alcohol for an imaginary Saturday night.
>> We collected data for three imaginary nights - First we presented participants with a set of alcohol prices reflecting current prices (baseline). We presented participants with two mores set of prices in a randomized order reflecting price increase resulting from i) the establishment of a minimum price and ii) an increase in the rate of tax. Participants comprise six quotas, differentiated by gender and recent cannabis and ecstasy use. Alcohol consumption is measured by the number of standard drinks, calculated by us from participant reports of how many items of alcohol they would consume eg glasses of wine, stubbies of beer etc. About 30% of the participants did not drink at baseline.
>> We'd like to know: Do the two reforms have different impacts? Do people in different quotas respond differently to the reforms? Do people with different levels of base-line drinking respond differently to the reforms?
>> One option we've thought of is for us to run two sets of fixed effects analysis (washes unobserved heterogeneity relating to alcohol consumption and quota membership)- using panel data for drinking at baseline and one of the reforms. Another option is for us to simply control for baseline consumption. We're thinking of running the analysis in two steps - a logit for whether or not someone drinks and an OLS regression for drinkers - log of standard drinks consumed, controlling for the predicted values coming from the logit.
>> Dr Matthew Sunderland
>> Drug Policy Modelling Program, National Drug and Alcohol Research Centre
>> The University of New South Wales
>> Sydney NSW AUSTRALIA 2052
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: