Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Multiple responses on plot level data

From   "Ronnie Babigumira" <>
To   <>
Subject   Re: st: RE: Multiple responses on plot level data
Date   Fri, 22 Apr 2005 16:37:17 +0200

Hi all
Nick, Rafael, and Menale, many thanks for your input. Let me try to
take stock of what I know so far, the problem was that I have plot level
data (with some household specific variables). The response variable(s)
are choice of conservation measure (three possible choices fert, man,

hhd_id	plnum	fert	man	fanyaju  (and a number of predictor
1001	1	0	1	0
1001	2	0	1	1
1001	3	1	0	1
1002	1	0	0	0
1002	2	1	1	1
1003	1	1	0	0

Here is what has been suggested

A. Simplest solution

Estimate a simple probit/logit for each of the three measures (fert,
man, and fanyaju). This would allow us to examine the effect of the
predictor variables on the odds of choosing a particular measure versus
all other measures including none of them. We are not able to say
anything about the odds of A vs B but rather A vs all other choices.

B. Even better
Use a multinomial logit. This is more interesting because in my
understanding, it allows us to examine the effect of the predictor
variables on the odds of choosing A vs B, or A vs C (and all other
combinations). I much prefer this however, for my data, this presents an
additional challenge.

If each plot could have only one of the 3 then we would be home and
dry, however as we see for plot 2 of 1001, a plot could have 2, or even
3 for 1002, plot 2 and herein lies my challenge

Menale has proposed that we add more observations. He proposes that we
create a new variable, choice which takes on the values 1,2,3 for fert,
man, and fanyaju rspv. So the data according to his proposal would then
look like

hhd_id	plnum	choice
1001	1	2
1001	2	2
1001	2	3
1001	3	1
1001	3	3
1002	1	0
1002	2	1
1002	2	2
1002	2	3
1003	1	1

This would appear OK to me, however, my concern is that now you have
clustering at both the household and plot level. Would this be a

Nick on the other hand proposes that in addition to the 4 clean classes
(0, 1, 2, 3), we add 5 6 and 7 which reflect diffrent combinatons of the
three. Some thing like this

fert			1
man			2
fanyaju			3
fert/man		                4
man/fanyaju		5
fert/fanyaju		6
fert/man/fanyaju	                7
none			0

The data according to Nicks proposal would look something like this

hhd_id	plnum	Choice
1001	1	2
1001	2	5
1001	3	6
1002	1	0
1002	2	7
1003	1	1

I have some concerns with this, particulary that choices 4/7 are
combinations of the 1/3. Isnt this a problem (forgive my ignorance, but
doesnt this point in the direction of the IIA problem).

Finally Rafael recommends a diffrent approach. Im still working through
his posting and will get back to him in a lil.

So, here I am, more ideas, a little more confusion, certainly non the
wiser. I would appreciate comments.


>>> menale kassie <> 04/22/05 1:01 am >>>

What is the problem if one repeats those plots that have more than one
conservation measure within the household? Assume one household has
three plots. One of the plots has two conservation measures and assumes
others have one conservation measure. Then when we repeat the plot with
two conservation measures the household will have four plots
observation. Look at the following example.

Hhno    plotno     fert        manure             fanya juu

1          1            1            1                      0

1          2            0            1                      0

1          3            0            0                      1

then repeating the first plot you can construct one dependent variable
as follow

hhno    plot         Y

1          1            1

1          1            2

1          2            2

1          3          3

where 1= fert, 2= manure,  3= fanya juu, and Y dependent variables

Hope this is right.

Menale K

Ronnie Babigumira <> wrote:Thanks Nick, sorry
for the lack of clarity, measure choice is my response (dependent is the
term Im more farmiliar with) variable (s). I do have a set of predictor
(independent) variables which I did not include in my email.

Yes I did consider logit models for individual responses. I also
considered constructing a dependent variable indicating whether or not
any one of the choices was made.

However, I have been thinking about the mlogit. So you say 8 possible
choices, would that mean 1, 2, 3 for fert, man, fanyaju rspv and then
5/8 for the diffrent combinations of each. If so, doesnt isnt this a
problem because the classes are not exactly independent in the sense
that (say 5 if choices 2 and 3 were made on the plot)


Ronnie Babigumira
Dept. of Economics and Resource Management
Norwegian University of Life Sciences (UMB)
PO Box 5003,
N-1432 *s,
>>> 04/21/05 4:30 PM >>>
I am not clear whether you regard
measure choice as a predictor or a response.
And in any case I can imagine
situations in which both views are reasonable.
Soil erosion loss could depend on
conservation measures, which choice of conservation
measures might depend on many things.

If it were the latter, I think it is up to you how you define
a composite response. Nor is it clear to me that
you need do that. If this were my problem
I would look at

(a) logit models for the individual responses


(b) a multinomial logit for the 8 possible choices of
and fanyaju[???].

If conservation measures are predictors, you would
need to look not just at dummies but also at interactions.


Ronnie Babigumira

> Something has been slowly eating away at me and at this point I have
> decided to seek help from the list. I have plot level data on use of
> soil conservation measures and would like to construct a "single"
> dependent variable for each household for use in a multinomial
> The data look something like this
> hhd_id plnum fert man fanyaju
> 1001 1 0 1 0
> 1001 2 0 1 1
> 1001 3 1 0 1
> 1002 1 0 0 0
> 1002 2 1 1 1
> 1003 1 1 0 0
> Where
> hhd_id: Household id
> plnum: Plot number (a household may have more than one plot)
> And fert, man and fanyaju are 3 possible soil conservation measures
> household may undertake (it is possible that more than one measure
> be applied to a plot)
> My question is how do I go about with constructing a single
> variable for use in a multinomial logit in this case (and
> would this be
> correct). I have considered the simplest case where I
> construct a simple
> dummy for each plot indicating whether or not a household
> used at least
> one of the measures however, I feel that it would be more
> interesting if
> I could say something on the determinants for the use of the
> measures

* For searches and help try:

* For searches and help try:

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index