[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Peter Wright" <Peter.Wright@nottingham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: Nested logit with shares/grouped data |

Date |
Wed, 09 Nov 2005 10:07:19 +0000 |

In response to Nick's comment I have added a bit more explanation of what I have attempted below. To remind you of my problem, the question is how do you estimate a nested logit model in STATA when your left hand side variable takes the form of a count (or a market share). i.e. the dataset records how many sales of each product are made in each time period. The stata web site offers advice for a multinomial logit model: http://www.stata.com/support/faqs/stat/grouped.html This advice suggests first putting your data in "long" form and then using frequency weights (fweights) with the mlogit command. The question is, is such a procedure suitable in the case of a nested logit model? To implement the model, I ran the following code: ***************************************************************** * As an example I use the STATA restaurant data. * However, I collapse it to make it look like a dataset of shares/grouped data * The collapsed dataset has information on the choices made by 20 households * regarding 7 restaurants. 15 yearly samples are taken ***************************************************************** clear use restaurant.dta gen year=1 if family_id<=20 replace year=2 if family_id>20 & family_id<=40 replace year=3 if family_id>40 & family_id<=60 replace year=4 if family_id>60 & family_id<=80 replace year=5 if family_id>80 & family_id<=100 replace year=6 if family_id>100 & family_id<=120 replace year=7 if family_id>120 & family_id<=140 replace year=8 if family_id>140 & family_id<=160 replace year=9 if family_id>160 & family_id<=180 replace year=10 if family_id>180 & family_id<=200 replace year=11 if family_id>200 & family_id<=220 replace year=12 if family_id>220 & family_id<=240 replace year=13 if family_id>240 & family_id<=260 replace year=14 if family_id>260 & family_id<=280 replace year=15 if family_id>280 & family_id<=300 collapse (sum) chosen (mean) income kids cost rating distance, by(restaurant year) rename chosen sales sort year by year: egen total_sales=sum(sales) gen market_share=sales/total_sales ************************************************************************************** * the dataset has information on sales (and sale-shares) as well * as some explanatory variables * There are 20 households choosing between 7 restaurants. * The sample is repeated for 20 years. * This is the kind of dataset that I had in mind. * How would you run a nested logit model using such shares/grouped data? ************************************************************************************** * If we follow a similar methodology to that suggested by for the multinomial model, * we need to expand the data so that it has 7*7 rows for each year expand 7 sort year restaurant * number the choices 1 to 7 egen alt_id=fill(1 2 3 4 5 6 7 1 2 3 4 5 6 7) * create an artificial chosen variable which is one for each restaurant in turn (zero for the others) sort year alt_id restaurant gen chosen=0 by year alt_id: replace chosen=1 if _n==alt_id * you also need a weighting variable to tell stata how many times each restaurant was * chosen (from the group of 7) replace sales=. if chosen==0 by year alt_id: egen sales2=mean(sales) gen alt_id2=10*year+alt_id * You can see that the dataset now looks very much like one based on individual data. * The only difference is that the sample will be weighted by sales2 gen type=0 replace type=1 if restaurant==1| restaurant==2 replace type=2 if restaurant==3| restaurant==4| restaurant==5 replace type=3 if restaurant==6| restaurant==7 * Now specify your nested logit model and run gen incFast=(type==1)*income gen incFancy=(type==3)*income gen kidFast=(type==1)*kids gen kidFancy=(type==3)*kids nlogit chosen (restaurant = cost rating distance) (type=incFast incFancy kidFast kidFancy) [fweight=sales2], group(alt_id2) ******************************************************************* This procedure yields the following results: Nested logit regression Levels = 2 Number of obs = 2100 Dependent variable = chosen LR chi2(10) = -676.381 Log likelihood = -513.32241 Prob > chi2 = 1.0000 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- restaurant | cost | -.2347816 .1384955 -1.70 0.090 -.5062277 .0366645 rating | .3833214 .2482818 1.54 0.123 -.1033021 .8699449 distance | -.3779229 .2466483 -1.53 0.125 -.8613448 .1054989 -------------+---------------------------------------------------------------- type | incFast | .0054128 .069671 0.08 0.938 -.1311398 .1419654 incFancy | .0715661 .0505795 1.41 0.157 -.0275679 .1707 kidFast | -.5918203 .6533741 -0.91 0.365 -1.87241 .6887694 kidFancy | -.6183388 .5423909 -1.14 0.254 -1.681405 .4447279 -------------+---------------------------------------------------------------- (incl. value | parameters) | type | /type1 | 3.94913 3.423767 1.15 0.249 -2.761329 10.65959 /type2 | 2.633478 2.804631 0.94 0.348 -2.863497 8.130453 /type3 | 1.281784 .7357307 1.74 0.081 -.1602222 2.723789 ------------------------------------------------------------------------------ LR test of homoskedasticity (iv = 1): chi2(3)= -680.30 Prob > chi2 = 1.0000 ------------------------------------------------------------------------------ In an attempt to check these results I checked them against LIMDEP NLOGIT (which claims to be able to cope with shares/grouped data) I get different results. Normal exit from iterations. Exit status=0. +---------------------------------------------+ | FIML: Nested Multinomial Logit Model | | Maximum Likelihood Estimates | | Dependent variable SALES | | Weighting variable ONE | | Number of observations 105 | | Iterations completed 5 | | Log likelihood function -524.2610 | | Restricted log likelihood -592.6711 | | Chi-squared 136.8201 | | Degrees of freedom 10 | | Significance level .0000000 | | R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj | | No coefficients -592.6711 .11543 .00486 | | Constants only. Must be computed directly. | | Use NLOGIT ;...; RHS=ONE $ | | At start values -527.7727 .00665 -.11751 | | Response data are given as frequencies. | +---------------------------------------------+ +---------------------------------------------+ | FIML: Nested Multinomial Logit Model | | The model has 2 levels. | | Coefs. for branch level begin with B5 | | Number of obs.= 15, skipped 0 bad obs. | +---------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Attributes in the Utility Functions B2 -.1827804310 .22258969E-01 -8.212 .0000 B3 .5317320087 .13437809 3.957 .0001 B4 .5306557987 .13130855 4.041 .0001 Attributes of Branch Choice Equations B5 -.5031389578E-01 .94708096E-01 -.531 .5952 B6 .6830782466 1.5458378 .442 .6586 B7 .2425946290E-01 .44919700E-01 .540 .5892 B8 -.4414619828 .66479462 -.664 .5067 Inclusive Value Parameters TYPE1 .9859433268 .25540989 3.860 .0001 TYPE2 1.054564664 .25223344 4.181 .0000 TYPE3 .9397231335 .27169252 3.459 .0005 Is this because you cannot proceed as I suggest above (or because LIMDEP is wrong)? (Incidentally I think the stata results are more likely to be correct as the t-ratios appear too high in LIMDEP). This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: Creating weights** - Next by Date:
**st: Hausman test** - Previous by thread:
**st: Nested logit with shares/grouped data** - Next by thread:
**Re: st: Nested logit with shares/grouped data** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |