# Re: st: reshaping data?

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: reshaping data? Date Fri, 26 Oct 2012 15:54:21 +0100

```I am getting closer, but it seems to me that there is still some
arbitrariness here.

If I am 789 then the others are 790 and 791 and should be P_1 P_2. But
which should be which? I can't see a rationale here.

Nick

On Fri, Oct 26, 2012 at 2:35 PM,  <agostino@unical.it> wrote:
> dear Nick,
> each item is observed each week (for two years, so  I have 106 observations
> for each supermarket in my sample), and belongs to a family. The families
> include from one up to 16 items (the number of items is not fixed for each
> fam). What follows is an example of a family made up of three items
>
>
> Code_item       week    Q       P       Code_family
> 789             1       8       2       1
> 790             1       25      4       1
> 791             1       9       1.3     1
>
> 789             2       12      2       1
> 790             2       2       3       1
> 791             2       20      1.2     1
>
>
> and so forth for  106 times (each supermarket)
> I'd like to regress Q on P (and other ctrl vbls, that I omit for brevity)
> and the prices of the other two items (substitute goods), in other words to
> have a dataset like follows:
>
> Code_item       week    Q       P       Code_family   P_1  P_2
> 789             1       8       2       1              4   1.3
> 790             1       25      4       1              2   1.3
> 791             1       9       1.3     1              2    4
>
> 789             2       12      2       1              3    1.2
> 790             2       2       3       1              2    1.2
> 791             2       20      1.2     1              2     3
>
>
>
>
> Quoting Nick Cox <njcoxstata@gmail.com>:
>
>> Sorry, but this seems to imply as many predictors as observations,
>> which isn't a good idea.
>>
>> Presumably you don't mean that, so you have tell us more about your
>> data structure for this to be clear to me.
>>
>> Nick
>>
>> On Fri, Oct 26, 2012 at 12:31 PM,  <agostino@unical.it> wrote:
>>>
>>> dear Nick,
>>> I'd like to estimate separate regressions, one for each family, hence the
>>> number of predictors would be  the same for each family
>>>
>>> hope this clarifies
>>> best
>>> M.
>>>
>>>
>>> Quoting Nick Cox <njcoxstata@gmail.com>:
>>>
>>>> In that case I really don't understand what you are seeking. See also
>>>>
>>>> I've never come across a model in which there are a different number
>>>> of predictors for different observations, although I am always happy
>>>> to be educated.
>>>>
>>>> It seems to me that
>>>>
>>>> 1. Either you are applying a standard model, in which case you can
>>>> give literature references.
>>>>
>>>> 2. Or this is a new model, in which case you need to explain how it
>>>> would be set-up and estimated.
>>>>
>>>> Note that it's easy to get variables such as the mean of the other
>>>> prices in the same family. See
>>>>
>>>> FAQ     . . Creating variables recording prop. of the other members of a
>>>> group
>>>>         . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N.
>>>> J.
>>>> Cox
>>>>         4/05    How do I create variables summarizing for each
>>>>                 individual properties of the other members of a
>>>>                 group?
>>>>                 http://www.stata.com/support/faqs/data/members.html
>>>>
>>>>
>>>> On Thu, Oct 25, 2012 at 3:39 PM,  <agostino@unical.it> wrote:
>>>>>
>>>>>
>>>>> dear Nick,
>>>>> thanks for your prompt reply. Unfortunately, the number of items is not
>>>>> fixed
>>>>> best regards
>>>>> Maria
>>>>>
>>>>>
>>>>> Quoting Nick Cox <njcoxstata@gmail.com>:
>>>>>
>>>>>> This model formulation makes me feel a bit queasy, but I think what
>>>>>> you want is something like this.
>>>>>>
>>>>>> Suppose for concreteness the number of items is fixed at 8. (I don't
>>>>>> see how this will work if the number is not fixed.) So, "for any value
>>>>>> of 8"
>>>>>>
>>>>>> sort code_family code_item
>>>>>> forval i = 1/8 {
>>>>>>        by code_family : gen P`i' = P[`i']
>>>>>> }
>>>>>>
>>>>>> Note that, contrary to your title, there is no -reshape- here as you
>>>>>> want your observations to remain observations; at least that's my
>>>>>> understanding.
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>> On Thu, Oct 25, 2012 at 3:08 PM,  <agostino@unical.it> wrote:
>>>>>>
>>>>>>> I have to estimate the equation
>>>>>>>
>>>>>>> Q1=a+b1P1+b2P2+...bnPn+e
>>>>>>>
>>>>>>>
>>>>>>> Where Q1 is the quantity of item 1 sold by a supermarket during a
>>>>>>> week
>>>>>>> ,
>>>>>>> P1
>>>>>>> is the price of item 1 in that week, the other prices are those of
>>>>>>> the
>>>>>>> n
>>>>>>> items belonging to the same family of items. My data set is organized
>>>>>>> as
>>>>>>> follows:
>>>>>>>
>>>>>>>
>>>>>>> Code_item       week    Q       P       Code_family
>>>>>>> 789             1       8       2       1
>>>>>>> 790             1       25      4       1
>>>>>>> 791             1       9       1.3     1
>>>>>>>
>>>>>>> 792             1       12      2       1
>>>>>>>
>>>>>>> 800             1       7       2       2
>>>>>>> 801             1       20      1.2     2
>>>>>>> 802             1       11      1.6     2
>>>>>>>
>>>>>>> 803             1       12      2       2
>>>>>>>
>>>>>>> And so forth for the other weeks and families...
>>>>>>>
>>>>>>>
>>>>>>> For each item, how can I include in my regression the prices of the
>>>>>>> other
>>>>>>> (n-1) items of the same family, ignoring the prices of the items
>>>>>>> belonging
>>>>>>> to other families?
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>
>
>
>
>
>
>
>
```