# Re: st: RE: RE: Creating a group variable based on values in observations

 From Bert Jung
Subject Re: st: RE: RE: Creating a group variable based on values in observations
Date Fri, 20 May 2011 21:32:49 -0400

```Hi Chris,

I am not sure if I understand your problem but maybe this helps:

If you would like group IDs for all unique values within your
"openmarkets" variable, you could use the "group" option of -egen-.  I
suspect that requires that the values in the openmarkets variable must
always have the same order since -egen- would consider "2-3-5" and
"5-3-2" as two different groups but to you and me they're the same.

If "openmarkets" is a string variable you could also remove the "-"
with -subinstr- or with -destring openmarkets, ignore("-") gen(new)-.
Since the values 1 to 5 seem to uniquely identify the week days (?)
that would be similar to Sarah's suggestion.

Cheers,
Bert

On Fri, May 20, 2011 at 7:49 PM, Sarah Edgington wrote:
> Chris,
> I think there are a number of different ways to solve this problem.
> How many markets are you dealing with?  If it's fewer than 20 here's a
> solution that gets you around the reshaping issue.
> First, create a new market id where market 1=1, market 2=10, market 3=100,
> etc.  Then sum this id within days.  That will give you a group variable
> where each place represents a particular market (starting with market 1 on
> the right) and a 1 or 0 tells you if the market was open or not.  Your day
> one group id would be 11111.  Day two's would be 10110.
>
>        gen double mid=10^(market-1)
>        bysort day: egen double margroup=total(mid)
>
> This only works well up to 19 markets because of precision issues.  In
> principle, though, you could do it in any base and have everything add up to
> create a unique group id.  So if you used 2 as your base instead of 10 (that
> is, change the first line to  gen double mid=2^(market-1) ) you'd be able to
> accommodate more markets.  Doing that you lose the ability to easily look at
> it and read which markets are open straight from the group variable.  That
> doesn't really matter for analytical purposes, though.
>
> -Sarah
>
>
>
> Hi,
>
> I think I have a solution. My data is a bit too big to do this all at once
> (reshape gives a return code telling me productmarket takes on too many
> values) but here is what works in case anyone runs into a similar
> problem:
>
> . gen marketdup = market
> . reshape wide market, i(date) j(marketdup) . egen openmarkets =
> concat(market*), punc(_) . encode openmarkets, gen(groupid) . drop
> openmarkets . reshape long . drop marketdup
>
> Chris
>
>
> Hi Statalist,
>
> I have a problem that's been troubling me for a while now. I have daily
> prices for several products in several markets over time. I use the data to
> measure price dispersion as the coefficient of variation of prices on a day
> for a product. However, not every market is open on every day.
> Systematic differences between the markets that are open (such as average
> distance between markets, percent of markets of type A, etc.) could impact
> price dispersion, so I need to control for this. For each product I would
> like to create a variable that lists which group of markets was open on each
> day (openmarkets in the example below). I could then encode this variable
> and include i.groupid which controls for these differences.
>
> Example data for one of the products:
>
> day     market          openmarkets     groupid
> 1       1               1-2-3-4-5       1
> 1       2               1-2-3-4-5       1
> 1       3               1-2-3-4-5       1
> 1       4               1-2-3-4-5       1
> 1       5               1-2-3-4-5       1
> 2       2               2-3-5           2
> 2       3               2-3-5           2
> 2       5               2-3-5           2
> 3       1               1-3-4-5         3
> 3       3               1-3-4-5         3
> 3       4               1-3-4-5         3
> 3       5               1-3-4-5         3
> 4       2               2-3-5           2
> 4       3               2-3-5           2
> 4       5               2-3-5           2
>
> Any ideas?
>
> Chris
>
>
>
>
>
>
>

```