# st: new expression: how to generate groups based on some characteristics, and assign a sample into the correct group

 From "Yi, Bingsheng" To "'statalist@hsphsun2.harvard.edu'" Subject st: new expression: how to generate groups based on some characteristics, and assign a sample into the correct group Date Fri, 21 Jun 2002 11:57:10 -0400

```Dear Statalister,

I have sent my problem before, which I didn't express clearly, and therefore
haven't received any reply. This time I express it more clearly, and I hope
that you will help me out with the following problem. Let me express my
sincerest thanks to you in advance!!!

I need to assign a firm into an industry which should contain at least 10
firms, and a firm should belong to an industry with the most number of
digits of industry code, so long as that industry contains at least 10
firms. If an industry has a code as 10, then it should contain all the firms
with the same 2-digit industry code as 10. ind1`i' is the i-digit industry
code, i=1,2,3,4. For example, ind4 is the 4-digit industry code. There are
only one firm in industry 1028, industry 102, but 12 firms in industry 10,
so the final industy code for firm 12 will be 10, and industry 10 should
contain all the firms with ind2=10. The industry code for the first 10 firms
is 1041,not 104,10, nor 1 since industry 1041 already contains 10 firms,
even though industry 104,10,or 1 contain more than 10 firms. ind3grp
contains all the firms with the same 3-digit industry code.

obs q ind4 ind3 ind2 ind1  industry size ind4grp ind3grp ind2grp ind1grp
1     1041 104  10   1       1041    1   small     ?       ?
2     1041 104  10   1       1041    3   small     ?       ?
3     1041 104  10   1       1041    2   small     ?       ?
4     1041 104  10   1       1041    5   middle    ?       ?
5     1041 104  10   1       1041    4   middle    ?       ?
6     1041 104  10   1       1041    9   large     ?       ?
7     1041 104  10   1       1041    8   middle    ?       ?
8     1041 104  10   1       1041    11  large     ?       ?
9     1041 104  10   1       1041    10  large     ?       ?
10    1041 104  10   1       1041    7   middle    ?       ?
11    1044 104  10   1       104     6     N.A.    ?       ?
12    1028 102  10   1       10      18    N.A.    N.A.    ?
.
.
Subsequently I will subdivide firms in each industry into small, middel and
large groups according to firm size. The small group contains the smallest
30% firms in size within an industry, the middle group contains the middle
40% firms in size (30% to 70%), and the large group includes firms whose
size
belongs to the largest 30% in that industry. How I can subdivide firms
according to size in an industry and assign a firm to a group according to
its size? I also need to get the median or mean value of  q for each
of these groups within an industry.

Take firm 1 as an example, its ind4 is 1041, and there are already 10 firms
with ind4 as 1041, so firm 1 should be in industry 1041. According to its
size, it belongs to samll group in industry 1041. Firm 1's industry-size
adjusted q = firm 1's q - the mean q of small group under industry 1041. As
to firm 11, first I need to assign firm 11 into industry 104 which contains
all the firms with the same 3-digit industry code as 104, then I need to
assign firm 11 into a group under industry 104 according to its size (a firm
may belong to different groups under different industry codes), then I have
to get the mean value of q of firms in the same group as firm
11.

I hope that you will understand my problems now and help me out, I really