Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: Constructing a group level variable


From   amardeep@ucla.edu
To   statalist@hsphsun2.harvard.edu
Subject   st: Re: Constructing a group level variable
Date   Mon, 15 Feb 2010 11:04:47 -0800

Thanks to Nick and Martin for their replies. Suggestions I received and their results are:

1) Use -collapse - : was not optimal it created a dataset of means, and not counts at the school level (unless I was doing something incorrect....)

2) use - contract - :

. contract timepub08 timefin08 schid

. sort schid

. l, sepby(schid)

     +-------------------------------------+
     | schid   timep~08   timef~08   _freq |
     |-------------------------------------|
  1. |     2          2          2      15 |
  2. |     2          .          .       8 |
  3. |     2          2          1       4 |
  4. |     2          2          3       1 |
  5. |     2          3          3       5 |
     |-------------------------------------|
  6. |     4          .          .       4 |
  7. |     4          1          3       1 |
  8. |     4          3          1       8 |

again, this was not precisely what I was looking for.

3) using reshape:

reshape wide time*, i(schid) j(studid)

forv i=1/3{
	egen byte timep`i' = anycount(timepub0*), values(`i')
	egen byte timef`i' = anycount(timefin0*), values(`i')
}

drop timepub0* timefin0*
order schid timep* timef*

l, noo

Note: I had to make minor changes in the code (to correct the varnames).
This worked like a charm! Although repeating it on my large dataset will take quite a bit of time :-(

Data                               long   ->   wide
-----------------------------------------------------------------------------
Number of obs.                     3530   ->      48
Number of variables                   4   ->    7061
j variable (3530 values)         studid   ->   (dropped)
xij variables:
timepub08 -> timepub083779 timepub083780 ... timepub087567 timefin08 -> timefin083779 timefin083780 ... timefin087567
-----------------------------------------------------------------------------

.
. forv i=1/3{
  2.         egen byte timep`i' = anycount(timepub0*), values(`i')
  3.         egen byte timef`i' = anycount(timefin0*), values(`i')
  4. }

.
. drop timepub0* timefin0*

. order schid timep* timef*

.
. l, noo

  +-------------------------------------------------------------+
  | schid   timep1   timep2   timep3   timef1   timef2   timef3 |
  |-------------------------------------------------------------|
  |     2        0       20        5        4       15        6 |
  |     4       19       70       17       37       60        3 |
  |     6       19       65        4       43       42        3 |
  |     7       10       60       20       35       46        7 |
  |     8       24       79        7       47       59        6 |
  |-------------------------------------------------------------|
  |    10       15       61       15       35       52        3 |
  |    11        4       38        1       10       30        2 |
  |    12        0       35        5       14       26        0 |
  |    16       30       26        2       38       16        2 |
  |    18        5       53       19       31       40        3 |
  |-------------------------------------------------------------|
  |    20       17       53        6       31       41        3 |
  |    27        2       35        8       16       26        2 |
  |    28        8       59       17       19       55        9 |
  |    32        4       42       14       27       27        5 |
  |    33        0       23       11        6       23        4 |
  |-------------------------------------------------------------|
  |    34        7       60        1       14       51        2 |
  |    36       20       33        2       29       22        3 |
  |    38        8       68       18       59       32        2 |
  |    40       10       18        1       20       10        0 |
  |    42       44       95       15       57       79       13 |
  |-------------------------------------------------------------|

Many thanks!


***************************************************************************
This is "so not elegant" :-(


*************
clear*

input byte schid   studid  byte timep08  byte timef08
  2     6910          2          2
  2     6911          2          2
  2     6912          2          3
  2     6913          3          3
  4     7299          2          2
  4     7300          2          2
  4     7301          3          1
  4     7302          2          2
  4     7303          2          2
  4     7304          2          1
  4     7305          1          .

end

reshape wide time*, i(schid) j(studid)

forv i=1/3{
	egen byte timep`i' = anycount(timep0*), values(`i')
	egen byte timef`i' = anycount(timef0*), values(`i')
}

drop timep0* timef0*
order schid timep* timef*

l, noo
*************



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von
amardeep@ucla.edu
Gesendet: Mittwoch, 10. Februar 2010 17:25
An: statalist@hsphsun2.harvard.edu
Cc: amardeep@ucla.edu
Betreff: st: Constructing a group level variable

Hi all,

I have a dataset that consists of students (studid) in 49 schools
(schid) responding to a survey. They were asked their impressions of the
curriculum ("do you believe time devoted to subject xxx was ....") and
all responses were categorical (with 1 denoting 'not enough', 2 denoting
'just right', and 3 being 'too much'). A slice of the data is:

list    schid studid timepub08 timefin08 in 30/40

    +--------------------------------------+
    schid   studid   timep~08   timef~08
    --------------------------------------
30.    2     6910          2          2
31.    2     6911          2          2
32.    2     6912          2          3
33.    2     6913          3          3
34.    4     7299          2          2
    --------------------------------------
35.    4     7300          2          2
36.    4     7301          3          1
37.    4     7302          2          2
38.    4     7303          2          2
39.    4     7304          2          1
    --------------------------------------
40.    4     7305          1          .
    +--------------------------------------+

Question: Is there a way to generate a (or collapse this) dataset to get
school levels variables? I am interested in school level variables that
captures the number of responses to each category (1 'not enough' 2
'just right' and 3 'too much') for each question (timepub08 timefin08).

Many thanks for the advice.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index