Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: data management question


From   Nikhil Jha <n.jha.utd@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: data management question
Date   Wed, 25 Mar 2009 11:16:20 -0500

Dear Stata users,

I have a question related to data management. I have a fairly large
data set in the following format:

id  asc1 asc2 asc3........
___________________________________
1    1    0    0
1    0    1    0
---------------------------------
2    0    1    0
2    1    0    0
2    0    0    1
----------------------------------
:
:
where id is the identifier and asc1, asc2, etc are associations
related to specific ids.

I would like to put it in this format eventually.

id  asc1 asc2 asc3........
___________________________________
1    1    1    0
2    1    1    1
:
:
My plan was to  use reshape wide  for which I needed it to first look like this:

id  asc1 asc2 asc3........
___________________________________
1    1    1    0
1    1    1    0
---------------------------------
2    1    1    1
2    1    1    1
2    1    1    1
----------------------------------
:
:
That is if ever a particular id is associated with any asc, that
column is 1 for all occurrence of that particular id.

This could probably be done with.....
bysort id : g byte assc1 =  sum(asc1)
or
collapse (sum) asc1-asc2138 , by (id)


But my problem is that there are 2138 asc (i.e. last var is asc2138)
[and not enough memory (see below) for collapse], so I want to
automate this. So I tried to do a loop like:

egen same = group(id)

forvalues i =1/_N{
     local j = 1
     while `j'=same{
     g ascc`j' =1
    continue
    local j = `j'+1
    }
}

But this just doesn't work - invalid syntax (using Stata 10). Any
pointers (either for fixing this loop or the original problem) would
be greatly appreciated.

Thanks,
Nikhil


query mem

Current memory allocation

                   current                                 memory usage
   settable          value     description                 (1M = 1024k)
   --------------------------------------------------------------------
   set maxvar         5000     max. variables allowed           1.909M
   set memory          745M    max. data space                745.000M
   set matsize         400     max. RHS vars in models          1.254M
                                                           -----------
                                                              748.163M

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index