Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Contract/Collapse Combination

 From Lucas To statalist@hsphsun2.harvard.edu Subject Re: st: Contract/Collapse Combination Date Tue, 22 May 2012 07:37:52 -0700

Not really.  I was using X# in the statistical convention sense, not
indicating that those are the true variable names.  I have about 150
variables in the dataset.  I need a 15-way crosstab.  The variables in
the crosstab have informative names, none of which start/end the same.

I provided too much information in my original post, and it is
distracting from the central question, which I now realize is this:

Is there a way to use the contract command and obtain frequencies for
TWO variables rather than just ONE?  A corollary question would be, Is
there a way to use the contract command and obtain the count of 1's on
TWO separate dichotomous variables?

I realize this is maybe a question for stata, but I cannot imagine I
am the only person to ever need such a feature--I'm an original
thinker, *maybe*, but not THAT original.

Such a command would allow one to easily reduce large datasets for
processing in all those models for binary data that allow a frequency
weight and/or a binomial option.

If someone has had this problem of needing two counts from the
contract command and solved it, I'd love to know what you did.  And,
if Dr. stata, whomever s/he is, has insight on the likelihood of this
feature being added to contract, I'd love to hear it.

Thanks a bunch.
Sam

On Mon, May 21, 2012 at 3:51 PM, Nick Cox <njcoxstata@gmail.com> wrote:
> This sounds to me like
>
> contract x* entercol
>
> Nick
>
> On Mon, May 21, 2012 at 9:45 PM, Lucas <lucaselastic@gmail.com> wrote:
>> So, I am attempting to construct a file containing a list-format 15
>> (or so) -way crosstab, with frequencies of cases for each combination
>> of values.  I have millions of cases, so this crosstab is appropriate.
>>
>> What would be ideal would be the ability to use the contract command
>> but, instead of only indicating the need for one count,  could ask for
>> the sum of two variables.  Assume I have a dichotomous variable (say,
>> "enter college).  Those who enter college are coded 1, those who do
>> not are coded zero.  I could then construct a new variable, named
>> "DidNotEnter", coded 1 for those who do NOT enter college, and zero
>> for those who do.
>>
>> If I could then write something like:
>>
>> collapse x1 x2 x3 x4 ... xj, freq(EnterCol) freq(DidNotEnter)  zero
>>
>> I could get the totals needed.  The plan is to speed processing of a
>> computationally difficult model by substituting a model of counts for a
>> model of individual cases.  To do this I need the total count of each
>> combination and the count meeting the condition (e.g., entering college).
>> My code above would, if possible, produce a file that allowed me to
>> add the two frequencies to get the total.
>>
>> As far as I can tell, this is not possible.  What seems to be required
>> is to run it with only the "Enter College" freq, then somehow break
>> the two cases (EnterCollege=1 vs. EnterCollege=0) (or, alternatively,
>> to keep EC==1 and run it, and then re-run with EC==0) and somehow
>> combine them, an operation that seems to be begging for error in
>> matching.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/