Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Collecting Statistics of Averages of Variables


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Collecting Statistics of Averages of Variables
Date   Sat, 29 Sep 2012 02:40:47 +0100

-statsby- can help there too. Do look at the -subsets- option.

Nick

On Fri, Sep 28, 2012 at 9:53 PM, robert hartman <rohartman@gmail.com> wrote:
> Thanks, Nick. This looks helpful. Where I'm still a bit hung up is the
> coding (I assume some kind of loop) to put together all the
> combinations, including combinations of varying sizes (e.g.,
> 3-variable and 2-variable combinations).
>
> I assume there's a fairly straightforward way to do this w/ loops, but
> it's not jumping out at me.
>
> Thanks,
> Rob
>
> (BTW, failed to mention, I'm on Stata 11.2)
>
> On Fri, Sep 28, 2012 at 2:24 PM, Nick Cox <njcoxstata@gmail.com> wrote:
>> Check out -statsby-.
>>
>> Nick
>>
>> On 28 Sep 2012, at 18:55, robert hartman <rohartman@gmail.com> wrote:
>>
>>> Hi Listers,
>>> I'm struggling with a couple of programming challenges. I will start
>>> with this one. Any input on efficient ways to code this would be most
>>> appreciated.
>>>
>>> I want to take mean (or could be other "egen"/"collapse" type summary
>>> statistic) form all possible combinations (of subsets of various
>>> sizes) from a set of variables.
>>>
>>> For example, if I have v1, v2, v3, and v4, I would want the mean of
>>> each: v1v2, v1v3, v1v4, v1v2v3, v1v2v4, v1v3v4...v3v2...v3v2v4, etc.
>>>
>>> By v1v2v3, I mean gen v1v2v3=(v1+v2+v3)/3; (assuming the summary
>>> statistic of interest is the mean)
>>> by v1v2v3v4, I mean v1v2v3v4=,I mean gen v1v2v3v4=(v1+v2+v3+v4)/4.
>>>
>>> etc.
>>>
>>> I don't need to actually create permanent variables, simply
>>> 1. create a temporary "combination variable" (e.g., v1v2v4 =
>>> rowmean(v1 v2 v4)) for each possible combination,
>>> 2. collect summary statistics of interest from "combination variables."
>>> 3. spit out a file that conveys this information intelligibly
>>>
>>> For example, I may want to write out a file that gives (a) the name of
>>> the combination (e.g., v1v2v4--the column representing the rowmean of
>>> those three variables) and (b) the mean and standard deviation of that
>>> v1v2v4 variable.
>>>
>>> In my particular case, I have about a 70 variable space from which to
>>> create all these subset combinations.
>>>
>>> Any helpful starting points or coding ideas would be helpful.
>>>
>>> Thanks,
>>> Rob Hartman
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index