Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: adding observation of means of variables


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: adding observation of means of variables
Date   Thu, 16 Feb 2012 12:34:32 +0000

OK, but there is no need to add the means to the dataset to do that. 

Nick 
n.j.cox@durham.ac.uk 

Abhimanyu Arora

Thanks very much for your conscientious advice.

Basically, I had created tables for in my paper (which had averages in
the last row). Now that the analysis is complete, I just would like to
make sure the numbers are replicable from A-Z (in stata itself---I
thought bringing them out as datasets would be ok for my purpose), in
case the referees would like to see where they come from.

On Thu, Feb 16, 2012 at 1:04 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote:

> Phil gives accurate advice, and as he said there are other ways to do it.
>
> Here's another:
>
> set obs `=_N + 1'
>
> ds, has(type numeric)
>
> qui foreach v in `r(varlist)' {
>        su `v', meanonly
>        replace `v' = r(mean) in L
> }
>
> That said, I think this is a bad idea for working with Stata. No, let me rephrase that: it's a very bad idea. A rule of thumb, blunt though it will seem, is that if you have to ask how to do this you don't yet understand Stata well enough to use it safely.
>
> My advice is not to do this.
>
> It's a spreadsheet practice that matches the way spreadsheets are set-up. It's not a good idea for working with statistical software like Stata,  The problem is that once those extra observation(s) are added, you _must_ always exclude them from further analyses with the same dataset. Otherwise you just get nonsense results. Add to that the fact that if you -sort- your dataset, or some program or command -sort-s your data as a side-effect (now rare but not impossible), those observations with summaries will typically no longer be at the end of your dataset, so you need to invent extra machinery to keep track of where they are.
>
> Better advice would depend on knowing quite why you want to this. Keeping means in variables, although there can be redundancy, can be a reasonable idea for some purposes.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Phil Clayton
>
> Not as far as I know, but it's easy to program. Here's one solution:
>
> preserve
> collapse (mean) *
> tempfile means
> save `means'
> restore
> append using `means'
>
> The above assumes that all variables are numeric. If they're not, you could replace:
> collapse (mean) *
> with:
> ds, has(type numeric)
> collapse (mean) `r(varlist)'
>
> On 16/02/2012, at 10:33 PM, Abhimanyu Arora wrote:
>
>> Is there a direct command that appends an observation to the dataset,
>> giving the means of all the numeric variables?
>> Perhaps I am using -findit- not that efficiently, but if I am not
>> mistaken there was one...

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index