Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: How to use Postfile without using a new Stata dataset


From   tiago.pereira@incor.usp.br
To   statalist@hsphsun2.harvard.edu
Subject   st: RE: How to use Postfile without using a new Stata dataset
Date   Wed, 3 Oct 2007 18:41:04 -0300 (BRT)

Thank you so much for your quick reply, Nick.

> First off, your title is contradictory. The
> whole point of -postfile- is to create a new
> dataset.

Sorry, Nick, I will get there, sooner or later. I am still learning Stata.

> I don't have a clear sense of your aim, but
> it's not evident that you need alternatives
> to -postfile-.

Yes, I do need. Why?

My objetive is to use an approach that enables one to calculate some
statistics and report them in the current dataset.

For example, letīs assume that I have the following data set:

a b c study
26 85 41 1
12 56 23 2
45 85 96 3
45 86 91 4

My objetive is to obtain some statistic from -genhwi-, say, a P value of
the exact approach. My data set should look like this:

a b c study pvalue
26 85 41 1 0.056
12 56 23 2 0.862
45 85 96 3 0.996
45 86 91 4 0.005

However, one needs to type for each observation the number of counts of
each variable to run -genhwi- (and, of course, this is similar for several
other programs I need to run) and to fill the dataset manually. Currently,
since I am not fluent in Stata, I am aware of only three approaches to
obtain what I want:

First approach - Using postfile

-----------------BEGIN------------------------
postfile HWE pvalue using nickexample, replace
quietly summarize a
local number = r(N)
forvalues i = 1/`number' {
quietly summarize a  if _n==`i'
local a = r(mean)
quietly summarize b  if _n==`i'
local b = r(mean)
quietly summarize c  if _n==`i'
local c = r(mean)
genhwi `a' `b' `c'
post HWE (r(p_exact))
}
postclose HWE
clear
use nickexample
describe
-----------------END---------------------------

Second approach - Using tempname

-----------------BEGIN------------------------
tempname M
quietly {
quietly summarize a
local number = r(N)
forvalues i = 1/`number' {
quietly summarize a  if _n==`i'
local a = r(mean)
quietly summarize b  if _n==`i'
local b = r(mean)
quietly summarize c  if _n==`i'
local c = r(mean)
genhwi `a' `b' `c'
matrix `M' = nullmat(`M') \ (r(p_exact))
}
}
svmat double `M' , name(pvalue)
-----------------END---------------------------


Third approach - replacing observations

-----------------BEGIN------------------------
quietly {
gene pvalue=.
quietly summarize a
local number = r(N)
forvalues i = 1/`number' {
quietly summarize a  if _n==`i'
local a = r(mean)
quietly summarize b  if _n==`i'
local b = r(mean)
quietly summarize c  if _n==`i'
local c = r(mean)
genhwi `a' `b' `c'
replace pvalue = r(p_exact) in `i'
}
}
-----------------END---------------------------

The first approach creates a new datased. Thatīs a problem for me. I do
not want that. The second alternative works only for up to 11,000
observations (I work with more than 100,000 observations) and, the third
one, although extremely straightforward in principle, is very
time-consuming.

Thx for any comment.

Tiago


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index