Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Trying to do some multiple imputation


From   Nicola Orsini <[email protected]>
To   [email protected]
Subject   Re: st: Trying to do some multiple imputation
Date   Tue, 24 Jan 2006 22:01:14 +0100

Mosi,

I would recommend you to read the help file (help uvis) and the Royston's paper about univariate imputation sampling.

The answer to your question is in the help file

uvis (univariate imputation sampling) imputes missing values in the single variable yvar based on multiple regression on xvarlist.

Nicola

Mosi A. Ifatunji wrote:

Oh okay,

Are the new values generated from multiple datasets or just from predicting
the missing values from independent variables in the old dataset?

M.


On 1/24/06 2:25 PM, "Nicola Orsini" <[email protected]> wrote:


Mosi,

if you want to have just one imputed variable in the original dataset
it's even easier than what I suggested earlier.

Just type one line

uvis regress income black male age2 educate, gen(imp_income)

Nicola

Mosi A. Ifatunji wrote:

Nicola,

Okay, so the syntax seems to have worked with the new commands. That is,
after running the syntax I find myself working with a dataset that is
representative of multiple datasets and if I run -tab income- I get a
crosstab with imputed values.

Ultimately, I would like to have a dataset ('das1995r') that has one
observation per variable per person in the dataset. I would like to have a
variable in such a dataset with a variable named, say 'imp_income' (for
imputed income). How do I get from where I am now to where I want to end up?

M.


On 1/24/06 11:29 AM, "Nicola Orsini" <[email protected]> wrote:



Mose,

1) define a working directory -help cd-

clear
cd "mypath"

2) use the command -use- instead of -sysuse- to open your dataset (help
sysuse)

use das1995r

The rest of the lines should be fine. I hope you are using a do-file to
run the analysis.

Best,
Nicola

Mosi A. Ifatunji wrote:


Thanks Nicola,

I have been trying the code that you sent and I am having some trouble with
it. Admittedly, I am not familiar with some of the commands that you're
using and the -help- command doesn't really provide for much more clarity.
So, let me tell you what I am doing (verbatim) and you can tell me if there
is an error in my use of the syntax you have so generously provided.

First, the dataset I am using is called das1995r and it is located in the
main Stata folder. The variable I would like to generate values for is
'income.' the variables I would like to generate 'income' from are:
'black',
'male', 'age2' and 'educate.'

The commands that I am not familiar with have an * after them. I did not
actually put the * in my syntax, but I thought it might help for you to
know
my level of novice :-). After looking at your sample, I tried the
following:

clear

sysuse* das1995r

forv* i = 1(1)5 {
preserve*
uvis* regress income black male age2 educate, gen(income`i')
seed(123695`i')
replace income = income`i'
save das`i', replace
restore*
}

[[Here I am told: "already preserved r(621);"]]

forv* i = 1(1)5 {
use das`i', clear
tab income, miss
}

[[Here is where I get the error message that stops the progress. I get a
message that says: "file das1.dta not found." I proceed nonetheless.]]

miset* using das

[[Here is where I figure out that I can go no further for real. I get the
error message: "file das1.dta not found."]]

I am assuming that I am making an error somewhere, but I just don't know
where. As you can see, I skipped the part where you created missing values,
because my values are already missing in the 'income' variable. Other then
that the syntax is the same as you provided it, I think.

Thank you for your time and energy,

M.

=====

Hi Rose,

this is an example of multiple imputation.

clear

* findfile cancer.dta

sysuse cancer

set seed 1234

* create some missing at random for the outcome variable died (0/1)

gen u = uniform()
replace died = . if u > 0.6
codebook died

* impute 5 times the outcome variable died using uvis
* every time generating a new variable (died1, died2, ..., died5)

forv i = 1(1)5 {
preserve

* specify a model to predict missing values
uvis logistic died drug age, gen(died`i')  seed(123695`i')
replace died = died`i'

* save a new dataset with the imputed dataset (canc1, canc2, ..., canc5)
save canc`i', replace
restore
}

* have a look at the imputed variables saved in the new datasets

forv i = 1(1)5 {
use canc`i', clear
tab died , miss
}

* set the imputed dataset before combine results
miset using canc

* specify the estimation command to be executed on each imputed dataset
* and get the overall estimates
mifit, indiv: logistic died drug age

Best,
Nicola





*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index