Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Trying to do some multiple imputation


From   "Mosi A. Ifatunji" <mifatunji@yahoo.com>
To   "Statalist@HARVARD.EDU" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Trying to do some multiple imputation
Date   Tue, 24 Jan 2006 12:08:53 -0600

Nicola,

Okay, so the syntax seems to have worked with the new commands. That is,
after running the syntax I find myself working with a dataset that is
representative of multiple datasets and if I run -tab income- I get a
crosstab with imputed values.

Ultimately, I would like to have a dataset ('das1995r') that has one
observation per variable per person in the dataset. I would like to have a
variable in such a dataset with a variable named, say 'imp_income' (for
imputed income). How do I get from where I am now to where I want to end up?

M.


On 1/24/06 11:29 AM, "Nicola Orsini" <nicola.orsini@ki.se> wrote:

> Mose,
> 
> 1) define a working directory -help cd-
> 
> clear
> cd "mypath"
> 
> 2) use the command -use- instead of -sysuse- to open your dataset (help
> sysuse)
> 
> use das1995r
> 
> The rest of the lines should be fine. I hope you are using a do-file to
> run the analysis.
> 
> Best,
> Nicola
> 
> Mosi A. Ifatunji wrote:
>> Thanks Nicola,
>> 
>> I have been trying the code that you sent and I am having some trouble with
>> it. Admittedly, I am not familiar with some of the commands that you're
>> using and the -help- command doesn't really provide for much more clarity.
>> So, let me tell you what I am doing (verbatim) and you can tell me if there
>> is an error in my use of the syntax you have so generously provided.
>> 
>> First, the dataset I am using is called das1995r and it is located in the
>> main Stata folder. The variable I would like to generate values for is
>> 'income.' the variables I would like to generate 'income' from are: 'black',
>> 'male', 'age2' and 'educate.'
>> 
>> The commands that I am not familiar with have an * after them. I did not
>> actually put the * in my syntax, but I thought it might help for you to know
>> my level of novice :-). After looking at your sample, I tried the following:
>> 
>> clear
>> 
>> sysuse* das1995r
>> 
>> forv* i = 1(1)5 {
>> preserve*
>> uvis* regress income black male age2 educate, gen(income`i') seed(123695`i')
>> replace income = income`i'
>> save das`i', replace
>> restore*
>> }
>> 
>> [[Here I am told: "already preserved r(621);"]]
>> 
>> forv* i = 1(1)5 {
>> use das`i', clear
>> tab income, miss
>> }
>> 
>> [[Here is where I get the error message that stops the progress. I get a
>> message that says: "file das1.dta not found." I proceed nonetheless.]]
>> 
>> miset* using das
>> 
>> [[Here is where I figure out that I can go no further for real. I get the
>> error message: "file das1.dta not found."]]
>> 
>> I am assuming that I am making an error somewhere, but I just don't know
>> where. As you can see, I skipped the part where you created missing values,
>> because my values are already missing in the 'income' variable. Other then
>> that the syntax is the same as you provided it, I think.
>> 
>> Thank you for your time and energy,
>> 
>> M.
>> 
>> =====
>> 
>> Hi Rose,
>> 
>> this is an example of multiple imputation.
>> 
>> clear
>> 
>> * findfile cancer.dta
>> 
>> sysuse cancer
>> 
>> set seed 1234
>> 
>> * create some missing at random for the outcome variable died (0/1)
>> 
>> gen u = uniform()
>> replace died = . if u > 0.6
>> codebook died
>> 
>> * impute 5 times the outcome variable died using uvis
>> * every time generating a new variable (died1, died2, ..., died5)
>> 
>> forv i = 1(1)5 {
>> preserve
>> 
>> * specify a model to predict missing values
>> uvis logistic died drug age, gen(died`i')  seed(123695`i')
>> replace died = died`i'
>> 
>> * save a new dataset with the imputed dataset (canc1, canc2, ..., canc5)
>> save canc`i', replace
>> restore
>> }
>> 
>> * have a look at the imputed variables saved in the new datasets
>> 
>> forv i = 1(1)5 {
>> use canc`i', clear
>> tab died , miss
>> }
>> 
>> * set the imputed dataset before combine results
>> miset using canc
>> 
>> * specify the estimation command to be executed on each imputed dataset
>> * and get the overall estimates
>> mifit, indiv: logistic died drug age
>> 
>> Best,
>> Nicola
>> 
>> 
>> 
>> 
>> 
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/support/faqs/res/findit.html
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index