Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Replace with returned results in svy loop


From   Steven Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Replace with returned results in svy loop
Date   Sun, 29 May 2011 23:56:58 -0400

Kate Schneider:

I would put it more strongly than Richard: substituting the mean for all the data has no justification. 

If you are going to work in Stata, then I suggest you  read the error messages and do some digging. You haven't helped us by showing (as requested in the FAQ) exactly what Stata typed, so we could see where the error took place.


But in your case, the error message makes the problem easy to find:  A type mismatch can occur only in some kind of comparison, and you have only two lines that contain these:  One is "if `X' == 1 & priceperkilo !=.".  But you already know that clause is OK.   That leaves:  "if `r(N)'> 0". What could be wrong?  If you look at the -help- for "return", you will see that it never refers to saved results as local macros. And only local macros get single quotes around them (like `X').  If you didn't know that, but did a Google search for "stata r(N)" you would be led to: http://www.ats.ucla.edu/stat/stata/faq/returned_results.htm where returned results are used without quotes. So the answer is: remove the single quotes from around r(N).

But although that solves the immediate problem, you are still left with the fact that there is also no justification for substituting the mean for missing values. There are two reasons. First, mean substitution distorts most other sample statistics, for example the standard deviation, skewness, kurtosis, some quantiles, correlations with other variables. Second, there are good imputation alternatives, as Richard has already stated. 


Steve
sjsamuels@gmail.com


On May 29, 2011, at 10:16 PM, Richard Williams wrote:

At 06:53 PM 5/29/2011, Kate Schneider wrote:
> Thank you so much Richard! It totally works! We can't get it to only
> replace the missing values, but I think it is more valid to create a
> new variable with the estimated value anyways. Thanks so much!
> 
> Kate

Richard wrote:

I'll take your word for it. :) I only looked at a small portion of the code and didn't try to figure out the rest. I'd be careful, though, if you aren't getting what you expected. You have to be pretty lucky to accidentally get a result that is better than the one you intended. Also, while I am not crazy about substituting the mean for missing data, I am even less crazy about substituting the mean for all the data, if that is in fact what you have done.

Kate Schneider wrote:
-Also, we have amended our code based on your recommendations but are
getting a "type mismatch error"

Here is what we're trying:

foreach X of varlist $fooditemall {
quietly count if `X'==1 & priceperkilo !=.
       if `r(N)'>0 {
       display "Variable is: " "`X'"
       svy, subpop(`X'): mean priceperkilo
       mat meanpriceperkilo = e(b)
       scalar vmean = meanpriceperkilo[1,1]
       }
quietly count if `X'==1 & priceperkilo ==.
       if `r(N)'>0 {
       replace priceperkilo = varmean if `X'==1 & priceperkilo ==.
       }
}

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index