Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: ambiguity in -if- qualifier


From   "Yu Chen, PhD" <[email protected]>
To   [email protected]
Subject   st: ambiguity in -if- qualifier
Date   Sat, 22 Mar 2014 16:26:10 -0500

I think there is some ambiguity in the meaning and usage of the -if-
qualifier. Generally, the command is performed on a subset that meets
the -if- condition. However, a command may perform many tasks, and the
subset for each task is not clear sometimes. For example, for the
-generate- command, it seems to calculate the result of the expression
on the full sample first, and then that result is assigned to a
subsample that meets the -if- condition. However, for the -egen-
command, the calculation is performed on a subset that meets the -if-
condition, not the full sample, and then that result is assigned to
the new variable on that subsample.

For example, see the code below.

sysuse auto
gen mpg2=mpg[_n-1] if foreign==1

Notice that observation number 53 has a value of 24 for mpg2. This
indicates that the task of taking a lagged value is performed on the
full sample first. Otherwise, this value should be missing. But -egen-
works differently.

There may exist other cases that have similar ambiguities. I would
suggest that Stata have a clear rule to address this issue. If the
rule is already out there, please tell me.
Thank you very much.

Yu Chen
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index