[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Hewan Belay <firstname.lastname@example.org> asked why -xthtaylor- would drop a
different variable when -xthtaylor- is run at different times. He also
speculated that this was due to a problem with -xthtaylor-.
There is no problem with the results produced by -xthtaylor-, which variable
is dropped is arbitrary.
Changes in the sort order of the data are responsible for this difference.
But, it is still surprising when the variables dropped change from run to
Computers compute in finite precision. Among other things, finite
precision mathematics means that changing the order in which a group of
numbers are summed can cause minor differences in the computed sum. These
minor changes in the computed sum can alter which variable is dropped.
Dropping one of a series of perfectly collinear variables is a classic
knife-edge computation. In -xthtaylor- the decision is especially
complicated because it is made on the basis of transformed variables, not
the original variables. In addition, these transforms are computed after
sorting the data by the panel-id variable, which is not unique. I suspect
that minor differences in the computed transforms are triggering a
difference in the knife-edge decision of which variable is dropped.
We will change -xthtaylor- so that the sorts on the panel-id variable depend
deterministically on the original sort order of the data. This will remove
the variation from run to run, unless some other command re-sorts the data
in between runs.
If Hewan could privately send me the data and a do-file that reproduces
the posted example, then I can ensure that the fix addresses the problem
In the meantime, Hewan should simply exclude one of the time dummies to
ensure that the same variables are used across runs and samples.
* For searches and help try: