Dear Statalisters,

I (again) have a question about rescaling. I have a panel data set in which two variables (the dependent variable & one independent variable) are expressed thousands of dollars and the other independent variables are all index numbers or percentages. As I'm taking logs but had several negative numbers, I rescaled the whole dataset by adding a value to all variables such that the biggest negative value equals 1. The problem there was that I had to add a very large value (over 23367 thousand) to all variables, which meant that after taking logs all variables were nearly the same, as the index numbers were mostly below 1. This meant then that I could not test for the endogeneity of one of the variables due to collinearity problems.

What I then did was to go back to the original dataset and converted the variables expressed in thousands into millions. This also meant that the biggest negative value occurred in a different variable and I only had to add a value of 45 to each variable to get positive values for all variables (in order to log them). When I did the regression then, I got different results. In particular, the suspected endogenous variable became insignificant which kind of made the endogeneity test redundant (I did it anyhow, but the F-test for the overall regression became insignificant, which is not surprising I guess).

My question is whether my second approach to the rescaling is ok to do or whether I cannot do it like that.

Sorry for the rather long mail, but I don't know how to describe it in a shorter manner.

Thanks for you help,

Cordula

