Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Standardizing values


From   JIBONAYAN RAYCHAUDHURI <jibonayanrc@yahoo.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: Standardizing values
Date   Mon, 24 Aug 2009 11:39:10 -0700 (PDT)

Hi Stata Users,

I am trying to calculate total factor productivity (TFP) for a panel of firms. I have these firms classified by industry. I have a measure of TFP (tfplevpet_imp, which contains imputed values) that I am trying to standarize at 2-digit industrial classification (called nic2 that ranges from 15 to 35 with some gaps) . I am using the following code to do so: 

#delimit;
gen logtfplevpet_mean_imp=.;
gen logtfplevpet_sd_imp=.;
#delimit;
gsort +nic2 +newyear; 
#delimit;
foreach num of numlist 15 17 19 21 23 24 25 26 27 29 30 
31 32 35 {; 
by nic2 : egen logtfplevpet_mean_imp`num' = mean(logtfplevpet_imp) if nic2==`num';
by nic2 : egen logtfplevpet_sd_imp`num' = sd(logtfplevpet_imp) if nic2==`num';
};
#delimit;
foreach num of numlist 15 17 19 21 23 24 25 26 27 29 30 
31 32 35 {;
replace logtfplevpet_mean_imp=logtfplevpet_mean_imp`num' if nic2==`num';
replace  logtfplevpet_sd_imp=logtfplevpet_sd_imp`num' if nic2==`num';
};

#delimit;
gsort +compname +newyear;
#delimit;
gen logtfplevpet_stand_imp= logtfplevpet_imp-logtfplevpet_mean_imp/logtfplevpet_sd_imp;

The correlation between the standarized and the non-standarized values is very low about 0.25. Also, the mean of this measure is .54. This measure of TFP is using a semi-parametric estimation technique. In another measure of TFP, which I get as the OLS residual in a simple regression, if I use the exact same code the correlation between std. and non std. values is 0.99!!! Also, instead of standarizing at 2-digit nic if I do a standardization over  the entire sample, i..e., std. values are now computed from the overall mean and variance of all firms in all industries, the TFP measure shows a mean of nearly 0 and an sd of 1, but the correlation with the non-std. measure is still low 0.71. As a sidenote, when I impute values I am using only those nic observations that are used in the standarization. I am very puzzled as to why the correlation is so low between standard and nonstandard values, when it should always be close to 1. Any comments suggestion will
 be highly appreciated.

Jibonayan




__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index