Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Data Corruption?


From   Ed Blackburne <[email protected]>
To   [email protected]
Subject   st: Data Corruption?
Date   Tue, 23 Oct 2007 15:41:46 -0500

Hello all,

I have searched within the archives and cannot find an answer to my problem. Please correct me if I am missing anything obvious.
I have strange results when calculating (basic) stats for a variable. I assume there is some sort of data corruption, but I have never seen this before, so any pointers would be helpful.

Here is a listing of my data (I have added an if condition to keep the example simple).

. li oil_level gdp if id==211

+-------------------------------+
| oil_le~l gdp |
|-------------------------------|
1059. | 548.9 15492.840168784467 |
1060. | 575.7 16248.206511636326 |
1061. | 595.8 16370.615114025704 |
1062. | 635.5 17072.56241927471 |
1063. | 667.8 17501.07679210328 |
|-------------------------------|
1064. | 694.6 17321.478178718633 |
1065. | 719.3 17792.90476504183 |
1066. | 775.8 18647.050437099166 |
1067. | 818 19551.838349719856 |
1068. | 782.6 19207.297273658987 |
|-------------------------------|
1069. | 765.9 18931.991301627946 |
1070. | 822.4 19861.88173008627 |
1071. | 865.9 20652.321708883446 |
1072. | 888.8 21615.181061740117 |
1073. | 868 22041.69364428876 |
|-------------------------------|
1074. | 794.1 21606.154441767136 |
1075. | 746 21955.530458150668 |
1076. | 705.5 21313.547357641153 |
1077. | 704.9 22154.364138248093 |
1078. | 723.3 23671.961781615486 |
|-------------------------------|
1079. | 720.2 24387.448078662386 |
1080. | 749.3 24951.98356186933 |
1081. | 764.8 25520.697093487765 |
1082. | 796.7 26275.362569684687 |
1083. | 795.3 26927.174510472792 |
|-------------------------------|
1084. | 781.8 27096.979921878978 |
1085. | 765.6 26688.35205330767 |
1086. | 782.2 27342.667747615116 |
1087. | 789.3 27871.529919979886 |
1088. | 809.8 28802.880024505936 |
|-------------------------------|
1089. | 807.7 29248.767207023822 |
1090. | 836.5 30097.683751824916 |
1091. | 848 31237.956233486548 |
1092. | 863.8 32297.523769177868 |
1093. | 888.9 33443.544095306104 |
|-------------------------------|
1094. | 897.6 34364.500620614825 |
1095. | 896.1 34162.90121322729 |
1096. | 897.4 34286.24328012388 |
1097. | 912.3 34875.37198079319 |
1098. | 948.7 36098.15411932452 |
+-------------------------------+

The listing above is correct and matches the raw data.

Here is a summ of the two series:

. summ oil_level gdp if id==211

Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
oil_level | 40 781.27 93.45438 548.9 948.7
gdp | 40 3019.45 771.8664 1673 4237

Note the gdp is obviously wrong. Any ideas?
FYI the data are stored as:

. desc oil_level gdp

storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
oil_level float %9.0g Oil Consumption (Million Tonnes)
gdp float %18.0g gdp
Stata versions: 9.2, both Windows and Linux experience the same problem.


Thanks,

Ed

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index