Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: re: SSC Activity, November 2009


From   Roy Wada <roywada@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: re: SSC Activity, November 2009
Date   Sun, 6 Dec 2009 16:27:01 -0800

> I am going to add the following because I don't like the idea of
> someone given a license to keep on doing this:

As I said before I am making this post because this has gone on long
enough and I do not want to see Kit giving an official validation or
imprimatur to download numbers which do not look right. This will only
encourage more manipulations. Judging by the collective silence on
this popular topic, I suspect many people also have uneasy feelings
that there is something amiss about these download numbers. Kit has
spent tremendous time and effort developing ssc and it's very
unfortunate that this is being done through ssc.

Disclaimer: these are numbers obtained and crunched by me. If I made a
gross error, contact me to have it fixed or issue an addendum. Note
that I am only presenting numbers with possible explanations. I do not
say or imply who did the downloading. I do not say or imply, for
example, that Edwin Leuven and Barbara Sianesi manipuated psmatch2
numbers. That would be silly. psmatch is included here because it had
download number of 500 or more at some time since Jan 2005 and still
in circulation. Richard Williams will be happy to know that mfx2 is
included in the list on pure merit. I did not discriminate.

There are several sources for ssc programs. One is through -ssc
install- command. This method is virtually costly, meaning can be
easily download by thousands. The other method is through RePEc
websites. This is a manual method. The only people who go through this
method are the ones who really want it.

I have downloaded the ssc statistics (Kit's monthly plus whatshot, I
used logout for this) and matched them to the RePEc download history.
In order to make them comparable, RePEc was converted into a moving
3-month average x 10. The ssc statistics prior to Nov 2007 was also
converted into a 3-month average.

The basic premise is that the two series must track each other. The
Excel graphs are uploaded here. They have sufficient resolution; you
only need to zoom in. If they get knocked off (go offline or do not
work), I will upload the raw numbers.

http://profile.imageshack.us/user/roywada/

The graphs are numbered from right to left. The graph 1 is on the RIGHT.

In the graph 1 (on the right side), mat2txt and psmatch2 are clearly
"manipulated". The pink repec line and the blue ssc line diverge for
several months and then re-converge. outreg shows up as it should: the
two close lines track each other (the pink repec line lags the blue
ssc line).

In the graph 2 (the one next to the right-most one), both outreg2 and
tabout are moving up and looking as they should. Could I have kown
this and manipuated outreg2 precisely? If I did that, I would have had
to download each ado files (4 of them) and may be 3 help files for 40
times each (which is about 20%), which would mean 280 manual download
per months for several years. Should I have cleared the cookies after
each visit? I don't know. Even then it would have been a risky
operation because the two lines can easily become divergent and it
will show up. As some people have found out, outreg2 is not even fully
documented, and I don't add functionalities that can be easily added.
The best explanation for outreg2 is that it doesn't have to be
manipuated and I have no interest in doing so.

The problem is estout. The blue ssc line for estout breaks the trend
aroud Sep 2008, and breaks again around Jan 2009. Note that the blue
line keeps going up while the pink line is trending down. There is no
good reason for this divergence considering that the two lines have
previously tracked each other. estout was updated around Jan 2009 but
the functionality added at that time overlapped other existing
programs and should not have had that much impact. If you believe what
the pink line is saying, the download numbers for estout have
basically moved sideways since the middle of 2007 and possibly
trending downward.

If anyone is keeping track of the calendar, the massive manipulation
began around Sep 2008 with mat2txt. It then briefly moved on to
psmatch2. mat2txt subsided around the end of spring. By that time
estout is in full swing. I find the timing to be very interesting. I
also find the choice of programs very interesting.

An outside possibility is that the manipulation was done for the
purpose of casting suspicion in that direction but this seems too much
work just for that.

Roy

The graphs 3, 4, 5, and 6 are discussed at the end of this post (mfx2
is in there). They are included here for completeness sake:

In the graph 3, ivreg2 looks at it should. xml_tab has a surge,
courtesy of its introduction at a Stata conference (by me). Similar
introduction must have happend to mfx2, but I don't know who did that
one.

In the graph 4, overid look okay. gllamm and ranktest has tendency to
diverge but they always diverge, which means the trend between the
pink and the blue lines hold.

In the graph 5, whitetst and xtabond2 look okay. xcollapse has a
singular peak but that sometimes happens to a program that has not
been updated in a while.

In the graph 6, xtivreg2 looks okay.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index