[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Fraud methods in Stata

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: Fraud methods in Stata
Date	Fri, 26 Sep 2008 11:30:44 -0500

I would search for publications of longstanding Stata user Stephen Evans in this area. He has done very serious work on (possibly fraudulent) medical data.

That said, I remain puzzled by the implication that outliers are prima facie evidence of fraud. My own impression is that fraudulent people wish to create datsets that look genuine and that they are thus unlikely to add or manufacture outliers, unless those outliers serve their purpose somehow, but that's just a guess. The main ways in which I can think of that fraudulent data can sometimes be identified is that often agreement is "too good to be true" and through looking at the patterns of first and last digits in data. Another obviously related issue is plagiarism of published data.

Nick
[email protected]

Williams, Rachael wrote:

I am considering methods of detecting fraud in a hypothetical clinical
trial with a large number of centres, but only a few patients per
centre.
In addition, many variables will be binary.

Would Cook's D be appropriate here?
Is it possible to calculate Mahalanobis' distance in Stata in order to
detect (possibly fraudulent) inliers, outliers and near duplicates in a
dataset?

If anyone has any ideas of other ways to detect possible fraud I would
love to hear from you too!

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Fraud methods in Stata
  - From: "Williams, Rachael" <[email protected]>

Prev by Date: st: Job posting Harvard
Next by Date: Re: st: Fraud methods in Stata
Previous by thread: st: Fraud methods in Stata
Next by thread: Re: st: Fraud methods in Stata
Index(es):
- Date
- Thread