[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: Fraud methods in Stata

From   "Lachenbruch, Peter" <[email protected]>
To   <[email protected]>
Subject   RE: st: Fraud methods in Stata
Date   Fri, 26 Sep 2008 10:08:10 -0700

Stephen Evans has done a great deal.  You might also contact Jonas
Ranstam, Marc Buyse, or others.  They did a survey a few years ago for
the ISCB on fraud.  
There are many forms of fraud - data that are outliers (or inliers - too
good to be true), have decided digit preferences (too many 0s or 5s or
even numbers, etc.).  I don't remember all.  
However, there is a nice little book that was published about 10 years
ago that Evans had a chapter in and he used Mahalanobis's distance.
In another context, there was an issue at the FDA in which some lab
technicians were suspected of using the same data tray in blood testing
to avoid having to rerun the test if it didn't work out properly.  This
would lead to almost identical values in the control wells, so we looked
at Euclidean distance between the plates that were close in time.  We
also considered a rank procedure.  I tried to explain the procedure to
the lawyers who were involved - they decided it was too complicated to
explain to a jury.


Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Maarten buis
Sent: Friday, September 26, 2008 9:58 AM
To: [email protected]
Subject: Re: st: Fraud methods in Stata

One way is to use Benford's Law, type -findit Benford's Law- and use
google to learn more.


--- "Williams, Rachael" <[email protected]> wrote:

> Dear all,
> I am considering methods of detecting fraud in a hypothetical
> clinical
> trial with a large number of centres, but only a few patients per
> centre.
> In addition, many variables will be binary.
> Would Cook's D be appropriate here?
> Is it possible to calculate Mahalanobis' distance in Stata in order
> to
> detect (possibly fraudulent) inliers, outliers and near duplicates in
> a
> dataset?
> If anyone has any ideas of other ways to detect possible fraud I
> would
> love to hear from you too!
> Thanks - Rachael
> This email and any files transmitted with it are confidential. If you
> are not the intended recipient, any reading, printing, storage,
> disclosure, copying or any other action taken in respect of this
> email is prohibited and may be unlawful. 
> If you are not the intended recipient, please notify the sender
> immediately by using the reply function and then permanently delete
> what you have received.Incoming and outgoing email messages are
> routinely monitored for compliance with the Department of Healths
> policy on the use of electronic communications. 
> For more information on the Department of Healths email policy, click
> The original of this email was scanned for viruses by the Government
> Secure Intranet virus scanning service supplied by Cable&Wireless in
> partnership with MessageLabs. (CCTM Certificate Number 2007/11/0032.)
> On leaving the GSi this email was certified virus free.
> Communications via the GSi may be automatically logged, monitored
> and/or recorded for legal purposes.
> *
> *   For searches and help try:
> *
> *
> *

Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room N515

+31 20 5986715

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index