Things to consider when fighting collusion fraud

Editor’s note: An abridged version of this article appeared as a web exclusive in Claims Magazine ( on Oct 11, 2009.

North American companies have developed sophisticated processes and technology to detect claims fraud. Illustrative claim process flow P" There are a variety of analytical techniques to use against claims data, such as automated red flags, predictive modeling, rules based analysis, data mining and others. At the most basic level, companies use decision rules to identify fraud at the claim level. Some companies process data at the provider level. They focus on the intermediaries who sit between the company and the customer. This is useful in cutting down fraud; however the weakness is that it assesses service provider behavior in isolation. A key gap in the arsenal is the capability to monitor the service provider network as a composite and assess pair-wise or group-wise culpability in fraud. This is the type of fraud we term “collusion.” This article answers the following questions.

  • What is collusion?
  • How rampant is collusion fraud?
  • Why are privacy concerns a barrier to catching collusion fraud?
  • What are some new technology innovations in the fight against collusion?

What is collusion?
Collusion is a tacit agreement amongst two or more entities in the value chain between the insurer and the customer. The purpose of the agreement is to misrepresent or to inflate loss events and thus to defraud the insurer. A typical example of collusion fraud would be a third party adjuster approving fraudulent claims on soft tissue injuries as submitted by a complicit clinic. As with other type of fraud, collusion hurts the industry in two ways: First, there is the direct charge to the insurer for claims that are not legitimate. Second, the inaccurate or non-existent claims corrupt the data used by underwriters.
How rampant is collusion fraud?
Organizational inertia is one reason that insurers do not invest the time, resources, or the budget to explore multi-channel fraud. Companies need to take a long view and build a business case around the amount they are actually losing by not actively monitoring the claim process for this type of fraud. While we are not aware of an industry survey on the size of the problem, we highlight a couple of illustrative case studies. In 2008, the Manhattan District Attorney [1] indicted 11 persons who operated a fraudulent “medical mill” that had bilked more than $6.2 million from insurance companies. Those charged included three medical doctors, a chiropractor, two acupuncturists, ten corporations, several ‘runners,’ and one mastermind behind the entire operation. The challenge with diagnosing such fraud is that when all the service providers are essentially validating each other, it becomes extremely hard to isolate inconsistencies at a single service provider level. In another example of collusion fraud, a single adjuster approved non-existent claims of net value of $2.4M from a rogue tire dealer over a multi-year period [2]. The point to be made is that if an organization is not performing periodic audits across all service providers, then its claim process is at risk of subversion and could be hemorrhaging millions of dollars in profits. We next discuss the operational and the technical challenges in mitigating collusion fraud.

Privacy concerns and sharing data
One of the operational challenges in fighting claim fraud lies in gaining access and use of relevant data from multiple service entities. If third party service providers are involved, then contractual obligations should be enforced to ensure service data are made available. Service partners often hold a perception that service data cannot be shared with partners because of privacy legislation. This is fundamentally incorrect. Our point of view on this is not a legal opinion, but a clarification on the government stance about customer data privacy and security.

Government guidelines are clear about an organization’s release of customer information to a third-party service provider. Customer confidentiality is enforced through a contractual agreement. Information that would identify customers through their name, address, date of birth, telephone number, social security identifier, or credit card number are not relevant and can be suppressed. The remaining service data can be shared and we particularly point to sub-section (e) in Section 6802 of the Gramm-Leach-Bliley Act [3], as being applicable to companies operating under U.S. law.
“Subsections (a) and (b) of this section shall not prohibit the disclosure of non-public personal information … (3) (A) to protect the confidentiality or security of the financial institution’s records pertaining to the consumer, the service or product, or the transaction therein; (B) to protect against or prevent actual or potential fraud, unauthorized transactions, claims, or other liability; (C) for required institutional risk control, or for resolving customer disputes or inquiries.”
A similar bill passed by the Canadian Parliament set out the privacy preservation guidelines, explicitly stating that use of the relevant data is permissible for purposes such as statistical analysis. Guidelines similar to those listed above have also been listed in the seventh principle of the UK Data Protection Act of 1998. Data availability is core to fraud mitigation. It is in the best interests of the industry and the consumer that all parties involved provide transparency into their respective processes.

Technology innovations for fighting collusion fraud
Gathering information across all service provider entities involved in the claim process is not only a technical challenge but, as noted above, it can be difficult to engage multiple parties in the initiative. We emphasize that this is permissible, has precedent, and ultimately is in the consumer’s interest because it helps the insurance industry keep costs down. Once the data are available, the data have to be linked across all the service providers on the claims they serviced.

To illustrate, once the data are captured and processed into a usable form, then it should be possible for an analyst to generate a profile on a closed claim that includes the date of the claim, the claimant(s) involved in the incident, the incident report, the name of the clinic(s) that appraised the claimant(s), the details on the treatment plan filed, the date of the filing, the adjuster who approved the treatment plan, the amount of the approval, the days of the treatment, the nature of the treatment, and so on. Such a data structure is known as an analytical data mart and is well within the scope of the technology capabilities of insurers. The decision support system that detects collusion patterns has a dependency on this data mart.

According to a new invention [6], a system and process must look at the ecosystem of all service providers as an aggregate and isolates suspicious patterns of aberrant transactions among specific combinations of service providers. A mathematical framework for the ecosystem comprises a representation of the entities and their interactions as a graphed network. The system parameters comprise the attributes of the individual entities, their interactions and the measures establishing the fraud propensity of every sub-graph in the network representation.

The system for assessing collusion risk comprises a data layer, an analytical layer and a reporting layer. The data layer is a structure to store and manage the transactions conducted by the ecosystem of provider entities. The analytical layer conducts the fraud propensity analysis at the singular entity’s level as well as the collusion propensity analysis at the entity ensemble level. The reporting layer is a means to deliver the discoveries of the analysis in a prioritized fashion to the end user.

The data layer compiles the necessary data on the entities, the transactions, and the entity features attributes representing the ecosystem. The data elements are drafted as matrices for subsequent analysis.

In the analytical layer, an entity fraud suspicion estimator is provided for computing a fraud suspicion score for each individual entity based on the behavioral attributes of that entity and its relationships with the entities in its neighborhood. The neighborhood in this context comprises the entities that are linked directly or indirectly via transactions to the entity being assessed. The collusion risk of a group of entities is an additive metric that incorporates the fraud suspicion scores across all entities in the group. The identification of all possible groups in the ecosystem and the computation of their risks for collusion is also a function of the analytical layer.

Collusion risk is measured as the cumulative suspicion scores for all entities that are linked based on their interactions in a path through the ecosystem. The path possesses a Markov property in that the suspicion score of any entity in the path is dependent only on the relationships between that entity and the entities that lie within a pre-defined neighborhood. In one embodiment of this invention, all paths between any two entities in the ecosystem are searched using a non-recursive Breadth First Search algorithm to identify all entity ensembles that are then assessed for their collusion risk. The collusion risk for all entities on a possible path is additive and a function of the sum of the fraud suspicion scores for all entities in the path. The collusion risk for each path is inversely proportional to a function of the path cardinality. The dominant collusion risk in a network of service providers is identified by rank ordering the collusion risk scores computed across all paths identified in the system.

The present invention is advantageous over previous systems in that the collusion propensities can not only be detected at individual entity level, but also detected at entity ensemble level to isolate suspected fraud rings. To learn more about this invention, access the relevant disclosure [6].

The effectiveness of this technology is in the intuitiveness of the results. In the example of the medical mill listed earlier [1], the combination of specific doctors, chiropractors, acupuncturists linking to the same set of unusual claims would be a dead giveaway that a medical mill was at work.

To learn more about this technology and its integration in the underwriting process, check out the case study below or look up the Claimsgator for Insurance product.

Underwriting process intelligence case study




[1] New York County District Attorney’s Office news release, March 11, 2008.

[2] Uniroyal Goodrich Tire Company v Mutual Trading Corporation, Nos 94-2915 & 94-3799.

[3] Gramm-Leach-Bliley Act, 15 USC, Subchapter I, Sec. 6801-6809 – Disclosure of Non-public Personal Information.

[4] Nearhos, J.; Rothman, M.; and Viveros, M. 1996. “Applying data mining techniques to a health insurance information system”. In Proc. of the 22nd Int’l Conference on Very Large Databases.

[5] P-N Tan, M. Steinbach, V. Kumar, “Introduction to Data Mining”, ISBN-10: 0321321367, Addison-Wesley, 2006.

[6] V. Madhok, L. Reimus, USPTO Application number: 12/816,574, Filing date: Jun 16, 2010.

Leave a Reply

Your email address will not be published. Required fields are marked *