Back to articles

How are data and advanced analytics tackling insurance fraud

15th August 2018
By Dan Lamyman
Co-Founder & Director
Business Intelligence & Advanced Analytics

Insurance fraud is a continuing blight. In 2016, 125,000 total fraudulent insurance claims cost the industry and policyholders alike a whopping £1.3bn in the UK alone. It’s also difficult to keep on top of, with sophisticated bands of fraudsters devising seemingly endless ways to cheat the system. For example, a fraud ring of eleven individuals who were jailed for 42 years in 2017 for defrauding insurers to the tune of £500,000 by targeting unwitting drivers to crash into and cash in on the insurance. Cases like this are forcing insurance companies to invest over £200 million a year on fraud detection (according to the ABI).  Fortunately, there may be a solution on the horizon, as AI and big data analytics allow more powerful new ways to tackle false insurance claims.

These new technologies offer improved avenues for the gathering and analysis of both the usual structured data and unstructured text data. Machine learning algorithms are able to process data and ascertain patterns in behaviour that far exceed human analytic capabilities, clearly resulting in significantly deeper levels of accuracy in fraud detection.

With this in mind, I thought it’d be good to explore the types of data analytics that are proving themselves to be instrumental in insurance fraud detection.

Vast quantities of unstructured text data gushes into insurance companies on a daily basis. Sources range from health, police, and incident reports to live chat sessions, social media interactions and email.

Fraud in Insurance - Computer screen with a fraud alert.
AI & Big Data Analytics offer a more powerful way to tackle false insurance claims

Text Analytics

This data needs to be intricately sifted, analysed, and acted on – and fast. The need for relevant evidence, including important emails etc, to be processed quickly is crucial; an oversight or delay could cost the insurance company dearly. By automatically grouping and routing the data, algorithms can ensure valuable information reaches its destination quickly, in a form that is clear, relevant, and easily digestible.

Detecting fraud as quickly as possible is paramount. Business processes are becoming increasingly automated, which makes the speed at which transactions take place much faster. Detection must therefore be even faster still, ideally in real time, in order to catch fraud before it is able to do damage.

Phone with dictionary word "design" highlighted - Text Analytics
Text Analytics will make fraud quicker and easier to spot

How does text analysis help in fraud detection, though?

A machine learning data analytics tool is able to comb through and ascertain patterns indiscernible to the naked human eye. It can learn to identify certain phrases, descriptions of incidents, and so on, that appear regularly, for example; phrases that indicate potential foul play. For a human to do the same job would be both time-consuming and fraught with potential error. Humans are less able to offer consistent, objective analysis than machine learning algorithms, and simply unable to identify the same in-depth patterns visible to the algorithm.

Predictive Analytics

The ability to predict fraudulent activity based on previous incidents can be problematic. Criminals know that they need to evolve their methods regularly to avoid detection, which renders traditional predictive models of fraud detection less than helpful. That being said, there are indicators which, although appearing absent on the surface, persist at a deeper level.

The difficulty with predictive analytics is flagging potential indicators of fraud without disrupting the customer experience. Whilst fraud may be correctly flagged, there’s always going to be false positives at an early stage of the analytic process. Despite the need for speed of identification, it is necessary that a series of touchpoints must be hit before a flag is raised. Pausing a claim on the basis of one flag that may not be significant can be detrimental and costly, so the need for sufficiently sophisticated and accurate prediction is of utmost importance.

That’s why predictive analytics is not a standalone solution, in fact encompassing predictive elements derived from behavioural, pattern, graph, and link analysis sources.

Behavioural Analytics

Behavioural models, comprising customer profile data gathered over time and cross-referenced against data specific to individual products, can help calculate risk, which – in turn – has an impact on fraud detection. By individual products, I mean entities such as home insurance. Behaviour related to this product would comprise data on, for example, weather patterns in the area, employment stats, and crime rates.

With machine learning data analysis, these behaviours can be drilled down to incredibly precise criteria. Glaring anomalies have been clear for human claims assessors to spot for years, but many anomalies – as these algorithms are proving – can be virtually undetectable.  

Behavioural analytics also alludes to what is known as telematics. This is an avenue of data collection offered by sensors in, for example, policy holders’ cars (in the case of automobile insurance). These sensors gather data on the average mileage driven, average speed, and other more intricate factors contributing to the probability of an incident. I wrote about the impact on AI in motor insurance a while back, in case you’re interested in finding out more.

A car driven at night - telematics
Telematic data can be used in Behavioural Analytics

Pattern, Graph and Link Analysis

As I’ve mentioned, machine learning is able to detect patterns in data at a much deeper level than human analysts. Unexpected patterns found within a dataset can be an identifier of certain types of fraud.

Nonetheless, on the whole such patterns are more symptomatic than necessarily indicative of fraud having been committed. As such, further investigation of suspicious patterns is necessary in order to ascertain fraudulent activity, much of which can be automated.

The system begins with initial pattern identification, then keeping a close, real-time watch as the claim progresses in order to identify certain rules that might raise a warning flag along the way. As a machine learning system gains more experience, so it is able to identify those warning flags at a more sophisticated level, ascertaining not only the likelihood of fraud, but also the reasons behind its suspicions.

The relationships between different elements involved in a claim can be extremely complicated and practically invisible. Large datasets can comprise information from sources as seemingly disparate as social media connections and across different insurance companies. Link analysis between, for example, Facebook friends and other social media connections can identify relationships between co-offenders, for instance, whilst figures gleaned from insurance claims from other companies, and very different policies can be ascertained accurately and efficiently.

When dealing with large, organised fraud rings, the number of elements to be combed for relationships can be extremely high; a job only feasible through an automated system. As with the predictive analytics system, over time the ability to identify even deeper deviations grows.


As all aspects of society, and the financial world in particular, become increasingly automated, a hunger for data becomes the heart and soul of business. With more data available than ever, the ability to save the economy billions has never been greater. Insurance fraud has been a serious problem for generations, with the limits of the human mind stunting our ability to identify sophisticated criminal activity. The growth of machine learning systems that can surpass anything we could do alone ushers in a new era of security that will have positive ramifications for businesses and consumers alike.

Woman sat at home on her computer - protected from fraud
The growth of machine learning ushers in a new era of security

If you would like to discuss anything from this article, please feel free to reach out to me.

Logikk provides exceptional data talent solutions globally – we engage exceptional humans for companies looking to unlock the potential of their data. Don’t hesitate to get in touch if you would like to know more about our services – [email protected].

Looking for jobs in data?
Check out our latest selection.

Looking to hire the exceptional humans working with data?
Check out our services.

Share Article

Get our latest articles and insight straight to your inbox

Hiring data professionals?

We engage exceptional humans for companies looking to unlock the potential of their data

Upload your CV
One of our consultants will contact you
to find out more about you