Hayley Carlotto, Marc Maier
Thursday 28 March 2019
Life insurance provides trillions of dollars of financial security for hundreds of millions of individuals and families worldwide. In order to price products competitively and maintain financial solvency (i.e., stay in business), life insurance companies like MassMutual must be able to accurately assess the mortality risk of applicants. This process of estimating this risk is called underwriting.
With many years of historical underwriting data and advancements in machine learning, the life underwriting process presented a great opportunity for data science to make an impact. Collaborating with actuaries, medical doctors, underwriters, and reinsurers, we developed a life score that has been deployed in an algorithmic underwriting system at MassMutual. Internally, this score is branded as the MassMutual Mortality Score (M3S), and commercially sold under the name LifeScore360.
Under a realistic simulation of MassMutual's historical book of business, we show that the life score outperforms traditional underwriting as measured by claims. In production, it's saving millions of dollars in operational efficiency, reducing the time it takes to issue policies, and driving the decisions behind tens of billions of dollars of benefits.
Underwriting is the process of collecting and analyzing data to estimate risk. Typically, when you decide you need life insurance, you meet with a financial advisor who connects you with a life insurance carrier and assists you with the application process. Most life insurance applications consist of an extensive questionnaire and a visit with a paramedical who draws blood and urine samples to be tested by a lab. Once the application is submitted and the lab tests are completed, an underwriter takes all this data and consults with very long, but established medical and life underwriting guidelines. These guidelines generally look at each data point independently and factor into a point-based system that ultimately assigns one of several risk classes. The risk class that the applicant is assigned to dictates the price, or premium payments, of the policy.
This process has worked very well in the past (after all, MassMutual is over 168 years old), but it hasn't really been optimized given the richness of the data. Leveraging this rich historical data and cutting-edge data science methods, we had the opportunity to develop an algorithmic solution that complements the traditional underwriting process.
The life score model is trained on nearly one million applications, spanning a 15-year period, using a survival variant of the random forest algorithm.1 Given an individual applicant, the model computes a cumulative hazard function, which is then mapped to a more interpretable life score.
The life score ranges from 0 to 100 and represents the mortality risk of a given individual relative to similar individuals. Life insurers price their products with respect to age, sex, and smoking status, and the life score is interpreted with respect to these same factors. For example, if Carlos is a 55-year-old non-smoking male with a life score of 87, he can be compared directly against and has lower mortality risk than Barry, another 55-year-old non- smoking male with a score of 53. However, if Amy is a 35- year-old non-smoking female with a score of 87, she does not necessarily present the same mortality risk as Carlos.
A number on its own is valuable, but understanding why it's high or low is an important question for insurance applicants, underwriters, and advisors. In addition to the life score, the mortality model provides contributing factors, i.e., which health attributes increased and decreased the score. Life score contributions are calculated using Shapley values, an intuitive, model agnostic approach to interpreting model decisions derived from coalitional game theory.2 Here's an example of how the contributions of a life score can be visualized:
The short answer is, you create a robust simulation that shows what would happen if a different underwriting regime were in place. In collaboration with actuaries, we designed an algorithm that generates a synthetic, model-assigned book of business to compare against historical underwriting risk class offers. On a pool of 600k applications, we use the life score to sort individuals by mortality risk and reassign risk classes maintaining identical age, sex, application year, and risk class distributions. With two sets of decisions in hand, we can compare the mortality between the historical and model risk class assignments.
Underwriters are experts at risk selection, yet the results show that after a 15-year duration, the life score model would have formed an offer pool with 6% fewer deaths.
In production at MassMutual, thresholds are calibrated such that the life score yields the same set of discrete risk classes used by underwriters. Below is a visualization of how this risk class recommendation is generated.
The model-driven risk class recommendation is then used in conjunction with the underwriting process (we'll get into that soon) to land on a final risk class decision.
While the large historical simulation illustrates how the life score performs on it's own, in production we leverage (and are required to abide by) the large established set of underwriting guidelines. These guidelines have been codified into thousands of rules that cap applicants to risk classes based on particular health attributes of their application. When a rule is triggered (say, someone checks off that they have a heart condition), underwriters can focus on pertinent details of the application and use domain expertise to either override the rule, decide if additional information is required, or confirm the rule and proceed with the recommended risk class.
Ultimately, the life score is the key driver of the final offer, but rules may hold the application to a worse risk class. Applications with no rules triggered can be issued with light review. Below is a schematic of how the mortality model interacts with larger algorithmic underwriting system.
Since its deployment in 2016, the life score model has scored more than 175k applications. It's saving millions of dollars in operational costs, reducing the time it takes to issue policies, and increasing the rate at which applicants take their policy offer. Not to mention in the long-run, MassMutual will observe better mortality rates in their healthiest risk classes.
Read our paper, recently published at the IAAI Conference on Innovative Applications of Artificial Intelligence!
Get a quick and easy estimate of your life score here!
 Ishwaran, Hemant, et al. "Random survival forests." The annals of applied statistics 2.3 (2008): 841-860.
 Lundberg, Scott M., and Su-In Lee. "A unified approach to interpreting model predictions." Advances in Neural Information Processing Systems. 2017.