Nonlife Insurance Risk Classification Using Categorical Embedding

Peng Shi - University of Wisconsin-Madison

21 May 2021

Introduction

Motivation

Motivation

Goal

We present several actuarial applications of categorical embedding in the context of nonlife insurance risk classification.

 

Based on paper:

P. Shi., K. Shi, 2021, Nonlife Insurance Risk Classification Using Categorical Embedding. Available at SSRN.

Categorical Embedding

What is categorical Embedding?

What is categorical Embedding?

For the categorical variable \(x\) with \(K\) levels, the embedding function of \(d\)-dimensional embedding space is given by: \[\begin{align} e: x \mapsto \bf{\Gamma} \times \bf{\delta}, \end{align}\]

The \(k\)th category is represented by the \(k\)th column of \(\bf{\Gamma}\). To see this, for the \(i\)th data point with \(x_i=c_k\), we note: \[\begin{align} e(x_i) = \left( \begin{array}{ccc} \gamma_{11} & \cdots & \gamma_{1K} \\ \vdots & \ddots & \vdots \\ \gamma_{d1} & \cdots & \gamma_{dK} \\ \end{array} \right) \times \left( \begin{array}{c} \delta_{x_i,c_1} \\ \vdots \\ \delta_{x_i,c_K} \\ \end{array} \right) = \left( \begin{array}{c} \gamma_{1k} \\ \vdots \\ \gamma_{dk} \\ \end{array} \right). \end{align}\]

Categorical Embedding and Nueral Networks

The embeddings can be automatically learned by a neural network in the supervised training process.

Actuarial Literature

We emphasize that categorical embedding is especially useful in two scenarios:

Actuarial Applications

Data

The insurance claims dataset is obtained from the local government property insurance fund of Wisconsin

Description of rating variables

Data

We consider a binary outcome that measures the claim frequency by peril

Data

Claim frequency outcomes are dependent:

Rating Classes for A Single Risk

In this case, we consider the context where there is a single insurance risk:


We fit neural networks:

Rating Classes for A Single Risk

Some results on prediction:

Rating Classes for A Single Risk

We could also use the embeddings to create risk classes:

Rating Classes for A Single Risk

Portfolio Management for Dependent Risks

\[\begin{align} \rho(z_1,z_2,z_3) = \frac{{\rm Pr}(Z_1=z_1,Z_2=z_2,Z_3=z_3)}{{\rm Pr}(Z_1=z_1){\rm Pr}(Z_2=z_2){\rm Pr}(Z_3=z_3)} \end{align}\]

Portfolio Management for Dependent Risks

\[\begin{align} \rho(z_1,z_2,z_3) = \frac{{\rm Pr}(Z_1=z_1,Z_2=z_2,Z_3=z_3)}{{\rm Pr}(Z_1=z_1){\rm Pr}(Z_2=z_2){\rm Pr}(Z_3=z_3)} \end{align}\]

Portfolio Management for Dependent Risks

We consider two types of insurance coverage, the stop-loss insurance and the excess-of-loss insurance. The insurer’s retained loss can be represented as: \[\begin{align*} {\rm Stop ~loss}:& ~R_1 = \min\{S,d_1\}\\ {\rm Excess~ of~ loss}:& ~R_2 = \max\{S-d_2,0\} \end{align*}\]

Pricing New Risks

Pricing New Risks