Prepare data for further usage
prepare_data.RdThe function filters large claims and adds columns that are needed for generating pools and for sampling.
It summarises seven subfunctions, see details for detailed description of parameters and subfunctions.
Usage
prepare_data(
claims_data,
indices,
threshold,
first_orig_year,
last_orig_year,
expected_year_of_growing_large = 3,
reserve_classes,
pool_of_annuities = NULL
)Arguments
- claims_data
dataframe, see details.
- indices
dataframe, see details.
- threshold
numeric value, see details.
- first_orig_year
integer or numeric value, see details.
- last_orig_year
integer or numeric value, see details.
- expected_year_of_growing_large
integer or numeric value, see details.
- reserve_classes
numeric vector, see details.
- pool_of_annuities
dataframe, see details.
Details
Data preparation before applying sicr is done in seven steps. The subfunctions for each step can be used separately.
This allows users to skip steps in order to use own implementations instead.
Description of needed input data
first_orig_year
Integer value for the first origin year for which a full claim history is available.
Information from older claims for which history is only available fromfirst_orig_yearto today may though be considered for the pools, seeclaims_data.last_orig_year
Integer value for the last origin year which will also be treated as the last calendar year, usually the calendar year just ended.claims_data
For each claim that could exceed the threshold after indexation this dataframe must contain one row for every calendar year in which at least one ofCl_reserve,An_reserve,Cl_payment_calorAn_payment_calis not equal to 0.
To consider older claims for which history is only available fromfirst_orig_yearuntil now, add one row for this claim for calendar yearfirst_orig_year - 1and fillCl_payment_calandAn_payment_calwith the cumulated payments up to that year andCl_reserveandAn_reservewith the reserve at the end offirst_orig_year - 1. This allows for checking if a claim belongs to large or small claims, but due to the missing first part of the history the Dev_year_of_growing_large (see step 5) can't be determined for these claims. Dev_year_of_growing_large has to be estimated instead (see parameter expected_year_of_growing_large).
claims_data must consist of the following columns:Claim_id
type: character. Each claim must be assigned a unique claim id. These must correspond to the claim id's in the dataframepool_of_annuities.Origin_year
type: integer or numeric. Origin year of the claim.Calendar_year
type: integer or numeric. Calendar year of payments and reserves of the claim.Cl_payment_cal
type: numeric. Claim payments made in this calendar year without annuity payments.Cl_reserve
type: numeric. Claim reserve at the end of this calendar year without annuity reserves.An_payment_cal
type: numeric. Annuity payments made in this calendar year.An_reserve
type: numeric. Annuity reserve at the end of this calendar year.
pool_of_annuities
This dataframe contains one row for each annuity that has ever been agreed on regardless of whether the annuity is still active or not.pool_of_annuitiesmust consist of the following columns:Claim_id
type: character. Each claim must be assigned a unique claim id. These must correspond to the claim id's in the dataframeclaims_data.Annuity_id
type: character. Annuity ID for the case of multiple annuities in one claim.Origin_year
type: integer or numeric. Origin year of the claim.Calendar_year
type: integer or numeric. Calendar year of information.Entering_year
type: integer or numeric. Year in which insurer and recipient have agreed on the annuity.Annuity_start
type: integer or numeric. Year in which payment starts, may be a past or a future year.Annuity_end
type: integer or numeric. Year in which payment ends, may be a past or a future year.Birth_year
type: integer or numeric. Birth year of recipient for assigning survival probability from mortality tables.Gender
type: character "m" or "f". Gender of recipient for assigning mortality table.Annual_payment
type: numeric. Agreed annual payment.Dynamic
type: numeric. If a dynamic increase of the annual payment is part of the agreement, it can be specified here, e.g. 0.02 for 2% annual payment increasement.
indices
This dataframe contains one row for each historic calendar year and 250 future calendar years.
Although the indices for the future calendar years (as well as the columnIndex_re) are not needed here but in the subsequent simulation, it is required to create all rows and all columns of this dataframe just once with the helper functions.expand_historic_indices()andadd_transition_factor().indicesmust consist of the following columns:Calendar_year
type: integer or numeric.Index_gross
type: numeric. Claim payment development from one year to another, e.g. 0.02 for 2% increaseIndex_re
type: numeric. Contractually fixed claim payment development that is to be used in special index clauses that are a common part of longtail xl resinsurance programs.Transition_factor
type: numeric. The transition_factor for year j indexes a payment of year j to the niveau of payments in the fixed index_year.
threshold
Numeric value to separate small from large claims.Thresholdhas to be stated at the niveau of index_year.reserve_classes
numeric vector to specify reserve classes at the niveau of index_year, for examplec(1, 1001)for the three reserve classes(-Inf, 1),[1, 1001)and[1001, Inf).expected_year_of_growing_large
Integer value for the estimated development year of becoming large for claims without full history (see description above). Default: 3.
Step 2 - Add missing rows
claims_data is requested to only contain rows in which one of Cl_reserve, An_reserve, Cl_payment_cal or An_payment_cal is not equal to 0.
This step adds rows so that claims_data contains one row per calendar year between the origin year and the last origin year for each claim.
Step 3 - Add columns
This step adds derived columns to claims_data that are needed for further operations:
Development_yearSum of claim and annuity payments and reserve
Payment_calas the sum ofCl_payment_calandAn_payment_calReserveas the sum ofCl_reserveandAn_reserve
cumulated columns
Cl_payment_cumas the cumulated claim paymentsAn_payment_cumas the cumulated annuity paymentsPayment_cumas the sum of the two latter
incurred columns
Cl_incurredas the sum ofCl_payment_cumandCl_reserveAn_incurredas the sum ofAn_payment_cumandAn_reserveIncurredas the sum of the two latter
Entry reserve columns
Entry_cl_reserveas theCl_reserveof the beginning of the yearEntry_an_reserveas theAn_reserveof the beginning of the yearEntry_reserveas the sum of the two latter
Step 4 - Add indexed columns
This step uses the dataframe indices to index every payment and every reserve to the indexation year.
The following columns are calculated:
indexed payments
Ind_cl_payment_calInd_an_payment_calInd_payment_calas the sum of the two latter
indexed reserves
Ind_cl_reserveInd_an_reserveInd_reserveas the sum of the two latter
indexed entry reserves
Ind_entry_cl_reserveInd_entry_an_reserveInd_entry_reserveas the sum of the two latter
cumulated columns
Ind_cl_payment_cumInd_an_payment_cumInd_payment_cumas the sum of the two latterInd_cl_incurredInd_an_incurredInd_incurredas the sum of the two latter
Step 5 - Filtering large claims
The indexed columns allow for the exact calculation of which claim has become large and when.
Claims that have not become large yet are eliminated.
The following new columns are derived from this information:
Large_sinceas the calendar year in which the claim exceeded the threshold for the first timeDev_year_of_growing_largeas the development year in which the claim exceeded the threshold for the first timeDev_year_since_largeas the "new" development year where counting starts in yearLarge_sinceinstead of the origin year
Step 6 - Add reserve classes
The columns Entry_reserve_class and Exit_reserve_class can now be derived from Ind_entry_cl_reserve and Ind_cl_reserve.
Step 7 - Attach future annuities
The columns New_annuity_1 to New_annuity_5 are added to the dataframe. If new annuities have been agreed on
in the calendar year and the claim of the row, these columns contain the row number of these annuities in the corresponding
dataframe pool_of_annuities. These columns are needed for the pool as the sampling shall consider claim payments and
new reserve classes as well as new annuities.
The function always adds five columns for new annuities which is sufficient for the rare case that five new annuities are agreed on for one single claim in one calendar year. Each further new annuity in one year will be ignored, which could lead to a slight underestimation of the best estimate. To avoid this, data may be manually adjusted, for example by moving further annuities to another calendar year or by aggregating annuities.
Examples
# this example uses data provided with this package
new_data <- prepare_data(claims_data = minimal_claims_data_xmpl,
indices = indices_xmpl,
threshold = 400000,
first_orig_year = 1989,
last_orig_year = 2023,
expected_year_of_growing_large = 3,
reserve_classes = c(1, 200001, 400001, 700001, 1400001),
pool_of_annuities = minimal_pool_of_annuities_xmpl)
head(new_data)
#> Claim_id Origin_year Calendar_year Cl_payment_cal Cl_reserve An_payment_cal
#> 1 Claim#1 1976 1976 0 0 0
#> 2 Claim#1 1976 1977 0 0 0
#> 3 Claim#1 1976 1978 0 0 0
#> 4 Claim#1 1976 1979 0 0 0
#> 5 Claim#1 1976 1980 0 0 0
#> 6 Claim#1 1976 1981 0 0 0
#> An_reserve Development_year Payment_cal Reserve Cl_payment_cum An_payment_cum
#> 1 0 1 0 0 0 0
#> 2 0 2 0 0 0 0
#> 3 0 3 0 0 0 0
#> 4 0 4 0 0 0 0
#> 5 0 5 0 0 0 0
#> 6 0 6 0 0 0 0
#> Payment_cum Cl_incurred An_incurred Incurred Entry_cl_reserve
#> 1 0 0 0 0 0
#> 2 0 0 0 0 0
#> 3 0 0 0 0 0
#> 4 0 0 0 0 0
#> 5 0 0 0 0 0
#> 6 0 0 0 0 0
#> Entry_an_reserve Entry_reserve Ind_cl_payment_cal Ind_an_payment_cal
#> 1 0 0 0 0
#> 2 0 0 0 0
#> 3 0 0 0 0
#> 4 0 0 0 0
#> 5 0 0 0 0
#> 6 0 0 0 0
#> Ind_payment_cal Ind_cl_reserve Ind_an_reserve Ind_entry_cl_reserve
#> 1 0 0 0 0
#> 2 0 0 0 0
#> 3 0 0 0 0
#> 4 0 0 0 0
#> 5 0 0 0 0
#> 6 0 0 0 0
#> Ind_entry_an_reserve Ind_entry_reserve Ind_reserve Ind_cl_payment_cum
#> 1 0 0 0 0
#> 2 0 0 0 0
#> 3 0 0 0 0
#> 4 0 0 0 0
#> 5 0 0 0 0
#> 6 0 0 0 0
#> Ind_an_payment_cum Ind_payment_cum Ind_cl_incurred Ind_an_incurred
#> 1 0 0 0 0
#> 2 0 0 0 0
#> 3 0 0 0 0
#> 4 0 0 0 0
#> 5 0 0 0 0
#> 6 0 0 0 0
#> Ind_incurred Large_since Dev_year_of_growing_large Dev_year_since_large
#> 1 0 1978 3 -1
#> 2 0 1978 3 0
#> 3 0 1978 3 1
#> 4 0 1978 3 2
#> 5 0 1978 3 3
#> 6 0 1978 3 4
#> Entry_reserve_class Exit_reserve_class New_annuity_1 New_annuity_2
#> 1 0 0 0 0
#> 2 0 0 0 0
#> 3 0 0 0 0
#> 4 0 0 0 0
#> 5 0 0 0 0
#> 6 0 0 0 0
#> New_annuity_3 New_annuity_4 New_annuity_5
#> 1 0 0 0
#> 2 0 0 0
#> 3 0 0 0
#> 4 0 0 0
#> 5 0 0 0
#> 6 0 0 0