This documentation is in a rudimentary form for release 0.1.1. which is meant to see how much interest (not the financial one) this package generates.
The following vignettes are available.
On https://github.com/vanzanden/ggsolvencyii/tree/master/vignettes less rudimentary versions might be available between releases.
It will be very helpful to have seen a few examples of what ggsolvencyii can do before going through this vignette.
a typical spreadsheet might show some ORSA (own risk and solvency assessment) in the shape represented by the following data.frame:
id | time | ratio | SCR | BSCR | operational | life | market | l_expenses | l_CAT | m_equity | and so on |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2017 | 230 | 100 | 80 | 25 | 33 | 50 | .. | .. | .. | .. |
2 | 2018 | 225 | 103 | 85 | 25 | 33 | 57 | .. | .. | .. | .. |
3 | 2019 | 227 | 107 | 90 | 23 | 37 | 60 | .. | .. | .. | .. |
.. |
One can discern several parts. The first columns are id of each SCR composition and its ‘meta’ attributes (time, ratio). The further columns describe the components of each SCR item. The value of each item is in the crossing of its corresponding column and row.
ggplot2
, the foundation on which the plotting part of this package is build expects data in a tidyverse format. Each row in the data describes only one data point i.e. value of SCR item for one specific ‘id’.
the following code is used from transferring data (for example 2, a single SCR plot) in a spreadsheet the same form as the “human format” as above to tidyverse format (the numbers differ though !)
data <- readxl::read_xlsx(path = "path/filename.xlsx",sheet = "ex2_data")
data <- tidyr::gather(data,
key = description,
value = value,
-id, -time, -ratio)
sii_z_ex2_data <- data.frame( time = as.numeric(data$time),
ratio = as.numeric(data$ratio),
description = data$description, # it has to be a factor !!
value = as.numeric(data$value),
id = data$id
when the above data is passed to the package with (a very) basic line as
a lot happens under the hood. Broadly speaking the next steps are taken for geom_sii_surface
and .._outline
:
1. when `geom_sii_riskoutline` is used for comparison of id's, risk-values are moved between data rows
2. the structure of the SCR composition a expanded with grouping information
3. the expanded structure is integrated with the data
4. actual grouping is performed by adding rows
5. for all elements to be plotted the corner-coordinates of the circle segments are calculated
6. when applicable rotation and/or "squarification" is applied by changing the corner-coordinates
7. corner coordinates are transformed in a series of points for polygons
geom_sii_riskoutline
plots (some of) the outlines of circle segment and as such can be used for a non-obtrusive plot, or for an overlay of the composition of one SCR over the other (see use in vignette showcase
. To prevent the need of working with two separate datasets the optional aesthetic comparewithid
is present in geom_sii_outline
. It is best explained with an example. Compare the data of sii_z_ex1_data
with the expanded structures without and with use of the comparewithid
-aesthetic. It shows that the structure of id = 1 is not plotted anymore at its own location (2016,230) but three times in 201: Value 23 for SCR is now present three times in the data. This transformation is used for all (sub)risks.
## the original data
sii_z_ex1_data[sii_z_ex1_data$description == "SCR", ]
#> time ratio description value id comparewithid
#> 1 2016 230 SCR 23.00000 1 NA
#> 2 2017 233 SCR 23.14993 2 1
#> 3 2018 238 SCR 19.99461 3 2
#> 4 2019 243 SCR 15.61773 4 3
#> 5 2017 231 SCR 19.60600 5 1
#> 6 2018 232 SCR 25.74336 6 5
#> 7 2019 232 SCR 21.91342 7 6
#> 8 2017 227 SCR 25.08169 8 1
#> 9 2018 225 SCR 22.43068 9 8
#> 10 2019 226 SCR 21.91607 10 9
#> without passing the aesthetic 'comparewithid`: 10 lines of data
#> description id x y value
#> 35 SCR 1 2016 230 23.00000
#> 34 SCR 2 2017 233 23.14993
#> 33 SCR 3 2018 238 19.99461
#> 31 SCR 4 2019 243 15.61773
#> 39 SCR 5 2017 231 19.60600
#> 38 SCR 6 2018 232 25.74336
#> 32 SCR 7 2019 232 21.91342
#> 36 SCR 8 2017 227 25.08169
#> 37 SCR 9 2018 225 22.43068
#> 40 SCR 10 2019 226 21.91607
#> and with passing passing the aesthetic 'comparewithid': 9 lines of data
#> description id x y value
#> 28 SCR 2 2017 233 23.00000
#> 31 SCR 3 2018 238 23.14993
#> 32 SCR 4 2019 243 19.99461
#> 29 SCR 5 2017 231 23.00000
#> 33 SCR 6 2018 232 19.60600
#> 34 SCR 7 2019 232 25.74336
#> 35 SCR 8 2017 227 23.00000
#> 30 SCR 9 2018 225 25.08169
#> 36 SCR 10 2019 226 22.43068
The foundation of the package is the structure. A representation of the buildup of the SCR from its risks and subrisks. This structure is applied as a data.frame passed as a parameter to the geom’s geom_sii_surface
and geom_sii_outline
. The default data.frame is sii_structure_sf16_eng
where ‘sf16’ stands for the standard formula as of 2016, and ‘eng’ for English descriptions.
head(sii_structure_sf16_eng, 15)
#> # A tibble: 15 x 3
#> description level childlevel
#> <chr> <chr> <chr>
#> 1 SCR 1 2
#> 2 BSCR 2 3
#> 3 operational 2 <NA>
#> 4 Adjustment-LACDT 2d <NA>
#> 5 BSCR_div 3d <NA>
#> 6 market 3 4.01
#> 7 life 3 4.02
#> 8 non-life 3 4.03
#> 9 health 3 4.04
#> 10 cp-default 3 <NA>
#> 11 intangibles 3 <NA>
#> 12 market_div 4.01d <NA>
#> 13 m_interestrate 4.01 <NA>
#> 14 m_equity 4.01 <NA>
#> 15 m_property 4.01 <NA>
A Dutch version, sii_structure_sf16_nld
, is present in the package.
The hierarchy of the elements in description
is determined by level
and their components (childlevel
). SCR has a mandatory level (character value) “1”. rows with a suffix ‘d’ indicate a diversification item.
For other localizations or for use with internal models another structure can be passed to the geom. see my interpretation of the Internal Model of the dutch insurer “nationale nederlanden” in sii_z_ex6_structure
. Changing level-numbering or descriptions of items leads possible to the need of changing other (parameter) files as well (i.e. levelmax, plotdetails, coloring-sets).
When reporting the SCR composition of a large insurance company many risks will be present. This can lead to a very cluttered plot where all information is present but which is difficult to interpret. The package provides the means to restrict the amount of items to ‘k’ (in general or for each level separately) by means of the parameter levelmax
. this can be an integer, to applied to all items or in the form of a data.frame. The default value is 99, only grouping for risks with more than 100 sub-risks….
Parameter levelmax = sii_levelmax_sf16_995
shows all higher levels (lower level numbers) but restricts the lower levels (higher numbers) to 4 individual risks and 1 grouping of the smallest risks in that level.
sii_levelmax_sf16_995
#> # A tibble: 8 x 2
#> level levelmax
#> <chr> <dbl>
#> 1 1 99
#> 2 2 99
#> 3 3 99
#> 4 4.01 5
#> 5 4.02 5
#> 6 4.03 5
#> 7 4.04 5
#> 8 5 5
Combining the structure and the levelmax-information leads to an expanded structure of which the lines for levels 3 and 4.01 are shown here:
#> # A tibble: 15 x 4
#> description level childlevel levelmax
#> <chr> <chr> <chr> <dbl>
#> 1 market 3 4.01 99
#> 2 life 3 4.02 99
#> 3 non-life 3 4.03 99
#> 4 health 3 4.04 99
#> 5 cp-default 3 <NA> 99
#> 6 intangibles 3 <NA> 99
#> 7 market_div 4.01d <NA> 99
#> 8 m_interestrate 4.01 <NA> 5
#> 9 m_equity 4.01 <NA> 5
#> 10 m_property 4.01 <NA> 5
#> 11 m_spread 4.01 <NA> 5
#> 12 m_currency 4.01 <NA> 5
#> 13 m_concentration 4.01 <NA> 5
#> 14 m_illiquidity 4.01 <NA> 5
#> 15 market_other 4.01o <NA> 99
The row with level 4.01o
is the added row. The description is derived from the row where childlevel
= 4.01 and the value of the parameter aggregatesuffix
(default value is “other”).
The data (in tidyverse format!) is combined with the expanded structure by means of a left-join on the side of the data. Because the data is not expected to have o
-lines for integration they will not be present in the merged table. When a possible grouping line is present in the expanded structure a check is conducted whether the data contains so much risks for that level that actual grouping is needed. (The dataset can contain less risks than the structure which is used; i.e. a pure life-insurance company can use the standard sii_structure_sf16_eng
without any problems)
Now it’s known which lines in the expanded structure/data-data.frame should be plotted it is time to convert the date into circle segments. For the data-row with the largest SCR value it is defined as a full circle with radius = 1whatever the values of x and y. When combining several calls to geom_sii_risksurface and/or _riskoutline the parameter maxscrvalue
overwrites this extracted value. All plot-elements are scaled to the surface value of the item. additional manual horizontal and vertical scaling is possible, depending on the range of x and y values of the axes to retain the round shape.
For other levels the circle segments are defined by an inner and outer radius and a number of (compass-)degrees of the first and last radial line (clockwise). the inner radius is defined by the outer radius of the next higher level. the number of compass-degrees is defined by the fraction of the value of each item and its (equal leveled) ‘peers’. The value / surface dictates the outer radius.
When applicable a rotation is performed, a rotation in such a way that the first radial line of a specific (sub)risk point to 12 ’o clock, and/or an added fixed rotation.
A final transformation to a squared form is possible. to keep surfaces correct the ‘radial’-lines are adjusted. This might lead to unpredictable results in combination with a rotation which is not a multiple of 45 degrees or description-based rotation.
The (transformed/rotated) corner points are translated in polygon points (for geom_sii_risksurface
) or line segments (for geom_sii_riskoutline
)
The final step is to define which of all these polygons or line segments actually will be plotted. By default everything will be plotted but passing a dataframe to parameter plotdetails
can determine this on a level
-level or a description
-level.
In the showcase two data-frames are used, only differing in column surface
, but equal for outline1 to outline13. one of them is shown here.
sii_z_ex1_plotdetails
#> levelordescription surface outline1 outline2 outline3 outline4
#> 1 1 TRUE NA TRUE NA NA
#> 2 2 TRUE TRUE NA TRUE NA
#> 3 2d TRUE NA NA NA NA
#> 4 3 TRUE TRUE TRUE TRUE NA
#> 5 3d TRUE NA NA NA NA
#> 6 4.01 FALSE NA TRUE NA NA
#> 7 4.01d FALSE NA NA NA NA
#> 8 4.01o FALSE NA TRUE NA NA
#> 9 4.02 FALSE NA TRUE NA NA
#> 10 4.02d FALSE NA NA NA NA
#> 11 4.02o FALSE NA TRUE NA NA
#> 12 operational NA TRUE TRUE TRUE NA
#> 13 cp-default NA TRUE TRUE TRUE NA
#> outline11 outline13
#> 1 TRUE TRUE
#> 2 NA NA
#> 3 NA NA
#> 4 NA NA
#> 5 NA NA
#> 6 TRUE TRUE
#> 7 NA NA
#> 8 TRUE TRUE
#> 9 TRUE TRUE
#> 10 NA NA
#> 11 TRUE TRUE
#> 12 NA NA
#> 13 NA NA
surface
is used by geom_sii_risksurface
, the other columns by geom_sii_riskoutline
. It can best be read as follows. for each risk the line of the corresponding level
is used, possibly overrule by the line with the correct description
and a explicit TRUE
or FALSE
present.