Interval Consensus Model

library(intervalpsych)

We want to fit the Interval Consensus Model to the Verbal Quantifiers dataset.

First, we load the verbal quantifiers dataset:

packages <- c("dplyr", "kableExtra")
# load packages and install if not available
for (pkg in packages) {
  if (!requireNamespace(pkg, quietly = TRUE)) {
    install.packages(pkg)
  }
  library(pkg, character.only = TRUE)
}


data(quantifiers)

quantifiers <- quantifiers |>
  # exclude control items
  dplyr::filter(!name_en %in% c("always", "never", "fifty-fifty chance")) |>
  # sample 100 respondents
  dplyr::filter(id_person %in% sample(
    size = 30,
    replace = FALSE,
    unique(quantifiers$id_person)
  )) |>
  # exclude missing values
  dplyr::filter(!is.na(x_L) & !is.na(x_U)) |>
  # recompute IDs
  dplyr::mutate(
    id_person = factor(id_person) |> as.numeric(),
    id_item = factor(id_item) |> as.numeric()
  )

head(quantifiers) |> 
  kable(digits = 2) |> 
  kable_styling()

id_person	id_item	name_ger	name_en	truth	scale_max	x_L	x_U
1	1	ab und zu	now and then	NA	100	30	61
1	2	eventuell	possibly	NA	100	39	61
1	3	fast immer	almost always	NA	100	85	94
1	4	fast nie	almost never	NA	100	3	10
1	5	gelegentlich	occasionally	NA	100	17	45
1	6	haeufig	frequently	NA	100	56	100

What does the dataset look like? We can visualize the interval responses using the plot_intervals_cumulative function:

plot_intervals_cumulative(
  lower = quantifiers$x_L,
  upper = quantifiers$x_U,
  min = quantifiers$scale_min,
  max = quantifiers$scale_max,
  cluster_id = quantifiers$name_en,
  weighted = TRUE
)
#> Warning: Removed 388000 rows containing missing values or values outside the scale range
#> (`geom_vline()`).

Cumulative distribution of interval responses.

Next, we need to convert the interval responses to the simplex format:

quantifiers <- cbind(
  quantifiers, 
  itvl_to_splx(quantifiers[,c("x_L","x_U")], min = quantifiers$scale_min, max = quantifiers$scale_max))

head(quantifiers[,9:13]) |> 
  round(2) |>
  kable() |> 
  kable_styling()

x_L	x_U	x_1	x_2	x_3
30	61	0.30	0.31	0.39
39	61	0.39	0.22	0.39
85	94	0.85	0.09	0.06
3	10	0.03	0.07	0.90
17	45	0.17	0.28	0.55
56	100	0.56	0.44	0.00

Let’s check if we can apply the Isometric Log-Ratio transformation:

try(ilr(quantifiers[,c("x_1","x_2","x_3")]))
#> Error in check_simplex(simplex[i, ]) : 
#>   Error: None of the elements in the (row-)vector must be exactly 0! Please apply padding first!

It seems we have components in our simplex data that are zero. So we first have to deal with these zero components. We can do this by adding a padding constant:

quantifiers[, c("x_1", "x_2", "x_3")] <- 
  remove_zeros(quantifiers[, c("x_1", "x_2", "x_3")], padding = 0.01)

head(quantifiers[,9:13]) |> 
  round(2) |>
  kable() |> 
  kable_styling()

x_L	x_U	x_1	x_2	x_3
30	61	0.30	0.31	0.39
39	61	0.39	0.22	0.39
85	94	0.83	0.10	0.07
3	10	0.04	0.08	0.88
17	45	0.17	0.28	0.54
56	100	0.55	0.44	0.01

fit <-
  fit_icm(
    df_simplex = quantifiers[, c("x_1", "x_2", "x_3")],
    id_person = quantifiers$id_person,
    id_item = quantifiers$id_item,
    item_labels = quantifiers |> 
      dplyr::distinct(id_item, name_en) |> 
      dplyr::pull(name_en),
    link = "ilr",
    padding = .01,
    n_chains = 4,
    n_cores = 4,
    iter_sampling = 500,
    iter_warmup = 300,
    adapt_delta = .9,
    verbose = TRUE
  )
#> 
#> CHECKING DATA AND PREPROCESSING FOR MODEL 'icm_ilr' NOW.
#> 
#> COMPILING MODEL 'icm_ilr' NOW.
#> 
#> STARTING SAMPLER FOR MODEL 'icm_ilr' NOW.
#> 
#> CHECKING DATA AND PREPROCESSING FOR MODEL 'icm_ilr' NOW.
#> 
#> COMPILING MODEL 'icm_ilr' NOW.
#> 
#> STARTING SAMPLER FOR MODEL 'icm_ilr' NOW.
#> 
#> CHECKING DATA AND PREPROCESSING FOR MODEL 'icm_ilr' NOW.
#> 
#> COMPILING MODEL 'icm_ilr' NOW.
#> 
#> STARTING SAMPLER FOR MODEL 'icm_ilr' NOW.
#> 
#> CHECKING DATA AND PREPROCESSING FOR MODEL 'icm_ilr' NOW.
#> 
#> COMPILING MODEL 'icm_ilr' NOW.
#> 
#> STARTING SAMPLER FOR MODEL 'icm_ilr' NOW.
#> 
#> SAMPLING FOR MODEL 'icm_ilr' NOW (CHAIN 1).
#> 
#> CHECKING DATA AND PREPROCESSING FOR MODEL 'icm_ilr' NOW.
#> 
#> COMPILING MODEL 'icm_ilr' NOW.
#> 
#> STARTING SAMPLER FOR MODEL 'icm_ilr' NOW.
#> 
#> SAMPLING FOR MODEL 'icm_ilr' NOW (CHAIN 2).
#> 
#> SAMPLING FOR MODEL 'icm_ilr' NOW (CHAIN 3).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 0.002641 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 26.41 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> 
#> SAMPLING FOR MODEL 'icm_ilr' NOW (CHAIN 4).
#> Chain 2: 
#> Chain 2: Gradient evaluation took 0.00274 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 27.4 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2: 
#> Chain 2: 
#> Chain 1: Iteration:   1 / 800 [  0%]  (Warmup)
#> Chain 3: 
#> Chain 3: Gradient evaluation took 0.002665 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 26.65 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3: 
#> Chain 3: 
#> Chain 4: 
#> Chain 4: Gradient evaluation took 0.002655 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 26.55 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4: 
#> Chain 4: 
#> Chain 2: Iteration:   1 / 800 [  0%]  (Warmup)
#> Chain 3: Iteration:   1 / 800 [  0%]  (Warmup)
#> Chain 4: Iteration:   1 / 800 [  0%]  (Warmup)
#> Chain 2: Iteration:  80 / 800 [ 10%]  (Warmup)
#> Chain 3: Iteration:  80 / 800 [ 10%]  (Warmup)
#> Chain 4: Iteration:  80 / 800 [ 10%]  (Warmup)
#> Chain 1: Iteration:  80 / 800 [ 10%]  (Warmup)
#> Chain 2: Iteration: 160 / 800 [ 20%]  (Warmup)
#> Chain 3: Iteration: 160 / 800 [ 20%]  (Warmup)
#> Chain 4: Iteration: 160 / 800 [ 20%]  (Warmup)
#> Chain 1: Iteration: 160 / 800 [ 20%]  (Warmup)
#> Chain 2: Iteration: 240 / 800 [ 30%]  (Warmup)
#> Chain 3: Iteration: 240 / 800 [ 30%]  (Warmup)
#> Chain 4: Iteration: 240 / 800 [ 30%]  (Warmup)
#> Chain 1: Iteration: 240 / 800 [ 30%]  (Warmup)
#> Chain 2: Iteration: 301 / 800 [ 37%]  (Sampling)
#> Chain 3: Iteration: 301 / 800 [ 37%]  (Sampling)
#> Chain 4: Iteration: 301 / 800 [ 37%]  (Sampling)
#> Chain 1: Iteration: 301 / 800 [ 37%]  (Sampling)
#> Chain 2: Iteration: 380 / 800 [ 47%]  (Sampling)
#> Chain 3: Iteration: 380 / 800 [ 47%]  (Sampling)
#> Chain 4: Iteration: 380 / 800 [ 47%]  (Sampling)
#> Chain 1: Iteration: 380 / 800 [ 47%]  (Sampling)
#> Chain 2: Iteration: 460 / 800 [ 57%]  (Sampling)
#> Chain 3: Iteration: 460 / 800 [ 57%]  (Sampling)
#> Chain 4: Iteration: 460 / 800 [ 57%]  (Sampling)
#> Chain 1: Iteration: 460 / 800 [ 57%]  (Sampling)
#> Chain 2: Iteration: 540 / 800 [ 67%]  (Sampling)
#> Chain 3: Iteration: 540 / 800 [ 67%]  (Sampling)
#> Chain 4: Iteration: 540 / 800 [ 67%]  (Sampling)
#> Chain 1: Iteration: 540 / 800 [ 67%]  (Sampling)
#> Chain 2: Iteration: 620 / 800 [ 77%]  (Sampling)
#> Chain 3: Iteration: 620 / 800 [ 77%]  (Sampling)
#> Chain 4: Iteration: 620 / 800 [ 77%]  (Sampling)
#> Chain 1: Iteration: 620 / 800 [ 77%]  (Sampling)
#> Chain 2: Iteration: 700 / 800 [ 87%]  (Sampling)
#> Chain 3: Iteration: 700 / 800 [ 87%]  (Sampling)
#> Chain 4: Iteration: 700 / 800 [ 87%]  (Sampling)
#> Chain 1: Iteration: 700 / 800 [ 87%]  (Sampling)
#> Chain 2: Iteration: 780 / 800 [ 97%]  (Sampling)
#> Chain 3: Iteration: 780 / 800 [ 97%]  (Sampling)
#> Chain 2: Iteration: 800 / 800 [100%]  (Sampling)
#> Chain 2: 
#> Chain 2:  Elapsed Time: 21.73 seconds (Warm-up)
#> Chain 2:                17.686 seconds (Sampling)
#> Chain 2:                39.416 seconds (Total)
#> Chain 2: 
#> Chain 3: Iteration: 800 / 800 [100%]  (Sampling)
#> Chain 3: 
#> Chain 3:  Elapsed Time: 22.435 seconds (Warm-up)
#> Chain 3:                17.276 seconds (Sampling)
#> Chain 3:                39.711 seconds (Total)
#> Chain 3: 
#> Chain 4: Iteration: 780 / 800 [ 97%]  (Sampling)
#> Chain 4: Iteration: 800 / 800 [100%]  (Sampling)
#> Chain 4: 
#> Chain 4:  Elapsed Time: 23.42 seconds (Warm-up)
#> Chain 4:                16.87 seconds (Sampling)
#> Chain 4:                40.29 seconds (Total)
#> Chain 4: 
#> Chain 1: Iteration: 780 / 800 [ 97%]  (Sampling)
#> Chain 1: Iteration: 800 / 800 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 23.746 seconds (Warm-up)
#> Chain 1:                17.072 seconds (Sampling)
#> Chain 1:                40.818 seconds (Total)
#> Chain 1:

Now we can extract the estimated cosensus intervals from the fit object. The function returns a list containing the posterior samples and a summary table of the consensus intervals stemming from the posterior medians.

consensus <- extract_consensus(fit, print_summary = FALSE)
attributes(consensus)
#> $names
#> [1] "df_rvar" "summary"

If we want to get a summary of the consensus intervals, we can use the summary function, which is a wrapper function around extract_consensus.

summary(fit) |>
  round(2) |> 
  kable() |> 
  kable_styling()

	T_L_median	T_L_CI_025	T_L_CI_975	T_U_median	T_U_CI_025	T_U_CI_975
now and then	0.22	0.17	0.28	0.43	0.36	0.51
possibly	0.23	0.15	0.32	0.55	0.45	0.66
almost always	0.88	0.84	0.91	0.98	0.97	0.99
almost never	0.03	0.02	0.05	0.11	0.08	0.16
occasionally	0.18	0.13	0.23	0.40	0.32	0.47
frequently	0.66	0.60	0.71	0.91	0.87	0.93
hardly	0.04	0.03	0.06	0.17	0.14	0.21
sometimes	0.21	0.15	0.27	0.45	0.37	0.52
most of the time	0.69	0.63	0.74	0.91	0.87	0.94
often	0.68	0.62	0.73	0.91	0.87	0.93
potentially	0.24	0.16	0.35	0.66	0.53	0.78
rarely	0.05	0.03	0.07	0.19	0.15	0.24
constantly	0.78	0.72	0.83	0.97	0.95	0.99

We can also plot the estimated consensus intervals. The generic function plot calls the function plot_consensus with the default method median_bounds.

plot(fit, method = "median_bounds")

Plot of the estimated consensus intervals.

We can call the function plot_consensus directly for the alternative plotting method draws_gradient. This method plots the consensus intervals based on the posterior draws. For every posterior draw, a sample is drawn from a uniform distribution using the respective interval bounds of the posterior draw as minimum and maximum. The result is an rvar containing the values from the respective consensus interval, which is visualized in the plot.

An alternative method is draws_gradient which plots the consensus intervals based on the posterior draws. For every posterior draw, a sample is drawn from a uniform distribution using the respective interval bounds of the posterior draw as minimum and maximum. The result is an rvar containing the values from the respective consensus interval, which is visualized in the plot. The argument CI specifies the credible interval for this distribution and CI_gradient specifies the credible interval outside which the uncertainty is visualized by a gradient.

plot_consensus(fit, method = "draws_distribution", CI = .95)

Plot of the estimated consensus intervals.

plot_consensus(fit, method = "draws_distribution", CI = c(.5, .95))

Plot of the estimated consensus intervals.

Estimation of Weighted Consensus Intervals for Interval Ratings