Chapter 5 可视化

5.1 频率图

可以使用 process _ map ()创建日志的进程映射。流程图是一个直接跟随的图形,其中每个不同的活动由一个节点表示,每个活动之间的直接跟随关系由有向边显示,即节点之间的箭头。

有6种频率图

  1. absolute frequency
  2. absolute-case frequency
  3. relative frequency
  4. relative-case frequency
  5. relative-antecedent frequency
  6. relative-consequent frequency

5.1.1 Absolute

require(tidyverse)
## Loading required package: tidyverse
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0     ✔ purrr   1.0.2
## ✔ tibble  3.2.1     ✔ dplyr   1.1.3
## ✔ tidyr   1.3.0     ✔ stringr 1.5.0
## ✔ readr   2.1.3     ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
require(bupaverse)
## Loading required package: bupaverse
## 
## .______    __    __  .______      ___   ____    ____  _______ .______          _______. _______
## |   _  \  |  |  |  | |   _  \    /   \  \   \  /   / |   ____||   _  \        /       ||   ____|
## |  |_)  | |  |  |  | |  |_)  |  /  ^  \  \   \/   /  |  |__   |  |_)  |      |   (----`|  |__
## |   _  <  |  |  |  | |   ___/  /  /_\  \  \      /   |   __|  |      /        \   \    |   __|
## |  |_)  | |  `--'  | |  |     /  _____  \  \    /    |  |____ |  |\  \----.----)   |   |  |____
## |______/   \______/  | _|    /__/     \__\  \__/     |_______|| _| `._____|_______/    |_______|
##                                                                                                 ── Attaching packages ─────────────────────────────────────── bupaverse 0.1.0 ──✔ bupaR         0.5.3     ✔ processcheckR 0.1.4
## ✔ edeaR         0.9.1     ✔ processmapR   0.5.2
## ✔ eventdataR    0.3.1     ── Conflicts ────────────────────────────────────────── bupaverse_conflicts() ──
## ✖ processcheckR::contains() masks dplyr::contains(), tidyr::contains()
## ✖ bupaR::filter()           masks dplyr::filter(), stats::filter()
## ✖ processmapR::frequency()  masks stats::frequency()
## ✖ edeaR::setdiff()          masks dplyr::setdiff(), base::setdiff()
## ✖ bupaR::timestamp()        masks utils::timestamp()
## ✖ processcheckR::xor()      masks base::xor()
patients %>%
    process_map(frequency("absolute"))

直接显示node 和arc出现次数.

5.1.2 Absolute case

library(bupaverse)
library(tidyverse)
patients %>%
    process_map(frequency("absolute-case"))

Absolute case 显示的是从case 的角度, 看node(activity)和arc 的出现次数.

5.1.3 Relative

patients %>%
    process_map(frequency("relative"))

5.1.4 Relative case

patients %>%
    process_map(frequency("relative-case"))

5.2 性能图

流程映射也可以用来显示流程的性能

patients %>%
    process_map(performance())

5.2.1 Aggregation function

默认情况下, 会显示平均值, 还可以显示其他统计量

patients %>%
    process_map(performance(FUN = max))
## Warning: There was 1 warning in `summarize()`.
## ℹ In argument: `label = do.call(...)`.
## ℹ In group 10: `ACTIVITY_CLASSIFIER_ = NA`, `from_id = NA`.
## Caused by warning in `type()`:
## ! no non-missing arguments to max; returning -Inf
## Warning: There were 2 warnings in `summarize()`.
## The first warning was:
## ℹ In argument: `value = do.call(...)`.
## ℹ In group 1: `ACTIVITY_CLASSIFIER_ = "ARTIFICIAL_END"`, `next_act = NA`,
##   `from_id = 1`, `to_id = NA`.
## Caused by warning in `type()`:
## ! no non-missing arguments to max; returning -Inf
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.

5.3 高级maps

我们可以设置node 和arc

patients %>%
    process_map(type_nodes = frequency("relative_case"),
                type_edges = performance(mean))

5.4 动画图

library(bupaverse)
library(processanimateR)

animate_process(patients)

还可以修改图形中的元素

animate_process(patients, mapping = token_aes(size = token_scale(12), shape = "rect",color = token_scale("red")))

5.5 Process Matrix

过程矩阵是一个二维矩阵,显示活动之间的流动。

5.5.1 频率过程矩阵

traffic_fines %>%
    process_matrix(frequency("absolute")) 
## # A tibble: 47 × 3
##    antecedent      consequent                                n
##    <fct>           <fct>                                 <dbl>
##  1 Add penalty     Insert Date Appeal to Prefecture         41
##  2 Add penalty     Notify Result Appeal to Offender          3
##  3 Add penalty     Payment                                1117
##  4 Add penalty     Receive Result Appeal from Prefecture    15
##  5 Add penalty     Send Appeal to Prefecture               171
##  6 Add penalty     Send for Credit Collection             3288
##  7 Appeal to Judge Add penalty                              13
##  8 Appeal to Judge End                                       5
##  9 Appeal to Judge Insert Date Appeal to Prefecture          1
## 10 Create Fine     Payment                                3443
## # ℹ 37 more rows
traffic_fines %>%
    process_matrix(frequency("absolute")) %>%
    plot()

5.5.2 性能过程矩阵

显示两个activity 之间的性能

traffic_fines %>%
    process_matrix(performance(FUN = mean, units = "weeks"))
## # A tibble: 47 × 4
##    antecedent      consequent                                n flow_time
##    <fct>           <fct>                                 <dbl>     <dbl>
##  1 Add penalty     Insert Date Appeal to Prefecture         41     9.43 
##  2 Add penalty     Notify Result Appeal to Offender          3    11.1  
##  3 Add penalty     Payment                                1117    25.1  
##  4 Add penalty     Receive Result Appeal from Prefecture    15     6.96 
##  5 Add penalty     Send Appeal to Prefecture               171    36.4  
##  6 Add penalty     Send for Credit Collection             3288    69.7  
##  7 Appeal to Judge Add penalty                              13     4.51 
##  8 Appeal to Judge End                                       5     0    
##  9 Appeal to Judge Insert Date Appeal to Prefecture          1     0.286
## 10 Create Fine     Payment                                3443     1.33 
## # ℹ 37 more rows
traffic_fines %>%
    process_matrix(performance(FUN = mean, units = "weeks"))  %>%
    plot()

5.6 Dotted chart

虚线图是一种图形,其中每个活动实例都由一个点显示。X 轴表示时间,y 轴表示case。虚线图函数有3个参数:

  1. x :absolute , relative, relative_day, relative_week
  2. sort : 在 y 轴上的顺序 . 可选参数包括 start、 end、 period、 start _ day 或 start _ week
  3. color
sepsis %>%
    dotted_chart(x = "absolute") %>% plotly::ggplotly()

如何理解 Dotted Chart :

  1. 横坐标是时间
  2. 纵坐标是每一个case, 每一个case 里面包含一系列的activity
  3. case 的排序可以有很多方式, 常用方式是使用开始时间

5.7 Trace exlorer

通过 trace _ Explorer ()可以看到日志中不同的活动序列。它可以用来探索频繁和不频繁的轨迹。

patients %>%
    trace_explorer()