SAMU - Emergency Calls SAMU - Response Times

Forecast Models

There are two types of forecast models: Model without regression and Regression models. For each type of models, we both use time and space discretization. The time discretization is given by time windows of 30 mins while the user can choose among 3 different discretizations denoted by Rect 10 (regular space discretization into 10x10=100 rectangles), Uber 7 (provided by Uber discretization library with scale parameter 7), and District (the districts of Rio de Janeiro). There are four sections: Line plot for the number of emergency calls, heatmap of the mean number of emergency calls, and empirical distributions of the number of calls (histograms).

Rectangular Discretization (10X10)Figure 1: Space discretization of a rectangle containing the city of Rio de Janeiro in 10×10=10010\small{\times}10=100 rectangles, 76 of which have intersection with the city and are represented in the figure.
Hexagonal Discretization (Parameter=7)Figure 3: Space discretization of a rectangle containing the city of Rio de Janeiro in Hexagones using Uber library for space discretization in hexagones with scale parameter equal to 7.
Discretization by DistrictFigure 5: Space discretization of the city of Rio de Janeiro into its 144 administrative districts.

Methodology for forecast

MODEL WITHOUT REGRESSION

The region under study is partitioned into a set I\mathcal{I} of zones. Different zones may have different shapes, sizes, and other attributes. For the basic model described here, it is assumed that arrival rates vary over time according to the time of the week. Time during the week is partitioned into a set T\mathcal{T} of time intervals. The model may impose constraints on the differences between arrival rates in different time intervals. For example, it may be required that the arrival rate during [8:00,8:30) on all weekdays be the same, or be close to each other. To facilitate such constraints, partition T\mathcal{T} into a collection G\mathcal{G} of subsets. For example, GGG \in \mathcal{G} may be G={G =\big\{Monday [8:00,8:30); Tuesday [8:00,8:30); Wednesday [8:00,8:30); Thursday [8:00,8:30); Friday [8:00,8:30).}\big\} There are a variety of classification systems for emergency calls, including ICD-10, ICD-11, MPDS, and APCO. Let C\mathcal{C} denote the set of arrival classes.

For each cCc \in \mathcal{C}, iIi \in \mathcal{I}, and tTt \in \mathcal{T}, let Nc,i,tN_{c,i,t} denote the number of observations for arrival class cc in zone ii during time interval tt, and let these observations be indexed n=1,,Nc,i,tn = 1,\ldots,N_{c,i,t}.

For each cCc \in \mathcal{C}, iIi \in \mathcal{I}, tTt \in \mathcal{T}, and n{1,,Nc,i,t}n \in \{1,\ldots,N_{c,i,t}\}, let Mc,i,t,nM_{c,i,t,n} denote the number of arrivals for observation nn of arrival class cc, zone ii, and time interval tt, and let Mc,i,t:=n=1Nc,i,tMc,i,t,nM_{c,i,t}:=\sum_{n=1}^{N_{c,i,t}} M_{c,i,t,n} denote the total number of arrivals over all observations for arrival class cc, zone ii, and time interval tt.

Assume that{Mc,i,t,n,cC,iI,tT,n=1,,Nc,i,t}\big\{M_{c,i,t,n},c \in \mathcal{C},i \in \mathcal{I}, t \in \mathcal{T}, n=1, \ldots,N_{c,i,t}\big\}are independent Poisson distributed random variables. Denoting by λc,i,t\lambda_{c,i,t} the mean number of calls per time unit (say an hour)for time interval tt arrival class cc and zone ii and by Dt\mathcal{D}_t the duration (in time units) of time interval tt, Mc,i,t,nM_{c,i,t,n} is Poisson with parameter (mean) λc,i,tDt\lambda_{c,i,t} \mathcal{D}_t. We calibrate the intensities (λc,i,t)(\lambda_{c,i,t}) solving minλ0  (λ)\displaystyle \min_{\lambda \geq 0}\;\ell(\lambda) with objective (λ)\ell(\lambda) given by (λ)=\ell(\lambda) \quad= cCiIGGtG[Nc,i,tλc,i,tDtMc,i,tlog(λc,i,tDt)+WG2tG(λc,i,tλc,i,t)2]\quad \displaystyle\sum_{c \in \mathcal{C}} \sum_{i \in \mathcal{I}} \sum_{G\in\mathcal{G}} \sum_{t \in G}\left[N_{c,i,t} \lambda_{c,i,t} \mathcal{D}_t - M_{c,i,t} \log\left(\lambda_{c,i,t} \mathcal{D}_t \right) + \frac{W_{G}}{2} \sum_{t' \in G} \left( \lambda_{c,i,t} - \lambda_{c,i,t'} \right)^2 \right]+cCiIGGtGjIwi,j2(λc,i,tλc,j,t)2\qquad \displaystyle + \sum_{c \in \mathcal{C}} \sum_{i \in \mathcal{I}} \sum_{G \in \mathcal{G}} \sum_{t \in G} \sum_{j \in \mathcal{I}} \frac{w_{i,j}}{2} \left(\lambda_{c,i,t} - \lambda_{c,j,t}\right)^2 for some weights WG,wi,j0W_G, w_{i,j} \geq 0.

MODEL WITH REGRESSION

Each arrival class cc, zone ii, and time interval tt may have covariates, and λc,i,tDt\lambda_{c,i,t} \mathcal{D}_t can be modeled as a function of these covariates. For each cCc \in \mathcal{C},iIi \in \mathcal{I}, and tTt \in \mathcal{T}, let xc,i,t:=(xc,i,t,1,,xc,i,t,K)x_{c,i,t} := (x_{c,i,t,1},\ldots,x_{c,i,t,K}) denote the covariate values of class cc, zone ii, and time interval tt. Then consider the model λ(xc,i,t)Dt=βxc,i,t\lambda(x_{c,i,t}) \mathcal{D}_t = \beta^{\top} x_{c,i,t}where β=(β1,,βK)\beta = (\beta_{1},\ldots,\beta_{K}) are the model parameters. To facilitate such a model, let NN denote the total number of observations, and let these observations be indexed n=1,,Nn = 1,\ldots,N. For each observation n{1,,N}n \in \{1,\ldots,N\}, let MnM_{n} denote the number of arrivals for observation nn and let xn:=(x1n,,xKn)x^{n}:=(x^{n}_{1},\ldots,x^{n}_{K}) denote the covariate values of observation nn. Typically xnx^{n} will indicate the class of emergency, the zone, the time interval, and other covariates used to explain the number of arrivals MnM_{n}. Then the negative log-likelihood function is given by L(β)=n=1N[βxnMnlog(βxn)]\mathscr{L}(\beta) = \sum_{n=1}^{N} \left[\beta^{\top} x^{n} - M_{n} \log\left(\beta^{\top} x^{n}\right)\right].

We calibrate β\beta maximizing the likelihood. We use as regressors the areas of the different land types (add a filter to choose the SAMU and show the corresponding heatmap of land types), the population, and holidays.