4 Cluster-based Network Modeling (CNM)
In this subsection, the workflow of CNMCluster-based Network Modeling (Fernex, Noack, and Semaan 2021) will be elaborated, as well as the previous attempt to expand the algorithm to accommodate a range of model parameter values \(\vec{\beta}\). CNMCluster-based Network Modeling (Fernex, Noack, and Semaan 2021) is the basis on which CNMccontrol-oriented Cluster-based Network Modeling is built or rather CNMccontrol-oriented Cluster-based Network Modeling invokes CNMCluster-based Network Modeling multiple times for one of its preprocessing steps. CNM can be split up into 4 main tasks, which are data collection, clustering, calculating transition properties and propagation. The first step is to collect the data, which can be provided from any dynamic system or numerical simulations. In this study, only dynamical systems are investigated. Once the data for the dynamical system is passed to CNMCluster-based Network Modeling, the data is clustered, e.g., with k-means++ algorithm (Arthur and Vassilvitskii 2006). A detailed elaboration about this step is given in section 8. CNMCluster-based Network Modeling exploits graph theory for approximating the trajectory as a movement on nodes. These nodes are equivalent to the centroids, which are acquired through clustering. Next, the motion, i.e., movement from one centroid to another, shall be clarified.
In order to fully describe the motion on the centroids, the time at which one centroid is visited is exited, and also the order of movement must be known. Note, when saying the motion is on the centroids, that means the centroids or characteristic nodes do not move at all. The entire approximated motion of the original trajectory on the nodes is described with the transition property matrices \(\boldsymbol Q\) and \(\boldsymbol T\). The matrices \(\boldsymbol Q\) and \(\boldsymbol T\) are the transition probability and transition time matrices, respectively. \(\boldsymbol Q\) is used to apply probability theory for predicting the next following most likely centroid. In other words, if the current location is at any node \(c_i\), \(\boldsymbol Q\) will provide all possible successor centroids with their corresponding transition probabilities. Thus, the motion on the centroids through \(\boldsymbol Q\) is probability-based. In more detail, the propagation of the motion on the centroids can be described as equation 4.1 . The variables are denoted as the propagated \(\vec{x}(t)\) trajectory, time \(t\), centroid positions \(\vec{c}_k,\, \vec{c}_j\), the time \(t_j\) where centroid \(\vec{c}_j\) is left and the transition time \(T_{k,j}\) from \(\vec{c}_j\) to \(\vec{c}_k\) (Fernex, Noack, and Semaan 2021). Furthermore, for the sake of a smooth trajectory, the motion between the centroids is interpolated through a spline interpolation. \[ \begin{equation} \vec{x}(t) = \alpha_{kj} (t) \, \vec{c}_k + [\, 1 - \alpha_{kj} (t)\,] \, \vec{c}_j, \quad \alpha_{kj} (t) = \frac{t-t_j}{T_{k,j}} \label{eq_34} \end{equation} \tag{4.1}\]
The \(\boldsymbol Q\) matrix only contains non-trivial transitions, i.e., if after a transition the centroid remains on the same centroid, the transition is not considered to be a real transition in CNMCluster-based Network Modeling. This idea is an advancement to the original work of Kaiser et al. (Kaiser et al. 2014). In Kaiser et al. (Kaiser et al. 2014) the transition is modeled as a Markov model. Markov models enable non-trivial transitions. Consequently, the diagonals of the resulting non-direct transition matrix \(\boldsymbol{Q_n}\)
exhibits the highest values. The diagonal elements stand for non-trivial transitions which lead to idling on the same centroid many times. Such behavior is encountered and described by Kaiser et al. (Kaiser et al. 2014).
There are 3 more important aspects that come along when adhering to Markov models. First, the propagation of motion is done by matrix-vector multiplication. In the case of the existence of a stationary state, the solution will converge to the stationary state, with an increasing number of iterations, where no change with time happens. A dynamical system can only survive as long as change with time exists. In cases where no change with respect to time is encountered, equilibrium or fixed points are found. Now, if a stationary state or fixed point exists in the considered dynamical system, the propagation will tend to converge to this fixed point. However, the nature of Markov models must not necessarily be valid for general dynamical systems. Another way to see that is by applying some linear algebra. The long-term behavior of the Markov transition matrix can be obtained with equation 4.2 . Here, \(l\) is the number of iterations to get from one stage to another. Kaiser et al. (Kaiser et al. 2014) depict in a figure, how the values of
\(\boldsymbol{Q_n}\) evolves after \(1 \mathrm{e}{+3}\) steps. \(\boldsymbol{Q_n}\) has become more uniform. \[
\begin{equation}
\label{eq_3_Infinite}
\lim\limits_{l \to \infty} \boldsymbol{Q_n}^l
\end{equation}
\tag{4.2}\]
If the number of steps is increased even further and all the rows would have the same probability value, \(\boldsymbol{Q_n}\) would converge to a stationary point. What also can be concluded from rows being equal is that it does not matter from where the dynamical system was started or what its initial conditions were. The probability to end at one specific state or centroid is constant as the number of steps approaches infinity. Following that, it would violate the sensitive dependency on initial conditions, which often is considered to be mandatory for modeling chaotic systems. Moreover, chaotic systems amplify any perturbation exponentially, whether at time \(t = 0\) or at time \(t>>0\).
Thus, a stationary transition matrix \(\boldsymbol{Q_n}\) is prohibited by chaos at any time step. This can be found to be one of the main reasons, why the Cluster Markov based Modeling (CMMCluster Markov-based Modeling) often fails to predict the trajectory. Li et al. (Li et al. 2021) summarize this observation compactly as after some time the initial condition would be forgotten and the asymptotic distribution would be reached. Further, they stated, that due to this fact, CMMCluster Markov-based Modeling would not be suited for modeling dynamical systems. The second problem which is involved, when deploying regular Markov modeling is that the future only depends on the current state. However, (Fernex, Noack, and Semaan 2021) has shown with the latest CNMCluster-based Network Modeling version that incorporating also past centroid positions for predicting the next centroid position increases the prediction quality. The latter effect is especially true when systems are complex.
However, for multiple consecutive time steps the trajectories position still could be assigned to the same centroid position (trivial transitions). Thus, past centroids are those centroids that are found when going back in time through only non-trivial transitions. The number of incorporated past centroids is given as equation 4.3, where \(L\) is denoted as the model order number. It represents the number of all considered centroids, where the current and all the past centroids are included, with which the prediction of the successor centroid is made. \[ \begin{equation} B_{past} = L -1 \label{eq_5_B_Past} \end{equation} \tag{4.3}\]
Furthermore, in (Fernex, Noack, and Semaan 2021) it is not simply believed that an increasing model order \(L\) would increase the outcome quality in every case. Therefore, a study on the number of \(L\) and the clusters \(K\) was conducted. The results proved that the choice of \(L\) and \(K\) depend on the considered dynamical system.
The third problem encountered when Markov models are used is that the time step must be provided. This time step is used to define when a transition is expected. In case the time step is too small, some amount of iterations is required to transit to the next centroid. Thus, non-trivial transitions would occur. In case the time step is too high, the intermediate centroids would be missed. Such behavior would be a coarse approximation of the real dynamics. Visually this can be thought of as jumping from one centroid to another while having skipped one or multiple centroids. The reconstructed trajectory could lead to an entirely wrong representation of the state-space. CNM generates the transition time matrix \(\boldsymbol T\) from data and therefore no input from the user is required.
A brief review of how the \(\boldsymbol Q\) is built shall be provided. Since the concept of model order, \(L\) has been explained, it can be clarified that it is not always right to call \(\boldsymbol Q\) and \(\boldsymbol T\) matrices. The latter is only correct, if \(L = 1\), otherwise it must be denoted as a tensor. \(\boldsymbol Q\) and \(\boldsymbol T\) can always be referred to as tensors since a tensor incorporates matrices, i.e., a matrix is a tensor of rank 2. In order to generate \(\boldsymbol Q\), \(L\) must be defined, such that the shape of \(\boldsymbol Q\) is known. The next step is to gather all sequences of clusters \(c_i\). To understand that, we imagine the following scenario, \(L = 3\), which means 2 centroids from the past and the current one are incorporated to predict the next centroid. Furthermore, imagining that two cluster sequence scenarios were found, \(c_0 \rightarrow c_1 \rightarrow c_2\) and \(c_5 \rightarrow c_1 \rightarrow c_2\). These cluster sequences tell us that the current centroid is \(c_2\) and the remaining centroids belong to the past. In order to complete the sequence for \(L = 3\), the successor cluster also needs to be added, \(c_0 \rightarrow c_1 \rightarrow c_2 \rightarrow c_5\) and \(c_5 \rightarrow c_1 \rightarrow c_2 \rightarrow c_4\). The following step is to calculate the likelihood of a transition to a specific successor cluster. This is done with equation 4.4, where \(n_{k, \boldsymbol{j}}\) is the amount of complete sequences, where also the successor is found. The index \(j\) is written as a vector in order to generalize the equation for \(L \ge 1\). It then contains all incorporated centroids from the past and the current centroid. The index \(k\) represents the successor centroid (\(\boldsymbol{j} \rightarrow k\)). Finally, \(n_{\boldsymbol{j}}\) counts all the matching incomplete sequences. \[ \begin{equation} \label{eq_4_Poss} P_{k, \boldsymbol j} = \frac{n_{k,\boldsymbol{j}}}{n_{\boldsymbol{j}}} \end{equation} \tag{4.4}\]
After having collected all the possible complete cluster sequences with their corresponding probabilities \(\boldsymbol Q\), the transition time tensors \(\boldsymbol T\) can be inferred from the data. With that, the residence time on each cluster is known and can be used for computing the transition times for every single transition. At this stage, it shall be highlighted again, CNM approximates its data fully with only two matrices or when \(L \ge 2\) tensors, \(\boldsymbol Q\) and \(\boldsymbol T\). The final step is the prorogation following equation 4.1 . For smoothing the propagation between two centroids the B-spline interpolation is applied.
4.0.1 First version of CNMc
Apart from this thesis, there already has been an attempt to build control-oriented Cluster-based Network Modeling (CNMc). The procedure, progress and results of the most recent effort are described in (Pierzyna 2021). Also, in the latter, the main idea was to predict the trajectories for dynamical systems with a control term or a model parameter value \(\beta\). In this subsection, a review of (Pierzyna 2021) shall be given with pointing out which parts need to be improved. In addition, some distinctions between the previous version of CNMccontrol-oriented Cluster-based Network Modeling and the most recent version are named. Further applied modifications are provided in chapter 5.
To avoid confusion between the CNMccontrol-oriented Cluster-based Network Modeling version described in this thesis and the prior CNMccontrol-oriented Cluster-based Network Modeling version, the old version will be referred to as first CNMc. First CNMc starts by defining a range of model parameter values \(\vec{\beta}\). It was specifically designed to only be able to make predictions for the Lorenz attractor (Lorenz 1963), which is described with the set of equations 7.1 given in section 7. An illustrative trajectory is of the Lorenz system (Lorenz 1963) with \(\beta = 28\) is depicted in figure 4.1 .
Having chosen a range of model parameter values \(\vec{\beta}\), the Lorenz system was solved numerically and its solution was supplied to CNMCluster-based Network Modeling in order to run k-means++ on all received trajectories. The centroid label allocation by the k-means+ algorithm is conducted randomly. Thus, linking or matching centroid labels from one model parameter value \(\beta_i\) to another model parameter value \(\beta_j\), where \(i \neq j\), is performed in 3 steps. The first two steps are ordering the \(\vec{\beta}\) in ascending order and transforming the Cartesian coordinate system into a spherical coordinate system. With the now available azimuth angle, each centroid is labeled in increasing order of the azimuth angle. The third step is to match the centroids across \(\vec{\beta}\), i.e., \(\beta_i\) with \(\beta_j\). For this purpose, the centroid label from the prior model parameter value is used as a reference to match its corresponding nearest centroid in the next model parameter value. As a result, one label can be assigned to one centroid across the available \(\vec{\beta}\).
Firstly, (Pierzyna 2021) showed that ambiguous regions can occur. Here the matching of the centroids across the \(\vec{\beta}\) can not be trusted anymore. Secondly, the deployed coordinate transformation is assumed to only work properly in 3 dimensions. There is the possibility to set one or two variables to zero in order to use it in two or one dimension, respectively. However, it is not known, whether such an artificially decrease of dimensions yields a successful outcome for lower-dimensional (2- and 1-dimensional) dynamical systems. In the event of a 4-dimensional or even higher dimensional case, the proposed coordinate transformation cannot be used anymore. In conclusion, the transformation is only secure to be utilized in 3 dimensions. Thirdly, which is also acknowledged by (Pierzyna 2021) is that the coordinate transformation forces the dynamical system to have a circular-like trajectory, e.g., as the in figure 4.1 depicted Lorenz system does. Since not every dynamical system is forced to have a circular-like trajectory, it is one of the major parts which needs to be improved, when first CNMc is meant to be leveraged for all kinds of dynamical systems. Neither the number of dimensions nor the shape of the trajectory should matter for a generalized CNMccontrol-oriented Cluster-based Network Modeling.
Once the centroids are matched across all the available \(\vec{\beta}\) pySINDy (Brunton, Proctor, and Kutz 2016; Silva et al. 2020; Kaptanoglu et al. 2022) is used to build a regression model. This regression model serves the purpose of capturing all centroid positions of the calculated model parameter values \(\vec{\beta }\) and making predictions for unseen \(\vec{\beta}_{unseen}\). Next, a preprocessing step is performed on the transition property tensors \(\boldsymbol Q\) and \(\boldsymbol T\). Both are scaled, such that the risk of a bias is assumed to be reduced. Then, on both Non-negative Matrix Factorization (NMF) (Lee and Seung 1999) is applied. Following equation 4.5 NMFNon-negative Matrix Factorization (Lee and Seung 1999) returns two matrices, i.e., \(\boldsymbol W\) and \(\boldsymbol H\). The matrices exhibit a physically relevant meaning. \(\boldsymbol W\) corresponds to a mode collection and \(\boldsymbol H\) contains the weighting factor for each corresponding mode. \[ \begin{equation} \label{eq_5_NMF} \boldsymbol {A_{i \mu}} \approx \boldsymbol A^{\prime}_{i \mu} = (\boldsymbol W \boldsymbol H)_{i \mu} = \sum_{a = 1}^{r} \boldsymbol W_{ia} \boldsymbol H_{a \mu} \end{equation} \tag{4.5}\]
The number of modes \(r\) depends on the underlying dynamical system. Firstly, the NMFNon-negative Matrix Factorization is utilized by deploying optimization. The goal is to satisfy the condition that, the deviation between the original matrix and the approximated matrix shall be below a chosen threshold. For this purpose, the number of required optimization iterations easily can be in the order of \(\mathcal{O} (1 \mathrm{e}+7)\). The major drawback here is that such a high number of iterations is computationally very expensive. Secondly, for first CNMc the number of modes \(r\) must be known beforehand. Since in most cases this demand cannot be fulfilled two issues arise. On the one hand, running NMFNon-negative Matrix Factorization on a single known \(r\) can already be considered to be computationally expensive. On the other hand, conducting a study to find the appropriate \(r\) involves even more computational effort. Pierzyna (Pierzyna 2021) acknowledges this issue and defined it to be one of the major limitations.
The next step is to generate a regression model with Random Forest (RF). Some introductory words about RFRandom Forest are given in subsection 10.0.2. As illustrated in (Pierzyna 2021), RFRandom Forest was able to reproduce the training data reasonably well. However, it faced difficulties to approximate spike-like curves. Once the centroid positions and the two transitions property tensors \(\boldsymbol Q\) and \(\boldsymbol T\) are known, they are passed to CNMCluster-based Network Modeling to calculate the predicted trajectories. For assessing the prediction quality two methods are used, i.e., the autocorrelation and the Cluster Probability Distribution (CPD). CPDCluster Probability Distribution outlines the probability of being on one of the \(K\) clusters. The autocorrelation given in equation 4.6 allows comparing two trajectories with a phase-mismatch (Protas, Noack, and Östh 2015) and it measures how well a point in trajectory correlates with a point that is some time steps ahead. The variables in equation 4.6 are denoted as time lag \(\tau\), state space vector \(\boldsymbol x\), time \(t\) and the inner product \((\boldsymbol x, \boldsymbol y) = \boldsymbol x \cdot \boldsymbol{y}^T\). \[ \begin{equation} R(\tau) = \frac{1}{T - \tau} \int\limits_{0}^{T-\tau}\, (\boldsymbol{x} (t), \boldsymbol{x}(t+ \tau)) dt, \quad \tau \in [\, 0, \, T\,] \label{eq_35} \end{equation} \tag{4.6}\]
First CNMc proved to work well for the Lorenz system only for the number of centroids up to \(K=10\) and small \(\beta\). Among the points which need to be improved is the method to match the centroids across the chosen \(\vec{\beta}\). Because of this, two of the major problems occur, i.e., the limitation to 3 dimensions and the behavior of the trajectory must be circular, similar to the Lorenz system (Lorenz 1963). These demands are the main obstacles to the application of first CNMc to all kinds of dynamical systems. The modal decomposition with NMFNon-negative Matrix Factorization is the most computationally intensive part and should be replaced by a faster alternative.