1 Background

In the background of control, it is well known and has been widely accepted that controllability and observability are the most fundamental properties in control systems in the state-space form [1] since 1960. A control system (usually a physical system, represented by a differential equation) is controllable if for every two states x and \(x'\), there is an input sequence that drives x to \(x'\) starting at some initial time. Usually, the state of a control system is partially observed, which is represented by an output function. A control system is observable if the initial state can be determined by an input sequence and the corresponding output sequence.

In his seminal 1963 paper [2], Kalman proved that a linear control system is an irreducible realization of an impulse–response matrix if and only if the system is completely controllable and completely observable. In addition, he also gave a canonical form of a linear control system in terms of four mutually exclusive parts: (A) completely controllable but unobservable, (B) completely controllable and completely observable, (C) uncontrollable and unobservable, (D) uncontrollable but completely observable. The form was later called Kalman decomposition. Following these seminal papers, the two notions have been extended to many different kinds of control systems using/developing new mathematical tools. The control systems cover linear systems [2, 3], nonlinear systems [4,5,6], switched systems [7], networked systems [8, 9], etc.

It might not be well known to the control community that controllability and observability are also two fundamental properties in the computer science community. They are in different terms, and appeared even earlier. On [2, pp. 169], Kalman compared his irreducible realization with E.F. Moore’s minimization of machines (in his seminal 1956 paper [10]): “The class of all machines which are indistinguishable from a given strongly connected machine S by any single experiment has a unique (up to isomorphism) member with a minimal number of states. This unique machine, called the reduced form of S, is strongly connected and has the property that any two of its states are distinguishable”. Moore’s seminal paper stimulated many branches in computer science which are currently called model-based testing, see the very early book [11] and recent monograph [12], as well as references therein, where [11] seems to be the first book on topics that received attention in both control science and computer science. The sequential machines studied in [10] were later called Moore machines, which consist of finitely many states, inputs, and outputs. The state update/transition function of a Moore machine is a function that maps every pair of a state and an input to a state, the output function is a function that maps a state to an output. Moore machines are actually among the earliest models of computer science. They are deterministic, in almost the same form compared with control systems, but discrete in both time and space. By recasting controllable into a Moore machine, we obtain strongly connected, because the following two conditions are equivalent: (1) for every two states, there is an input sequence that drives the one to the other, (2) the state-transition graphFootnote 1 of a Moore machine is strongly connected. Hence, the terminology “strongly connected” in [10] means “completely controllable” in [2] (see [2, pp. 170]). That is, this notion performs complete correspondence in linear control systems and Moore machines. On the other hand, indistinguishability of two states in Moore machines means that under each input sequence, the two output sequences generated by the two states are the same. Hence, the “existence of two indistinguishable states” seems to correspond to unobservability of linear control systems. However, it is not true, because Moore machines are essentially nonlinear. This incorrespondence will be directly reflected by the fact that there are nonequivalent notions of observability for Moore machines, but “all plausible definitions of observability in linear time-invariant control systems turn out to be equivalent” [3], because the difference of two solutions of a linear time-invariant control system is still its solution, resulting in that different definitions of observability do not depend on inputs. The closedness of difference of solutions in linear time-invariant control systems also results in that controllability is dual to observability in this type of systems: a linear time-invariant control system is controllable if and only if its dual system is observable.

To sum up, there is essential difference between linear control systems and Moore machines due to various notions of observability, while there is an extreme similarity between irreducible realization of linear control systems and minimization of Moore machines. Except for this similarity, there is another extreme similarity—verification of multiple-experiment observabilityFootnote 2 (see Definition 12) of Moore machines [10, Theorem 6] and unobservable subspace computation in linear time-invariant control systems [3]. This similarity is due to the finiteness of the number of inputs and states in Moore machines and the linearity feature and finiteness of dimensions of linear systems. This similarity plays a central role in the solvability of several control problems in Boolean control networks (BCNs) (they were proposed in [14, 15]). In hindsight, the mathematical essence in the multiple-experiment observability decomposition [16] in BCNs, the mathematical essence in the disturbance decoupling problem of Boolean networks (BNs) [17], the invariant dual subspaces (actually unobservable subspaces) of BNs [18], and the observability verification method [19] for BNs, turn out to coincide with Moore’s partition-based verification method for multiple-experiment observability of Moore machines [10, Theorem 6]. See Sect. 3 for details. In [20] and [13, Remark 4.1], Moore’s partition-based method was briefly restated. The observability verification method used in [20] is, however, essentially different from Moore’s method. The method used in [20] is an extension of the weighted-pair-graph method (actually an inverse method used for verifying the negation of observability) proposed in [21]. Different from Moore’s method, the weighted-pair-graph method applies to more notions of observability, and can be extended to nondeterministic finite-transition systems [20]. The weighted-pair-graph method (with its slight variants) has been widely used in observability verification until now, e.g., in BCNs [22,23,24,25,26], in probabilistic BNs [27, 28], in singular BCNs [29], as well as in observability disturbance analysis of BNs [30]. See Sect. 4 for details.

Now we can introduce the model studied in the current paper—Boolean control networks (BCNs). What are the relations between BCNs and the two models—linear control systems and Moore machines, as discussed above? BCNs are special Moore machines for which the numbers of states, inputs, and outputs are all powers of 2. BNs are BCNs whose inputs are constant. Now that BCNs are special Moore machines, is it true that one only needs to study Moore machines, but does not need to study BCNs separately? The answer is No, because when mentioning a Moore machine, its size is the numbers of states, inputs, outputs, and transitions (between states driven by inputs); while mentioning a BCN, its size is the numbers of state nodes, input nodes, output nodes, and the lengths of its updating functions and output functions which are all Boolean functions. In other words, the focus is mainly on states, inputs, and outputs in Moore machines, while the focus is mainly on state nodes, input nodes, and output nodes in BCNs. For example, if an algorithm for verifying observability of a BCN runs in time exponential in the size of the BCN, then it runs in time polynomial in the size of the Moore machine represented by the BCN. As a matter of fact, not all Moore machines can be represented by BCNs. Hereinafter, when mentioning an algorithm for a Moore machine, its time complexity and space complexity are in the size of the Moore machine; when mentioning an algorithm for a BCN, its time complexity and space complexity are in the size of the BCN. Another difference between BCNs and Moore machines lies in that each node of a BCN can be regarded as a variable, then it makes sense to endow a BCN with a coordinate, which does not apply to a Moore machine. Hence, it makes sense to study coordinate-based problems in BCNs such as the disturbance decoupling problem [17, 31] and the observability decomposition problem [16]. It is worthy mentioning that the disturbance decoupling problem can also be defined in a coordinate-independent form, which was called the “original disturbance decoupling problem” in [16]. The semitensor product of matrices first proposed by Cheng [32] in 2001 is a convenient and appropriate tool used for representing coordinates and transformations between different coordinates in BCNs [33, 34]. Here we want to point out that the area of the control-theoretic problems of BCNs and the area of model-based testing in computer science are quite far from each other, and many results obtained in the latter area were not known to most researchers in the former area. Of course, the latter area is much more developed. Later when introducing specific topics, we will introduce relevant overlaps.

As mentioned above, the controllability of a Moore machine is the same as the strong connectedness of its state-transition graph. Hence, it appears that it is not worthy devoting much space to the discussion on controllability of Moore machines, because it can be verified in linear time by Tarjan’s algorithm. Hence, controllability of a BCN can be verified in exponential time also by Tarjan’s algorithm. On the other hand, verifying controllability of BCNs is \(\textsf{NP}\)-hard [35]. Hence, the interesting topics on controllability of BCNs should be on developing fast algorithms for special types of BCNs [36]. Also as mentioned before, there exist nonequivalent notions of observability in Moore machines, so does for BCNs [37], even for controllable BCNs [38]. Therefore, it is interesting to discuss the relations between different notions of BCNs.

In the following, we start to show the main parts of the survey. Different from many other survey papers in which the principal line is based on problems/definitions, we choose the principle line based on methods. Our choice is more fundamental, because in this area, there are quite a lot of problems/definitions, but there are only a very small number of methods (up to equivalence).

2 Preliminaries

2.1 Notation

  • \(\emptyset \): empty set

  • \(\subset \) and \(\subsetneq \): subset and strict subset relations, respectively

  • \(2^X\): power set of set X

  • \(\mathbb {Z}\) (\(\mathbb {Z}_+\)): set of (positive) integers

  • \(\mathbb {N}\): set of natural numbers (including 0)

  • \(\mathbb {R}^n\): set of n-length real column vectors

  • \(\mathbb {R}^{m\times n}\): set of \(m\times n\) real matrices

  • \(\mathcal {D}\): set \(\{0,1\}\)

  • \(\wedge ,\vee ,\lnot \): logical connectives conjunction, disjunction, negation

  • \(\delta _n^i\): i-th column of the identity matrix \(I_n\)

  • \(\textbf{1}_k\): \(\sum _{i=1}^k{\delta _k^i}\)

  • \(\varDelta _n\): set \(\{\delta _n^1,\dots ,\delta _n^n\}\) (\(\varDelta _2=:\varDelta \))

  • \(\llbracket m,n\rrbracket \): \(\{m,m+1,...,n\}\), where \(m,n\in \mathbb {N}\) and \(m\le n\)

  • \(\delta _n[i_1,\dots ,i_s]\): logical matrix \([\delta _n^{i_1},\dots ,\delta _n^{i_s}]\), where \(i_1,\dots , i_s \in \llbracket 1,n\rrbracket \)

  • \(\mathcal {L}_{n\times s}\): set of \(n\times s\) logical matrices

  • \({\text {Col}}_i(A)\): i-th column of matrix A

  • \({\text {Col}}(A)\): set of columns of matrix A

  • \(A^{{{\,\textrm{Tr}\,}}}\): transpose of matrix A

  • \(\vert X\vert \): cardinality of set X

  • \(A_1\oplus A_2\oplus \cdots \oplus A_n\): \(\begin{bmatrix} A_1 &{} 0 &{} \cdots &{} 0\\ 0 &{} A_2 &{} \cdots &{} 0\\ \vdots &{} \vdots &{} &{} \vdots \\ 0 &{} 0 &{} \cdots &{} A_n \end{bmatrix}\)

2.2 Equivalence relations and partitions

Let X be a set. A partition of X is a set of pairwise disjoint nonempty subsets of X whose union is equal to X, where these subsets are called the parts (or cells) of the partition. A partition of X is finite if it consists of finitely many subsets of X, or is discrete if all cells are singletons. For two partitions \(\alpha ,\beta \) of X, we say \(\alpha \) refines \(\beta \), or \(\alpha \) is finer than \(\beta \), or \(\beta \) is coarser than \(\alpha \), or \(\alpha \) is a refinement of \(\beta \) , denoted by \(\alpha \succeq \beta \) or \(\beta \preceq \alpha \), if for every \(M\in \alpha \), there exists \(N\in \beta \) such that \(M\subset N\). When \(\alpha \) strictly refines \(\beta \), we denote \(\alpha \succ \beta \) or \(\beta \prec \alpha \).

On a given set X, a relation \(\sim \) on X is a subset of \(X\times X\). We say two states \(x,x'\in X\) have relation \(\sim \) (or x has relation \(\sim \) with \(x'\)) if \((x,x')\in \sim \), which is denoted by \(x\sim x'\). A relation \(\sim \subset X\times X\) is an equivalence relation if it is reflexive, symmetric, and transitive. That is, for all \(x,y,z\in X\),

  • \(x\sim x\) (reflexivity),

  • \(x\sim y\) if and only if \(y\sim x\) (symmetry), and

  • if \(x\sim y\) and \(y\sim z\) then \(x\sim z\) (transitivity).

Let relation \(\sim \subset X\times X\) be symmetric and transitive. Define \([x]_{\sim }:=\{x\}\cup \{y\in X\vert y\ne x\wedge y\sim x\}\). Not that \(\{y\in X\vert y\ne x\wedge y\sim x\}\) may be equal to \(\emptyset \). One easily sees that for all \(x,y\in X\), either \([x]_{\sim }=[y]_{\sim }\) or \([x]_{\sim }\cap [y]_{\sim }=\emptyset \). That is, \(\{[x]_{\sim }\vert x\in X\}\) forms a partition of X, which is called the partition of X induced by \(\sim \). Moreover, for all \(x,y\in X\), \((x,x)\notin \sim \implies [x]_{\sim }=\{x\}\), \(x\sim y\implies [x]_{\sim }=[y]_{\sim }\), \(x\ne y\wedge [x]_{\sim }=[y]_{\sim }\implies x\sim y\). Note that if \(x=y\), then . \(\sim \) is reflexive if and only if for every \([x]_{\sim }\) satisfying \(\vert [x]_{\sim }\vert =1\), \(x\sim x\).

Let \(\sim \subset X\times X\) be an equivalence relation. For \(x\in X\), \([x]_{\sim }\) (\(=\{y\in X\vert y\sim x\}\)) is called the equivalence class generated by x, i.e., the set of all elements of X that have the relation \(\sim \) with x. For all \(x,y\in X\), \([x]_{\sim }=[y]_{\sim }\) if and only if \(x\sim y\).

For two equivalence relations \(\alpha ,\beta \) on a set X, we say \(\alpha \) refines \(\beta \) if the partition of X induced by \(\alpha \) refines the partition of X induced by \(\beta \), equivalently, for all xy in X, if \(x\alpha y\) then \(x\beta y\).

Let \(f:X\rightarrow Y\) be a function. f defines an equivalence relation on X: for all \(x,x'\in X\), x and \(x'\) have this relation if and only if \(f(x)=f(x')\). The partition induced by this relation is called the partition induced by f and is denoted by \(\mathcal {P}(f)\). Conversely, given a set X and a partition \(\alpha \) of X, \(\alpha \) induces (at least) one function \(f_{\alpha }\) on X: for all \(x,x'\in X\), \(x\alpha x'\) if and only if \(f_{\alpha }(x)= f_{\alpha }(x')\). Clearly, \(f_{\alpha }\) induces \(\alpha \).

2.3 The notion of Moore machine

Definition 1

A Moore machine is a quintuple

$$S=(X,U,f,Y,h),$$

where

  • X is a finite set of states,

  • U is a finite set of inputs,

  • \(f:X\times U\rightarrow X\) is the update/transition function,

  • Y is a finite set of outputs, and

  • \(h:X \rightarrow Y\) is the output function.

Another widely studied class of systems called Mealy machines generalize Moore machines in the sense that h also depends on U, i.e., h is a function from \(X\times U\) to Y [39].

2.4 The notion of Boolean (control) network

Definition 2

A Boolean network (BN) is formulated as the following logical form:

$$\begin{aligned} x_1 (t + 1)= & {} f_1 (x_1 (t), \cdots ,x_n (t)), \end{aligned}$$
(1a)
$$\begin{aligned} x_2 (t + 1)= & {} f_2 (x_1 (t), \cdots ,x_n (t)), \end{aligned}$$
(1b)
$$\begin{aligned} \qquad \vdots \nonumber \\ x_n (t + 1)= & {} f_n (x_1 (t), \cdots ,x_n (t)), \end{aligned}$$
(1c)
$$\begin{aligned} y_1 (t )= & {} h_1 (x_1 (t), \cdots ,x_n (t)), \end{aligned}$$
(1d)
$$\begin{aligned} y_2 (t )= & {} h_2 (x_1 (t), \cdots ,x_n (t)), \end{aligned}$$
(1e)
$$\begin{aligned} \qquad \vdots \nonumber \\ y_q (t )= & {} h_n (x_1 (t), \cdots ,x_n (t)), \end{aligned}$$
(1f)

where \(t\in \mathbb {N}\) denotes any discrete time step, \(x_i(t),y_k(t)\in \mathcal {D}\) denote the value of state node \(x_i\) and the value of output node \(y_j\) at time step t, \(f_i,h_k:\mathcal {D}^n\rightarrow \mathcal {D}\) are Boolean functions, \(i\in \llbracket 1,n\rrbracket \), \(k\in \llbracket 1,q\rrbracket \). In this survey, we mainly consider observability, so we need to consider the output function in (2).

A BN (2) can be briefly denoted as the compact form

$$\begin{aligned} x(t+1)&= f(x(t)),\end{aligned}$$
(2a)
$$\begin{aligned} h(t)&= h(x(t)), \end{aligned}$$
(2b)

where \(t\in \mathbb {N}\); \(x(t)\in \mathcal {D}^{n}\) and \(y(t)\in \mathcal {D}^q\) stand for the state and the output at time step t, respectively; \(f:\mathcal {D}^n\rightarrow \mathcal {D}^n\) and \(h:\mathcal {D}^n\rightarrow \mathcal {D}^q\) are logical mappings.

The dependency graph of a BN (2) is a directed graph \((\mathcal {V},\mathcal {E})\), where the vertex set \(\mathcal {V}\) is \(\{x_1,\dots ,x_n\}\), the edge set \(\mathcal {E}\subset \mathcal {V}\times \mathcal {V}\) is such that there is an edge from \(x_i\) to \(x_j\) if and only if the value \(x_i(t)\) affects the value \(x_j(t+1)\), where \(i,j\in \llbracket 1,n \rrbracket \).

The state-transition graph of a BN (2) is a directed graph \((\mathcal {V},\mathcal {E})\), where \(\mathcal {V}=\mathcal {D}^n\); \(\mathcal {E}\subset \mathcal {V}\times \mathcal {V}\) is as follows: for all \(x,x'\in \mathcal {D}^n\), \((x,x')\in \mathcal {E}\) if and only if \(x'=f(x)\).

For a BN, the size of its state-transition graph is exponential in the size of its dependency graph.

Example 1

Consider the following simple BN:

$$\begin{aligned} A(t+1)&= B(t)\wedge C(t),\end{aligned}$$
(3a)
$$\begin{aligned} B(t+1)&= \lnot A(t),\end{aligned}$$
(3b)
$$\begin{aligned} C(t+1)&= B(t)\vee C(t), \end{aligned}$$
(3c)

where \(t\in \mathbb {N}\); \(A(t),B(t),C(t)\in \mathcal {D}\). Its dependency graph and state-transition graph are shown in Fig. 1a and 1b, respectively.

Fig. 1
figure 1

Graphs of BN (3). a Dependency graph of BN (3). b State-transition graph of BN (3)

Definition 3

A Boolean control network (BCN) is defined by the following logical form:

$$\begin{aligned} x_1 (t + 1)&= f_1 (u_1 (t), \dots ,u_m (t),x_1 (t), \dots ,x_n (t)), \end{aligned}$$
(4a)
$$\begin{aligned} x_2 (t + 1)&= f_2 (u_1 (t), \dots ,u_m (t),x_1 (t), \dots ,x_n (t)), \end{aligned}$$
(4b)
$$\begin{aligned} \qquad \vdots \nonumber \\ x_n (t + 1)&= f_n (u_1 (t), \dots ,u_m (t),x_1 (t), \dots ,x_n (t)),\end{aligned}$$
(4c)
$$\begin{aligned} y_1 (t )&= h_1 (x_1 (t), \dots ,x_n (t)), \end{aligned}$$
(4d)
$$\begin{aligned} y_2 (t )&= h_2 (x_1 (t), \dots ,x_n (t)), \end{aligned}$$
(4e)
$$\begin{aligned} \qquad \vdots \nonumber \\ y_q (t )&= h_n (x_1 (t), \dots ,x_n (t)), \end{aligned}$$
(4f)

where \(t\in \mathbb {N}\) denotes any discrete time step; \(x_i(t),u_j(t)\), and \(y_k(t)\in \mathcal {D}\) denote the values of state node \(x_i\), input node \(u_j\), and output node \(y_k\) at time step t, respectively, \(f_i:\mathcal {D}^{m+n}\rightarrow \mathcal {D}\) and \(h_k:\mathcal {D}^{n}\rightarrow \mathcal {D}\) are Boolean functions, \(i\in \llbracket 1,n\rrbracket \), \(j\in \llbracket 1,m\rrbracket \), \(k\in \llbracket 1,q\rrbracket \).

A BCN (3) is represented in the compact form

$$\begin{aligned} x(t+1)&= f(u(t),x(t)),\end{aligned}$$
(5a)
$$\begin{aligned} y(t)&= h(x(t)), \end{aligned}$$
(5b)

where \(t\in \mathbb {N}\); \(x(t)\in \mathcal {D}^{n}\), \(u(t)\in \mathcal {D}^{m}\), and \(y(t)\in \mathcal {D}^{q}\) stand for the state, input, and output of (3) at time step t; \(f:\mathcal {D}^{m+n}\rightarrow \mathcal {D}^{n}\) and \(h:\mathcal {D}^{n}\rightarrow \mathcal {D}^{q}\) are logical mappings.

By definition, a BCN (3) is a Moore machine, but not vice versa.

The dependency graph of a BCN (3) is a directed graph \((\mathcal {V},\mathcal {E})\), where the vertex set \(\mathcal {V}\) is equal to \(\{x_1,\dots ,x_n,u_1, \dots ,u_m,y_1,\dots ,y_q\}\), the edge set \(\mathcal {E}\subset \mathcal {V}\times \mathcal {V}\) is such that there is an edge from \(x_i\) (resp., \(u_j\)) to \(x_k\) (resp., \(y_l\)) if and only if the value \(x_i(t)\) (resp., \(u_j(t)\)) affects the value \(x_k(t+1)\) (resp., \(y_l(t)\)), where \(i,k\in \llbracket 1,n \rrbracket \), \(j\in \llbracket 1,m \rrbracket \), \(l\in \llbracket 1, q\rrbracket \).

The state-transition graph of a BCN (3) is a weighted directed graph \((\mathcal {V},\mathcal {E},\mathcal {W})\), where \(\mathcal {V}=\mathcal {D}^n\); \(\mathcal {E}\subset \mathcal {V}\times \mathcal {V}\) is as follows: for all \(x,x'\in \mathcal {D}^n\), \((x,x')\in \mathcal {E}\) if and only if there exists \(u\in \mathcal {D}^m\) such that \(x'=f(u,x)\); and for every edge \(e=(x,x')\in \mathcal {E}\), \(\mathcal {W}(e)= \{u\in \mathcal {D}^m\vert x'=f(u,x)\}\).

By definition, for a BCN, the size of its state-transition graph is exponential in the size of its dependency graph. In addition, more than one BCN may share the same dependency graph.

Example 2

Consider the following simple BCN:

$$\begin{aligned} A(t+1)&=B(t)\wedge u(t),\end{aligned}$$
(6a)
$$\begin{aligned} B(t+1)&=\lnot A(t),\end{aligned}$$
(6b)
$$\begin{aligned} y(t)&=A(t), \end{aligned}$$
(6c)

where \(t\in \mathbb {N}\), \(A(t),B(t),u(t),y(t)\in \mathcal {D}\). Its dependency graph and state-transition graph are shown in Fig. 2a and 2b, respectively.

Fig. 2
figure 2

Graphs of BCN (6). a Dependency graph of BCN (6). b State-transition graph of BCN (6)

2.5 The notion of finite automaton

An alphabet \(\varSigma \) is a nonempty finite set such that each finite sequence of elements of \(\varSigma \) has a unique decomposition of elements of \(\varSigma \). Elements of \(\varSigma \) are called letters, finite sequences of letters are called words. \(\varSigma ^*\) denotes the set of all words over \(\varSigma \), including the empty word \(\epsilon \). \(\varSigma ^+:=\varSigma ^*\setminus \{\epsilon \}\). \(\varSigma ^{\omega }\) denotes the set of infinite sequences over \(\varSigma \). Deterministic finite automata will be used to verify four definitions of observability of BCNs (see Sect. 4.2) as well as observability of nondeterministic finite-transition systems (NFTSs, see Sect. 4.5). Nondeterministic finite automata will be used to verify the observability of NFTSs as well.

Definition 4

A deterministic finite automaton is a quintuple \((Q,\varSigma ,\delta ,q_0,F)\), where

  • Q is a finite set of states,

  • \(\varSigma \) is an alphabet,

  • \(\delta :Q\times \varSigma \rightarrow Q\) is the transition function,

  • \(q_0\in Q\) is the initial state (aka start state), and

  • \(F\subset Q\) is the set of final states (aka accept states).

The transition function \(\delta \) is recursively extended to \(Q\times \varSigma ^*\rightarrow Q\) as usual. That is, for all \(q\in Q\), \(s\in \varSigma ^*\), and \(a\in \varSigma \), \(\delta (q,sa)=\delta (\delta (q,s),a)\).

Definition 5

Let \(\varSigma \) be an alphabet and \(A=(Q,\varSigma ,\delta ,\) \(q_0,\) F) be a DFA. A word \(w\in \varSigma ^*\) is called accepted by DFA A if \(\delta (q_0,w)\in F\). The formal language recognized by DFA A is the set of all words accepted by A, i.e., \(\{w\in \varSigma ^*\vert \delta (q_0,w)\in F\}\). A formal language recognized by some DFA is called regular.

Example 3

[40]   Consider the DFA \(A_1\) as in Fig. 3.

In \(A_1\), all transitions ending with the unique final state \(q_2\) are \(q_1\xrightarrow []{a}q_2\) and \(q_2\xrightarrow []{a}q_2\), and the unique transition ending with \(q_1\) is \(q_0\xrightarrow []{a}q_1\). Hence, all transition sequences of length 2 ending with \(q_2\) are \(q_0\xrightarrow []{a}q_1\xrightarrow []{a}q_2\), \(q_1\xrightarrow []{a}q_2\xrightarrow []{a}q_2\), and \(q_2\xrightarrow []{a}q_2\xrightarrow []{a}q_2\). Hence, all words accepted by \(A_1\) end with aa. On the other hand, for every word waa, where \(w\in \varSigma ^*\), no matter \(\delta (q_0,w)\) is equal to \(q_0\), \(q_1\), or \(q_2\), \(A_1\) accepts waa. Hence, \(A_1\) exactly accepts all words ending with aa.

Fig. 3
figure 3

DFA \(A_1\), where state \(q_0\) is initial (with an input arrow from nowhere), \(q_2\) is final (represented by double circles), symbols beside arrows denote letters. Transition \(q_0\xrightarrow []{a} q_1\) means that when \(A_1\) is in state \(q_0\) and reads letter a, \(A_1\) transitions to state \(q_1\). The other transitions can be explained similarly

Definition 6

A nondeterministic finite automaton is a quintuple \((Q,\varSigma ,\delta ,\) \(q_0,\) F), where

  • Q is a finite set of states,

  • \(\varSigma \) is an alphabet,

  • \(\delta :Q\times \varSigma \rightarrow 2^Q\) is the transition function,

  • \(q_0\in Q\) is the initial state, and

  • \(F\subset Q\) is the set of final states.

There exists a transition \(q\xrightarrow []{a}q'\) if and only if \(q'\in \delta (q,a)\). When an NFA A is in a state q and reads a letter a, then A can transition to any state of \(\delta (q,a)\). The transition function \(\delta \) is also recursively extended to \(Q\times \varSigma ^*\rightarrow 2^Q\) as usual.

Definition 7

Let \(\varSigma \) be an alphabet and \(A=(Q,\varSigma ,\delta ,\) \(q_0,\) F) be an NFA. A word \(w\in \varSigma ^*\) is called accepted by A if \(\delta (q_0,w)\cap F\ne \emptyset \). The formal language recognized by A is the set of all words accepted by A, i.e., \(\{w\in \varSigma ^*\vert \delta (q_0,w)\cap F\ne \emptyset \}\).

In other words, for an NFA \(A=(Q,\varSigma ,\delta ,q_0,F)\) over an alphabet \(\varSigma \) and a word \(w=w_1\dots w_n\in \varSigma ^*\), where \(w_1,\dots ,w_n\in \varSigma \), A accepts w if and only if there exist states \(q_1,\dots ,q_n\) such that \(q_i\in \delta (q_{i-1},w_i)\) for all \(1\le i\le n\) and \(q_n\in F\).

From now on, for an automaton A (either deterministic or nondeterministic), we use \({{\,\textrm{Acc}\,}}(A)\) to denote its accessible part, i.e., the part of A that is obtained by removing all states that are not reachable from the initial state and their ingoing and outgoing transitions.

2.6 The algebraic form of Boolean control networks based on the semitensor product of matrices

Definition 8

[34]   Let \(A\in \mathbb {R}^{m\times n}\), \(B\in \mathbb {R}^{p\times q}\), and \(\alpha =\text{ lcm } (n,p)\) be the least common multiple of n and p. The semitensor product (STP) of A and B is defined as

$$\begin{aligned} A\ltimes B = \left( A\otimes I_{\frac{\alpha }{n}}\right) \left( B\otimes I_{\frac{\alpha }{p}}\right) , \end{aligned}$$

where \(\otimes \) denotes the Kronecker product.

From this definition, it is easy to see that the conventional product of matrices is a particular case of the STP, since if \(n=p\) then \(A\ltimes B=AB\). Since the STP keeps most properties of the conventional product, e.g., the associative law [13]Footnote 3, the distributive law, inverse-order laws (e.g., \((A\ltimes B)^{{{\,\textrm{Tr}\,}}}=B^{{{\,\textrm{Tr}\,}}}\ltimes A^{{{\,\textrm{Tr}\,}}}\)), etc. [34], we usually omit the symbol “\(\ltimes \)” hereinafter.

It is very convenient to represent coordinates and coordinate transformations in BCNs using STP. When verifying properties of BCNs, STP is not necessary. As usual, STP-based verification is not very efficient compared with the verification based on state-transition graphs, because the former provides an enumeration of the states of a BCN, while in the latter one only needs to visit necessary states.

Definition 9

Let \(A\in \mathbb {R}^{m\times n},B\in \mathbb {R}^{p\times n}\). The Khatri-Rao product of A and B is defined by

$$\begin{aligned} A*B =&\, [{\text {Col}}_1(A)\otimes {\text {Col}}_1(B),\dots ,{\text {Col}}_n(A)\otimes {\text {Col}}_n(B)]\\ =&\, [{\text {Col}}_1(A)\ltimes {\text {Col}}_1(B),\dots ,{\text {Col}}_n(A)\ltimes {\text {Col}}_n(B)]. \end{aligned}$$

Proposition 1

[34]    For a Boolean function \(f:\mathcal {D}^n\rightarrow \mathcal {D}\), there exists a unique logical matrix \(F\in \mathcal {L}_{2\times 2^n}\) such that

$$f(x_1,\dots ,x_n)=F\tilde{x}_1\ltimes \cdots \ltimes \tilde{x}_n,$$

where \(x_i\in \mathcal {D}\), \(\tilde{x}_i\in \varDelta \), and \(x_i\sim \tilde{x}_i\), \(i\in \llbracket 1,n\rrbracket \). \(1\sim \delta _2^1\), \(0\sim \delta _2^2\).

Proposition 2

[34]    For two logical functions \(FA_1\ltimes \cdots \ltimes A_n\) and \(GA_1\ltimes \cdots \ltimes A_n\), where \(A_i\in {\varDelta }\), \(i\in \llbracket 1,n\rrbracket \), \(F\in {\mathcal {L}}_{2^{m_1}\times 2^n}\), \(G\in {\mathcal {L}}_{2^{m_2}\times 2^n}\), one has

$$\begin{aligned} (FA_1\cdots A_n)\ltimes (GA_1\cdots A_n)=HA_1\cdots A_n, \end{aligned}$$

where \(H=F*G\). Conversely, one also has

$$\begin{aligned} F&= (I_{2^{m_1}}\otimes {\textbf {1}}_{2^{m_2}}^{{{\,\textrm{Tr}\,}}}) H,\\ G&= ({\textbf {1}}_{2^{m_1}}^{{{\,\textrm{Tr}\,}}}\otimes I_{2^{m_2}}) H. \end{aligned}$$

Proposition 3

[34](Pseudocommutative law) Let \(A\in {\mathbb R}^{m\times n}\) and \(z\in {\mathbb R}^t\). Then

$$\begin{aligned}&A\ltimes z^{{{\,\textrm{Tr}\,}}}=z^{{{\,\textrm{Tr}\,}}}\ltimes (I_t\otimes A), \\&z\ltimes A=(I_t\otimes A)\ltimes z. \end{aligned}$$

Definition 10

[34] The matrix \(M_{k_r}={\delta }_k^1\oplus \cdots \oplus {\delta }_k^k\) is called the power-reducing matrix . Particularly, we denote \(M_{2_r}:=M_r\).

Proposition 4

[34] For power-reducing matrix \(M_{k_r}\), we have

$$P^2=M_{k_r}P$$

for each \(P\in \varDelta _k\).

A widely used property on the Kronecker product is as follows.

Proposition 5

Let \(A\in \mathbb {R}^{m_1\times n_1}\), \(B\in \mathbb {R}^{n_1\times p_1}\), \(C\in \mathbb {R}^{m_2\times n_2}\), \(D\in \mathbb {R}^{n_2\times p_2}\). Then

$$\begin{aligned} (AB)\otimes (CD)=(A\otimes C)(B\otimes D). \end{aligned}$$

Identifying \(1\sim \delta _2^1\), \(0\sim \delta _2^2\), using the STP of matrices, by Proposition 1, a BN (2) can be equivalently represented as the following algebraic form:

$$\begin{aligned} x_1 (t + 1)&= L_1 \ltimes _{i=1}^n x_i(t), \end{aligned}$$
(7a)
$$\begin{aligned} x_2 (t + 1)&= L_2 \ltimes _{i=1}^n x_i(t), \end{aligned}$$
(7b)
$$\begin{aligned} \qquad \vdots \nonumber \\ x_n (t + 1)&= L_n \ltimes _{i=1}^n x_i(t), \end{aligned}$$
(7c)
$$\begin{aligned} y_1 (t )&= H_1 \ltimes _{i=1} ^n x_i (t), \end{aligned}$$
(7d)
$$\begin{aligned} y_2 (t )&= H_2 \ltimes _{i=1} ^n x_i (t), \end{aligned}$$
(7e)
$$\begin{aligned} \qquad \vdots \nonumber \\ y_q (t )&= H_q \ltimes _{i=1} ^n x_i (t), \end{aligned}$$
(7f)

where \(t\in \mathbb {N}\), \(x_i(t),y_k(t)\in \varDelta \), \(L_i,H_k\in \mathcal {L}_{2\times 2^n}\), \(i\in \llbracket 1,n \rrbracket \), \(k\in \llbracket 1,q \rrbracket \).

Then by Definition 9 and Proposition 2, (7) can be equivalently represented as the algebraic form

$$\begin{aligned} x(t + 1)&= Lx(t), \end{aligned}$$
(8a)
$$\begin{aligned} y(t)&= Hx(t), \end{aligned}$$
(8b)

where \(t\in \mathbb {N}\), \(x(t)=\ltimes _{i=1}^{n}x_i(t)\in \varDelta _{2^n}\), \(y(t)=\ltimes _{k=1}^{q}y_k(t)\in \varDelta _{2^q}\), \(L=L_1*\cdots *L_n\in \mathcal {L}_{2^n\times 2^{n}}\) and \(H=H_1*\cdots *H_q\in \mathcal {L}_{2^q\times 2^{n}}\) are its structure matrices.

Similarly, by Proposition 1, a BCN (3) can be equivalently represented as

$$\begin{aligned} x_1 (t + 1)&= L_1 \ltimes _{j=1}^m u_j(t) \ltimes \ltimes _{i=1} ^n x_i (t), \end{aligned}$$
(9a)
$$\begin{aligned} x_2 (t + 1)&= L_2 \ltimes _{j=1}^m u_j(t) \ltimes \ltimes _{i=1} ^n x_i (t) , \end{aligned}$$
(9b)
$$\begin{aligned} \qquad \vdots \nonumber \\ x_n (t + 1)&= L_n \ltimes _{j=1}^m u_j(t) \ltimes \ltimes _{i=1} ^n x_i (t), \end{aligned}$$
(9c)
$$\begin{aligned} y_1 (t )&= H_1 \ltimes _{i=1} ^n x_i (t), \end{aligned}$$
(9d)
$$\begin{aligned} y_2 (t )&= H_2 \ltimes _{i=1} ^n x_i (t), \end{aligned}$$
(9e)
$$\begin{aligned} \qquad \vdots \nonumber \\ y_q (t )&= H_q \ltimes _{i=1} ^n x_i (t), \end{aligned}$$
(9f)

where \(t\in \mathbb {N}\), \(x_i(t)\in \varDelta \), \(u_j(t)\in \varDelta \), \(y_k(t)\in \varDelta \), \(L_i\in \mathcal {L}_{2\times 2^{n+m}}\), \(H_k\in \mathcal {L}_{2\times 2^n}\), \(i\in \llbracket 1,n\rrbracket \), \(j\in \llbracket 1,m\rrbracket \), \(k\in \llbracket 1,q\rrbracket \).

Then also by Definition 9 and Proposition 2, (9) can be equivalently represented as the algebraic form

$$\begin{aligned} x(t + 1)&= Lu(x)x(t), \end{aligned}$$
(10a)
$$\begin{aligned} \qquad \qquad ~&=:[L_1,\dots ,L_{2^m}]u(t)x(t), \end{aligned}$$
(10b)
$$\begin{aligned} y(t)&= Hx(t), \end{aligned}$$
(10c)

where \(t\in \mathbb {N}\), \(x(t)=\ltimes _{i=1}^nx_i(t)\in \varDelta _{2^n}\), \(u(t)=\ltimes _{j=1}^mu_j(t)\in \varDelta _{2^m}\), and \(y(t)=\ltimes _{k=1}^qy_k(t)\in \varDelta _{2^q}\), \(L\in \mathcal {L}_{2^n\times 2^{n+m}}\) and \(H\in \mathcal {L}_{2^q\times 2^n}\) are called its structure matrices , \(L_j\in \mathcal {L}_{2^n\times 2^n}\), \(j\in \llbracket 1,2^m\rrbracket \). For brevity, denote \(N:=2^n\), \(M:=2^m\), \(Q:=2^q\).

The algorithms for transforming the algebraic form of a BN/BCN back to their logical forms can be found in [33, 34].

2.7 Different definitions of observability of Boolean control networks

Although in this survey we do not consider controllability, we still state its definition for ease of statement.

Definition 11

A BCN (3) is called controllable if for all states \(x_0,x_d\in \mathcal {D}^n\), if \(x(0)=x_0\), then \(x(p)=x_d\) for some \(p\in \mathbb {Z}_{+}\) and some inputs \(u(0),u(1),\dots ,u(p-1)\in \mathcal {D}^m\).

Clearly, a BCN (3) is controllable if and only if its state-transition graph is strongly connected.

As usual, a verification algorithm for a property is a necessary and sufficient condition for the property that is verifiable in finitely many steps.

Next we state the four definitions of observability BCNs considered in this survey. The terms “***-experiment observability” were chosen from [13], which were rephrased from the term “Gedanken experiment” used in [10]. An experiment is a process for which an input sequence is fed into a BCN to generate an output sequence, to distinguish between the real initial state and the candidates of the initial state. According to the number of experiments needed, the notions of observability are called “single-experiment” or “multiple-experiment”. In this survey, we only consider preset experiments, where all input sequences in experiments are completely determined in advance. In addition, there are another kind of experiments called adaptive experiments, where in an input sequence, every input is determined by the (past) generated output sequence. See [11, 39] for details. For example, the existence of adaptive distinguishing sequences (that can determine the initial state) was verified in exponential time in [11], and verified in polynomial time in [39], for Mealy machines. In [26], the existence of adaptive distinguishing sequences was studied for BCNs, but the necessary and sufficient condition for the existence given in [26] is not verifiable.

Definition 12

[10, 41] A BCN (3) is called multiple-experiment observable if for every two different initial states \(x(0),x(0)'\in \mathcal {D}^n\), there is an input sequence such that the output sequences corresponding to x(0) and \(x(0)'\) are different. Such an input sequence is called a distinguishing input sequence of x(0) and \(x(0)'\).

If a BCN is multiple-experiment observable, then every two different initial states have a distinguishing input sequence, and then the initial state can be determined by doing a finite number of experiments. In the first experiment, assume two different initial states as candidates, and feed one of their distinguishing input sequences into the BCN, then at least one of the obtained output sequences differs from the real output sequence (generated by the real initial state and the input sequence), hence at least one of the two candidates is not the real initial state, and mark each candidate that is not real. In the second experiment, repeat the first experiment by assuming two different initial states as candidates from the unmarked initial states. Repeat the experiment until there is only one unmarked initial state left. Then the only state is the real initial state.

In [10], the notion of observability was not explicitly mentioned, but a polynomial-time algorithm was given to verify multiple-experiment observability of Moore machines (see the proof of Theorem 6 therein), i.e., the algorithm runs in exponential time for BCNs. The algorithm is based on an equivalence relation \( \sim _k\): two states x and \(x'\) have this relation if and only if x and \(x'\) have no distinguishing input sequence of length k. See Sect. 3 for details. The notion of “indistinguishability” was formally defined and verified in [10] as well. In [41], a sufficient but not necessary condition was given for multiple-experiment observability of BCNs.

Definition 13

[33, 39]    A BCN (3) is called strongly multiple-experiment observable if for every initial state \(x(0)\in \mathcal {D}^n\), there exists an input sequence such that for each initial state \(x(0)'\in \mathcal {D}^n\) different from x(0), the output sequences corresponding to x(0) and \(x(0)'\) are different. Such an input sequence is called a distinguishing input sequence of x(0).

If a BCN is strongly multiple-experiment observable, then every initial state has a distinguishing input sequence, and then the initial state can be determined also by doing a finite number of experiments (sometimes only one experiment is enough). In the first experiment, assume one initial state \(x_0'\) as a candidate, and feed one of its distinguishing input sequences into the BCN, then \(x_0'\) is the real initial state \(x_0\) if and only if the output sequence generated by \(x_0'\) and U is the same as the real output sequence (generated by \(x_0\) and U). With any luck, after the first experiment, the real initial state can be found. With no luck, after repeating this experiment for finitely many times for different candidates, the real initial state will be determined. Definition 13 is strictly stronger than Definition 12 [37, 38].

In [39], the distinguishing input sequence of x(0) was called a unique input output sequence of x(0), and its existence problem was proven to be \(\textsf{PSPACE}\)-complete for Mealy machines. Hence, the algorithm runs in doubly exponential time for BCNs. The verification algorithm for the existence of a distinguishing input sequence of x(0) is as follows: apply an input sequence to the pair \((x(0),\{x\in \mathcal {D}^n \vert x\ne x(0), h(x)=h(x(0))\})\) and see whether some pair whose right component is empty can be reached, where in each reachable pair, the right component must be maximal and the states in the right component must produce the same output as the left component. The method proposed in [39] is not equivalent to the partition-based method proposed in [10] (see Sect. 3) or the inverse method proposed in [21] (see Sect. 4). In [33], a necessary and sufficient condition for strong multiple experiment observability of controllable BCNs was given, which is sufficient but not necessary for general BCNs.

Definition 14

[39, 42]    A BCN (3) is called single-experiment observable if there exists an input sequence such that for every two different initial states \(x(0),x(0)'\in \mathcal {D}^n\), the output sequences corresponding to x(0) and \(x(0)'\) are different. Such an input sequence is called a distinguishing input sequence of (3).

If a BCN is single-experiment observable, then the initial state can be determined by every distinguishing input sequence. Hence, in order to determine the initial state, one only needs to do one experiment. Definition 14 is strictly stronger than Definition 13 [37, 38].

In [39], the single-experiment observability verification problem was proven to be \(\textsf{PSPACE}\)-complete for Mealy machines, where the distinguishing input sequence was called preset distinguishing sequence. Hence, the verification algorithm runs in doubly exponential time for BCNs. The verification algorithm is based on the notion of initial state uncertainty: the initial state uncertainty of an input sequence \(u(0)\dots u(n)\) is the partition induced by the equivalence relation for which two states x(0) and \(x(0)'\) have this relation if and only if \(u(0)\dots u(n)\) is not a distinguishing input sequence of x(0) and \(x(0)'\). Clearly, \(u(0)\dots u(n)\) is a distinguishing input sequence of BCN (3) if and only if the partition is discrete. The method proposed in [39] is not equivalent to the partition-based method proposed in [10] or the inverse method proposed in [21]. In [42], a necessary and sufficient condition for single-experiment observability of controllable BCNs was given, which is sufficient but not necessary for general BCNs.

Definition 15

[43]    A BCN (3) is called arbitrary-experiment observable if for all different initial states \(x(0),x'(0)\in \mathcal {D}^n\), for each input sequence \(u(0)u(1)\dots \), the corresponding output sequences \(y(0)y(1)\dots \) and \(y'(0)y'(1)\dots \) are different.

If a BCN is arbitrary-experiment observable, then the initial state can be determined by every sufficiently long input sequence. Hence, to determine the initial state, one also only needs to do one experiment. Definition 15 is strictly stronger than Definition 14 [37, 38].

In [43], an exponential-time algorithm for arbitrary-experiment observability of BCNs was given based on the equivalence relation \(\sim _k\) defined in [10] (see Sect. 3): a BCN is arbitrary-experiment observable if and only if for every pair of different periodic (state, input)-trajectories of the same minimal period k and the same input trajectory, the corresponding output trajectories are also different and periodic of minimal period k. It is not very convenient to apply this verification algorithm, because one needs to first find an upper bound on k and then check all sequences of length k. Apparently, the method is not equivalent to the inverse method proposed in [21]. The inverse method is easy-to-use.

3 Moore’s partition-based method

3.1 The method

The partition-based method given by Moore [10] in 1956 is based on a sequence of equivalence relations. We restate the relations in terms of BCNs instead of Moore machines because BCNs are the considered model. Consider a BCN (3). For each \(k\in \mathbb {N}\), define an equivalence relation \(\sim _k\) on its state set \(\mathcal {D}^n\): for every two states x and \(x'\) in \(\mathcal {D}^n\), \(x\sim _k x'\) if and only if x and \(x'\) have no k-length distinguishing input sequence. The partition induced by \(\sim _k\) is denoted by \(\mathcal {P}(\sim _k)\). The result in [10, Theorem 6] is as follows.

Theorem 6

[10]    A BCN (3) is multiple-experiment observable if and only if there exists \(k\in \mathbb {N}\) with \(k\le 2^n-1\) such that the partition \(\mathcal {P}(\sim _k)\) is discrete (i.e., all its cells are singletons).

Note that the term “observability” did not explicitly appear in [10]. For two states x and \(x'\), if they have no k-length distinguishing input sequence, then they have no distinguishing input sequence of length \(k-1\). Then for all \(k\in \mathbb {N}\), \(\mathcal {P}(\sim _k)\preceq \mathcal {P}(\sim _{k+1})\).

Moreover, one observes that

$$\begin{aligned} \begin{aligned}&\hbox {for all } k\in \mathbb {N},\hbox { if }\mathcal {P}(\sim _k)=\mathcal {P}(\sim _{k+1})\hbox { then}\\&\quad \mathcal {P}(\sim _{k+1})=\mathcal {P}(\sim _{k+2}). \end{aligned} \end{aligned}$$
(A)

Assume \(\mathcal {P}(\sim _k)=\mathcal {P}(\sim _{k+1})\). Let \(x,x'\in \mathcal {D}^n\) be such that \(x\sim _{k}x'\) and \(x\sim _{k+1}x'\). Choose an arbitrary \(u\in \mathcal {D}^m\), one then has \(f(u,x)\sim _{k} f(u,x')\), then \(f(u,x)\sim _{k+1}f(u,x')\), \(x\sim _{k+2}x'\). Hence the chain

$$\begin{aligned} \mathcal {P}(\sim _0)\prec \cdots \prec \mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})=\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}+1})=\cdots \end{aligned}$$
(11)

satisfies that the partitions first strictly refine, and then remain the same, where \({{\,\textrm{ind}\,}}=\min \{l\in \mathbb {N}\vert \mathcal {P}(\sim _l)=\mathcal {P}(\sim _{l+1})\}\). Because the cardinality of \(\mathcal {D}^n\) is \(2^n\), one has \({{\,\textrm{ind}\,}}\le 2^n-1\) and \(\mathcal {P}(\sim _{2^n-1})=\mathcal {P}(\sim _{k})\) for all \(k\ge 2^n\). That is, Theorem 6 holds.

Given a partition \(\mathcal {P}(\sim _0)\), one can compute \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\) based on the observation (A) as follows. (1) \(\mathcal {P}_{{{\,\textrm{tmp}\,}}}:=\mathcal {P}(\sim _0)\). (2) Arbitrarily choose a cell C of \(\mathcal {P}_{{{\,\textrm{tmp}\,}}}\), partition C as follows: two different states \(x,x'\) of C remain in the same cell if and only if for all u, f(ux) and \(f(u,x')\) belong to the same cell of \(\mathcal {P}_{{{\,\textrm{tmp}\,}}}\). (3) Repeat (2) until no cell can be divided, and then we obtain \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\). Note that the final result \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\) is unique and irrespective of the order of dividing cells.

Example 4

Consider BCN

$$\begin{aligned} \begin{aligned} x(t+1)&= f(u(t),x(t)),\\ y(t)&= h(x(t)), \end{aligned} \end{aligned}$$
(12)

where \(t\in \mathbb {N}\), \(x(t)\in \mathcal {D}^2\), \(u(t),y(t)\in \mathcal {D}\), f and h are as follows:

(a) f.

x

00

01

10

11

\(u=0\)

11

00

10

11

\(u=1\)

11

10

10

11

(b) h.

x

00

01

10

11

y

0

0

0

1

After a series of direct calculations, one has

figure b

for all \(k>2\). In \(\mathcal {P}(\sim _0)\), 00, 01, 10 form a cell because \(h(00)=h(01)=h(10)=0\), 11 forms the other cell because \(h(11)=1\ne h(00)\). By applying inputs 0 and 1 on states 01 and 10, they still stay in the same cell of \(\mathcal {P}(\sim _0)\), then 01 and 10 are in the same cell of \(\mathcal {P}(\sim _1)\). By applying input 0 on states 00 and 01, they transition into different cells of \(\mathcal {P}(\sim _0)\), then they belong to different cells of \(\mathcal {P}(\sim _1)\). The same argument indicates that states 00 and 10 belong to different cells of \(\mathcal {P}(\sim _1)\). By applying input 0 on states 01 and 10, they transition into different cells of \(\mathcal {P}(\sim _1)\), then they belong to different cells of \(\mathcal {P}(\sim _2)\). Therefore, \(\mathcal {P}(\sim _2)\) is discrete. By Theorem 6, BCN (12) is multiple-experiment observable. By the process of computing \(\mathcal {P}(\sim _0)\), \(\mathcal {P}(\sim _1)\), and \(\mathcal {P}(\sim _2)\), one can find distinguishing input sequences of every pair of states. The empty word \(\epsilon \) is a distinguishing input sequence of 00 and 11, 01 and 11, and 10 and 11. Input 0 is a distinguishing input sequence of 00 and 01, 00 and 10. Input sequence 00 is a distinguishing input sequence of 01 and 10.

3.2 An equivalent semitensor-product representation for Theorem 6

Hereinafter, regard a logical matrix \(L\in \mathcal {L}_{R\times T}\) as a function from \(\varDelta _T\) to \(\varDelta _R\) in the form of \(y=Lx\), where \(x\in \varDelta _T\).

Consider the algebraic form (10) of BCN (3). By definition, the corresponding partition \(\mathcal {P}(\sim _0)\) as defined in (11) is induced by \(H=:G_0\). Furthermore, the corresponding partition \(\mathcal {P}(\sim _1)\) is induced by \(G_0**_{i=1}^{2^m}HL\delta _{2^m}^i=:G_1\in \mathcal {L}_{2^q(1+2^m)\times 2^n}\). Then by induction, it is not difficult to see that for every \(k\in \mathbb {Z}_+\), the partition \(\mathcal {P}(\sim _k)\) is induced by \(G_{k-1}**_{i_1=1}^{2^m}\cdots *_{i_k=1}^{2^m} HL\delta _{2^m}^{i_1}\cdots L\delta _{2^m}^{i_k}=:G_k\in \mathcal {L}_{2^q(1+2^m+\cdots +2^{mk})\times 2^n}\). That is, two states \(x,x'\in \varDelta _{2^n}\) have a distinguishing input sequence of length k if and only if \(G_kx\ne G_kx'\). Hence Theorem 6 can be equivalently represented by the following result.

Proposition 7

A BCN (10) is multiple-experiment observable if and only if there exists \(k\in \mathbb {N}\) with \(k\le 2^n-1\) such that \({\text {rank}}\left( G_k\right) =2^n\).

In the special case that \(m=0\) (i.e., when the input is constant), Theorem 7 degenerates to the main result obtained in [19].

3.3 Relations with invariant dual subspaces and observability of Boolean networks

In [18], a notion of smallest invariant subspace of a BN (8a) containing a set of Boolean functions was studied and an algorithm was designed to compute such a subspace.

Consider Boolean functions \(z_i=f_i(x_1,\dots ,x_n)\), where \(x_1,\dots ,x_n,z_i\in \mathcal {D}\), \(i\in \llbracket 1,r \rrbracket \). The subspace generated by \(z_1,\dots ,z_r\) is defined by \(\mathcal {F}_{\ell }\{z_1,\dots ,z_r\} =\{f(z_1(x_1,\dots ,x_n),\dots ,z_r(x_1,\dots ,x_n))\vert f:\mathcal {D}^r\rightarrow \mathcal {D}\}\), i.e., the set of all Boolean functions of \(z_1,\dots ,z_r\). Let the structure matrix of \(z_i=f_i(x_1,\dots ,x_n)\) be \(F_i\in \mathcal {L}_{2\times 2^n}\), \(i\in \llbracket 1,r \rrbracket \), \(F=F_1*\cdots *F_r\in \mathcal {L}_{2^r\times 2^n}\) is called the structure matrix of \(\mathcal {F}_{\ell }\{z_1,\dots ,z_r\}\). Next we give a partition-based description for \(\mathcal {F}_{\ell }\{z_1,\dots ,z_r\}\).

It is not difficult to prove that for all Boolean functions \(z_1,\dots ,z_r:\mathcal {D}^n\rightarrow \mathcal {D}\),

$$\begin{aligned} \mathcal {P}(\mathcal {F}_{\ell }\{z_1,\dots ,z_r\})=\mathcal {P}(\{z_1,\dots ,z_r\}). \end{aligned}$$

Then the following proposition follows.

Proposition 8

Consider Boolean functions \(z_1^1,\dots ,z_1^r\), \(z_2^1,\dots ,z_2^s:\mathcal {D}^n\rightarrow D\), one has

  • \(\mathcal {P}(\mathcal {F}_{\ell }\{z_1^1,\dots ,z_1^r\})=\mathcal {P}(\mathcal {F}_{\ell }\{z_2^1,\dots ,z_2^s\})\) if and only if

    $$\begin{aligned} \mathcal {P}(\{z_1^1,\dots ,z_1^r\})=\mathcal {P}(\{z_2^1,\dots ,z_2^s\}); \end{aligned}$$
  • furthermore,

    $$\begin{aligned} \mathcal {F}_{\ell }\{z_1^1,\dots ,z_1^r\}=\mathcal {F}_{\ell }\{z_2^1,\dots ,z_2^s\} \end{aligned}$$

    if and only if

    $$\begin{aligned} \mathcal {P}(\{z_1^1,\dots ,z_1^r\})=\mathcal {P}(\{z_2^1,\dots ,z_2^s\}). \end{aligned}$$

That is, the subspace generated by a set of Boolean functions is determined by the partition induced by these Boolean functions. Hence, different sets of Boolean functions may generate the same subspace. A similar feature also appears in linear algebra: different bases may generate the same linear subspace. In addition, the above \(\mathcal {P}\) defines a bijection from \(\mathcal {X}^*\) to \(\varPi \), where \(\mathcal {X}^*\) denotes the set of all subspaces generated by sets of Boolean functions on \(\mathcal {D}^n\), \(\varPi \) denotes the set of all partitions of \(\mathcal {D}^n\).

A subspace \(\mathcal {F}_{\ell }\{z_1,\dots ,z_r\}\) is called L-invariant [18] if there exists \(G\in \mathcal {L}_{2^r\times 2^r}\) such that \(FL=GF\), where F is the structure matrix of the subspace, and the BN \(z(t+1)=Gz(t)\) is called the dual BN of (8a) with respect to \(\mathcal {F}_{\ell }\{z_1,\dots ,z_r\}\). Since \(\mathcal {F}_{\ell }\{z_1,\dots ,z_n\}\) is generated by a set of functions, in the current paper we call it dual subspace which follows the traditional terms of functional analysis; and then, we call an L-invariant subspace (this term was used in [18]) an L-invariant dual subspace. A number of properties on L-invariant dual subspaces of (8a) containing a set of Boolean functions were characterized in [18].

We give a new explanation for L-invariant dual subspaces from a partition point of view. A dual subspace \(\mathcal {F}_{\ell }\{z_1,\dots ,z_r\}\) is L-invariant if and only if \(\mathcal {P}(FL)\preceq \mathcal {P}(F)\), i.e., the partition induced by FL is coarser than the partition induced by F, i.e., (from a dynamical-system point of view) the BN (8a) coarsens the partition induced by F, i.e., for every two states \(x,x'\in \varDelta _{2^n}\), if \(Fx=Fx'\) then \(FLx=FLx'\).

The structure matrix of the smallest L-invariant dual subspace of (8a) containing \(\mathcal {F}_{\ell }\{z_1, \dots ,z_r\}\) is computed as \(F*FL*\cdots *FL^{{{\,\textrm{ind}\,}}}=:O_{{{\,\textrm{ind}\,}}}\in \mathcal {L}_{(1+{{\,\textrm{ind}\,}})2^r\times 2^n}\) [18], where \({{\,\textrm{ind}\,}}=\min \{k\in \mathbb {N}\vert {\text {rank}}(O_k)={\text {rank}}(O_{k+1}\}\), where \(O_i=F*FL*\cdots *FL^i\). If we consider \(z(t)=Fx(t)\) as an output function of BN (8a), then corresponding to the BN (LF), the partition induced by \(O_{{{\,\textrm{ind}\,}}}\) is exactly Moore’s partition \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\) as defined in (11). This is an essential relation between Moore’s partition and the smallest L-invariant dual subspace of (8a) containing \(\mathcal {F}_{\ell }\{z_1, \dots ,z_r\}\) defined in [18]. By Theorem 6, this smallest L-invariant dual subspace containing F decides the observability of BN (LF).

3.4 Relations with unobservable subspaces of linear time-invariant control systems

Consider the following linear control system:

$$\begin{aligned} \dot{x}(t)&= Ax(t) + v(t),~~ t\ge 0, \end{aligned}$$
(13a)
$$\begin{aligned} y(t)&= Cx(t), \end{aligned}$$
(13b)

where \(A\in \mathbb {R}^{n\times n}\), \(C\in \mathbb {R}^{q\times n}\).

Define \(\ker (CA^{i})=:\mathcal {O}_i\), \(i\in \mathbb {N}\). The unobservable subspace of (CA) is defined as \(\bigcap _{i=0}^{n-1}\mathcal {O}_i\) [3]. It is well known that (13) is observable if and only if \(\bigcap _{i=0}^{n-1}\mathcal {O}_i=\{{\textbf {0}}\}\).

One directly sees that BCNs (or Moore machines) are discrete in both time and space, but linear control systems are continuous in both time and space. Therefore, they look totally different. Is there any similarity between these two types of systems? It is hard to give a confirmatory answer. But, now there does exist one! The similarity comes from observability. One can see that the chain

$$\begin{aligned} \mathcal {O}_0\supsetneq \cdots \supsetneq \mathcal {O}_{l} = \mathcal {O}_{l+1} = \cdots \end{aligned}$$
(14)

satisfies that the subspaces first strictly shrink, and then remain the same.

Comparing (14) with (11), one can see that the two chains are quite similar. In (11), the partitions strictly refine and then remain the same. In (14), the subspaces strictly shrink and then also remain the same. The former means that the pairs of indistinguishable initial states become fewer and fewer, the latter means unobservable initial states become fewer and fewer. The two descending chains are quite similar to each other and they represent the same physical phenomenon if their mathematical meanings are neglected. This similarity can also be regarded as one between discrete mathematics and continuous mathematics.

3.5 An application to disturbance decoupling of Boolean control networks

In this subsection, we study the disturbance decoupling problem of BCNs with disturbance. After finding solutions to the problem, one will see the solutions are just of Moore’s partition style! This part contains relatively intricate technical details. The reader who is not interested in the technical details can jump over them and directly read the theorems and their explanations. The material of this part was chosen from the lecture notes of the course “Finite-State Dynamical Systems” given by the writer in the winter semester of 2021/22 at Technical University of Berlin, Germany.

The coordinate-independent disturbance decoupling problem for BNs with disturbance has been completely solved in [17], and was called the original disturbance decoupling problem therein. The method of finding a solution used in this subsection starts from the definition of the problem and is technically different from the method used in [17]. However, when the input is constant, the solution found in this subsection is consistent with the one found in [17] (by comparing Theorem 10 with [17, Theorem 2]). The technique (based on Proposition 11) in the current paper used for checking the conditions in Theorem 10 is technically easier than the one used in [17].

We now formulate the disturbance decoupling problem for a BCN with disturbance:

$$\begin{aligned} \begin{aligned} x(t+1)&= Lx(t)u(t)\xi (t),\\ y(t)&= Hx(t), \end{aligned} \end{aligned}$$
(15)

where \(t=0,1,2\dots \), \(x(t)\in \varDelta _{2^n}\) denotes the state, \(u(t)\in \varDelta _{2^m}\) the input, \(\xi (t)\in \varDelta _{2^p}\) the disturbance, \(y(t)\in \varDelta _{2^q}\) the output at time step t, respectively, \(L\in \mathcal {L}_{2^n\times 2^{n+m+p}}\), \(H\in \mathcal {L}_{2^q\times 2^n}\).

The original disturbance decoupling problem means that, whether some state-feedback controller and some coordinate transformation can make the output of (15) not be affected by its disturbance. A state-feedback controller is \(u(t)=Gx(t),\) where \(G\in \mathcal {L}_{2^m\times 2^n}\). A coordinate transformation is a bijection from \(\varDelta _{2^n}\) to \(\varDelta _{2^n}\), equivalently a mapping \(z=Tx,\) where \(z,x\in \varDelta _{2^n}\), \(T\in \mathcal {L}_{2^n\times 2^n}\) is a nonsingular matrix. Since T is also a permutation matrix, the inverse transformation is \(x=T^{-1}z=T^{{{\,\textrm{Tr}\,}}}z\). For brevity, denote \(N:=2^n\), \(M:=2^m\), \(P:=2^p\), \(Q:=2^q\).

Problem 1

Consider a BCN (15) with disturbance \(\xi \). Its original disturbance decoupling problem is to check whether there exists a state-feedback controller

$$\begin{aligned} u(t)=Gx(t) \end{aligned}$$
(16)

and a coordinate transformation

$$\begin{aligned} z(t)=Tx(t), \end{aligned}$$
(17)

where \(G\in \mathcal {L}_{M\times N}\), \(T\in \mathcal {L}_{N\times N}\), such that after feeding them into (15), the obtained closed-loop BCN

$$\begin{aligned} z(t+1)&= TL(I_N\otimes G)M_{N_r}T^{{{\,\textrm{Tr}\,}}}z(t)\xi (t), \end{aligned}$$
(18a)
$$\begin{aligned} y(t)&= HT^{{{\,\textrm{Tr}\,}}}z(t) \end{aligned}$$
(18b)

satisfies that \(\xi \) does not affect y.

We next formulate a special case of the original disturbance decoupling problem which is coordinate-dependent.

Problem 2

[31]    Consider a BCN (15) with disturbance \(\xi \). Its coordinate-dependent disturbance decoupling problem is to check whether there exists a state-feedback controller \(u(t)=Gx(t)\) (16) and a coordinate transformation \(z(t)=Tx(t)\) (17), where \(G\in \mathcal {L}_{M\times N}\), \(T\in \mathcal {L}_{N\times N}\), such that after feeding them into (15), one obtains the following decomposed BCN (in which \(\xi \) does not affect y):

$$\begin{aligned} z^1(t+1)&= \bar{L}_1z(t)\xi (t), \end{aligned}$$
(19a)
$$\begin{aligned} z^2(t+1)&= \bar{L}_2z^2(t),\end{aligned}$$
(19b)
$$\begin{aligned} y(t)&= \bar{H}z^2(t), \end{aligned}$$
(19c)

where \(t=0,1,2,\dots \), \(z^1(t)\in \varDelta _S\), \(S=2^s\) for some \(s\in \llbracket 1,n \rrbracket \), \(z^2(t)\in \varDelta _{\bar{S}}\), \(\bar{S}=N/S=2^{\bar{s}}\) with \(\bar{s}=n-s\), \(\bar{L}_1\in \mathcal {L}_{S\times NP}\), \(\bar{L}_2\in \mathcal {L}_{\bar{S}\times \bar{S}}\), \(\bar{H}\in \mathcal {L}_{Q\times \bar{S}}\).

Sometimes, we write “a state-feedback controller \(G\in \mathcal {L}_{M\times N}\)” for short, and write “a coordinate transformation \(T\in \mathcal {L}_{N\times N}\)” for short, where T is nonsingular.

The procedure of finding solutions to Problem 1 is almost the same as that of finding solutions to Problem 2. The latter is a little more complicated, because the decomposed form (19) is more fragmented. We next show how to find solutions to Problem 2.

To solve Problem 2, we first put a state-feedback controller candidate (16) and a coordinate transformation candidate (17) into (15), and obtain the following:

$$\begin{aligned} z(t+1)&= TLx(t)Gx(t)\xi (t) \end{aligned}$$
(20a)
$$\begin{aligned}&= TL(I_N\otimes G) M_{N_r}x(t)\xi (t) \end{aligned}$$
(20b)
$$\begin{aligned}&\quad \ \text {(by Propositions~3 and 4)}\nonumber \\&= TL(I_N\otimes G) M_{N_r}T^{{{\,\textrm{Tr}\,}}}z(t)\xi (t) \end{aligned}$$
(20c)
$$\begin{aligned}&= T[L_1,\dots ,L_N] \begin{bmatrix} G &{} &{} \\ &{} \ddots &{}\\ &{} &{} G \end{bmatrix} \begin{bmatrix} \delta _N^1 &{} &{} \\ &{} \ddots &{}\\ &{} &{} \delta _N^N \end{bmatrix}\nonumber \\&\quad \ltimes T^{{{\,\textrm{Tr}\,}}}z(t)\xi (t) \end{aligned}$$
(20d)
$$\begin{aligned}&= T[L_1,\dots ,L_N] \begin{bmatrix} G_1 &{} &{} \\ &{} \ddots &{}\\ &{} &{} G_N \end{bmatrix} T^{{{\,\textrm{Tr}\,}}}z(t)\xi (t) \end{aligned}$$
(20e)
$$\begin{aligned}&= T[L_1(G_1\otimes I_P),\dots ,L_N(G_N\otimes I_P)]\nonumber \\&\quad \ltimes (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)z(t)\xi (t), \end{aligned}$$
(20f)
$$\begin{aligned} y(t)&= HT^{{{\,\textrm{Tr}\,}}}z(t), \end{aligned}$$
(20g)

where \(L_i\in \mathcal {L}_{N\times MP}\), \(G_i={\text {Col}}_i(G)\in \mathcal {L}_{M\times 1}\), \(i\in \llbracket 1,N \rrbracket \).

To make Problem (2) solvable, one needs to assume (20) coincides with (19), one then has

$$\begin{aligned}&\begin{aligned} \bar{L}_1=&(I_S\otimes {\textbf {1}}_{\bar{S}}^{{{\,\textrm{Tr}\,}}})T[L_1(G_1\otimes I_P),\dots ,\\&L_N(G_N\otimes I_P)] (T^{{{\,\textrm{Tr}\,}}}\otimes I_P), \end{aligned} \end{aligned}$$
(21a)
$$\begin{aligned}&\begin{aligned} {\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}} =&({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T[L_1(G_1\otimes I_P),\dots ,\\&L_N(G_N\otimes I_P)] (T^{{{\,\textrm{Tr}\,}}}\otimes I_P), \end{aligned} \end{aligned}$$
(21b)
$$\begin{aligned}&{\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes \bar{H} = HT^{{{\,\textrm{Tr}\,}}}, \end{aligned}$$
(21c)

where (21a) and (21b) hold by Proposition 2 (the right-hand side of (21a) is the structure matrix of the first s Boolean functions of (20f), the right-hand side of (21b) is the structure matrix of the last \(n-s\) Boolean functions of (20f)), and because \({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}\) is obtained from (19b) as follows:

$$\begin{aligned} \bar{L}_2 z^2(t)&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}z^1(t))\otimes (\bar{L}_2 z^2(t)) \otimes ({\textbf {1}}_P^{{{\,\textrm{Tr}\,}}} \xi (t))\\&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}} \otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}) (z^1(t)\otimes z^2(t)\otimes \xi (t))\\&\ \quad \text {(by Proposition~5)}\\&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}} \otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}) (z^1(t)\ltimes z^2(t)\ltimes \xi (t))\\&=({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}} \otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}) z(t)\xi (t). \end{aligned}$$

Moreover, one has

$$\begin{aligned}&{\textbf {1}}_S^{{{\,\textrm{Tr}\,}}} \otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}} = \underbrace{[\bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}},\dots ,\bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}]}_{S}, \end{aligned}$$
(22a)
$$\begin{aligned}&{\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes \bar{H}= \underbrace{[\bar{H},\dots ,\bar{H}]}_{S}, \end{aligned}$$
(22b)
$$\begin{aligned}&[L_1(G_1\otimes I_P),\dots ,L_N(G_N\otimes I_P)](T^{{{\,\textrm{Tr}\,}}}\otimes I_P) \end{aligned}$$
(23a)
$$\begin{aligned}&= [{{\,\textrm{Blk}\,}}_{1_{G_1}}(L_1),\dots ,{{\,\textrm{Blk}\,}}_{N_{G_N}}(L_N)](T^{{{\,\textrm{Tr}\,}}}\otimes I_P) \end{aligned}$$
(23b)
$$\begin{aligned}&= [\tilde{L}_1,\dots ,\tilde{L}_N]=:\tilde{L}, \end{aligned}$$
(23c)

where \(G_i\!\!=\!\delta _M^{i_{G_i}}\), \(L_i\!\!=\!\![{{\,\textrm{Blk}\,}}_1(L_i),\!\dots \!, {{\,\textrm{Blk}\,}}_M(L_i)]\),\({{\,\textrm{Blk}\,}}_j(L_i) \in \mathcal {L}_{N\times P}\), \({{\,\textrm{Blk}\,}}_{i_{G_i}(L_i)} = \tilde{L}_k\) with \(\delta _N^i={\text {Col}}_{k}(T^{{{\,\textrm{Tr}\,}}})\), \(i,k\in \llbracket 1,N \rrbracket \), \(j\in \llbracket 1,M \rrbracket \). (Recall that T is nonsingular.)

By (21b), (22a), and (23),

$$\begin{aligned}&[({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T\tilde{L}_1,\dots ,({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T\tilde{L}_N]\\&=\underbrace{[\bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}},\dots ,\bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}]}_{S}, \end{aligned}$$

where \(({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T\tilde{L}_i\in \mathcal {L}_{\bar{S}\times P}\), \(i\in \llbracket 1,N \rrbracket .\) Then

$$\begin{aligned}&[({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T\tilde{L}_{(i-1)\bar{S}+1},\dots ,({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T \tilde{L}_{(i-1)\bar{S}+\bar{S}}] \\&= \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}, \end{aligned}$$

where \(i\in \llbracket 1,S \rrbracket \). That is, for every \(i\in \llbracket 1,\bar{S} \rrbracket \), \(({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T\tilde{L}_{i}= ({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T\tilde{L}_{\bar{S}+i}=\cdots = ({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T\tilde{L}_{(S-1)\bar{S}+i}\), and all columns of \(({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T\tilde{L}_{i}\) are the same.

To sum up, the following result holds.

Theorem 9

Problem 2 has a solution if and only if there exists a state-feedback controller \(G\in \mathcal {L}_{M\times N}\) and a coordinate transformation \(T\in \mathcal {L}_{N\times N}\) such that

$$\begin{aligned}&HT^{{{\,\textrm{Tr}\,}}}\nonumber \\&= {\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes \bar{H}, \end{aligned}$$
(26a)
$$\begin{aligned}&({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T[{{\,\textrm{Blk}\,}}_{1_{G_1}}(L_1),\dots ,{{\,\textrm{Blk}\,}}_{N_{G_N}}(L_N)](T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\nonumber \\&= {\textbf {1}}_S^{{{\,\textrm{Tr}\,}}} \otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}, \end{aligned}$$
(26b)

where \(S=2^s\) for some \(s\in \llbracket 1,n \rrbracket \), \(\bar{S}=N/S\), \(\bar{H}\in \mathcal {L}_{Q\times \bar{S}}\), \(G=[\delta _M^{1_{G_1}}, \dots ,\delta _M^{N_{G_N}}]\), \(L=[L_1,\dots ,L_N]\) with \(L_i\in \mathcal {L}_{N\times MP}\), \(L_i=[{{\,\textrm{Blk}\,}}_1(L_i),\dots ,{{\,\textrm{Blk}\,}}_M(L_i)]\) with \({{\,\textrm{Blk}\,}}_j(L_i)\in \mathcal {L}_{N\times P}\), \(i\in \llbracket 1,N \rrbracket \), \(j\in \llbracket 1,M \rrbracket \), \(\bar{L}_2\in \mathcal {L}_{ \bar{S}\times \bar{S}}\).

Until now, the relationship between the solutions to Problem 2 and Moore’s partition is not obvious to see, because in (26), the term “partition” does not explicitly appear.

Because there are finitely many state-feedback controllers and finitely many coordinate transformations, one can verify whether Problem 2 has a solution using Theorem 9.

Next, based on Theorem 9, we reveal the relationship between the solutions to Problem 2 and Moore’s partition. The closed-loop BCN in Theorem 9 is

$$\begin{aligned}&\begin{aligned} z(t+1)=&T[{{\,\textrm{Blk}\,}}_{1_{G_1}}(L_1),\dots ,{{\,\textrm{Blk}\,}}_{N_{G_N}}(L_N)]\\&\ltimes (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)z(t)\xi (t), \end{aligned} \end{aligned}$$
(27a)
$$\begin{aligned}&y(t)= HT^{{{\,\textrm{Tr}\,}}} z(t), \end{aligned}$$
(27b)

where \(t=0,1,2,\dots \).

By (26), one sees that for each \(i\in \llbracket 1,N \rrbracket \), all columns of \(({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T{{\,\textrm{Blk}\,}}_{i_{G_i}}(L_i)\) are the same; for all \(i,i'\in \llbracket 1,S \rrbracket \), \(j\in \llbracket 1,\bar{S} \rrbracket \), and \(k,k'\in \llbracket 1,P \rrbracket \), it holds that \(HT^{{{\,\textrm{Tr}\,}}}\delta _S^i\delta _{\bar{S}}^j = HT^{{{\,\textrm{Tr}\,}}}\delta _S^{i'}\delta _{\bar{S}}^j\) and

$$\begin{aligned}&({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T[{{\,\textrm{Blk}\,}}_{1_{G_1}}(L_1),\dots ,{{\,\textrm{Blk}\,}}_{N_{G_N}}(L_N)]\nonumber \\&\ltimes (T^{{{\,\textrm{Tr}\,}}}\otimes I_P) \delta _S^i\delta _{\bar{S}}^j\delta _P^k \end{aligned}$$
(28a)
$$\begin{aligned}&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T[{{\,\textrm{Blk}\,}}_{1_{G_1}}(L_1),\dots ,{{\,\textrm{Blk}\,}}_{N_{G_N}}(L_N)]\nonumber \\&\quad \ltimes (T^{{{\,\textrm{Tr}\,}}}\otimes I_P) \delta _S^{i'}\delta _{\bar{S}}^j\delta _P^{k'}. \end{aligned}$$
(28b)

Then one can construct a partition of the state space \(\varDelta _N\) as \(\{\{\delta _S^i\delta _{\bar{S}}^j\vert i\in \llbracket 1,S \rrbracket \}\vert j\in \llbracket 1,\bar{S}\rrbracket \}\) with \(\bar{S}\) cells. Denote each cell \(\{\delta _S^i\delta _{\bar{S}}^j\vert i\in \llbracket 1,S \rrbracket \}\) by \([\delta _{\bar{S}}^j]_{\sim }\), then one can construct the following quotient BCN :

$$\begin{aligned}&\begin{aligned} {[z^2(t+1)]}_{\sim }&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T[{\text {Col}}_1({{\,\textrm{Blk}\,}}_{1_{G_1}}(L_1)),\dots ,\\&{\text {Col}}_1({{\,\textrm{Blk}\,}}_{N_{G_N}}(L_N))]T^{{{\,\textrm{Tr}\,}}}[z^2(t)]_{\sim },\\ \end{aligned} \end{aligned}$$
(29a)
$$\begin{aligned}&\qquad \qquad y(t) = HT^{{{\,\textrm{Tr}\,}}} [z^2(t)]_{\sim }, \end{aligned}$$
(29b)

where \(t=0,1,2,\dots \), \(z^2(t)\in \varDelta _{\bar{S}}\),

$$\begin{aligned}&({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T[{\text {Col}}_1({{\,\textrm{Blk}\,}}_{1_{G_1}}(L_1)),\dots ,\nonumber \\&{\text {Col}}_1({{\,\textrm{Blk}\,}}_{N_{G_N}}(L_N))]T^{{{\,\textrm{Tr}\,}}}[z^2(t)]_{\sim }\\&:=[({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})T[{{\,\textrm{Blk}\,}}_{1_{G_1}}(L_1),\dots ,{{\,\textrm{Blk}\,}}_{N_{G_N}}(L_N)]\nonumber \\&\quad \ \ltimes (T^{{{\,\textrm{Tr}\,}}}\otimes I_P) \delta _S^iz^2(t)\delta _P^k]_{\sim },\\&HT^{{{\,\textrm{Tr}\,}}}[z^2(t)]_{\sim }:=[HT^{{{\,\textrm{Tr}\,}}}\delta _S^iz^2(t)]_{\sim }, \end{aligned}$$

for any \(i\in \llbracket 1,S \rrbracket \) and \(k\in \llbracket 1,P \rrbracket \).

Because the functionality of a coordinate transformation T is renaming the vertices in the state-transition graph of a BCN and the quotient BCN (29) is equivalently obtained from a BCN (15) whose disturbance decoupling problem has a solution, we next give a new necessary and sufficient condition for Problem 2 to have a solution without using T. After putting a state-feedback controller (16) into (15), we obtain the following closed-loop BCN

$$\begin{aligned} x(t+1)&= [L_1(G_1\otimes I_P),\dots , \end{aligned}$$
(30a)
$$\begin{aligned}&\quad \ L_N(G_N\otimes I_P)] x(t)\xi (t), \end{aligned}$$
(30b)
$$\begin{aligned}&=: \hat{L}x(t)\xi (t), \end{aligned}$$
(30c)
$$\begin{aligned} y(t)&= Hx(t), \end{aligned}$$
(30d)

where \(t=0,1,2,\dots \), \(G_i\) and \(L_i\) are the same as those in (20), \(L_i(G_i\otimes I_P)={{\,\textrm{Blk}\,}}_{i_{G_i}} (L_i)\) as in (27a), \(i\in \llbracket 1,N \rrbracket \).

Theorem 10

Problem 2 has a solution if and only if there exists a state-feedback controller \(G\in \mathcal {L}_{M\times N}\) and a partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) of the state space \(\varDelta _N\) such that

  1. (i)

    all the cells of \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) have the same cardinality \(S=2^{s}\) for some \(s\in \llbracket 1, n\rrbracket \),

  2. (ii)

    for every two states \(x,x'\) of (30) in the same cell of \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\), it holds that \(Hx=Hx'\),

  3. (iii)

    for every two states \(x,x'\) of (30) in the same cell of \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\), for every \(k,k'\in \llbracket 1,P \rrbracket \), \(\hat{L} x\delta _P^k\) and \(\hat{L} x'\delta _P^{k'}\) are also in the same cell of \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\).

Proof

“if”: By (i) and (ii), write the partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) as

$$\begin{aligned} \{\{\delta _N^{1_1},\dots ,\delta _N^{1_S}\},\dots ,\{\delta _N^{\bar{S}_1},\dots ,\delta _N^{\bar{S}_S}\}\}, \end{aligned}$$

where \(\bigcup _{i=1}^{\bar{S}}\{\delta _N^{i_1},\dots ,\delta _N^{i_S}\}=\varDelta _N\), for all \(i\in \llbracket 1,\bar{S} \rrbracket \) and all \(j,j'\in \llbracket 1,S \rrbracket \), \(H\delta _N^{i_j}=H\delta _N^{i_{j'}}\). Define coordinate transformation

$$\begin{aligned} z(t) = T x(t), \end{aligned}$$
(31)

with \(T\in \mathcal {L}_{N\times N}\) as follows:

$$\begin{aligned}&{\text {Col}}_{i_j}(T)= \delta _N^{(j-1)\bar{S}+i}, \end{aligned}$$
(32)
$$\begin{aligned}&{\text {Col}}_{(j-1)\bar{S}+i}(T^{{{\,\textrm{Tr}\,}}})= \delta _N^{i_j}, \end{aligned}$$
(33)

where \(i\in \llbracket 1,\bar{S} \rrbracket \), \(j\in \llbracket 1,S \rrbracket \). Then by (ii), for all \(j,j'\in \llbracket 1,S \rrbracket \) and \(i\in \llbracket 1,\bar{S} \rrbracket \),

$$\begin{aligned} HT^{{{\,\textrm{Tr}\,}}}\delta _S^j\delta _{\bar{S}}^i=H\delta _N^{i_j}=H\delta _N^{i_{j'}}=HT^{{{\,\textrm{Tr}\,}}}\delta _S^{j'}\delta _{\bar{S}}^{i}. \end{aligned}$$

That is,

$$\begin{aligned} HT^{{{\,\textrm{Tr}\,}}} =: {\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes \bar{H}, \end{aligned}$$
(34)

i.e., (26a) is satisfied. By (iii), for all \(i\in \llbracket 1,\bar{S} \rrbracket \), \(j,j'\in \llbracket 1,S \rrbracket \), and \(k,k'\in \llbracket 1,P \rrbracket \),

$$\begin{aligned}&\hat{L} \delta _N^{i_j}\delta _P^k \end{aligned}$$
(35a)
$$\begin{aligned}&= \hat{L} T^{{{\,\textrm{Tr}\,}}}\delta _N^{(j-1)\bar{S}+i}\delta _P^k \end{aligned}$$
(35b)
$$\begin{aligned}&=\hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^j\delta _{\bar{S}}^i\delta _P^k \end{aligned}$$
(35c)

and

$$\begin{aligned}&\hat{L} \delta _N^{i_{j'}}\delta _P^{k'} \end{aligned}$$
(36a)
$$\begin{aligned}&= \hat{L} T^{{{\,\textrm{Tr}\,}}}\delta _N^{(j'-1)\bar{S}+i}\delta _P^{k'} \end{aligned}$$
(36b)
$$\begin{aligned}&= \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^{j'}\delta _{\bar{S}}^i\delta _P^{k'} \end{aligned}$$
(36c)

are in the same cell of \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\). Then we can denote the right-hand side of (35c) and the right-hand side of (36c) by \(\delta _N^{l_t}\) and \(\delta _N^{l_{t'}}\), respectively, where \(l\in \llbracket 1,\bar{S} \rrbracket \), \(t,t'\in \llbracket 1,S \rrbracket \). Then

$$\begin{aligned}&T\delta _N^{l_t}= \delta _N^{(t-1)\bar{S}+l}=\delta _S^t\delta _{\bar{S}}^l,\\&T\delta _N^{l_{t'}}= \delta _N^{(t'-1)\bar{S}+l}=\delta _S^{t'}\delta _{\bar{S}}^l,\\&\delta _S^t\delta _{\bar{S}}^l= T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^j\delta _{\bar{S}}^i\delta _P^k,\\&\delta _S^{t'}\delta _{\bar{S}}^l= T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^{j'}\delta _{\bar{S}}^i\delta _P^{k'}. \end{aligned}$$

Furthermore,

$$\begin{aligned} \delta _{\bar{S}}^l&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) \delta _S^t\delta _{\bar{S}}^l \\&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^j\delta _{\bar{S}}^i\delta _P^k,\\ \delta _{\bar{S}}^l&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) \delta _S^{t'}\delta _{\bar{S}}^l \\&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^{j'}\delta _{\bar{S}}^i\delta _P^{k'}. \end{aligned}$$

That is, the value of \(({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^j\delta _{\bar{S}}^i\delta _P^k\) does not depend on jk. Hence

$$\begin{aligned} ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)= {\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes \bar{L}_2 \otimes {\textbf {1}}_{P}^{{{\,\textrm{Tr}\,}}} \end{aligned}$$

for some \(\bar{L}_2\in \mathcal {L}_{\bar{S}\times \bar{S}}\), i.e., (26b) is satisfied.

By Theorem 9, Problem 2 has a solution as state-feedback controller G and coordinate transformation (31).

“only if”: By Theorem 9, we construct the partition

$$\begin{aligned}&\mathcal {P}_{{{\,\textrm{ddp}\,}}} \nonumber \\&= \big \{\big \{T^{{{\,\textrm{Tr}\,}}}\delta _N^i,T^{{{\,\textrm{Tr}\,}}}\delta _N^{\bar{S}+i},\dots ,T^{{{\,\textrm{Tr}\,}}}\delta _N^{(S-1)\bar{S}+i}\big \}\big \vert i\in \llbracket 1, \bar{S}\rrbracket \big \} \end{aligned}$$
(37)

of \(\varDelta _N\) which satisfies (i). By (26a), for all \(i\in \llbracket 1,\bar{S}\rrbracket \) and \(j,j'\in \llbracket 1,S\rrbracket \),

$$\begin{aligned}&HT^{{{\,\textrm{Tr}\,}}}\delta _N^{(j-1)\bar{S}+i}\\&= HT^{{{\,\textrm{Tr}\,}}}\delta _S^{j}\delta _{\bar{S}}^i = \bar{H}\delta _{\bar{S}}^i = HT^{{{\,\textrm{Tr}\,}}}\delta _S^{j'}\delta _{\bar{S}}^i = HT^{{{\,\textrm{Tr}\,}}}\delta _N^{(j'-1)\bar{S}+i}, \end{aligned}$$

which satisfies (ii). By (26b),

$$\begin{aligned} ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P) = {\textbf {1}}_S^{{{\,\textrm{Tr}\,}}} \otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}}, \end{aligned}$$

where \(\bar{L}_2\in \mathcal {L}_{\bar{S}\times \bar{S}}\). For all \(i\in \llbracket 1, \bar{S}\rrbracket \), \(j,j'\in \llbracket 1, S\rrbracket \), and \(k,k'\in \llbracket 1,P \rrbracket \), choose \(T^{{{\,\textrm{Tr}\,}}}\delta _N^{(j-1)\bar{S}+i}\) and \(T^{{{\,\textrm{Tr}\,}}}\delta _N^{(j'-1)\bar{S}+i}\) in the same cell of \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) (37),

$$\begin{aligned}&({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} T^{{{\,\textrm{Tr}\,}}}\delta _N^{(j-1)\bar{S}+i}\delta _P^k\\&=({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^j\delta _{\bar{S}}^i\delta _P^k\\&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}} \otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}})\delta _S^j\delta _{\bar{S}}^i\delta _P^k\\&= \bar{L}_2\delta _{\bar{S}}^i,\\&({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} T^{{{\,\textrm{Tr}\,}}}\delta _N^{(j'-1)\bar{S}+i}\delta _P^{k'}\\&=({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}}) T \hat{L} (T^{{{\,\textrm{Tr}\,}}}\otimes I_P)\delta _S^{j'}\delta _{\bar{S}}^i\delta _P^{k'}\\&= ({\textbf {1}}_S^{{{\,\textrm{Tr}\,}}} \otimes \bar{L}_2\otimes {\textbf {1}}_P^{{{\,\textrm{Tr}\,}}})\delta _S^{j'}\delta _{\bar{S}}^i\delta _P^{k'}\\&= \bar{L}_2\delta _{\bar{S}}^i, \end{aligned}$$

hence \( \hat{L} T^{{{\,\textrm{Tr}\,}}}\delta _N^{(j-1)\bar{S}+i}\delta _P^k\) and \( \hat{L} T^{{{\,\textrm{Tr}\,}}}\delta _N^{(j'-1)\bar{S}+i}\delta _P^{k'}\) are in the same cell of \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) as in (37), which satisfies (iii). \(\square \)

Remark 1

The main result obtained in [17] is as follows: for a BCN (15) with constant input, i.e., with \(m=0\), Problem 2 has a solution if and only if the state-transition graph of the reduced (15) has an equal, concolorous, perfect partition of its vertex set, where “equal”, “concolorous”, and “perfect” coincide with (i), (ii), and (iii) of Theorem 10, respectively. The method of checking the existence of such a partition designed in [17] is technically more complex than the method designed later based on Proposition 11. The reader who is interested in this topic could try to compare their efficiency.

By Theorem 10, to check whether Problem 2 has a solution, one needs to check whether there is a state-feedback controller \(G\in \mathcal {L}_{M\times N}\) such that for the closed-loop BCN (30) obtained by feeding G into BCN (15), there is a partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) of state space \(\varDelta _N\) satisfying (i), (ii), and (iii). One can see that \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) is very similar to and finer thanFootnote 4 Moore’s partition \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\) as in (11) if one regards \(\xi \) as input, hence \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) can be regarded as a Moore-style partition. By comparing \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) with \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\), one sees only minor differences between them: (a) in \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\), the cardinalities of all cells need not be the same, (b) in \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\), each input must drive the states of one cell into one cell, but different inputs may drive the states of one cell into different cells, (c) \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\) is unique with respect to BCN (3), but \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) may not be unique with respect to (30).

We define a relation \(\sim _{{{\,\textrm{ddp}\,}}}\in \varDelta _N\times \varDelta _N\) with respect to the closed-loop BCN (30) as follows: two states \(x,x'\) have relation \(\sim _{{{\,\textrm{ddp}\,}}}\) if and only if for every two disturbance sequences \(\xi _1\dots \xi _r\) and \(\xi _1'\dots \xi _r'\), where \(r\in \mathbb {N}\) (\(r=0\) means \(\xi _1\dots \xi _r=\xi _1'\dots \xi _r'=\epsilon \)), the output sequences generated by x and \(\xi _1\dots \xi _r\) and generated by \(x'\) and \(\xi _1'\dots \xi _r'\) are the same. Relation \(\sim _{{{\,\textrm{ddp}\,}}}\) is symmetric and transitive, but not necessarily reflexive. As introduced in Sect. 2.2, \(\sim _{{{\,\textrm{ddp}\,}}}\) can induce a partition \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) of \(\varDelta _N\). By definition, the equivalence relation that induces a given partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) satisfying (ii) and (iii) is finer than \(\sim _{{{\,\textrm{ddp}\,}}}\). Furthermore, we have the following result.

Proposition 11

A closed-loop BCN (30) admits a partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) satisfying (ii) and (iii) of Theorem 10 if and only if the relation \(\sim _{{{\,\textrm{ddp}\,}}}\) corresponding to (30) is reflexive.

Proof

“if”: Assume the relation \(\sim _{{{\,\textrm{ddp}\,}}}\) corresponding to (30) is reflexive. That is, now \(\sim _{{{\,\textrm{ddp}\,}}}\) is an equivalence relation. For every two states \(x,x'\) having relation \(\sim _{{{\,\textrm{ddp}\,}}}\), choosing the empty input sequence \(\epsilon \), by definition we have \(Hx=Hx'\). That is, \(\sim _{{{\,\textrm{ddp}\,}}}\) satisfies (ii). Let \(x,x'\) be two states having relation \(\sim _{{{\,\textrm{ddp}\,}}}\) and let \(\xi \xi _1\dots \xi _r,\xi '\xi _1'\dots \xi _r'\) be two disturbance sequences, by definition the output sequence generated by x and \(\xi \xi _1\dots \xi _r\) is the same as the output sequence generated by \(x'\) and \(\xi '\xi _1'\dots \xi _r'\). Then the output sequence generated by \(\hat{L}x\xi \) and \(\xi _1\dots \xi _r\) is the same as the output sequence generated by \(\hat{L}x'\xi '\) and \(\xi _1'\dots \xi _r'\). Due to the arbitrariness of \(\xi _1\dots \xi _r\) and \(\xi _1'\dots \xi _r'\), we have \(\hat{L}x\xi \) and \(\hat{L}x'\xi '\) have relation \(\sim _{{{\,\textrm{ddp}\,}}}\). That is, \(\sim _{{{\,\textrm{ddp}\,}}}\) satisfies (iii). Note that a nonreflexive relation \(\sim _{{{\,\textrm{ddp}\,}}}\) may not satisfy both (ii) and (iii), because the above x and \(x'\) may not necessarily be different.

“only if”: Assume (30) admits a partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) satisfying (ii) and (iii). By definition, every two states \(x,x'\) in the same cell of \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) satisfy that for all disturbance sequences \(\xi _1\dots \xi _r\) and \(\xi _1'\dots \xi _r'\), the output sequences generated by x and \(\xi _1\dots \xi _r\) and generated by \(x'\) and \(\xi _1'\dots \xi _r'\) are the same. Hence the equivalence relation inducing \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) is finer than \(\sim _{{{\,\textrm{ddp}\,}}}\). Compressing \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) by repeating the following action: merging two cells \(C_1,C_2\) such that the states of \(C_1\) and \(C_2\) produce the same output and transition into the same cell, until no two such cells exist. Then the modified \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) is the same as \(\sim _{{{\,\textrm{ddp}\,}}}\).

Corollary 1

If the relation \(\sim _{{{\,\textrm{ddp}\,}}}\) corresponding to a closed-loop BCN (30) is reflexive, then it is the coarsest among the partitions \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) satisfying (ii) and (iii) of Theorem 10.

By Proposition 11, the existence of partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) satisfying (ii) and (iii) of Theorem 10 can be checked by computing relation \(\sim _{{{\,\textrm{ddp}\,}}}\) as follows (since \(\sim _{{{\,\textrm{ddp}\,}}}\) is symmetric and transitive, it induces a partition \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) of \(\varDelta _N\) as introduced in Sect. 2.2, the following procedure works). (a) \(\mathcal {P}_{{{\,\textrm{tmp}\,}}} :=\mathcal {P}(\sim _0)\). (b) Arbitrarily choose a cell C of \(\mathcal {P}_{{{\,\textrm{tmp}\,}}}\), partition C as follows: two different states \(x,x'\) of C remain in the same cell if and only if for all disturbances \(\xi ,\xi '\), \(\hat{L}x\xi \) and \(\hat{L} x'\xi '\) are in the same cell of the current \(\mathcal {P}_{{{\,\textrm{tmp}\,}}}\). (c) Repeat (b) until no cell of \(\mathcal {P}_{{{\,\textrm{tmp}\,}}}\) can be divided, and then the final \(\mathcal {P}_{{{\,\textrm{tmp}\,}}}\) is \(\sim _{{{\,\textrm{ddp}\,}}}\). As introduced in Sect. 2.2, \(\sim _{{{\,\textrm{ddp}\,}}}\) is reflexive if and only if for every singleton cell \(\{x\}\) of \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\), x has relation \(\sim _{{{\,\textrm{ddp}\,}}}\) with itself. Moreover, for such an x, \(x\sim _{{{\,\textrm{ddp}\,}}}x\) if and only if for every two disturbances \(\xi ,\xi '\), it holds that \(\hat{L}x\xi \) and \(\hat{L} x\xi '\) are in the same cell of \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\).

Now we know how to check whether a closed-loop BCN (30) admits a partition satisfying (ii) and (iii) of Theorem 10. If the obtained \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) has all cells with the same cardinality, then \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) also satisfies (i); otherwise, one needs to check if there is a refinement of \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) that satisfies (i), (ii), and (iii) simultaneously.

Remark 2

Problem 1 is coordinate-independent, so when looking for a partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) as in Theorem 10, (i) plays no role. We obtain a reformulation of Theorem 10: Problem 1 has a solution if and only if there exists a state-feedback controller \(G\in \mathcal {L}_{M\times N}\) and a partition \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) of the state space \(\varDelta _N\) such that (ii) and (iii) of Theorem 10 are satisfied. The existence of such \(\mathcal {P}_{{{\,\textrm{ddp}\,}}}\) can be checked by computing \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) and then check whether \(\sim _{{{\,\textrm{ddp}\,}}}\) is reflexive by Proposition 11.

Example 5

Consider the following BCN:

$$\begin{aligned}&\begin{aligned} x(t+1) = \delta _8[&2,8,1,2,2,8,1,2,\\&1,7,1,2,3,5,1,2,\\&1,1,1,2,3,3,1,2,\\&2,2,1,2,2,2,1,2]x(t)u(t)\xi (t), \end{aligned} \end{aligned}$$
(38a)
$$\begin{aligned}&y(t) = \delta _2[1,2,2,1,2,1,1,2]x(t), \end{aligned}$$
(38b)

where \(t=0,1,2,\dots \), \(x(t)\in \varDelta _8\), \(u(t)\in \varDelta \), \(\xi (t)\in \varDelta \), and \(y(t)\in \varDelta \) denote the state, input, disturbance, and output at time t, respectively.

We next check if the coordinate-dependent disturbance decoupling problem of BCN (38) is solvable. If so, find a solution to the problem.

Denote

$$\begin{aligned} L = \delta _8[&2,8,1,2,2,8,1,2,\\&1,7,1,2,3,5,1,2,\\&1,1,1,2,3,3,1,2,\\&2,2,1,2,2,2,1,2],\\ H = \delta _2[&1,2,2,1,2,1,1,2]. \end{aligned}$$

Substituting a state-feedback controller candidate

$$\begin{aligned} u(t) = Gx(t), \end{aligned}$$
(39)

where \(G\in \mathcal {L}_{2\times 8}\), into (38), by (30) we obtain

$$\begin{aligned} x(t+1) = [L_1{\text {Col}}_1(G),\dots ,L_8{\text {Col}}_8(G)] x(t)\xi (t), \end{aligned}$$
(40)

where \(L_i\in \mathcal {L}_{8\times 4}\), \({\text {Col}}_i(G)\in \mathcal {L}_{2\times 1}\), \(i\in \llbracket 1, 8 \rrbracket \). When \({\text {Col}}_i(G)=\delta _2^1\) (resp., \(\delta _2^2\)), \(L_i{\text {Col}}_i(G)\) is the first (resp., last) two columns of \(L_i\).

We might as well first choose \(G=\delta _2[1,1,1,1,1,1,\) 1, 1], then (40) is specified as

$$\begin{aligned}&x(t+1)\nonumber \\&= \delta _8[2,8,2,8,1,7,3,5,1,1,3,3,2,2,2,2] x(t)\xi (t) \end{aligned}$$
(41a)
$$\begin{aligned}&=: \bar{L}x(t)\xi (t). \end{aligned}$$
(41b)

For all \(\xi ',\xi '' \in \varDelta \),

$$\begin{aligned} \bar{L}\delta _8^1\xi '&= \delta _8^2\text {~~or~~}\delta _8^8, ~~ \bar{L}\delta _8^7\xi '' = \delta _8^2,\\ \bar{L}\delta _8^4\xi '&= \delta _8^3\text {~~or~~}\delta _8^5, ~~ \bar{L}\delta _8^6\xi '' = \delta _8^3,\\ \bar{L}\delta _8^2\xi '&= \delta _8^2\text {~~or~~}\delta _8^8, ~~ \bar{L}\delta _8^8\xi '' = \delta _8^2,\\ \bar{L}\delta _8^3\xi '&= \delta _8^1\text {~~or~~}\delta _8^7, ~~ \bar{L}\delta _8^5\xi '' = \delta _8^1. \end{aligned}$$

After a series of calculations, one has

figure c

By the output function (38b), we obtain partition \(\mathcal {P}(\sim _0)\) \(=\{C_1,C_2\}\) of the state space \(\varDelta _8\), where \(C_1=\{\delta _8^1,\delta _8^4,\delta _8^6,\delta _8^7\}\), \(C_2=\{\delta _8^2,\delta _8^3,\delta _8^5,\delta _8^8\}\), such that the states in the same cell (i.e., \(C_1\) or \(C_2\)) produce the same output. Choosing \(C_2\), under disturbances, \(\delta _8^2,\delta _8^8\) remain in \(C_2\), but \(\delta _8^3,\delta _8^5\) transition into \(C_1\), so we partition \(C_2\) as \(C_{21}=\{\delta _8^2,\delta _8^8\}\) and \(C_{22}=\{\delta _8^3,\delta _8^5\}\) and obtain \(\mathcal {P}(\sim _1)\). Analogously, we partition \(C_1\) as \(C_{11}=\{\delta _8^1,\delta _8^7\}\) and \(C_{12}=\{\delta _8^4, \delta _8^6\}\) and obtain \(\mathcal {P}(\sim _2)\). \(\mathcal {P}(\sim _2)\) cannot be divided any more, and there is no singleton in \(\mathcal {P}(\sim _2)\), then \(\mathcal {P}(\sim _2)=\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) and \(\sim _{{{\,\textrm{ddp}\,}}}\) is reflexive. Moreover, the cardinalities of all cells of \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) are the same, \(\mathcal {P}(\sim _{{{\,\textrm{ddp}\,}}})\) satisfies (i), (ii), and (iii) of Theorem 10. The disturbance decoupling problem of (38) has a solution (starting from both states in each of the four cells, \(\xi \) does not affect y).

To rename

$$\begin{aligned} \delta _8^1,\delta _8^7,\delta _8^4,\delta _8^6,\delta _8^2,\delta _8^8,\delta _8^3,\delta _8^5 \end{aligned}$$

to

$$\begin{aligned} \delta _8^1,\delta _8^5,\delta _8^4,\delta _8^8,\delta _8^2,\delta _8^6,\delta _8^3,\delta _8^7, \end{aligned}$$

respectively, we construct the coordinate transformation

$$\begin{aligned} z(t) = Tx(t), \end{aligned}$$
(42)

where \(T=\delta _8[1,2,3,4,7,8,5,6]\). Substituting \(x(t) = T^{{{\,\textrm{Tr}\,}}}\) \(\ltimes \) z(t) into (41) and the output function (38b), we obtain

$$\begin{aligned} z(t+1)&= T\bar{L} T^{{{\,\textrm{Tr}\,}}} z(t)\xi (t) \end{aligned}$$
(43a)
$$\begin{aligned}&= \delta _8[1,2,3,4,7,8,5,6]\nonumber \\&\quad \ \ltimes \delta _8[2,8,2,8,1,7,3,5,1,1,3,3,2,2,2,2]\nonumber \\&\quad \ \ltimes \delta _8[1,2,3,4,7,8,5,6] z(t)\xi (t) \end{aligned}$$
(43b)
$$\begin{aligned}&= \delta _8[2,6,2,6,1,5,3,7,1,1,3,3,2,2,2,2]\nonumber \\&\quad \ \ltimes \delta _8[1,2,3,4,7,8,5,6] z(t)\xi (t) \end{aligned}$$
(43c)
$$\begin{aligned}&= \delta _8[2,6,2,6,1,5,3,7,2,2,2,2,1,1,3,3]\nonumber \\&\quad \ \ltimes z(t)\xi (t), \end{aligned}$$
(43d)
$$\begin{aligned} y(t)&= \delta _2[1,2,2,1,2,1,1,2] T^{{{\,\textrm{Tr}\,}}}z(t)\end{aligned}$$
(43e)
$$\begin{aligned}&= \delta _2[1,2,2,1,2,1,1,2] \delta _8[1,2,3,4,7,8,5,6] z(t)\end{aligned}$$
(43f)
$$\begin{aligned}&= \delta _2[1,2,2,1,1,2,2,1] z(t)\end{aligned}$$
(43g)
$$\begin{aligned}&= ([1,1]\otimes \delta _2[1,2,2,1]) (z^1(t)z^2(t))\end{aligned}$$
(43h)
$$\begin{aligned}&= ([1,1]\otimes \delta _2[1,2,2,1]) (z^1(t)\otimes z^2(t))\end{aligned}$$
(43i)
$$\begin{aligned}&= ([1,1]z^1(t)) \otimes (\delta _2[1,2,2,1]z^2(t))\end{aligned}$$
(43j)
$$\begin{aligned}&= \delta _2[1,2,2,1]z^2(t), \end{aligned}$$
(43k)

where \(z(t)=z^1(t)z^2(t)\), \(z^1(t)\in \varDelta \), \(z^2(t)\in \varDelta _4\).

Furthermore, we compute the structure matrix of \(z^2(t)\) in (43) as follows

$$\begin{aligned}&([1,1]\otimes I_4)\delta _8[2,6,2,6,1,5,3,7,2,2,2,2,1,1,3,3]\end{aligned}$$
(44a)
$$\begin{aligned}&= \delta _4[1,2,3,4,1,2,3,4]\nonumber \\&\quad \ltimes \delta _8[2,6,2,6,1,5,3,7,2,2,2,2,1,1,3,3]\end{aligned}$$
(44b)
$$\begin{aligned}&= \delta _4[2,2,2,2,1,1,3,3,2,2,2,2,1,1,3,3]\end{aligned}$$
(44c)
$$\begin{aligned}&= \delta _4[2,2,1,3,2,2,1,3] \otimes [1,1]\end{aligned}$$
(44d)
$$\begin{aligned}&= [1,1]\otimes \delta _4[2,2,1,3] \otimes [1,1]. \end{aligned}$$
(44e)

By (43) and (44), after substituting coordinate transformation (42) into (41) and the output function (38b), we obtain

$$\begin{aligned}&z^1(t+1) \\&= (I_2\otimes [1,1,1,1])\nonumber \\&\quad \ \ltimes \delta _8 [2,6,2,6,1,5,3,7,2,2,2,2,1,1,3,3]z(t)\xi (t)\\&= \delta _2[1,1,1,1,2,2,2,2]\nonumber \\&\quad \ \ltimes \delta _8 [2,6,2,6,1,5,3,7,2,2,2,2,1,1,3,3]z(t)\xi (t)\\&= \delta _2[1,2,1,2,1,2,1,2,1,1,1,1,1,1,1,1]z(t)\xi (t),\\&z^2(t+1) \\&= ([1,1]\otimes \delta _4[2,2,1,3] \otimes [1,1])(z^1(t)z^2(t)\xi (t))\\&= ([1,1]z^1(t))\otimes (\delta _4[2,2,1,3]z^2(t)) \otimes ([1,1]\xi (t))\\&= \delta _4[2,2,1,3]z^2(t), \end{aligned}$$
$$\begin{aligned} y(t) = \delta _2[1,2,2,1] z^2(t), \end{aligned}$$

where \(z^1(t)\in \varDelta \), \(z^2(t)\in \varDelta _4\). \(\xi \) does not affect y. Hence, the pair of state-feedback controller \(u(t)=\delta _2[1,1,1,1,1,\) 1, 1, 1]x(t) and coordinate transformation (42) is a solution to the coordinate-dependent disturbance decoupling problem of BCN (38).

3.6 An application to observability decomposition of Boolean control networks

In this subsection, we introduce the observability decomposition problem of BCNs with respect to multiple-experiment observability (Definition 12). As mentioned in Theorem 6 (Sect. 3.1), Moore’s partition directly provides an algorithm for verifying multiple-experiment observability of BCNs. In this subsection, we will show that Moore’s partition also directly provides an algorithm for checking the existence of the observability decomposition of BCNs with respect to this type of observability, as well as computing the decomposition if one exists. When characterizing the observability decomposition, one will see that its form is quite similar to the disturbance decoupling problem studied in Sect. 3.5. That is to say, Moore’s partition closely relate the observability decomposition problem and the disturbance decoupling problem in BCNs.

Before formally introducing the observability decomposition problem, we show a motivating example.

Example 6

Consider the BCN

$$\begin{aligned} x(t+1)&= \delta _8[1,2,2,1,5,6,6,5,\nonumber \\&\qquad \qquad 1,1,2,2,5,5,6,6]u(t)x(t), \end{aligned}$$
(46a)
$$\begin{aligned} y(t)&= \delta _2[1,2,2,2,1,2,2,2]x(t), \end{aligned}$$
(46b)

where \(t=0,1,2,\dots \), \(x(t)\in \varDelta _8\), \(u(t),y(t)\in \varDelta \). The corresponding partition \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\) defined in (11) is

$$\begin{aligned} \{\{\delta _8^1,\delta _8^5\},\{\delta _8^2,\delta _8^6\},\{\delta _8^3,\delta _8^7\},\{\delta _8^4,\delta _8^8\}\}. \end{aligned}$$
(47)

By Theorem 6, (46) is not multiple-experiment observable, because partition (47) is not discrete. However, one can see that all cells of (47) have the same cardinality 2. Moreover, if one regards each cell as a state and consider the quotient BCN obtained by identifying all states in the same cell, then the quotient BCN is multiple-experiment observable. In detail, rewrite (47) as

$$\begin{aligned} \{\{\delta _2^1\delta _4^1,\delta _2^2\delta _4^1\},\{\delta _2^1\delta _4^2,\delta _2^2\delta _4^2\},\{\delta _2^1\delta _4^3,\delta _2^2\delta _4^3\},\{\delta _2^1\delta _4^4,\delta _2^2\delta _4^4\}\}, \end{aligned}$$
(48)

the cells are \(\{\delta _2^1\delta _4^i,\delta _2^2\delta _4^i\}\), \(i\in \llbracket 1,4\rrbracket \). Also rewrite \(x(t)=z^1(t)\ltimes z^2(t)\), where \(z^1(t)\in \varDelta \), \(z^2(t)\in \varDelta _4\), then (46) can be rewritten as

$$\begin{aligned} z^1(t+1) =&\, (I_2\otimes \textbf{1}_4^{{{\,\textrm{Tr}\,}}}) \delta _8[1,2,2,1,5,6,6,5,\nonumber \\&\qquad \qquad \quad ~~~~~~1,1,2,2,5,5,6,6]u(t)x(t)\nonumber \\ =&\, \delta _2[1,1,1,1,2,2,2,2,\nonumber \\&\quad \, 1,1,1,1,2,2,2,2]u(t)z^1(t)z^2(t)\nonumber \\ =&\, (\textbf{1}_2^{{{\,\textrm{Tr}\,}}}\otimes I_2\otimes \textbf{1}_4^{{{\,\textrm{Tr}\,}}})(u(t)z^1(t)z^2(t))\nonumber \\ =&\, (\textbf{1}_2^{{{\,\textrm{Tr}\,}}}u(t))\otimes (I_2z^1(t))\otimes (\textbf{1}_4^{{{\,\textrm{Tr}\,}}}z^2(t))\nonumber \\ =&\, z^1(t),\nonumber \\ z^2(t+1) =&\, (\textbf{1}_2^{{{\,\textrm{Tr}\,}}}\otimes I_4) \delta _8[1,2,2,1,5,6,6,5,\nonumber \\&\qquad \qquad \quad ~~~~~~1,1,2,2,5,5,6,6]u(t)x(t)\nonumber \\ =&\, \delta _4[1,2,2,1,1,2,2,1,\nonumber \\&\quad ~\, ~1,1,2,2,1,1,2,2]u(t)z^1(t)z^2(t)\nonumber \\ =&\, \delta _4[1,2,2,1,1,1,2,2]u(t)z^2(t),\end{aligned}$$
(49a)
$$\begin{aligned} y(t) =&\, \delta _2[1,2,2,2,1,2,2,2]z^1(t)z^2(t)\nonumber \\ =&\, \delta _2[1,2,2,2]z^2(t), \end{aligned}$$
(49b)

where the pair ((49a), (49b)) can be regarded as the above quotient BCN and it is multiple-experiment observable, because the corresponding partition \(\mathcal {P}(\sim _{\text {ind}})\) defined in (11) is \(\{\{\delta _4^1\},\{\delta _4^2\},\{\delta _4^3\},\{\delta _4^4\}\}\).

From the above example, one can see that the decomposed form (49) can be regarded as a form of observability decomposition of BCN (46). The condition for the existence of the decomposition is that the cells of the corresponding Moore’s partition have the same cardinality equal to \(2^s\) for some \(s\ge 1\). It is easy to see that the condition is necessary and sufficient. Therefore, the observability decomposition problem and the condition for the existence of an observability decomposition can be formulated as follows.

Problem 3

Consider a BCN (10). Its observability decomposition problem is to check whether there exists a coordinate transformation \(z(t)=Tx(t)\) as in (17) (i.e., \(T\in \mathcal {L}_{N\times N}\) is nonsingular) such that after feeding the transformation into (10), one obtains the following decomposed BCN:

$$\begin{aligned} z^1(t+1)&= \bar{L}_1u(t)z(t)\end{aligned}$$
(50a)
$$\begin{aligned}&=: [\bar{L}_{11},\dots ,\bar{L}_{1M}]u(t)z(t),\nonumber \\ z^2(t+1)&= \bar{L}_2u(t)z^2(t),\end{aligned}$$
(50b)
$$\begin{aligned}&=: [\bar{L}_{21},\dots ,\bar{L}_{2M}]u(t)z^2(t)\nonumber \\ y(t)&= \bar{H}z^2(t), \end{aligned}$$
(50c)

where \(t=0,1,2,\dots \), \(z^1(t)\in \varDelta _S\), \(S=2^s\) for some \(s\in \llbracket 1,n \rrbracket \), \(z^2(t)\in \varDelta _{\bar{S}}\), \(\bar{S}=N/S=2^{\bar{s}}\) with \(\bar{s}=n-s\), \(\bar{L}_1\in \mathcal {L}_{S\times MN}\), \(\bar{L}_2\in \mathcal {L}_{\bar{S}\times M\bar{S}}\), \(\bar{H}\in \mathcal {L}_{Q\times \bar{S}}\), \(\bar{L}_{1j}\in \mathcal {L}_{S\times N}\), \(\bar{L}_{2j}\in \mathcal {L}_{\bar{S}\times \bar{S}}\), \(j\in \llbracket 1,M\rrbracket \), and the pair ((50b), (50c)) is multiple-experiment observable.

Using argument similar to that used in Sect. 3.5, one can see that the existence of a coordinate transformation as in Problem 3 is equivalent to the condition that the cells of the corresponding Moore’s partition have the same cardinality equal to \(2^s\) for some \(s\ge 1\).

Theorem 12

Problem 3 has a solution if and only if the cells of the corresponding Moore’s partition \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\) defined in (11) have the same cardinality equal to \(2^s\) for some \(s\in \llbracket 1,n\rrbracket \).

Remark 3

In [3], it was shown that for a linear time-invariant control system (13), its quotient system over the quotient space \(\mathbb {R}^n/\bigcap _{i=0}^{n-1}\mathcal {O}_i\) is observable, where recall that \(\bigcap _{i=0}^{n-1}\mathcal {O}_i\) is the unobservable subspace of (13). Comparing this quotient system with the pair ((50b), (50c)), one can regard the latter as a counterpart of the former in a BCN (10). In addition, with respect to the disturbance decoupling problem, the quotient system of a BCN (15) with disturbance was derived in (29). An essential difference lies in that, the quotient system of (13) with respect to observability always exists, but the above quotient BCN ((50b), (50c)) of BCN (10) with respect to multiple-experiment observability and the quotient BCN (29) of BCN (15) with disturbance do not, which also reveals the essence of nonlinearity of BCNs.

Next we restate a related result obtained in [44] which is consistent with Theorem 12, but s was given and the results were represented by sub-BCNs \(\mathcal {B}_j\) which are obtained from BCN (10) by setting \(u(t)\equiv \delta _M^j\), \(j\in \llbracket 1,M\rrbracket \). By comparing these two results, one can see that it is easier to use Theorem 12 to check the existence of an observability decomposition for a BCN (10).

One can equivalently represent (50) as the following based on an appropriate coordinate transformation \(T\in \mathcal {L}_{N\times N}\) (if such a T exists):

$$\begin{aligned}&(I_S\otimes {\textbf {1}}_{\bar{S}}^{{{\,\textrm{Tr}\,}}})TL_jT^{{{\,\textrm{Tr}\,}}} = \bar{L}_{1j}, \end{aligned}$$
(51a)
$$\begin{aligned}&({\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes I_{\bar{S}})TL_jT^{{{\,\textrm{Tr}\,}}} = {\textbf {1}}_{\bar{S}}^{{{\,\textrm{Tr}\,}}}\otimes \bar{L}_{2j}, \end{aligned}$$
(51b)
$$\begin{aligned}&HT^{{{\,\textrm{Tr}\,}}} = {\textbf {1}}_{S}^{{{\,\textrm{Tr}\,}}}\otimes \bar{H}, \end{aligned}$$
(51c)

where \(j\in \llbracket 1,M\rrbracket \). Then the following result was obtained in [44].

Theorem 13

Let s be in \(\llbracket 1,n\rrbracket \) and \(\bar{s}= n-s\). Then Problem 3 has a solution if and only if there is a partition \(\mathcal {P}_{{{\,\textrm{obsd}\,}}}\) of \(\varDelta _N\) such that

  1. (i)

    all the cells of \(\mathcal {P}_{{{\,\textrm{obsd}\,}}}\) have the same cardinality \(S=2^{s}\),

  2. (ii)

    for every two states \(x,x'\) in the same cell of \(\mathcal {P}_{{{\,\textrm{obsd}\,}}}\), it holds that \(Hx=Hx'\),

  3. (iii)

    for all \(j\in \llbracket 1, M\rrbracket \), for every two states \(x,x'\) in the same cell of \(\mathcal {P}_{{{\,\textrm{obsd}\,}}}\), \(L_jx\) and \(L_jx'\) are also in the same cell of \(\mathcal {P}_{{{\,\textrm{obsd}\,}}}\),

  4. (iv)

    \(\mathcal {P}_{{{\,\textrm{obsd}\,}}}\) is the coarsest among the partitions of \(\varDelta _N\) that satisfy (ii) and (iii).

The conditions (i), (ii), and (iii) together imply (51). Then the condition (iv) implies that the pair ((50b), (50c)) is multiple-experiment observable. The latter was guaranteed by a proposition proven in [44]: Let \(\mathcal {P}\) be a partition of \(\varDelta _N\) that satisfies (ii) and (iii), then \(\mathcal {P}\) is finer than the corresponding Moore’s partition \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\). If (iv) holds then corresponding to the quotient BCN obtained by identifying all states in the same cell of \(\mathcal {P}_{{{\,\textrm{obsd}\,}}}\), \(\mathcal {P}(\sim _{{{\,\textrm{ind}\,}}})\) is discrete.

A short conclusion   In this section, we introduced several control problems of BCNs, including computation of the smallest invariant dual subspace of a BN containing a given set of Boolean functions, the multiple-experiment observability verification (decomposition) problem, and the disturbance decoupling problem. It is surprising that Moore’s partition plays a central role in all these fundamental problems and hence closely relates them. However, Moore’s partition has its own drawbacks, e.g., when using Moore’s partition to verify arbitrary-experiment observability (Definition 15) of BCNs [43], the procedure and the obtained verification algorithm are very complex; Moore’s partition does not apply to the other notions of observability of BCNs; in addition, Moore’s partition cannot be extended to nondeterministic finite-transition systems in [20] and [13, Remark 4.1]. In the next section, we introduce a different method which applies to the four definitions of observability of BCNs as introduced in Sect. 2.7. Since this new method was found by the writer when trying to verify the negations of these definitions of observability of BCNs [21, 37], it was called an inverse method in the current paper.

4 An inverse method based on the notion of observability graph

4.1 Introductory subsection

During the early stages of the study of observability problems of BCNs (between 2008 and 2016), the model-based testing area (including [10, 39, 12] and references therein) in computer science should not be known to the researchers in the control area of BCNs. The earliest works on the verification of different types of observability of BCNs are under the controllability assumption (see a necessary and sufficient condition for strong multiple-experiment observability [33] and a necessary and sufficient condition for single-experiment observability [42], both based on the controllability assumption). Although an STP representation for Moore’s partition was rediscovered in [33, 42], Moore’s partition does not apply to these two types of observability (later we will show the reason), hence necessary and sufficient conditions for these types of observability were not obtained in [33, 42]. On the other hand, although Moore’s partition applies to multiple-experiment observability (shown in Theorem 6), it was not used to verify this type of observability in [41]. Only a sufficient but not necessary condition was given for this type of observability in [41]. The first necessary and sufficient condition for observability of BCNs was given in [43] based on Moore’s partition, where only arbitrary-experiment observability was studied. Although a necessary and sufficient condition for arbitrary-experiment observability was given in [43], the procedure of deriving the condition is rather complex, so is the condition itself: a BCN (10) is arbitrary-experiment observable if and only if for every pair of different periodic (state, input)-trajectories of the same minimal period k and the same input trajectory, the corresponding output trajectories are also different and periodic of minimal period k. The first necessary and sufficient condition for single-experiment observability of BCNs was given in [45]. The method used is to give a length upper bound on input sequences such that if no input sequence of that length is a distinguishing input sequence of (10) then no distinguishing input sequence exists.

Hence until 2013, no necessary and sufficient condition for multiple-experiment observability or strong multiple-experiment observability of BCNs had been found in the control area of BCNs. The methods used in [43, 45] cannot be used to deal with the other types of observability.

From 2012 to 2013, the writer was visiting Department of Mathematics and Statistics of University of Turku, Finland as a joint Ph.D. candidate. During that period, he was mainly studying cellular automata (discrete-time dynamical systems over the Cantor space, a natural generalization of deterministic Turning machines), and his main research interests gradually moved from control theory to theoretical computer science. Upon leaving, he discussed the decidability problem of one type of observability of BCNs with his colleges therein, including Jarkko Kari, Charalampos Zinoviadis, Valle Salo, and Illka Törmä. The idea of using finite automata and formal languages to verify observability of BCNs was born during the discussion but did not take shape during that period.

The idea took shape after the writer left Finland and just arrived in Singapore in around October, 2013. This time he considered the negation of observability of BCNs, and proposed the concept of weighted pair graph [21, 37] (see Definition 16, later renamed observability graph in [13, 23]), which aggregates every pair of trajectories of a BCN that produce the same output sequence. Note that this kind of thinking is essentially different from the ideas used in the previous papers on observability of BCNs, including [33, 41,42,43, 45], because in these papers, the authors all considered directly verifying different definitions of observability themselves instead of their negations. After considering verifying the negations of various definitions of observability, the writer proposed a unified method to verify the four definitions of observability shown in Sect. 2.7 (see [21, 37]). The overall idea is as follows: first, given a BCN (3), compute its weighted pair graph (i.e., observability graph); second, compute deterministic finite automata (DFAs) from the graph adapted to the four definitions of observability; third, use each DFA to verify the corresponding type of observability. See Fig. 4 for a sketch. Next, we show the implementation of the sketch.

Fig. 4
figure 4

Sketch of using the notions of observability graph and deterministic finite automaton to verify observability of BCNs [21, 37]

4.2 Observability verification

Now we introduce the notion of weighted pair graph (i.e., observability graph).

Definition 16

[21, 37]    Consider a BCN (3). A triple \(\mathcal {G}_o=(\mathcal {V},\mathcal {E},\mathcal {W})\) is called its observability graph if the vertex set \(\mathcal {V}\) is equal to \( \{\{x,x'\}\vert x,x'\in \mathcal {D}_N,h(x)=h(x')\}\)Footnote 5, the edge set \(\mathcal {E}\) is equal to \(\{(\{x_1,x_1'\},\{x_2,x_2'\})\in \mathcal {V}\times \mathcal {V}\vert \)there exists \(u\in \mathcal {D}_M\) such that \(f(u,x_1)=x_2\) and \(f(u,x_1')=x_2'\), or, \(f(u,x_1)=x_2'\) and \(f(u,x_1')=x_2\}\subset \mathcal {V}\times \mathcal {V}\), and the weight function \(\mathcal {W}:\mathcal {E}\rightarrow 2^{\mathcal {D}_M}\) assigns to each edge \((\{x_1,x_1'\},\{x_2,x_2'\})\in \mathcal {E}\) a set \( \{u\in \mathcal {D}_M\vert f(u,x_1)=x_2\) and \(f(u,x_1')=x_2'\), or, \(f(u,x_1)=x_2'\) and \(f(u,x_1')=x_2\}\) of inputs. A vertex \(\{x,x'\}\) is called diagonal if \(x=x'\), called non-diagonal otherwise. For a vertex \(v\in \mathcal {V}\), its outdegree is \({\text {outdeg}}(v):=\vert \bigcup _{(v,v')\in \mathcal {E}}\mathcal {W}((v,v'))\vert \), i.e., the number of inputs appearing in the edges starting from v. The diagonal subgraph of an observability graph is defined by all diagonal vertices and all edges between them, and denoted by \(\diamond \). Similarly, the non-diagonal subgraph is defined by all non-diagonal vertices and all edges between them.

For a BCN (3), the size of its observability graph \(\mathcal {G}_o\) is \(O(2^{2n+m-1})=O(2^{2n+m})\). Because at each diagonal vertex of \(\mathcal {G}_o\), there exist exactly \(2^m\) outgoing edges and all these edges go to diagonal vertices, we can consider the diagonal subgraph as a single vertex. Therefore, the number of vertices can be equivalently bounded by \(1+2^n(2^n-1)/2\) from above. When drawing \(\mathcal {G}_o\), the diagonal subgraph will be drawn as a single vertex \(\diamond \).

Example 7

The observability graph of BCN (12) is shown in Fig. 5.

Fig. 5
figure 5

Observability graph of BCN (12), where \(\diamond \) denotes the diagonal subgraph

Recall a DFA \(A=(Q,\varSigma ,\delta ,q_0,F)\) as in Definition 4. If \(F=Q\), then A recognizes formal language \(\varSigma ^*\). Note that \(\delta \) is defined at each pair (qa) in \(Q\times \varSigma \). Such standard DFAs are called complete in the current paper. To conveniently verify observability of BCNs, incomplete DFAs are also considered, in which \(\delta \) is not defined at at least one pair in \(Q\times \varSigma \). The words w accepted by an incomplete DFA A are also defined as those satisfying \(\delta (q_0,w)\in F\), and the language recognized by an incomplete DFA A is also defined as the set of all words accepted by A. The following direct proposition will be useful in verifying observability of BCNs.

Proposition 14

[21]    Consider a (complete or incomplete) DFA \(A=(Q,\varSigma ,\delta ,q_0,Q)\) (all states are final). Assume each state is reachable from \(q_0\). Then A recognizes language \(\varSigma ^*\) if and only if A is complete.

Since we verify the negations of different definitions of observability of BCNs, we characterize their negations in the following four propositions. These propositions can be directly obtained by definition.

Proposition 15

A BCN (3) is not multiple-experiment observable (Definition 12) if and only if there exist two different initial states \(x_0,x_0'\in \mathcal {D}^n\) such that for each \(p\in \mathbb {N}\) and each input sequence \(U\in (\mathcal {D}^m)^p\), the corresponding output sequences are the same.

Proposition 16

A BCN (3) is not strongly multiple-experiment observable (Definition 13) if and only if there exists an initial state \(x_0\in \mathcal {D}^n\) such that for each \(p\in \mathbb {N}\) and each input sequence \(U\in (\mathcal {D}^m)^p\), there exists an initial state \(x_0'\in \mathcal {D}^n\) different from \(x_0\) such that the corresponding output sequences are the same.

Proposition 17

A BCN (3) is not single-experiment observable (Definition14) if and only if for each \(p\in \mathbb {N}\) and each input sequence \(U\in (\mathcal {D}^m)^p\), there exist two different initial states \(x_0,x_0' \in \mathcal {D}^n\) such that the corresponding output sequences are the same.

Proposition 18

A BCN (3) is not arbitrary-experiment observable (Definition 15) if and only if there exist two different initial states \(x_0,x_0'\in \mathcal {D}^n\) and an input sequence \(U\in (\mathcal {D}^m)^{p}\) with p sufficiently large such that the corresponding output sequences are the same.

4.2.1 Verifying Definition 12

Consider a BCN (3) and its observability graph \(\mathcal {G}_o=(\mathcal {V},\mathcal {E},\mathcal {W})\). Consider a non-diagonal vertex \(v=\{x,x'\}\in \mathcal {V}\) and the (possibly incomplete) DFA

$$\begin{aligned} \mathcal {G}_{oD}^{v}=(\mathcal {V},\mathcal {D}^m,\delta ,v,\mathcal {V}), \end{aligned}$$
(52)

where for all \(v'\in \mathcal {V}\) and \(u\in \mathcal {D}^m\), \(\delta (v',u)=v''\) if \((v',v'')\in \mathcal {E}\) and \(u\in \mathcal {W}((v',v''))\). Recall that \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^v)\) is the accessible part of \(\mathcal {G}_{oD}^v\). Then the words that are not accepted by \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^v)\) are distinguishing input sequences of x and \(x'\). Then by Proposition 14, x and \(x'\) have no distinguishing input sequence if and only if the DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^v)\) is complete.

Theorem 19

[21, 37]    A BCN (3) is not multiple-experiment observable if and only if there exist two different states \(x,x'\) such that \(h(x)=h(x')\) and the DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{x,x'\}})\) (52) recognizes language \((\mathcal {D}^m)^{*}\), i.e., \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{x,x'\}})\) is complete by Proposition 14.

Example 8

Reconsider BCN (12) in Example 4. The non-diagonal vertices of its observability graph (see Fig. 5) are \(\{00,10\}\), \(\{01,10\}\), and \(\{01,00\}\). The DFAs \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{00,10\}})\), \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{01,10\}})\), and \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{01,00\}})\) are shown in Fig. 6.

Fig. 6
figure 6

DFAs \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{00,10\}})\), \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{01,10\}})\), and \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{01,00\}})\) corresponding to Fig. 5

The three DFAs are all incomplete. Then by Theorem 19, (12) is multiple-experiment observable.

One sees both input sequence 0 and input sequence 1 distinguish initial states 01 and 00, both input sequence 0 and input sequence 1 distinguish initial states 00 and 10, input sequence 01 distinguishes initial states 01 and 10. For example, an illustration of using input sequence 01 to distinguish states 01 and 10 is shown in Table 1.

Table 1 Using input sequence 01 to distinguish initial states 01 and 10

Remark 4

When using Theorem 19 to verify multiple-experiment observability of a BCN (3), one needs to compute DFA \(\mathcal {G}_{oD}^{v}\) for at most \(2^n(2^n-1)/2\) times. Hence the overall time complexity of verifying this type of observability is \(O(2^{4n+m-2})=O(2^{4n+m})\).

4.2.2 Verifying Definition 13

Verifying strong multiple-experiment observability (Definition 13) is more involved than verifying multiple-experiment observability. By Proposition 16, to verify Definition 13 for a BCN (3), one needs a powerset construction from the observability graph \(\mathcal {G}_o\) of (3). From now on we call a DFA with at least one initial state a deterministic finite-state automaton. Such an automaton is denoted by \((Q,\varSigma ,\delta ,Q_0,F)\), where \(Q_0\subset Q\) denotes the set of initial states.

For every initial state \(x_0\in \mathcal {D}^n\) of (3), we consider the observability graph \(\mathcal {G}_o=(\mathcal {V},\mathcal {E},\mathcal {W})\) of (3) as a deterministic finite-state automaton

$$\begin{aligned} \mathcal {G}_o^{x_0}=(\mathcal {V},\mathcal {D}^m,\delta ,\mathcal {V}_{x_0},\mathcal {V}), \end{aligned}$$
(53)

where \(\mathcal {V}_{x_0}=\{ \{x_0,x_0'\}\vert h(x_0)=h(x_0'),x_0\ne x_0'\}\subset \mathcal {V}\) is the set of initial states, \(\delta \) is a partial function \(\mathcal {V}\times \mathcal {D}^m\rightarrow \mathcal {V}\) satisfying \(\delta (v,u)=v'\) if \((v,v')\in \mathcal {E}\) and \(u\in \mathcal {W}((v,v'))\). Over \(\mathcal {G}_o^{x_0}\), we construct a (complete or incomplete) DFA

$$\begin{aligned} \mathcal {G}_{oD}^{x_0}=(2^{\mathcal {V}}\setminus \{\emptyset \},\mathcal {D}^m,\delta ',\mathcal {V}_{x_0},2^{\mathcal {V}}\setminus \{\emptyset \}), \end{aligned}$$
(54)

where \(2^{\mathcal {V}}\setminus \{\emptyset \}\) is the state set, \(\mathcal {V}_{x_0}\) is the initial state, for all \(\emptyset \ne Z\in 2^{\mathcal {V}}\) and \(u\in \mathcal {D}^m\), \(\delta '(Z,u)=\{\delta (z,u)\vert z\in Z,\delta \text { is defined at }(z,u)\}\) if \(\delta \) is defined at at least one pair of \(Z\times \{u\}\). The size of \(\mathcal {G}_{oD}^{x_0}\) is \(O(2^{2^{2n-1}-2^{n-1}+1+m})=O(2^{2^{2n-1}+m})\).

Recall that \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{x_0})\) is the accessible part of \(\mathcal {G}_{oD}^{x_0}\). Then all words that are not accepted by \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{x_0})\) are distinguishing input sequences of \(x_0\).

Theorem 20

[21, 37]    A BCN (3) is not strongly multiple-experiment observable if and only if there exists an initial state \(x_0\in \mathcal {D}^n\) such that the corresponding DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{x_0})\) (54) recognizes language \((\mathcal {D}^m)^{*}\), i.e., \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{x_0})\) is complete by Proposition 14.

Example 9

Reconsider BCN (12) in Example 4. The corresponding DFAs \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{10})\), \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{01})\), and \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{00})\) are shown in Fig. 7 from left to right.

Fig. 7
figure 7

DFAs \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{10})\), \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{01})\), and \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{00})\) corresponding to BCN (12)

From DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{10})\) (resp., \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{01})\)), one sees that for each positive-length input sequence U, 0U distinguishes initial state 10 (resp., 01) from any other initial state, but 1U cannot do that.

Taking \(U=1\) for example, one has results shown in Table 2.

Table 2 Using sequence 01 to distinguish initial state 10 from any other initial state

From DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{00})\), one sees that each positive-length input sequence distinguishes initial state 00 from any other initial state.

Hence BCN (12) is strongly multiple-experiment observable by Theorem 20.

Remark 5

When using Theorem 20 to verify strong multiple-experiment observability of a BCN (3), one needs to compute DFA \(\mathcal {G}_{oD}^{x_0}\) for at most \(2^n\) times (for all \(x_0\in \mathcal {D}^n\)). Hence the overall time complexity of verifying this type of observability is \(O(2^{2^{2n-1}-2^{n-1}+1+m+n})=O(2^{2^{2n-1}+m})\).

4.2.3 Verifying Definition 14

To verify single-experiment observability (Definition 14) of a BCN (3), by Proposition 17, one also needs a powerset construction from the observability graph \(\mathcal {G}_o\) of (3), but this construction is different from \(\mathcal {G}_{oD}^{x_0}\) (54) used for verifying strong multiple-experiment observability.

We also consider \(\mathcal {G}_o=(\mathcal {V},\mathcal {E},\mathcal {W})\) as a deterministic finite-state automaton

$$\begin{aligned} \mathcal {G}_o^{\mathcal {V}_{nd}}=(\mathcal {V},\mathcal {D}^m,\delta ,\mathcal {V}_{nd},\mathcal {V}), \end{aligned}$$
(55)

where \(\mathcal {V}_{nd}=\{\{x,x'\}\vert h(x)=h(x'),x\ne x'\}\subset \mathcal {V}\) is the set of initial states, i.e., the initial states are exactly the non-diagonal vertices of \(\mathcal {G}_o\), \(\delta \) is a partial function \(\mathcal {V}\times \mathcal {D}^m\rightarrow \mathcal {V}\) satisfying \(\delta (v,u)=v'\) if \((v,v')\in \mathcal {E}\) and \(u\in \mathcal {W}((v,v'))\). Over \(\mathcal {G}_o^{\mathcal {V}_{nd}}\), we construct a DFA

$$\begin{aligned} \mathcal {G}_{oD}^{\mathcal {V}_{nd}}=(2^{\mathcal {V}}\setminus \{\emptyset \},\mathcal {D}^m,\delta ',\mathcal {V}_{nd},2^{\mathcal {V}}\setminus \{\emptyset \}), \end{aligned}$$
(56)

where \(2^{\mathcal {V}}\setminus \{\emptyset \}\) is the state set, \(\mathcal {V}_{nd}\) is the initial state, \(\delta '\) is defined in the same way as the one in \(\mathcal {G}_{oD}^{x_0}\) (54). The size of \(\mathcal {G}_{oD}^{\mathcal {V}_{nd}}\) is \(O(2^{2^{2n-1}-2^{n-1}+1+m})=O(2^{2^{2n-1}+m})\). All words that are not accepted by \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\mathcal {V}_{nd}})\) are distinguishing input sequences of (3).

Theorem 21

[21, 37]    A BCN (3) is not single-experiment observable if and only if the corresponding DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\mathcal {V}_{nd}})\) (56) recognizes language \((\mathcal {D}^m)^{*}\), i.e., \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\mathcal {V}_{nd}})\) is complete by Proposition 14.

Example 10

Recall BCN (12) in Example 4. The corresponding DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\mathcal {V}_{nd}})\) is shown in Fig. 8. This DFA is not complete, then by Theorem 21, (12) is single-experiment observable, and for every positive-length input sequence U, input sequence 0U distinguishes every pair of different initial states.

Fig. 8
figure 8

DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\mathcal {V}_{nd}})\) corresponding to BCN (12)

Taking \(U=1\) for example, one also has results shown in Table 2.

Remark 6

When using Theorem 21 to verify single-experiment observability of a BCN (3), one needs to compute DFA \(\mathcal {G}_{oD}^{\mathcal {V}_{nd}}\) once. Hence the overall time complexity of verifying this type of observability is \(O(2^{2^{2n-1}-2^{n-1}+1+m})=O(2^{2^{2n-1}+m})\).

4.2.4 Verifying Definition 15

Different from verifying the previous three definitions of observability, one can directly use the observability graph \(\mathcal {G}_o\) of a BCN (3) to verify its arbitrary-experiment observability (Definition 15). By Proposition 18, the following result holds.

Theorem 22

[37]    A BCN (3) is not arbitrary-experiment observable if and only if in its observability graph \(\mathcal {G}_o\), there exist a non-diagonal vertex v, a cycle C, and a path from v to C.

Due to the finiteness of the number of states of (3) and the pigeonhole principle, a sufficiently long input sequence that is not a distinguishing input sequence of two different initial states \(x_0,x_0'\) leads vertex \(\{x_0,x_0'\}\) to a cycle in \(\mathcal {G}_o\). The converse is also true. Hence Theorem 22 holds.

Example 11

Reconsider BCN (12) in Example 4. In its observability graph (Fig. 5), there is a path from non-diagonal vertex \(\{01,10\}\) to its diagonal subgraph, then there is a path from \(\{01,10\}\) to some cycle in the diagonal subgraph, (12) is not arbitrary-experiment observable by Theorem 22.

Remark 7

The time complexity of using Theorem 22 to verify arbitrary-experiment observability of a BCN (3) is \(O(2^{2n+m-1})=O(2^{2n+m})\).

4.2.5 Observer design based on arbitrary-experiment observability

Consider a BCN (3) that is arbitrary-experiment observable. One can design an observer as a partial function \({{\,\textrm{obs}\,}}:(\mathcal {D}^m)^{2^{2n-1}-2^{n-1}}\times (\mathcal {D}^q)^{2^{2n-1}-2^{n-1}+1} \rightarrow \mathcal {D}^n\), because in the observability graph \(\mathcal {G}_o\) of (3), there are at most \(2^{2n-1}-2^{n-1}\) non-diagonal vertices, and all input sequences of length \(2^{2n-1}-2^{n-1}\) are distinguishing input sequences of every pair of different initial states. The construction of \({{\,\textrm{obs}\,}}\) is as follows. Choose an initial state \(x_0\in \mathcal {D}^n\) and an input sequence \(U\in (\mathcal {D}^m)^{2^{2n-1}-2^{n-1}}\), and feed them into (3) to obtain the output sequence \(Y\in (\mathcal {D}^q)^{2^{2n-1}-2^{n-1}+1}\), then \({{\,\textrm{obs}\,}}(U,Y)=x_0\), because when the input sequence is U and the corresponding output sequence is Y, the initial state can only be \(x_0\).

4.3 Reconstructibility verification

Reconstructibility (also called detectability) is a property on whether one can use an input sequence and the corresponding output sequence to determine the current (i.e., final) state. This property is weaker than observability. When the initial state is not crucial but one wants to do quantitative analysis for a BCN (3) after some time, the property of reconstructibility applies.

We characterize the counterparts of single-experiment observability (Definition 14) and arbitrary-experiment observability (Definition 15) of a BCN (3), for reconstructibility, which are called single-experiment reconstructibility and arbitrary-experiment reconstructibility. A homing input sequence \(u_1\dots u_n\) of (3), where \(n\in \mathbb {N}\), is such that for every two initial states \(x_0,x_0'\), if the corresponding current states \(x_n\) and \(x_n'\) are different, then the output sequences generated by \(x_0\) and \(u_1\dots u_n\) and generated by \(x_0'\) and \(u_1\dots u_n\) are different. Hence a homing input sequence can determine the current state.

Definition 17

[12, 46]    A BCN (3) is called single-experiment reconstructible if it has a homing input sequence.

Definition 18

[43]    A BCN (3) is called arbitrary-experiment reconstructible if all input sequences of length greater than some number in \(\mathbb {N}\) are homing.

Definition 18 is strictly stronger than Definition 17 [46].

To verify reconstructibility, we define a notion of reconstructibility graph.

Definition 19

[46]    Consider a BCN (3) and its observability graph \(\mathcal {G}_o\) (Definition 16). Its reconstructibility graph \(\mathcal {G}_r\) is defined by the non-diagonal subgraph of \(\mathcal {G}_o\).

The size of a reconstructibility graph is \(O(2^{2n+m-1})=O(2^{2n+m})\).

Example 12

Reconsider the BCN (12) in Example 4. Its reconstructibility graph is shown in Fig. 9. This figure is the non-diagonal subgraph of Fig. 5 (the observability graph of (12)).

Fig. 9
figure 9

Reconstructibility graph of BCN (12)

4.3.1 Verifying Definition 18

We first verify Definition 18 and then Definition 17, because it is much easier to verify Definition 18. We directly use the reconstructibility graph of a BCN (3) to verify its arbitrary-experiment reconstructibility. The following result is rather intuitive.

Theorem 23

[46]    Consider a BCN (3) and its reconstructibility graph \(\mathcal {G}_r\). The BCN is not arbitrary-experiment reconstructible if and only if \(\mathcal {G}_r\) contains a cycle.

Remark 8

Compared with the necessary and sufficient condition for arbitrary-experiment reconstructibility of (3) given in [43], one can see that the expression of our condition in Theorem 23 is much simpler and much easier to check. Recall that the condition given in [43] was derived from Moore’s partition (defined in (11)).

Example 13

Reconsider the BCN (12) in Example 4. Since in its reconstructibility graph (Fig. 9) there is no cycle, then by Theorem 23, (12) is arbitrary-experiment reconstructible. By Fig. 9 one can see that all input sequences starting from 0 of length at least 2 and all input sequences starting from 1 are homing. The procedure of using homing input sequence 01 to determine the current state is shown in Table 2.

Remark 9

The time complexity of using Theorem 23 to verify arbitrary-experiment reconstructibility of a BCN (3) is \(O(2^{2n+m-1})=O(2^{2n+m})\).

4.3.2 Verifying Definition 17

Similar to verifying single-experiment observability (Sect. 4.2.3), to verify single-experiment reconstructibility (Definition 17) of a BCN (3), one also needs a powerset construction from its reconstructibility graph \(\mathcal {G}_r\).

We also consider \(\mathcal {G}_r=(\mathcal {V},\mathcal {E},\mathcal {W})\) as a deterministic finite-state automaton

$$\begin{aligned} \mathcal {G}_r=(\mathcal {V},\mathcal {D}^m,\delta ,\mathcal {V},\mathcal {V}), \end{aligned}$$
(57)

where all states are initial, \(\delta \) is a partial function \(\mathcal {V}\times \mathcal {D}^m\rightarrow \mathcal {V}\) satisfying \(\delta (v,u)=v'\) if \((v,v')\in \mathcal {E}\) and \(u\in \mathcal {W}((v,v'))\). Over \(\mathcal {G}_r\), we construct a DFA

$$\begin{aligned} \mathcal {G}_{rD}=(2^{\mathcal {V}}\setminus \{\emptyset \},\mathcal {D}^m,\delta ',\mathcal {V},2^{\mathcal {V}}\setminus \{\emptyset \}), \end{aligned}$$
(58)

where \(2^{\mathcal {V}}\setminus \{\emptyset \}\) is the state set, \(\mathcal {V}\) is the initial state, for all \(Z\in 2^{\mathcal {V}} \setminus \{\emptyset \}\) and \(u\in \mathcal {D}^m\), \(\delta '(Z,u)=\{\delta (z,u)\vert z\in Z,\delta \text { is defined at }(z,u)\}\) if \(\delta \) is defined at at least one pair of \(Z\times \{u\}\).

One can see that the words that are not accepted by DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{rD})\) are homing input sequences of (3). Then the following result holds.

Theorem 24

[46]    A BCN (3) is not single-experiment reconstructible if and only if the DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{rD})\) (58) recognizes language \((\mathcal {D}^m)^{*}\), i.e., \(\mathcal {G}_{rD}\) is complete by Proposition 14.

Example 14

Reconsider the BCN (12) in Example 4. The corresponding DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{rD})\) is shown in Fig. 10. This DFA is not complete, then by Theorem 24, (12) is single-experiment reconstructible.

Fig. 10
figure 10

DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{rD})\) corresponding to BCN (12)

Remark 10

Given a BCN (3), the size of DFA \(\mathcal {G}_{rD}\) (58) is \(O(2^{2^{2n-1}-2^{n-1}+m})=O(2^{2^{2n-1}+m})\). Then the time complexity of using Theorem 24 to verify single-experiment reconstructibility is \(O(2^{2^{2n-1}+m})\).

Further discussion on verification of single-experiment reconstructibility   It has been shown that the time complexity of using Theorem 24 to verify single-experiment reconstructibility of a BCN (3) is doubly exponential. Is it possible to reduce the complexity? The size of DFA \(\mathcal {G}_{rD}\) (58) is exponential in the size of the reconstructibility graph \(\mathcal {G}_r\) of (3). One possible way of reducing time complexity is to see whether one can directly use \(\mathcal {G}_r\) to verify this type of reconstructibility. Luckily, this kind of reduction works, because the condition in Theorem 24 can be equivalently represented by some kinds of “limit behavior” of \(\mathcal {G}_r\). In detail, if one regards \(\mathcal {G}_r\) as a dynamical system, then the DFA \(\mathcal {G}_{rD}\) is complete implies after a sufficiently long time, all vertices of \(\mathcal {G}_r\) that are reachable have outdegree \(2^m\). This holds because \(\mathcal {G}_r\) is deterministic and all vertices are initial states of \(\mathcal {G}_r\) as a dynamical system. The converse implication also holds. The limit set of \(\mathcal {G}_r\) (i.e., the set of all vertices that can be reached at any time) generates a complete subgraph (all vertices have outdegree \(2^m\)). The conclusion is formulated as follows:

Proposition 25

[46]    Consider a BCN (3), its reconstructibility graph \(\mathcal {G}_r=(\mathcal {V},\mathcal {E},\mathcal {W})\), and the DFA \(\mathcal {G}_{rD}\) defined in (58). DFA \(\mathcal {G}_{rD}\) is complete if and only if \(\mathcal {G}_r\) has a complete subgraph.

By Theorem 24 and Proposition 25, the following result holds.

Theorem 26

[46]    A BCN (3) is not single-experiment reconstructible if and only if its reconstructibility graph \(\mathcal {G}_r\) has a complete subgraph.

To check the condition in Theorem 26, one can remove the set \(\mathcal {V}_{<2^m}\) of all vertices of \(\mathcal {G}_r\) with outdegree less than \(2^m\) and all vertices that will reach \(\mathcal {V}_{<2^{m}}\). Then the rest of \(\mathcal {G}_r\) is a complete subgraph if it is nonempty. Hence this check can be done in time linear in the size of \(\mathcal {G}_r\).

Example 15

Consider BCN

$$\begin{aligned} \begin{aligned} x(t+1)&= f(u(t),x(t)),\\ y(t)&= h(x(t)), \end{aligned} \end{aligned}$$
(59)

where \(t\in \mathbb {N}\), \(x(t)\in \mathcal {D}^2\), \(u(t),y(t)\in \mathcal {D}\), f and h are as follows:

(a) f.

x

00

01

10

11

\(u=0\)

01

01

10

11

\(u=1\)

01

01

10

11

(b) h.

x

00

01

10

11

y

0

0

0

1

The reconstructibility graph of (59) is shown in Fig. 11.

Fig. 11
figure 11

Reconstructibility graph of BCN (59)

In Fig. 11, there is a complete subgraph (the self-loop on vertex \(\{01,10\}\)), then by Theorem 26, (59) is not single-experiment observable. The corresponding DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{rD})\) (defined in (58)) is shown in Fig. 12. DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{rD})\) is complete, which matches Proposition 25.

Fig. 12
figure 12

DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{rD})\) corresponding to BCN (59)

Remark 11

The time complexity of using Theorem 26 to verify single-experiment reconstructibility of a BCN (3) is \(O(2^{2n+m})\). When applying the classical verification algorithm (based on a notion of current-state uncertainty, which is similar to the initial-state uncertainty on pp. 10) for the existence of homing sequences of Mealy machines [12, Chapter 1] to verify single-experiment reconstructibility of (3), it runs in time \(O(2^{3n}+2^{2n+m})\), and homing input sequences can also be computed in this time.

4.4 Further applications of the observability graph to observability of Boolean control networks

As shown in Sect. 4.3.2, we first used DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{rD})\) (58) to verify single-experiment reconstructibility of a BCN (3) (Theorem 24), and then directly used the reconstructibility graph \(\mathcal {G}_r\) of (3) to verify this type of reconstructibility (Theorem 26). This modification leads to tremendous decrease of time complexity (from doubly exponential to exponential!). Then one may consider whether such tremendous decrease of time complexity could be obtained in verifying observability. For arbitrary-experiment of observability, we had already directly used the observability graph \(\mathcal {G}_o\) of (3) to do verification (Theorem 22). For strong multiple-experiment observability (Theorem 20) and single-experiment observability (Theorem 21), it unlikely to directly use \(\mathcal {G}_o\) to do verification, because the powerset construction from \(\mathcal {G}_o\) is unlikely to be avoided (because of the diagonal subgraph). However, for multiple-experiment observability (Theorem 19), one can directly use \(\mathcal {G}_o\) to do verification. The method is quite similar to the one used for verifying single-experiment reconstructibility as in Theorem 26.

Proposition 27

[22, 23]    Consider a BCN (3). Two different initial states \(x_0,x_0'\in \mathcal {D}^n\) with \(h(x_0)\ne h(x_0')\) has \(\epsilon \) as one of their distinguishing input sequences. Two different initial states \(x_0,x_0'\in \mathcal {D}^n\) with \(h(x_0)=h(x_0')\) has a distinguishing input sequence if and only if in the observability graph \(\mathcal {G}_o\) of (3), either \({\text {outdeg}}(\{x_0,x_0'\})<\vert \mathcal {D}^m\vert =2^m\) or there is a vertex v reachable from \(\{x_0,x_0'\}\) with \({\text {outdeg}}(v)<2^m\).

By Proposition 27, a necessary and sufficient condition for multiple-experiment observability of (3) can be obtained.

Theorem 28

[22, 23]    A BCN (3) is multiple-experiment observable if and only if in its observability graph \(\mathcal {G}_o\), for every non-diagonal vertex v with outdegree equal to \(2^m\), there is a path from v to some non-diagonal vertex with outdegree less than \(2^m\).

Theorem 28 was represented by the adjacency matrix of \(\mathcal {G}_o\) in [22], and the matrix was called observability matrix therein; the set of diagonal vertices and the set of non-diagonal vertices in \(\mathcal {G}_o\) (see Definition 16) are exactly the set D and the set \(\Xi \) on [22, pp. 78]). Theorem 28 was represented by \(\mathcal {G}_o\) in [23] and \(\mathcal {G}_o\) was renamed observability graph therein. The condition in Theorem 28 can be verified in time linear in the size of the observability graph \(\mathcal {G}_o=(\mathcal {V},\mathcal {E},\mathcal {W})\) as follows (similar to checking the condition in Theorem 26): first, remove the set \(\mathcal {V}_{<2^m}\) of all (non-diagonal) vertices of \(\mathcal {G}_o\) with outdegree less than \(2^m\) and all (non-diagonal) vertices that will reach \(\mathcal {V}_{<2^{m}}\); then, the rest of \(\mathcal {G}_o\) contains a non-diagonal vertex if and only if the condition in Theorem 28 does not hold if and only if (3) is not multiple-experiment observable.

Remark 12

The time complexity of using Theorem 28 to verify multiple-experiment observability of a BCN (3) is \(O(2^{2n+m})\), which is lower than that of using Theorem 19 with complexity \(O(2^{4n+m})\) (see Remark 4).

Example 16

Reconsider BCN (12) in Example 4 and its observability graph \(\mathcal {G}_o\) in Fig. 5. One sees \(\mathcal {V}_{<2}=\{\{00,10\},\{01,10\},\{01,00\}\}\). The rest of \(\mathcal {G}_o\) contains no non-diagonal vertex, then by Theorem 28, (12) is multiple-experiment observable. That is, the three DFAs \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{00,10\}})\), \({{\,\textrm{Acc}\,}}({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^ {\{01,10\}})\), and \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\{01,00\}})\) computed in Example 8 are not needed any more.

Example 17

Consider BCN

$$\begin{aligned} \begin{aligned} x(t+1)&= f(u(t),x(t)),\\ y(t)&= h(x(t)), \end{aligned} \end{aligned}$$
(60)

where \(t\in \mathbb {N}\), \(x(t)\in \mathcal {D}^2\), \(u(t),y(t)\in \mathcal {D}\), f and h are as follows:

(a) f.

x

00

01

10

11

\(u=0\)

11

10

10

11

\(u=1\)

11

10

10

11

(b) h.

x

00

01

10

11

y

0

0

0

1

The observability graph \(\mathcal {G}_o\) of (60) is shown in Fig. 13.

To check the condition in Theorem 28, we compute as \(\mathcal {V}_{<2}=\{\{00,10\},\{01,00\}\}\). The rest of \(\mathcal {G}_o\) contains a vertex \(\{01,10\}\) whose outdegree is equal to 2. Therefore, (60) is not multiple-experiment observable by Theorem 28. By Fig. 13, one easily sees that initial states 01 and 10 have no distinguishing input sequence.

Fig. 13
figure 13

Observability graph of BCN (60)

4.5 Extensions of the observability graph to nondeterministic finite-transition systems

After writing the papers [21, 37, 46] and establishing the basic framework of using the notion of observability graph to verify the four kinds of observability of BCNs, the writer tried to extend the notion to nondeterministic finite-transition systems (NFTSs). Recall that Moore machines and Mealy machines are deterministic finite-transition systems. Luckily, the notion can be naturally extended, and hence the previously studied different notions of observability have been verified in NFTSs using the extended notion of observability graph [20]. However, in NFTSs, Moore’s equivalence relation (see (11)) is not necessarily transitive [20], hence Moore’s partition-based method does not apply to NFTSs.

An NFTS is a sextuple

$$S=(X,X_0,U,f,Y,h),$$

where

  • X is a finite set of states,

  • \(X_0\subset X\) a set of initial states,

  • U an (finite) alphabet of inputs,

  • \(f: X\times U \rightarrow 2^X\setminus \{\emptyset \}\) the transition function,

  • Y an alphabet of outputs, and

  • \(h:X\rightarrow Y\) the output map.

In [20], the NFTSs considered are a bit more general in the sense that \(f:X\times U\rightarrow 2^X\), \(h:X\times U\rightarrow Y\).

4.5.1 Different definitions of observability of nondeterministic finite-transition systems

Definition 20

[20]    An NFTS \(S=(X,X_0,U,f,Y,h)\) is called multiple-experiment observable if for every two different initial states \(x_0,x_0'\in X_0\), there is an input sequence \(\alpha \in U^{*}\) such that every output sequence \(\beta \) of length \(\vert \alpha \vert +1\) generated by \(x_0\) and \(\alpha \) is different from every output sequence \(\beta '\) of length \(\vert \alpha \vert +1\) generated by \(x_0'\) and \(\alpha \). Such an \(\alpha \) is called a distinguishing input sequence of \(x_0\) and \(x_0'\).

Definition 21

[20]    An NFTS \(S=(X,X_0,U,f,Y,h)\) is called single-experiment observable if there is an input sequence \(\alpha \in U^{*}\) such that for every two different initial states \(x_0,x_0'\in X_0\), every output sequence \(\beta \) of length \(\vert \alpha \vert +1\) generated by \(x_0\) and \(\alpha \), and every output sequence \(\beta '\) of length \(\vert \alpha \vert +1\) generated by \(x_0'\) and \(\alpha \), it holds that \(\beta \ne \beta '\). Such an \(\alpha \) is called a distinguishing input sequence of S.

Definition 22

[20]    An NFTS \(S=(X,X_0,U,f,Y,h)\) is called arbitrary-experiment observable if for every two different initial states \(x_0,x_0'\in X_0\), for every input sequence \(\alpha \in U^{\omega }\), every output sequence \(\beta \in Y^{\omega }\) generated by \(x_0\) and \(\alpha \) is different from every output sequence \(\beta '\in Y^{\omega }\) generated by \(x_0'\) and \(\alpha \).

4.5.2 Observability graphs of nondeterministic finite-transition systems

In [20], the observability graph \(\mathcal {G}_o\) of a BCN (3) (Definition 16) was extended to an NFTS S as follows (in the form of its adjacency matrix (5) of [20]).

Definition 23

[20]    Consider an NFTS S. A triple \(\mathcal {G}_o=(\mathcal {V},\mathcal {E},\mathcal {W})\) is called its observability graph if the vertex set \(\mathcal {V}\) is equal to \( \{\{x,x'\}\vert x,x'\in X,h(x)=h(x')\}\), the edge set \(\mathcal {E}\) is equal to \(\{(\{x_1,x_1'\},\{x_2,x_2'\})\in \mathcal {V}\times \mathcal {V}\vert \)there exists \(u\in U\) such that \(x_2\in f(x_1,u)\) and \(x_2'\in f(x_1',u)\), or, \(x_2'\in f(x_1,u)\) and \(x_2\in f(x_1',u)\}\subset \mathcal {V}\times \mathcal {V}\), and the weight function \(\mathcal {W}:\mathcal {E}\rightarrow 2^{U}\) assigns to each edge \((\{x_1,x_1'\},\{x_2,x_2'\})\in \mathcal {E}\) a set \( \{u\in U\vert x_2\in f(x_1,u)\) and \(x_2'\in f(x_1',u)\), or, \(x_2'\in f(x_1,u)\) and \(x_2\in f(x_1',u)\}\) of inputs. Diagonal vertices and non-diagonal vertices are also defined as the vertices \(\{x,x'\}\) satisfying \(x=x'\) and \(x\ne x'\), respectively.

The number of vertices of \(\mathcal {G}_o\) of an NFTS S is bounded by \(\vert X\vert +\vert X\vert (\vert X\vert -1)/2=(\vert X\vert ^2+\vert X\vert )/2\). Then the size of \(\mathcal {G}_o\) is \(O(\vert X\vert ^4\vert U\vert )\).

With the notion of observability graph of an NFTS S, the procedures shown in Sect. 4.2.1, Sect. 4.2.3, and Sect. 4.2.4 to verify observability of BCNs were extended to NFTSs (i.e., the DFAs (52) and (56) were extended to S), and verification algorithms for the three types of observability were derived for NFTSs. The same as arbitrary-experiment observability of a BCN (3), this type of observability was also verified directly using the observability graph \(\mathcal {G}_o\) of S.

4.5.3 Verifying Definition 20

To verify multiple-experiment observability of S, we consider a non-diagonal vertex \(v:=\{x,x'\}\in \mathcal {V}\) of its observability \(\mathcal {G}_o\), where \(x,x'\in X_0\), and regard \(\mathcal {G}_o\) as a nondeterministic finite automaton (NFA)

$$\begin{aligned} \mathcal {G}_o^v=(\mathcal {V},U,\delta ,v,\mathcal {V}), \end{aligned}$$
(61)

where \(\mathcal {V}\) is the state set, v is the initial state, \(\delta :\mathcal {V}\times U\rightarrow 2^{\mathcal {V}}\) is such that for all \(v'\in \mathcal {V}\) and \(u\in U\), \(\delta (v',u)=\{v''\in \mathcal {V}\vert (v',v'')\in \mathcal {E},u\in \mathcal {W}((v',v''))\}\). Over \(\mathcal {G}_o^v\) we construct the following DFA (the counterpart of DFA \(\mathcal {G}_{oD}^{v}\) (52) of BCN (3))

$$\begin{aligned} \mathcal {G}_{oD}^{v}=(2^{\mathcal {V}}\setminus \{\emptyset \},U,\delta ',\{v\},2^{\mathcal {V}}\setminus \{\emptyset \}), \end{aligned}$$
(62)

where \(2^{\mathcal {V}}\setminus \{\emptyset \}\) is the state set, \(\{v\}\) is the initial state, \(\delta '\) is defined in the same way as the one in \(\mathcal {G}_{oD}^{x_0}\) (54). The size of \(\mathcal {G}_{oD}^{v}\) is \(O\big (2^{\frac{\vert X\vert ^2+\vert X\vert }{2}}\vert U\vert \big )\). Similar to \(\mathcal {G}_{oD}^v\) (52), we have the words that are not accepted by \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^v)\) (62) are distinguishing input sequences of x and \(x'\). Then by Proposition 14, x and \(x'\) have no distinguishing input sequence if and only if DFA (62) is complete. Similar to Theorem 19, the following result holds.

Theorem 29

[20]    An NFTS S is not multiple-experiment observable if and only if there exist two different initial states \(x_0,x_0'\in X_0\) such that \(h(x_0)=h(x_0')\) and the DFA \({{\,\textrm{Acc}\,}}\big (\mathcal {G}_{oD}^{\{x_0,x_0'\}}\big )\) (62) is complete.

Remark 13

When using Theorem 29 to verify multiple-experiment observability of an NFTS S, one needs to compute DFA \(\mathcal {G}_{oD}^v\) (62) for at most \(\vert X_0\vert (\vert X_0\vert -1)/2\) times, then it takes time \(O\big (\vert X_0\vert ^2\vert U\vert 2^{\frac{\vert X\vert ^2+\vert X\vert }{2}}\big )\) to use Theorem 29 to verify multiple-experiment observability of an NFTS S.

4.5.4 Verifying Definition 21

To verify single-experiment observability of an NFTS S, we consider the observability \(\mathcal {G}_o\) of S as a nondeterministic finite-state automaton

$$\begin{aligned} (\mathcal {V},U,\delta ,\mathcal {V}_{nd},\mathcal {V}), \end{aligned}$$
(63)

where \(\mathcal {V}_{nd}=\{\{x_0,x_0'\}\vert x_0,x_0'\in X_0,x_0\ne x_0',h(x_0)=h(x_0')\}\) is the initial state set, \(\delta :\mathcal {V}\times U\rightarrow 2^{\mathcal {V}}\) is the same as the one defined in (61). Furthermore, we compute the following DFA from (63):

$$\begin{aligned} \mathcal {G}_{oD}^{\mathcal {V}_{nd}}=(2^{\mathcal {V}}\setminus \{\emptyset \},U,\delta ',\mathcal {V}_{nd},2^{\mathcal {V}}\setminus \{\emptyset \}), \end{aligned}$$
(64)

where \(2^{\mathcal {V}}\setminus \{\emptyset \}\) is the state set, \(\mathcal {V}_{nd}\) is the initial state, \(\delta '\) is defined in the same way as the one in (62). The size of (64) is \(O\big (2^{\frac{\vert X\vert ^2+\vert X\vert }{2}}\vert U\vert \big )\). Similar to \(\mathcal {G}_{oD}^{\mathcal {V}_{nd}}\) (56), we have the words that are not accepted by \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\mathcal {V}_{nd}})\) (64) are distinguishing input sequences of S.

Theorem 30

[20]    An NFTS S is not single-experiment observable if and only if the DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{\mathcal {V}_{nd}})\) (64) is complete.

Remark 14

The time complexity of using Theorem 30 to verify single-experiment observability of an NFTS S is \(O\big (2^{\frac{\vert X\vert ^2+\vert X\vert }{2}}\vert U\vert \big )\).

4.5.5 Verifying Definition 22

As mentioned above, one can directly use the observability graph \(\mathcal {G}_o\) of an NFTS S to verify its arbitrary-experiment observability.

Theorem 31

[20]    An NFTS S is not arbitrary-experiment observable if and only if in its observability graph \(\mathcal {G}_o\), there exists a non-diagonal vertex v, a cycle C, and a path from v to C.

Remark 15

The time complexity of using Theorem 31 to verify arbitrary-experiment observability of an NFTS S is \(O(\vert X\vert ^4\vert U\vert )\).

4.5.6 Observer design based on arbitrary-experiment observability

Similar to the observer design for a BCN (3) base on arbitrary-experiment observability as shown in Sect. 4.2.5, next we design an observer for an arbitrary-experiment observable NFTS S. Assume an NFTS S that is arbitrary-experiment observable and assume \(X_0=X\), one can design an observer as a partial function \({{\,\textrm{obs}\,}}:U^{\vert X\vert (\vert X\vert -1)/2}\times Y^{\vert X\vert (\vert X\vert -1)/2+1} \rightarrow X_0\), because in the observability graph \(\mathcal {G}_o\) of S, there are at most \(\vert X\vert (\vert X\vert -1)/2\) non-diagonal vertices, and all input sequences of length \(\vert X\vert (\vert X\vert -1)/2\) are distinguishing input sequences of every pair of different initial states. The construction of \({{\,\textrm{obs}\,}}\) is as follows. Choose an initial state \(x_0\in X_0\) and an input sequence \(\alpha \in U^{\vert X\vert (\vert X\vert -1)/2}\), and feed them into S and choose an arbitrary generated output sequence \(\beta \in Y^{\vert X\vert (\vert X\vert -1)/2+1}\), then \({{\,\textrm{obs}\,}}(\alpha ,\beta )=x_0\), because when the input sequence is \(\alpha \) and the corresponding output sequence is \(\beta \), the initial state can only be \(x_0\).

Recall that an S is time-invariant, so the observer \({{\,\textrm{obs}\,}}\) works starting at any time step. For example, consider an input sequence \(u(0)u(1)\dots \), feed it into an S that is arbitrary-experiment observable, and then S produces an output sequence \(y(0)y(1)\dots \). One use \(u(0)\dots u(\vert X\vert (\vert X\vert -1)/2)\) and \(y(0)\dots y(\vert X\vert (\vert X\vert -1)/2+1)\) to determine x(0), and then use \(u(1)\dots u(\vert X\vert (\vert X\vert -1)/2+1)\) and \(y(1)\dots y(\vert X\vert (\vert X\vert -1)/2+2)\) to determine x(1) no matter whether x(0) is known, and then use \(u(2)\dots u(\vert X\vert (\vert X\vert -1)/2+2)\) and \(y(2)\dots y(\vert X\vert (\vert X\vert -1)/2+3)\) to determine x(2) no matter whether x(0) and x(1) are known, and so on.

The writer wants to point out that when writing papers [21, 37, 46], he had not known the research area of model-based testing, including [10, 12, 39]. However, when writing paper [20], he found this theoretical computer science area, and was aware of that several notions of observability have been studied in this area, but the notions were not called observability. In [10, 39], Moore machines and Mealy machines were studied, multiple-experiment observability was characterized in [10] (already introduced in Sect. 3.1), single-experiment observability and strong multiple-experiment observability were studied in [39] and their verification problems were proven to be \(\textsf{PSPACE}\)-complete in Mealy machines. However, none of the verification methods found [10, 39] is equivalent to the observability-graph method (already shown on Page 9 in Sect. 2.7). Later on, when the writer started to look for observability results in NFTSs in the literature, he found paper [47], in which a structure equivalent to DFA \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^{v})\) (62) was directly used to verify multiple-experiment observability of NFTSs. Note that in [47], the counterpart of \({{\,\textrm{Acc}\,}}(\mathcal {G}_{oD}^v)\) was only used to verify multiple-experiment observability. Note also that it is far from trivial to recover the observability graph \(\mathcal {G}_o\) of S from the counterpart, because the size of the counterpart is exponential in the size of \(\mathcal {G}_o\).

4.6 Applications of the observability graph to observability of singular Boolean control networks and probabilistic Boolean networks

After the papers [21, 37, 46] were published, the basic idea of observability graph has been widely used in the control area of BCNs, not only in [22, 23]. For example, in [25, 26], the problem of set controllability/reachability of BCNs were verified; moreover, after computing the product (called parallel extension in [26], which is actually the adjacency matrix of the observability graph proposed in [21, 37] when being applied to verify observability) of two copies of a BCN (3), multiple-experiment observability of (3) was verified by verifying set controllability for the product. Note that the technical details of verifying multiple-experiment observability of (3) by verifying set controllability of the parallel extension of (3) are exactly the same as those in [21, 37], but in an equivalent matrix representation (that only looks different from the graph and automaton representation used in [21, 37]). In [26], two other types of observability of (3) were also verified over the parallel extension of (3) (see Table 3).

The observability graph has also been used to study observability perturbation for multivalued logical networks (a deterministic extension of BNs) in [30]. Note that [30, Theorem 1] is a special case of [37, Theorem 3.15] (i.e., Theorem 22 in the current survey).

A variant of the reconstructibility graph (Definition 19) was used to study reconstructibility of singular BCNs in [29], where such networks are a subclass of NFTSs studied in [20]. A summary can be found in Table 3.

Table 3 Complexity upper bounds for verifying four definitions of observability in BCNs, where the algorithms corresponding to the bounds with the same type of underlines are derived from mathematically equivalent methods, n and m denote the numbers of state nodes and input nodes. The method in [48] is function-based, all the other methods are state-based. So, the method in [48] shows remarkably different efficiencies when being applied to BCNs with different structures, but the other methods show similar efficiencies

Observability results have also been extended to probabilistic Boolean networks (PBNs) [27, 28, 49, 50], where in [27, 28], the observability graph was used. In the PBNs studied in these papers, the stochastic switching signals are independent and identically distributed processes, hence the PBNs are actually discrete-time finite-state time-homogeneous Markov chains. Moreover, if all probabilities in such a PBN are removed then it becomes an NFTS in which the input is constant. That is, the systems considered in [20] are more general than the systems considered in [27, 28, 49, 50].

In [27, 28, 49, 50], four definitions of observability for PBNs were studied: observability with probability one on \(\llbracket 0,\theta \rrbracket \) with \(\theta \in \mathbb {N}\), observability in probability on \(\llbracket 0,\theta \rrbracket \) with \(\theta \in \mathbb {N}\), finite-time observability in probability, and asymptotic observability in distribution.

The PBNs studied in [27, 28, 49, 50] are as follows:

$$\begin{aligned} \begin{aligned} x(t+1)&= L_{\sigma (t)} x(t),\\ y(t)&= Hx(t), \end{aligned} \end{aligned}$$
(65)

where \(\sigma :\mathbb {N}\rightarrow \llbracket 1,s \rrbracket \) with \(s\in \mathbb {Z}_+\) is an independent and identically distributed process and at each time \(t\in \mathbb {N}\), \({\text {Prob}}\{\sigma (t)=i\}=p_i\), \(i\in \llbracket 1,s \rrbracket \), with \(p_i>0\)Footnote 6 and \(\sum _{i=1}^{s} p_i=1\), \([p_1,\dots ,p_s]=:\varvec{p}\) is the probability distribution of \(\sigma \); \(L_1,\dots ,L_s\in \mathcal {L}_{2^n\times 2^n}\) are the system matrices; \(H\in \mathcal {L}_{2^q\times 2^n}\) is the output matrix; \(x(t)\in \varDelta _{2^n}\); \(y(t)\in \varDelta _{2^q}\).

Because \(\sigma \) is independent and identically distributed, (65) can be reformulated as the following BCN:

$$\begin{aligned} \begin{aligned} x(t+1)&= [L_1,\dots ,L_s] u(t) x(t),\\ y(t)&= Hx(t), \end{aligned} \end{aligned}$$
(66)

where \(u(t)\in \varDelta _s\), the other variables are the same as above.

Let \(\varvec{x}(\theta ;\sigma ,x_0)\) (resp., \(\varvec{y}(\theta ;\sigma ,x_0)\)) be any of the admissible state (resp., output) sequences of (65) starting from initial state \(x_0\in \varDelta _{2^n}\) on discrete time set \(\llbracket 0,\theta \rrbracket \). In \(\varvec{y}(\theta ;\sigma ,x_0)\) and \(\varvec{y}(\theta ;\sigma ,x_0')\), the only difference lies in the different initial states. At each time step, in both of them, the switching signal \(\sigma \) takes the same value. Hence \({\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma ,x_0)\}=0\) for any \(x_0\in \varDelta _{2^n}\). Note that \({\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma ,x_0')\}\) increases as \(\theta \) increases and its limit need not be reached in finite time steps.

Definition 24

[27, 50]    Let \(\theta \) be in \(\mathbb {N}\). A PBN (65) is called observable with probability one on \(\llbracket 0,\theta \rrbracket \) if for every two different initial states \(x_0,x_0'\in \varDelta _{2^n}\),

$$\begin{aligned} {\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma ,x_0')\}=1. \end{aligned}$$
(67)

It is trivial to see that a PBN (65) is observable with probability one on \(\llbracket 0,\theta \rrbracket \) if and only if the corresponding BCN (66) is arbitrary-experiment observable (Definition 15) and all input sequences of length \(\theta \) are distinguishing input sequences of (66). Hence, observability with probability one on \(\llbracket 0,\theta \rrbracket \) can be regarded as a slightly stronger version of arbitrary-experiment observability.

Definition 25

[27, 49]    Let \(\theta \) be in \(\mathbb {N}\). A PBN (65) is called observable in probability on \(\llbracket 0,\theta \rrbracket \) if for every two different initial states \(x_0,x_0'\in \varDelta _{2^n}\),

$$\begin{aligned} {\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma ,x_0')\}>0. \end{aligned}$$
(68)

Definition 26

[27]    A PBN (65) is called finite-time observable in probability if there is \(\theta \in \mathbb {N}\) such that (65) is observable in probability on \(\llbracket 0,\theta \rrbracket \).

By definition, a PBN (65) is finite-time observable in probability if and only if the corresponding BCN (66) is multiple-experiment observable (Definition 12). Furthermore, (65) is observable in probability on \(\llbracket 0,\theta \rrbracket \) if and only if the corresponding (66) is multiple-experiment observable and all pairs of different initial states \(x_0,x_0'\) have a distinguishing input sequence of fixed length \(\theta \). Hence observability in probability on \(\llbracket 0,\theta \rrbracket \) and finite-time observability in probability can be regarded as slight variants of multiple-experiment observability.

Definition 27

[27, 28]    A PBN (65) is called asymptotically observable in distribution if for every two different initial states \(x_0,x_0'\in \varDelta _{2^n}\), it holds that

$$\begin{aligned} \lim _{\theta \rightarrow \infty }{\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma ,x_0')\}=1. \end{aligned}$$
(69)

If \(x_0\) and \(x_0'\) have no distinguishing input sequence in (66), then \({\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma ,x_0')\}=0\) for any \(\theta \in \mathbb {N}\). Hence, to make (65) asymptotically observable in distribution, every pair of different states must have at least one distinguishing input sequence in (66), i.e., (66) satisfies Definition 12. To make (65) asymptotically observable in distribution, one additionally must have in the observability graph of (66), there is no path from any non-diagonal vertex to any diagonal vertex, because for a non-diagonal vertex \(v=\{x_0,x_0'\}\) from which there is a path to some diagonal vertex, there exists \(\beta <1\) such that for any \(\theta \) greater than the length of the path, \({\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma ,x_0')\} \le \beta \). Then \(\lim _{\theta \rightarrow \infty }{\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma ,x_0')\} \le \beta <1\).

Conversely, assume that (i) (66) satisfies Definition 12 and (ii) in the observability graph \(\mathcal {G}_o:=(\mathcal {V},\mathcal {E},\mathcal {W})\) of (66), there is no path from any non-diagonal vertex to any diagonal vertex. We endow the edges of \(\mathcal {G}_o\) with probabilities according to the probability distribution \(\varvec{p}=[p_1,\dots ,p_s]\) as in (65): for all \(v,v'\in \mathcal {V}\) with \((v',v)\notin \mathcal {E}\), \(p_{v,v'}=0\); for all \(v,v'\in \mathcal {V}\) with \((v',v)\in \mathcal {E}\), \(p_{v,v'}=\sum _{i\in \llbracket 1,s \rrbracket , \delta _s^i\in \mathcal {W}((v',v))} p_i\). Denote the set of diagonal vertices and the set of non-diagonal vertices of \(\mathcal {G}_o\) by \(\mathcal {V}_d\) and \(\mathcal {V}_{nd}\), respectively. Then by (ii), for all \(v\in \mathcal {V}_{d}\) and \(v'\in \mathcal {V}_{nd}\), \(p_{v,v'}=0\). Denote the adjacency matrix of the non-diagonal subgraph of \(\mathcal {G}_o\) by \(M_{\mathcal {V}_{nd}}=(p_{v,v'})_{v,v'\in \mathcal {V}_{nd}}\). Then also by (ii), the sum of the v-th column of \((M_{\mathcal {V}_{nd}})^{\theta }\) is equal to \({\text {Prob}}\{ \varvec{y}(\theta ;\sigma ,x_0) = \varvec{y}(\theta ;\sigma ,x_0')\}\), where \(\{x_0,x_0'\}=v\). By (i) (i.e., for every \(\{x_0,x_0'\}\) in \(\mathcal {V}_{nd}\), \(x_0\) and \(x_0'\) have a distinguishing input sequence in \(\mathcal {G}_o\)), there exists \(\bar{\theta }\in \mathbb {Z}_+\) such that in \((M_{\mathcal {V}_{nd}})^{\bar{\theta }}\), the sum of each column is less than 1. Then the spectral radius of \((M_{\mathcal {V}_{nd}})^{\bar{\theta }}\) is less than 1, so is \(M_{\mathcal {V}_{nd}}\). Hence \(\lim _{\theta \rightarrow \infty }(M_{\mathcal {V}_{nd}})^{\theta }\) has all entries equal to 0. That is, for all \(x_0,x_0'\) with \((x_0,x_0')\in \mathcal {V}_{nd}\), \(\lim _{\theta \rightarrow \infty }{\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) = \varvec{y}(\theta ;\sigma , x_0')\}=0\), \(\lim _{\theta \rightarrow \infty }{\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma , x_0')\}=1\). For all \(x_0,x_0'\) with \(Hx_0\ne Hx_0'\), \({\text {Prob}}\{\varvec{y}(\theta ;\sigma ,x_0) \ne \varvec{y}(\theta ;\sigma , x_0')\}=1\) for any \(\theta \in \mathbb {N}\). Then (65) is asymptotically observable in distribution.

Based on the above discussion, a PBN (65) is asymptotically observable in distribution if and only if the corresponding BCN (66) satisfies Definition 12 and in the observability graph of (66), there is no path from any non-diagonal vertex to any diagonal vertex, which is exactly [28, Proposition 1]. Hence asymptotic observability in distribution of PBNs can be regarded as a slightly stronger version of multiple-experiment observability of BCNs (Definition 12).

Remark 16

The above three conclusions show that the three definitions of observability in probability on \(\llbracket 0,\theta \rrbracket \) with \(\theta \in \mathbb {N}\), finite-time observability in probability, and asymptotic observability in distribution are rather close to each other and all can be regarded as slight variants of multiple-experiment observability of BCNs (the first and the third are slightly stronger than multiple-experiment observability, the second is exactly multiple-experiment observability). In addition, the three definitions do not depend on probability distributions of the stochastic switching signal \(\sigma \). Formally, given two probability distributions \(\varvec{p}^i\), \(i=1,2\), for a PBN (65), one has (65) with \(\varvec{p}=\varvec{p}^1\) is observable if and only if PBN (65) with \(\varvec{p}=\varvec{p}^2\) is observable with respect to any one of the above three definitions.

[28, Lemma 3] is exactly [10, Theorem 6]. Note that this result was earlier restated in [20] and [13, Remark 4.1]. In [28], observability of switched BNs was mentioned. As one can easily see, observability of switched BCNs can be equivalently transformed to observability of BCNs if inputs and switching signals of switched BCNs are with the same quantifier (either \(\exists \) or \(\forall \)). For a given switched BCN, one could regard the Cartesian product of the set of inputs and the codomain of the switching signal as the new set of inputs so that a new equivalent BCN is obtained. The results in [51] showed that Definition 14 extended to switched BCNs for which the switching signals are with the \(\exists \) quantifier is actually Definition 14 of BCNs. As a sequence, controllability of switched BCNs studied in [52] (in which inputs and switching signals are both with \(\exists \) quantifier) is actually controllability of BCNs.

5 Concluding remarks

This paper provided a thorough survey for the results related to observability of general Boolean control networks (BCNs). The contents of the paper were guided by two methods—Edward F. Moore’s partition and our observability graph. First, Moore’s partition closely relates the following problems: computation of smallest invariant dual subspaces of Boolean networks (BNs) containing a set of Boolean functions, multiple-experiment observability verification/decomposition in BCNs, and disturbance decoupling in BCNs. However, this method does not apply to other types of observability or nondeterministic systems. Second, based on our observability graph, four different types of observability were verified in BCNs, verification results were also extended to probabilistic BNs, singular BCNs, and nondeterministic finite-transition systems (NFTSs).

All the methods introduced in the paper are state-based, i.e., with respect to BCNs whose dependency graphs are with different degrees of sparsity, the methods show similar efficiencies. Hence, such methods generally cannot be used to verify observability of large-scale BCNs (i.e., those with more than approximately 30 nodes) in a reasonable amount of time. It is known that verifying observability of BNs is \(\textsf{NP}\)-hard [53], then it is also \(\textsf{NP}\)-hard to verify all different types of observability of BCNs, because different types of BCNs coincide with each other when the inputs are constant (see [53] and [13, Section 4.4]). Hence, it is very difficult to efficiently verify observability for large-scale BCNs. One way of overcoming the high complexity of verifying a large-scale BCN is to first partition the BCN into small-scale subnetworks (with not more than 20 nodes), and then verify these small subnetworks. Two classes of such partitions (called node aggregations) were given in [24]. In the first class, the subnetworks are BCNs, and the aggregated networks are acyclic; in the second class, the subnetworks are NFTSs, and the aggregated networks are not necessarily acyclic. The partition method was successfully applied to analyze arbitrary-experiment observability of a Boolean T-cell receptor kinetics model built in [54] with 37 state nodes and 3 input nodes. The result on the T-cell model obtained in [24] is the unique minimal set of 16 state nodes needed to be directly measured to make the overall model observable, as well as 5 of the 16 state nodes needed to be directly measured to make the model arbitrary-experiment reconstructible.

Efficient verification can be done in several special types of BCNs, e.g., a polynomial-time observability verification algorithm for conjunctive BNs (which only contain logical operator \(\wedge \)) was given in [55].

All the results collected in the paper are state-based. As mentioned above, for different types of BCNs, they show similar efficiencies. In [48], function-based necessary and sufficient conditions for controllability and multiple-experiment observability of BCNs were given. The method is based on an algebraic form of a BCN, i.e., regarding each node as a variable in the finite field \(\mathbb {F}_2\) and regarding Boolean functions as polynomial functions on the vector space \(\mathbb {F}_2^n\). Then controllability and observability were verified by computing two affine varieties and checking their equality by comparing their Gröbner bases (that are computable). The verification algorithms depend on the lengths of Boolean functions of a BCN. Hence, when a BCN is sparse, the lengths of its Boolean functions are relatively small, the algorithms are efficient. Based on the algorithms, in [48], controllability and multiple-experiment observability were verified directly over the above mentioned Boolean T-cell model built in [54]. Different from the results obtained in [24], the unique minimal set of states nodes needed to be directly measured to make the overall model multiple-experiment observable is 14.

Recently, the study on the observability synthesis problem in BCNs based on state feedback started [56], in which only arbitrary-experiment observability was studied. The results were obtained by combining the notion of observability graph and the semitensor product of matrices. The synthesis algorithm designed therein is preliminary, a lot of work could be done to improve its efficiency and performance.