Many advanced mathematical models of biochemical, biophysical and other processes in systems biology can be described by parametrized systems of nonlinear differential equations. Due to complexity of the models, a problem of their simplification has become of great importance. In particular, rather challengeable methods of estimation of parameters in these models may require such simplifications. The paper offers a practical way of constructing approximations of nonlinearly parametrized functions by linearly parametrized ones. As the idea of such approximations goes back to Principal Component Analysis, we call the corresponding transformation Principal Component Transform. We show that this transform possesses the best individual fit property, in the sense that the corresponding approximations preserve most information (in some sense) about the original function. It is also demonstrated how one can estimate the error between the given function and its approximations. In addition, we apply the theory of tensor products of compact operators in Hilbert spaces to justify our method for the case of the products of parametrized functions. Finally, we provide several examples, which are of relevance for systems biology.
This study is closely related to applications in the so-called “metamodeling” of differential equations, where a “proper” model of an e.g. complex biological process is replaced by its approximation which contains “most information” about the model, but which is simpler. In particular, the true parameters of the model are replaced by “the latent parameters”, which makes the model linear with respect to the latter and hence enables the usage of the (if necessary, partial) least-squares regression. This explains why this idea proved to be efficient in parameter estimation (see e.g. [
Let x = x ( u , ω ) be a function, where u ∈ U ⊂ ℝ N and ω ∈ Ω , Ω ⊂ ℝ M being a space of parameters and k ∈ ℕ be a given number. The kth Principal Component Transform (PCT) is a specially constructed parametrized function PCT ( x , k ) ≡ x k of the form x k = ∑ i = 1 k p i ( u ) t i ( ω ) . The image x k is constructed to yield the minimum distance (in some sense) between x and all possible approximations of x of the form ∑ i = 1 k z i ( u ) y i ( ω ) . The distance is chosen to ensure an efficient way to estimate the deviation of x k from x .
Geometrically, the parametrized function x may be regarded as a curve ω ↦ x ( ⋅ , ω ) in a separable Hilbert space. Then x k = PCT ( x , k ) can be inter- preted as a projection of this curve onto an k -dimensional subspace, which is chosen in such a way that the image x k gives a best possible individual fit to x among all k -dimensional subspaces. As we will see in Subsection 3.1, this necessarily leads to nonlinearity of the mapping PCT.
As we will see in Subsection 3.3, discretizing the function x ( u , ω ) and its PCT yields matrices and the projections onto their first k principal compo- nents, respectively. This explains our terminology: PCT can be regarded as a functional analog of the principal component analysis (PCA) of matrices. This terminology was suggested by Prof. E. Voit in a private talk with the second author during his seminar lecture in Oslo in 2014.
All the papers cited above concentrate on efficiency of the metamodeling approach and disregard mathematical properties of PCT and their justification, which is, for instance, quite important for understanding the limitations of the method and describing the exact conditions under which the method is applicable. In particular, the convergence properties of the sequence of metamodels to the original model has not been studied in the available literature. In our paper we try to fill this gap suggesting a rigorous mathematical approach to PCT and analysis of its basic properties. More precisely, we demonstrate how the theory of compact operators in separable Hilbert spaces can be used to provide such an analysis.
The paper is organized as follows. In Section 2 we introduce the distance in the space of parametrized functions, formulate the theorem on the best indivi- dual fit in terms of PCT of functions (Subsection 2.1) and provide some examples relevant for systems biology (Subsection 2.2). In Section 3 we study mathematical properties of PCT: nonlinearity (Subsection 3.1), continuity (Subsection 3.2) and show relations of PCT and PCA via discretization of functions (Subsections 3.3 and 3.4). In Section 4 we study PCT of products of parametrized functions which are interpreted as elements of the tensor product of two or several Hilbert spaces (Subsection 4.1). We aslo show that PCT pre- serves the tensor products and therefore the product of parametrized functions (Subsection 4.2) and give some examples (Subsection 4.3). In Appendix 5 we offer short proofs of some auxiliary results used in the paper: Allahverdiev’s theorem (Subsection 5.1) and some propositions related to tensor products of linear compact operators in Hilbert spaces (Subsection 5.2).
In this section we define the distance in the space of parametrized functions and describe how best individual fits PCT ( x , k ) ( k ∈ ℕ ) to a given function x can be obtained using the theory of compact operators in Hilbert spaces. We also prove nonlinearity and continuity of PCT and give some specific examples.
Let U be a compact subset of ℝ N and Ω be a compact subset of ℝ M . We consider the separable Hilbert spaces L 2 ( U ) and L 2 ( Ω ) with the standard scalar products ( ⋅ , ⋅ ) and the norms ‖ ⋅ ‖ .
Suppose we are given a measurable, square integrable function x : U × Ω → ℝ , i.e.
∫ U ∫ Ω | x ( u , ω ) | 2 d u d ω < ∞ (1)
The aim is to find a best possible approximation of x in the class L k of all functions of the form x k ( u , ω ) = ∑ i = 1 k z i ( u ) y i ( ω ) , where z i ∈ L 2 ( U ) and y i ∈ L 2 ( Ω ) .
To explain better the nature of topology we use in this case let us have a look at finite dimensional Hilbert, i.e. Euclidean, spaces. Let X = [ x i j ] be an m × n -matrix, for instance, a discretized function x ( u , ω ) where x i j = x ( u i , ω j ) . In this case, the best approximation X k to X in the class of m × n -matrices of rank not greater than k is given by the first k terms in the singular value decomposition of X :
X k = ∑ i = 1 k t i p i * , (2)
where t i = X p i and p i are the normalized eigenvectors of the matrix X * X and A * is the conjugate (transpose) of a matrix A . In other words,
min ‖ X − Y ‖ = ‖ X − X k ‖ , where rank Y ≤ k (3)
The matrix norm is defined as ‖ Z ‖ = sup ‖ α ‖ ≤ 1 ‖ Z α ‖ , where ‖ α ‖ is the Euclidean norm in ℝ n .
Now we will look at arbitrary real separable Hilbert spaces which are denoted by H and K and which are equipped with the scalar products ( ⋅ , ⋅ ) H and ( ⋅ , ⋅ ) K and the corresponding norms ‖ ⋅ ‖ H and ‖ ⋅ ‖ K , respectively. Assume that
X : H → K is a linear compact operator. Its norm is again defined as ‖ X ‖ = sup ‖ α ‖ H ≤ 1 ‖ X α ‖ K .
Put
L k ( H , K ) = { Y isalinearboundedoperatorfrom H to K suchthat dim ( Im Y ) ≤ k } (4)
We want to find an operator X k ∈ L k ( H , K ) for which ‖ X − X k ‖ → min . The construction of X k is very close to the singular value decomposition of matrices.
Assume that X * : H → K is the adjoint of X . Then the linear compact operators X * X : H → H , X X * : K → K are self-adjoint and positive-definite.
Let σ 1 2 ≥ σ 2 2 ≥ ⋯ ≥ σ i 2 ≥ ⋯ → 0 , σ i > 0 , ( i = 1 , 2 , ⋯ ) be all positive eigen- values of the operator X * X , the associated normalized eigenvectors being p 1 , p 2 , p 3 , ⋯ ∈ H , respectively:
X * X p i = σ i 2 p i , ‖ p i ‖ H = 1 , i ∈ ℕ (5)
It is well-known that p i can always be chosen to be orthogonal: p i ⊥ p j , i ≠ j and for any α ∈ H there is a unique set c i ∈ ℝ , i ∈ ℕ and a unique p 0 ∈ Null ( X * X ) for which α = p 0 + ∑ i = 1 ∞ c i p i and, moreover, ‖ α ‖ H 2 = ‖ p 0 ‖ H 2 + ∑ i = 1 ∞ c i 2 . Now, the operator X can be represented as
X α = ∑ i = 1 ∞ ( α , p i ) H t i , (6)
where t i = X p i and the convergence is understood in the sense of the norm in the space K . The truncated versions X k ∈ L k ( H , K ) of this representation is defined by
X k α = ∑ i = 1 k ( α , p i ) H t i (7)
The following result, a short proof of which is offered in Appenix 5.1, is known as Allahverdiev’s theorem, see e.g. [8, Chapter II, p. 28]:
Theorem 1. For any linear compact operator X : H → K
min Y ∈ L k ( H , K ) ‖ X − Y ‖ = ‖ X − X k ‖ = σ k + 1 (8)
The functions in numerical calculations are usually replaced by their discreti- zations, which in the case of parametrized functions gives matrices. That is why, the distance in the space of the parametrized functions x ( u , ω ) should be consistent with the distance in the space of matrices, so that we can get all the advantages of the finite dimensional singular value decomposition as well as Allahverdiev’s theorem. To define the distance in the space of matrices we have to interpret matrices as linear operators between two Euclidean spaces. Analo- gously, we have to interpret parametrized functions as operators between suitable Hilbert spaces, and define the distance accordingly.
Let us therefore go back to the spaces L 2 ( U ) , L 2 ( Ω ) , where U , as before, is a compact subset of ℝ N and Ω is a compact subset of ℝ M . We denote the norm in both spaces as ‖ ⋅ ‖ L 2 . Consider the integral operator
( X α ) ( ω ) = ∫ U x ( u , ω ) α ( u ) d u (9)
Under the assumptions of the square integrability of the kernel x ( u , ω ) the operator X becomes compact and linear from the space L 2 ( U ) to the space L 2 ( Ω ) (see e.g. [
The distance between two square integrable parametrized functions x and x ′ can be now defined in the following way:
dist ( x , x ′ ) = ‖ X − X ′ ‖ , (10)
where X is defined in (9) and ( X ′ α ) ( ω ) = ∫ U x ′ ( u , ω ) α ( u ) d u . The norm of the linear operators acting from L 2 ( U ) to L 2 ( Ω ) is defined in the standard way.
Remark 1. Evidently,
‖ X ‖ ≤ C ∫ U ∫ Ω | x ( u , ω ) | 2 d u d ω (11)
for some constant C . Therefore, L 2 -convergence of the sequence { x ( n ) } implies the convergence in the sense of the distance dist.
Let X * : L 2 ( Ω ) → L 2 ( U ) be the adjoint of X , so that
( X * β ) ( u ) = ∫ Ω x ( u , ω ) β ( ω ) d ω (12)
Now, the self-adjoint and positive-definite integral operators
X * X : L 2 ( U ) → L 2 ( U ) and X X * : L 2 ( Ω ) → L 2 ( Ω ) (13)
can be written as follows:
( X * X α ) ( u ) = ∫ U γ ( u , v ) α ( v ) d v , where γ ( u , v ) = ∫ Ω x ( u , ω ) x ( v , ω ) d ω (14)
and
( X X * β ) ( ω ) = ∫ Ω δ ( ω , ξ ) β ( ξ ) d ξ , where δ ( ω , ξ ) = ∫ U x ( u , ω ) x ( u , ξ ) d u , (15)
respectively. Let, as before,
σ 1 2 ≥ σ 2 2 ≥ ⋯ ≥ σ i 2 ≥ ⋯ → 0 ( i = 1 , 2 , ⋯ ) (16)
be all positive eigenvalues of the integral operator (14) associated with its normalized and mutually orthogonal eigenfunctions p i ∈ L 2 ( U ) , i.e.
( Γ p i ) ( u ) = ∫ U γ ( u , v ) p i ( u ) d u = σ i 2 p i ( u ) , ∫ U p i ( u ) p j ( u ) d u = { 0 ( i ≠ j ) 1 ( i = j ) (17)
From Theorem 1 we immediately obtain the Best Individual Fit Theorem.
Theorem 2. For a given function x : U × Ω → ℝ satisfying (1) the best approximation of x in the class L k of all functions of the form ∑ i = 1 k z i ( u ) y i ( ω ) , where z i ∈ L 2 ( U ) and y i ∈ L 2 ( Ω ) , is given by
x k ( u , ω ) = ∑ i = 1 k p i ( u ) t i ( ω ) , (18)
where p i are the normalized, mutually orthogonal eigenfunctions of the operator (14) and t i ( ω ) = ( X p i ) ( ω ) = ∫ U x ( u , ω ) p i ( u ) d u . Moreover, dist ( x , x k ) = σ k + 1 for all natural k .
In other words,
dist ( x , y ) ≥ dist ( x , x k ) = σ k + 1 forall y ∈ L k (19)
Remark 2. The functions t i have the following properties (which we do not use in this paper):
• t i ⊥ t j for all i ≠ j ;
• t i = σ i for all i ;
• X X * t i = σ i 2 t i for all i .
Definition 1.
• The kth Principal Component Transform (PCT) of the function x ∈ L 2 ( U × Ω ) is defined as
PCT ( x , k ) ( u , ω ) = x k ( u , ω ) = ∑ i = 1 k p i ( u ) t i ( ω ) (20)
• The Full Principal Component Transform of the function x ∈ L 2 ( U × Ω ) is given by
PCT ( x , ∞ ) ( u , ω ) = ∑ i = 1 ∞ p i ( u ) t i ( ω ) (21)
We will also write PCT ( x , ∞ ) ≡ PCT ( x ) .
We remark that none of these transforms is uniquely defined: even if all σ i are all different, we have always a choice between two normalized eigenfunctions p i . However, the distance between x and any x k is independent of the projection we use. On the other hand, this means that the properties of PCT should be formulated with a care.
In this subsection we consider three examples which are of importance in systems biology.
Example 1. Let
x ( u , ω ) = u ω (22)
Assume that u ∈ [ a , b ] , a , b ∈ ℝ , a > 0 , ω ∈ [ 0 , 1 ] . Then, using Formulas (14) and (15), we obtain the following representations of the kernels γ and δ
γ ( u , v ) = ∫ 0 1 u ω v ω d ω = ∫ 0 1 ( u v ) ω d ω = u v − 1 ln ( u v ) , (23)
δ ( ω , ξ ) = ∫ a b u ω u ξ d u = ∫ a b u ω + ξ d u = b ω + ξ + 1 − a ω + ξ + 1 ω + ξ + 1 (24)
Therefore the normalized eigenfunctions p i ( u ) can be obtained from the equation
∫ 0 1 ( u v − 1 ln ( u v ) ) p i ( u ) d u = σ i 2 p i ( u ) (25)
The functions t i ( ω ) = ∫ a b u ω p i ( u ) d u can be alternatively found from the equations
∫ 0 1 ( b ω + ξ + 1 − a ω + ξ + 1 ω + ξ + 1 ) t i ( ω ) d ω = σ i 2 t i ( ω ) (26)
The parametrized power function x ω is of crucial importance in the bioche- mical system theory, where u represents the concentration of a metabolite, while ω stands for the kinetic order. In the case of several metabolites, one gets products of such power functions, which, in turn, are included into the right- hand side of the so-called “synergetic system”, see (e.g. [
Example 2. Consider the function
x ( u , ω ) = e − ω | u | (27)
Assume that u ∈ [ − c , c ] , c ∈ ℝ , c > 0 , ω ∈ [ a , b ] , a , b ∈ ℝ , a > 0. Then, using Formulas (14) and (15), we obtain the following representations of the kernels γ and δ
γ ( u , v ) = ∫ a b e − ω | u | e − ω | v | d ω = ∫ a b e − ω ( | u | + | v | ) d ω = 1 | u | + | v | ( e − a ( | u | + | v | ) − e − b ( | u | + | v | ) ) , (28)
δ ( ω , ξ ) = ∫ − c c e − ω | u | e − l | u | d u = ∫ − c c e − | u | ( ω + l ) d u (29)
We denote for simplicity
F ( s , ω , ξ ) = ∫ 0 s e − | u | ( ω + ξ ) d u = { 1 ω + l e s ( ω + l ) for s < 0 1 ω + l e − s ( ω + ξ ) for s > 0 (30)
and get
δ ( ω , ξ ) = F ( c , ω , ξ ) − F ( − c , ω , ξ ) (31)
Therefore the normalized eigenfunctions p i ( u ) can be obtained from the equation
∫ b a ( 1 | u | + | v | ) ( e − a ( | u | + | v | ) − e − b ( | u | + | v | ) ) p i ( u ) d u = σ i 2 p i ( u ) (32)
The functions t i ( ω ) = ∫ a b e − ω | u | p i ( u ) d u can be also obtained from the equations
∫ − c c ( F ( c , ω , ξ ) − F ( − c , ω , ξ ) ) t i ( ω ) d ω = σ i 2 t i ( ω ) (33)
The function e − ω | u | is often used in the neural field models, where it serves as the simplest example of the so-called “connectivity functions” describing the interactions between neurons, see e.g. [
Example 3. Consider the Hill function
x ( u , ω ) = u q u q + θ q (34)
Assume that u ∈ [ a , b ] , a , b ∈ ℝ , a > 0 , q ∈ [ q 0 , q m ] , q 0 , q m ∈ ℝ , q 0 > 0 , θ ∈ [ θ 0 , θ m ] , θ 0 , θ m ∈ ℝ , θ 0 > 0. Putting ω = ( q , θ ) and ξ = ( q ′ , θ ′ ) we obtain
γ ( u , v ) = ∫ q 0 q m ∫ θ 0 θ m u q u q + θ q v q v q + θ q d q d θ (35)
and
δ ( ω , ξ ) = ∫ a b u q u q + θ q u q ′ u q ′ + θ ′ q ′ d u (36)
The Hill function plays central role in the theory of gene regulatory networks, where it stands for the gene activation function, x being the gene concentra- tion and θ being the activation threshold, see e.g. [
The Principal Component Transform PCT ( x , k ) is not uniquely defined. That is why, we will use a special notation when comparing PCT of different func- tions, namely, we will write PCT ( x , k ) = ˙ PCT ( y , k ) if there exist coinciding versions of PCT of x and y .
Theorem 3.
1. PCT ( c x , k ) = ˙ c PCT ( x , k ) for any c ∈ ℝ and k ∈ ℕ .
2. In general, PCT ( x ( 1 ) + x ( 2 ) , k ) is different from PCT ( x ( 1 ) , k ) + PCT ( x ( 2 ) , k ) .
Proof.
1. The case c = 0 is trivial. We assume therefore that c ≠ 0 . Let ( X α ) ( ω ) = ∫ U x ( u , ω ) α ( u ) d u and PCT ( x ) ( u , ω ) = ∑ i = 1 ∞ p i ( u ) t i ( ω ) , see (21). By definition, p i are normalized, mutually orthogonal eigenfunctions of the ope- rator X * X and t i = X p i . Let X c ( α ) ≡ X ( c α ) . Then
X c * X c p i = X * ( c p i ) X ( c p i ) = c 2 X * X p i = c 2 σ i 2 p i , (37)
so that p i are the same for X c and X . On the other hand, X c ( p i ) = X ( c p i ) = c X ( p i ) = c t i and
PCT ( c x , k ) ( u , ω ) = ∑ i = 1 k p i ( u ) c t i ( ω ) = ˙ c PCT ( x , k ) ( u , ω ) (38)
2. Before constructing an example illustrating nonlinearity of PCT we remark that this statement, in its more precise formulation, says that there are no
versions of PCT ( x ( 1 ) + x ( 2 ) , k ) , PCT ( x ( 1 ) , k ) , PCT ( x ( 2 ) , k ) , for which PCT ( x ( 1 ) + x ( 2 ) , k ) = PCT ( x ( 1 ) , k ) + PCT ( x ( 2 ) , k ) .
Let U = Ω = [ 0 , 1 ] and the functions r τ : [ 0 , 1 ] → ℝ ( τ = 1 , 2 ) satisfy
∫ 0 1 r τ 2 ( u ) d u = 1 and ∫ 0 1 r 1 ( u ) r 2 ( u ) d u = 0 (39)
We put
( X ( 1 ) α ) ( ω ) = 2 r 1 ( ω ) ∫ 0 1 r 1 ( u ) α ( u ) d u + r 2 ( ω ) ∫ 0 1 r 2 ( u ) α ( u ) d u , ( X ( 2 ) α ) ( ω ) = r 1 ( ω ) ∫ 0 1 ( 2 r 1 ( u ) + r 2 ( u ) ) α ( u ) d u + r 2 ( ω ) ∫ 0 1 ( r 1 ( u ) + r 2 ( u ) ) α ( u ) d u . (40)
To calculate PCT we observe that both operators have a 2-dimensional image in L 2 ( Ω ) . Using the representation α ( u ) = c 1 r 1 ( u ) + c 2 r 2 ( u ) + α ^ ( u ) where α ^ ⊥ r τ ( τ = 1 , 2 ) we reduce the operators X ( 1 ) and X ( 2 ) to the matrices
A = [ 2 0 0 1 ] and B = [ 2 1 1 2 ] , respectively ,
so that
X ( 1 ) α = ( r 1 r 2 ) A ( c 1 c 2 ) * and X ( 2 ) α = ( r 1 r 2 ) B ( c 1 c 2 ) * , (41)
where ( a , b ) and ( a , b ) * are row and column vectors, respectively.
Matrices A and B are symmetric. Then A * A = A 2 and B * B = B 2 . The first eigenpairs of A 2 and B 2 are 4, ( 1 0 ) * and 9, ( 1 1 ) * , respectively. There- fore the best rank 1 approximations of A and B are
A 1 = [ 2 0 0 0 ] and B 1 = [ 1.5 1.5 1.5 1.5 ] , respectively ,
so that PCT ( X ( 1 ) ,1 ) ( u , ω ) = 2 r 1 ( u ) r 1 ( ω ) and PCT ( X ( 2 ) , 1 ) ( u , ω ) = 1.5 ( r 1 ( u ) + r 2 ( u ) ) ( r 1 ( ω ) + r 2 ( ω ) ) , which both are operators with an 1-dimensional image. However, their sum
3.5 r 1 ( u ) r 1 ( ω ) + 1.5 r 1 ( u ) r 2 ( ω ) + 1.5 r 2 ( u ) r 1 ( ω ) + 1.5 r 2 ( u ) r 2 ( ω ) (42)
has a 2-dimensional image, as its representation in the basis { r 1 , r 2 } is given by the non-singular matrix A = [ 3.5 1.5 1.5 1.5 ] . Therefore PCT ( X ( 1 ) ,1 ) + PCT ( X ( 2 ) ,1 ) cannot coincide with any version of PCT ( X ,1 ) .
Let us consider a sequence of parametrized, square integrable functions x ( n ) : U × Ω → ℝ .
Theorem 4. Let k ∈ ℕ and dist ( x ( n ) , x ) → 0 ( n → ∞ ) for some parame- trized, square integrable functions x ( n ) , x : U × Ω → ℝ . Then for any version x k = P C T ( x , k ) there are versions x k ( n ) = PCT ( x ( n ) , k ) such that
dist ( x k ( n ) , x k ) → 0, n → ∞ (43)
Proof. Let H = L 2 ( U ) , K = L 2 ( Ω ) . We define the compact linear integral operators X ( n ) , X : H → K using the kernels x ( n ) , respectively. By the definition of the dist we immediately get that ‖ X ( n ) − X ‖ → 0 , n → ∞ .
Let p i , i = 1 , ⋯ , k be the normalized, mutually orthogonal eigenfunctions of the operator X * X corresponding to its first k eigenvalues σ 1 2 ≥ σ 2 2 ≥ ⋯ ≥ σ k 2 . Since X ( n ) converges to the operator X in norm, we can always choose a sequence of the eigenfunctions p i ( n ) such that
‖ p i ( n ) − p i ‖ H → 0 , n → ∞ , i = 1 , ⋯ , k (44)
In this case
t i ( n ) = X ( n ) p i ( n ) → t i = X p i , n → ∞ , i = 1 , ⋯ , k (45)
Therefore ‖ X k ( n ) − X k ‖ → 0, n → ∞ , which implies
dist ( x k ( n ) , x k ) → 0, n → ∞ (46)
The above theorem can be reformulated in terms of robustness of PCT.
Corollary 1. Let k ∈ ℕ and x : U × Ω → ℝ be a parametrized, square inte- grable function and k ∈ ℕ . Then given an ε > 0 there is a ε > 0 such that for every parametrized, square integrable function x ′ : U × Ω → ℝ the follow- ing holds true:
dist ( x ′ , x ) < δ ⇒ dist ( PCT ( x ′ , k ) − PCT ( x , k ) ) < ε (47)
for some suitable versions of PCT.
In the papers [
In this subsection we suppose that all functions are continuous, which is sufficient for most applications. The general case is, however, unproblematic as well if we slightly adjust the approximation procedure.
Let x be a continuous function on a compact set D ⊂ ℝ N + M , D = U × Ω , where s = ( u , ω ) .
For all n ∈ ℕ , D is divided into n measurable subsets D i ( n ) :
D = ∪ i = 1 n D i (48)
We define the sequence of the functions x n ( s ) as follows:
x ( n ) ( s ) = x ( s i ( n ) ) , s ∈ D i ( n ) , (49)
where s i ( n ) is an arbitrary point in D i ( n ) .
Lemma 1. Let x be a continuous function on D . Then
dist ( x ( n ) , x ) → 0, n → ∞ (50)
provided that max 1 ≤ i ≤ n diam D i ( n ) → 0 as n → ∞ .
Proof. The function x is continuous on the compact set D , therefore x ( s ) is uniformly continuous on D . Then for all ε > 0 there is δ > 0 such that
| s − s ′ | < δ ⇒ | x ( s ) − x ( s ′ ) | < ε (51)
On the other hand, there is a number N for which max 1 ≤ i ≤ n diam D i ( n ) < ε as long as n > N . Let s be an arbitrary point from D . Then for any n there is D i ( n ) such that s ∈ D i ( n ) . Taking now an arbitrary n > N we obtain
| x ( n ) ( s ) − x ( s ) | = | x ( s i ( n ) ) − x ( s ) | < ε , (52)
so that dist ( x ( n ) , x ) ≤ C ε , where C 2 is the Lebesgue measure of the set D .
Hence dist ( x ( n ) , x ) → 0, n → ∞ .
Corollary 2. Let k ∈ ℕ and x : U × Ω → ℝ be a parametrized, continuous function, { x ( n ) } be a sequence of discrete approximations satisfied the assump- tions of Lemma 1. Then for any version x k = PCT ( x , k ) there are versions x k ( n ) = PCT ( x ( n ) , k ) such that dist ( x k ( n ) , x k ) → 0, n → ∞ .
Finally, we observe that if D i ( n ) are defined as U j ( n ) × Ω l ( n ) , where for any n { U j ( n ) } and { Ω l ( n ) } are measurable partitions of U and Ω , respectively, and
i = ( j , l ) , then PCT of the discrete functions x ( n ) coincide with the k - truncated SVD of the matrix [ x ( n ) ( s ( j , l ) ) ] . In the next subsection we provide an example of such approximation stemming from the biochemical systems theory.
In this subsection we study the parametrized power function x ( u , ω ) = u ω defined on the interval [ u 1 , u n ] , u 1 , u n ∈ ℝ , u 1 > 0 with the parameter values ω ∈ [ ω 1 , ω m ] . To approximate this function we construct a matrix X ˜ as follows: we divide [ u 1 , u n ] into n − 1 parts: u 1 < u 2 < ⋯ < u n . Similarly, we divide the interval [ ω 0 , ω m ] into m − 1 parts. Every entry of the matrix X ˜ will be given by the values u i ω j ( 1 ≤ i ≤ n , 1 ≤ j ≤ m ) :
X ˜ = [ u 1 ω 1 u 2 ω 1 ... u n ω 1 u 1 ω 2 u 2 ω 2 ... u n ω 2 ... ... ... ... u 1 ω m u 2 ω m ... u n ω m ] (53)
The corresponding discretization of PCT ( x , k ) will be then given by the matrix
∑ i = 1 k t ˜ i p ˜ i * , t ˜ i ∈ ℝ , p ˜ i ∈ ℝ n (54)
The vectors p ˜ i and t ˜ i can be obtained from the singular value decompo- sition of the matrix X ˜
X ˜ = U m × m S m × n P n × n * , (55)
where the rows of the scores matrix T = U S consists of the numbers t ˜ i and the columns of the loadings matrix P are the vectors p ˜ i . As an example, let us consider the case k = 4 , [ u 0 , u n ] = [ 0.5 , 1.5 ] , [ ω 0 , ω m ] = [ − 1 , 2 ] , n = m = 50 . Then
X ˜ = [ u 1 ω 1 u 1 ω 1 ... u 50 ω 1 u 1 ω 2 u 2 ω 2 ... u 50 ω 2 ... ... ... ... u 1 ω 50 u 2 ω 50 ... u 50 ω 50 ] , T = [ t 11 t 12 t 13 t 14 t 21 t 22 t 23 t 24 ... ... ... ... t m 1 t m 2 t m 3 t m 4 ] , P = [ p 11 p 12 ... p 1 n p 21 p 22 ... p 2 n p 31 p 32 ... p 3 n p 41 p 42 ... p 4 n ] , (56)
so that the Expression (54) becomes
t 1 p 1 * + t 2 p 2 * + t 3 p 3 * + t 4 p 4 * (57)
Assume now that ω = 0.5 . This value corresponds to row s in the matrix T . We find a number s as follows:
s ≈ m ω − ω 0 ω m − ω 0 = 50 0.5 − ( − 1 ) 0.5 − ( − 1 ) = 25 (58)
This yields
t 1 = t s 1 = − 7.0579 t 2 = t s 2 = − 0.0089 t 3 = t s 3 = 0.2400 t 4 = t s 4 = 0.0016
and hence
u 0.5 ≈ − 7.0579 p 1 * ( u ) − 0.0089 p 2 * ( u ) + 0.2400 p 3 * ( u ) + 0.0016 p 4 * ( u ) (59)
where p i * ( x ) ∈ ℝ 50 , i = 1 , 2 , 3 , 4 are the columns in the loadings matrix P , see
The
To calculate PCT of products of parametrized functions we need to apply the theory of tensor products of Hilbert spaces and compacts operators. Appendix 5.2 includes all the necessary details we need in this section.
Below we use the following notation (where τ = 1 , 2 ):
• U τ ⊂ ℝ N , Ω τ ⊂ ℝ M are compact sets;
• U = U 1 × U 2 , Ω = Ω 1 × Ω 2 ;
• H τ = L 2 ( U τ ) , K τ = L 2 ( Ω τ ) , H = L 2 ( U ) , K = L 2 ( Ω ) ;
• x ( τ ) ( u τ , ω τ ) , u τ ∈ U τ , ω τ ∈ Ω τ are square integrable functions and x ( u , ω ) = x ( u 1 , ω 1 ) x ( u 2 , ω 2 ) ;
• ( X ( τ ) h τ ) ( ω τ ) = ∫ U τ x ( τ ) ( u τ , ω τ ) h τ ( u τ ) d u τ so that X ( τ ) : H τ → K τ ;
• ( X h ) ( ω ) = ∫ U x ( u , ω ) h ( u ) d u so that X : H → K .
Theorem 5. In the above notation:
• H = H 1 ⊗ H 2 , K = K 1 ⊗ K 2
• X = X ( 1 ) ⊗ X ( 2 )
Proof. We use the definition of the tensor product from Appendix 5.2.
Let H τ = L 2 ( U τ ) have an orthonormal basis { e 1 ( τ ) , e 2 ( τ ) , ⋯ , e i ( τ ) , ⋯ } , so that any h τ ∈ H τ can be represented as
h τ = ∑ i = 1 ∞ c i ( τ ) e i ( τ ) ( τ = 1 , 2 ) , (60)
where ∑ i = 1 ∞ | c i ( τ ) | 2 < ∞ .
We prove now that the set E ≡ { e i ( 1 ) e j ( 2 ) , i , j ∈ ℕ } is an orthonormal basis in the space H = L 2 ( U ) . Its orthonormality follows directly from its definition. It remains therefore to check that the set of all linear combinations of the elements from E is dense in H . Indeed, the set of continuous functions, and hence the set P of polynomials P ( u ) , on U is dense in H . On the other hand, the set P ^ of polynomials of the form P ( 1 ) ( u ( 1 ) ) P ( 2 ) ( u ( 2 ) ) spans the set P and, finally, the set E spans the set P ^ . Thus, E spans H and we have proved that any h ∈ H can be represented as the L 2 -convergent series
h = ∑ i , j = 1 ∞ c i j e i ( 1 ) e j ( 2 ) (61)
for some set c i j satisfying
∑ i , j = 1 ∞ c i j 2 < ∞ (62)
Defining
( e i ( 1 ) ⊗ e j ( 2 ) ) ( u ) ≡ e i ( 1 ) ( u 1 ) e j ( 2 ) ( u 2 ) (63)
and comparing the Representation (61) with the Formula (94) proves the equality H = H 1 ⊗ H 2 . The equality K = K 1 ⊗ K 2 can be checked similarly.
Let us now prove the last formula of the theorem. First of all, we remark that the Definition (63) implies
g 1 ( ω 1 ) g 2 ( ω 2 ) = ( g 1 ⊗ g 2 ) ( ω ) (64)
for any g τ ∈ H τ , τ = 1 , 2 .
By the assumptions on the kernels, the operators in this equality are linear and bounded. Therefore, it is sufficient to check the equality for h = h 1 ⊗ h 2 (see Appendix 5.2).
( X h ) ( ω ) = ∫ U x ( u , ω ) h ( u ) d u = ∫ U 1 × U 2 x ( 1 ) ( u 1 , ω 1 ) x ( 2 ) ( u 2 , ω 2 ) h 1 ( u 1 ) h 2 ( u 2 ) d u 1 d u 2 = ∫ U 1 x ( 1 ) ( u 1 , ω 1 ) h 1 ( u 1 ) d u 1 ∫ U 2 x ( 2 ) ( u 2 , ω 2 ) h 2 ( u 2 ) d u 2 = ( X ( 1 ) h 1 ) ( ω 1 ) ( X ( 2 ) h 2 ) ( ω 2 ) = ( ( X ( 1 ) h 1 ) ⊗ ( X ( 2 ) h 2 ) ) ( ω ) (65)
due to (64). Hence X h = X ( h 1 ⊗ h 2 ) = ( X ( 1 ) h 1 ) ⊗ ( X ( 2 ) h 2 ) . Comparing this for- mula with the Definition (100) completes the proof of the theorem.
The main theoretical result of this subsection is the following theorem:
Theorem 6.
PCT ( X ( 1 ) ⊗ X ( 2 ) ) = ˙ PCT ( X ( 1 ) ) ⊗ PCT ( X ( 2 ) ) (66)
Proof. For τ = 1 , 2 we have by definition
PCT ( X ( τ ) ) α = ∑ i = 1 ∞ ( α , p i ( τ ) ) t i ( τ ) , (67)
where p i ( τ ) are normalized, mutually orthogonal eigenvectors of the operator ( X ( τ ) ) * X ( τ ) corresponding to the eigenvalues ( σ ( τ ) ) 2 and t i ( τ ) = ( X ( τ ) ) p i ( τ ) .
Put X = X ( 1 ) ⊗ X ( 2 ) and p i j = p i ( 1 ) ⊗ p j ( 2 ) . Using the properties of the tensor product listed in Appendix 5.2 we obtain
( X * X ) p i j = ( X ( 1 ) ⊗ X ( 2 ) ) * ( X ( 1 ) ⊗ X ( 2 ) ) ( p i ( 1 ) ⊗ p j ( 2 ) ) = ( ( X ( 1 ) ) * X ( 1 ) ) ⊗ ( ( X ( 2 ) ) * X ( 2 ) ) ( p i ( 1 ) ⊗ p j ( 2 ) ) = ( ( ( X ( 1 ) ) * X ( 1 ) ) p i ( 1 ) ) ⊗ ( ( ( X ( 2 ) ) * X ( 2 ) ) p i ( 2 ) ) = ( ( σ i ( 1 ) ) 2 p i ( 1 ) ) ⊗ ( ( σ j ( 2 ) ) 2 p i ( 2 ) ) = ( σ i ( 1 ) σ j ( 2 ) ) p i j , (68)
where
( p i j , p l m ) = ( ( p i ( 1 ) ⊗ p j ( 2 ) ) , ( p l ( 1 ) ⊗ p m ( 2 ) ) ) = ( p i ( 1 ) , p l ( 1 ) ) ( p j ( 2 ) , p m ( 2 ) ) = 1 if i = l , j = m and = 0 otherwise (69)
This proves that p i j are normalized, mutually orthogonal eigenvectors of the operator X * X corresponding to the eigenvalues σ i ( 1 ) σ j ( 2 ) .
On the other hand,
X p i j = ( X ( 1 ) ⊗ X ( 2 ) ) ( p i ( 1 ) ⊗ p j ( 2 ) ) ( X ( 1 ) p i ( 1 ) ) ⊗ ( X ( 2 ) p i ( 2 ) ) = t i ( 1 ) ⊗ t j ( 2 ) ≡ t i j (70)
Therefore,
( PCT ( X ( 1 ) ⊗ X ( 2 ) ) ) ( α 1 ⊗ α 2 ) = ˙ ∑ i = 1 ∞ ∑ j = 1 ∞ ( p i j , α 1 ⊗ α 2 ) t i j = ( ∑ i = 1 ∞ ( p i ( 1 ) , α 1 ) t i ( 1 ) ) ⊗ ( ∑ j = 1 ∞ ( p i ( 2 ) , α 1 ) t i ( 2 ) ) = ˙ PCT ( X ( 1 ) ) ⊗ PCT ( X ( 2 ) ) , (71)
which proves the theorem. □
Remark 3. Theorem 6 is only valid for the full PCT. The truncated versions of PCT are not necessarily valid, as the order of the singular values σ i j = σ i ( 1 ) σ j ( 2 ) depends on the magnitude of the eigenvales σ i ( 1 ) and σ j ( 2 ) .
In this subsection we describe the kernels of the integral operators related to products of parametrized functions from Subsection 0. These examples are of importance in systems biology.
Example 1. Consider the following function
x ( u 1 , u 2 , ω 1 , ω 2 ) = u 1 ω 1 u 2 ω 2 , u 1 , u 2 ∈ U , ω 1 , ω 2 ∈ Ω (72)
Assume that U = [ a , b ] , a , b ∈ ℝ , a > 0 , Ω = [ 0 , 1 ] . Then, using Formulas (14) and (15), we obtain the following representations of the kernels γ and δ
γ ( u 1 , u 2 , v 1 , v 2 ) = ∬ Ω u 1 ω 1 u 2 ω 2 v 1 ω 1 v 2 ω 2 d ω 1 d ω 2 = ∫ 0 1 ∫ 0 1 ( u 1 v 1 ) ω 1 ( u 2 v 2 ) ω 2 d ω 1 d ω 2 = ∫ 0 1 ( u 2 v 2 ) ω 2 d ω 2 ∫ 0 1 ( u 1 v 1 ) ω 1 d ω 1 = u 1 v 1 − 1 ln ( u 1 v 1 ) ⋅ u 2 v 2 − 1 ln ( u 2 v 2 ) ,
δ ( ω 1 , ω 2 , ξ 1 , ξ 2 ) = ∬ U u 1 ω 1 u 2 ω 2 u 1 ξ 1 u 2 ξ 2 d u 1 d u 2 = ∬ U u 1 ω 1 + ξ 1 u 2 ω 2 + ξ 2 d u 1 d u 2 = ∫ a b u 1 ω 1 + ξ 1 d u 1 ∫ a b u 2 ω 2 + ξ 2 d u 2 = b ω 1 + ξ 1 + 1 − a ω 1 + ξ 1 + 1 ω 1 + ξ 1 + 1 ⋅ b ω 1 + ξ 1 + 1 − a ω 2 c + ξ 2 + 1 ω 2 + ξ 2 + 1
Example 2. Consider the function
x ( u 1 , u 2 , ω 1 , ω 2 ) = e − ω 1 | u 1 | ⋅ e − ω 2 | u 2 | , u 1 , u 2 ∈ U , ω 1 , ω 2 ∈ Ω (73)
Assume that U = [ − c , c ] , c ∈ ℝ , c > 0 , Ω = [ a , b ] , a , b ∈ ℝ , a > 0. Then, using Formulas (14) and (15), we obtain the following representations of the kernels γ and δ
γ ( u 1 , u 2 , v 1 , v 2 ) = ∬ Ω e − ω 1 | u 1 | e − ω 2 | u 2 | e − ω 1 | v 1 | e − ω 2 | v 2 | d ω 1 d ω 2 = ∬ Ω e − ω 1 ( | u 1 | + | | v 1 | ) e − ω 2 ( | u 2 | + | v 2 | ) d ω 1 d ω 2 = ∫ a b e − ω 1 ( | u 1 | + | v 1 | ) d ω 1 ∫ a b e − ω 2 ( | u 2 | + | v 2 | ) d ω 2 = 1 − | u | − | v | ( e − b ( | u | + | v | ) − e − a ( | u | + | v | ) ) ,
δ ( ω 1 , ω 2 , ξ 1 , ξ 2 ) = ∬ U e − ω 1 | u 1 | e − ω 2 | u 2 | e − ξ 1 | u 1 | e − ξ 2 | u 2 | d u 1 d u 2 = ∬ U e − | u 1 | ( ω 1 + ξ 1 ) e − | u 2 | ( ω 2 + ξ 2 ) d u 1 d u 2 = ∫ − c c e − | u 1 | ( ω 1 + ξ 1 ) d u 1 ∫ − c c e − | u 2 | ( ω 2 + ξ 2 ) d u 2 = 1 ( ω 1 + ξ 1 ) ( ω 2 + ξ 2 ) ( e − u 1 ( ω 1 + ξ 1 ) − e u 1 ( ω 1 + ξ 1 ) ) ⋅ ( e − u 2 ( ω 2 + ξ 2 ) − e u 2 ( ω 2 + ξ 2 ) )
Example 3. For the Hill function we obtain
x ( u 1 , u 2 , ω 1 , ω 2 ) = u 1 q 1 u 1 q 1 + θ 1 q 1 u 2 q 2 u 2 q 2 + θ 2 q 2 (74)
Assume that
u i ∈ U , U = [ a , b ] , a , b ∈ ℝ , a > 0 ,
ω i = ( q i , θ i ) , q i ∈ [ q 0 , q m ] , q 0 , q m ∈ ℝ , q 0 > 0 ,
θ i ∈ [ θ 0 , θ m ] , θ 0 , θ m ∈ ℝ , θ 0 > 0 , i = 1 , 2.
Putting Ω = [ q 0 , q m ] × [ θ 0 , θ m ] and ξ i = ( q ′ i , θ ′ i ) , i = 1 , 2. Then, using Formu- las (14) and (15), we obtain the following representations of the kernels γ and δ
γ ( u 1 , u 2 , v 1 , v 2 ) = ∫ Ω ∫ Ω u 1 q 1 u 1 q 1 + θ 1 q 1 u 2 q 2 u 2 q 2 + θ 2 q 2 v 1 q 1 v 1 q 1 + θ 1 q 1 v 2 q 2 v 2 q 2 + θ 2 q 2 d ω 1 d ω 2 , (75)
δ ( ω 1 , ω 2 , ξ 1 , ξ 2 ) = ∬ U u 1 q 1 u 1 q 1 + θ 1 q 1 u 2 q 2 u 2 q 2 + θ 2 q 2 u 1 q ′ 1 u 1 q ′ 1 + θ ′ 1 q ′ 1 u 2 q ′ 2 u 2 q ′ 2 + θ ′ 2 q ′ 2 d u 1 d u 2 (76)
Remark 4. The eigenfunctions of the integral operators with the kernels that are products of parametrized functions are, according to Subsection 5.2, also products of the respective eigenfunctions of the factors.
The main results of the paper can be summarized as follows. We defined the distance in the space of parameterized functions. We defined the k -th Principal Component Transform (PCT) and the Full Principal Component Transform of functions x ∈ L 2 ( U × Ω ) . The kth PCT is the best approximation of the given function, i.e. it minimizes dist ( ⋅ , ⋅ ) . We proved that if the sequence of functions x ( n ) ( s ) converge to the continuous function x ( s ) , then the sequence of the PCT of x ( n ) ( s ) will converge to the PCT of x ( s ) . Some properties of PCT were considered. These results can also serve as theoretical background for the design of some metamodels. Using the theory of the tensor product of Hilbert spaces and compact operators we calculated the PCT of products of functions. We provided several examples of the discrete approximations and products of the parametrized functions.
We will emphasize that our study is related to systems biology. In future works we aim to investigate the problem of “sloppiness” in nonlinear models [
The work of the second author has been partially supported by the Norwegian Research Council, grant 239070.
Zabrodskii, I. and Ponosov, A. (2017) The Principal Component Transform of Parametrized Functions. Applied Mathematics, 8, 453-475. https://doi.org/10.4236/am.2017.84037
1. Allahverdiev’s theorem
Let K and K be two real separable Hilbert spaces, equipped with the scalar products ( ⋅ , ⋅ ) H and ( ⋅ , ⋅ ) K and the corresponding norms ‖ ⋅ ‖ H and ‖ ⋅ ‖ K , respectively. Assume that X : H → K is a linear compact operator. Its norm is
defined as ‖ X ‖ = sup ‖ α ‖ H ≤ 1 ‖ X α ‖ K .
Put
L k ( H , K ) = { Y isalinearboundedoperatorfrom H to K suchthat dim ( Im Y ) } ≤ k .
We want to find an operator X k ∈ L k ( H , K ) for which ‖ X − X k ‖ → min . This construction is very close to the finite dimensional singular value decomposition.
Assume that X * : H → K is the adjoint of X . Then the linear compact operators X * X : H → H , X X * : K → K are self-adjoint and positive-definite. Let σ 1 2 ≥ σ 2 2 ≥ σ 3 2 ≥ ⋯ → 0 , σ i > 0 be all positive eigenvalues of the operator X * X , the associated normalized eigenvectors being p 1 , p 2 , p 3 , ⋯ ∈ H , respectively:
X * X p i = σ i 2 p i , ‖ p i ‖ = 1 , i ∈ ℕ . (77)
It is well-known that p i can always be chosen to be orthogonal: p i ⊥ p j , i ≠ j . By the Hilbert-Schmidt theorem, for any α ∈ H there is a
unique set c i ∈ ℝ , i ∈ ℕ and a unique p 0 ∈ Null ( X * X ) for which α = p 0 + ∑ i = 1 ∞ c i p i and, moreover, ‖ α ‖ H 2 = ‖ p 0 ‖ H 2 + ∑ i = 1 ∞ c i 2 . Thus, the operator X can be represented as
X α = ∑ i = 1 ∞ ( α , p i ) H t i = ∑ i = 1 ∞ c i t i , (78)
where t i = X p i , and the convergence is understood in the sense of the norm in the space K . We define the linear bounded operators X k ∈ L k ( H , K ) by
X k α = ∑ i = 1 k ( α , p i ) H t i = ∑ i = 1 k c i t i (79)
The following result is known as Allahverdiev’s theorem, see e.g. [
Proposition 7. For any linear compact operator X : H → K
min Y ∈ L k ( H , K ) ‖ X − Y ‖ = ‖ X − X k ‖ = σ k + 1 (80)
Proof. First of all, we prove that ‖ X − X k ‖ = σ k + 1 . By definition,
‖ X − X k ‖ 2 = sup ‖ α ‖ H ≤ 1 ‖ ( X − X k ) α ‖ K 2 (81)
From (79) and (78) we get
( X − X k ) α = X α − X k α = ∑ i = 1 k c i t i − ∑ i = 1 ∞ c i t i = ∑ i = k + 1 ∞ c i t i (82)
We calculate the norm of X − X k using (81), (82):
‖ X − X k ‖ 2 = sup ‖ α ‖ H ≤ 1 ‖ ∑ i = k + 1 ∞ c i t i ‖ H 2 = sup ‖ α ‖ H ≤ 1 ∑ i = k + 1 ∞ c i ‖ t i ‖ K 2 = sup ‖ α ‖ H ≤ 1 ∑ i = k + 1 ∞ c i σ i 2 , (83)
because
‖ t i ‖ K 2 = ( t i , t i ) K = ( X p i , X p i ) K = ( X * X p i , p i ) H = ( σ i 2 p i , p i ) H = σ i 2 ( p i , p i ) H = σ i 2 ‖ p i ‖ H = σ i 2 (84)
and
( t i , t j ) H = ( X p i , X p j ) = ( X * X p i , p j ) = ( σ i 2 p i , p j ) = 0 if i ≠ j (85)
As α = p 0 + ∑ i = 1 ∞ c i p i , p 0 ⊥ p i ( i ∈ ℕ ) and ‖ α ‖ H 2 = ‖ p 0 ‖ 2 + ∑ i = 1 ∞ c i 2 ≤ 1 , we obtain ∑ i = 1 ∞ c i 2 ≤ 1 . As σ k + 1 ≥ σ i for all i > k + 1 ,
∑ i = k + 1 ∞ c i σ i 2 → max = σ k + 1 , (86)
if c k + 1 = 1 , c k + 2 = c k + 3 = ⋯ = 0 and p 0 = 0 .
Hence,
‖ X − X k ‖ = σ k + 1 (87)
Secondly, we prove that
‖ X − Y ‖ H ≥ ‖ X − X k ‖ H for all Y ∈ L k ( H , K ) (88)
Let y 1 , ⋯ , y k be a basis in Im Y . Then there exist some z 1 , ⋯ , z k from H such that
Y α = ∑ i = 1 k ( α , z i ) H y i (89)
We want to prove that
span { z 1 , ⋯ , z k } ⊥ ∩ span { p 1 , ⋯ , p k + 1 } ≠ { 0 } (90)
If α ∈ span { z 1 , ⋯ , z k } ⊥ , then Y α = 0.
If α ∈ span { p 1 , ⋯ , p k + 1 } , then α = α 1 p 1 + ⋯ + α k + 1 p k + 1 , α i ∈ ℝ , 1 ≤ i ≤ k + 1.
Therefore
span { z 1 , ⋯ , z k } ⊥ ∩ span { p 1 , ⋯ , p k + 1 } ≠ { 0 } ⇔ ∃ z 1 , ⋯ , z k suchthatthesystem
α 1 ( p 1 , z i ) + ⋯ + α k + 1 ( p k + 1 , z i ) = 0 , 1 ≤ i ≤ k hasnon − trivialsolutions . (91)
This homogeneous system has k + 1 unknowns and k equations, so that there is α = ∑ i = 1 k + 1 c i p i such that ∑ i = 1 k + 1 c i 2 = 1 and Y α = 0 . Therefore
‖ X − Y ‖ 2 ≥ ‖ ( X − Y ) α ‖ K 2 = ‖ ∑ i = 1 k + 1 c i t i ‖ K 2 = ∑ i = 1 k + 1 c i 2 ‖ t i ‖ K 2 ≥ σ k + 1 2 , (92)
as ‖ t i ‖ K = σ i ≥ σ k + 1 = ‖ t k + 1 ‖ K for i > k + 1.
□
2. Tensor product of operators in Hilbert spaces
Let H 1 , H 2 and K 1 , K 2 be real separable Hilbert spaces, where
• H 1 has an orthonormal basis { e 1 ( 1 ) , e 2 ( 1 ) , ⋯ , e i ( 1 ) , ⋯ } .
• H 2 has an orthonormal basis { e 1 ( 2 ) , e 2 ( 2 ) , ⋯ , e j ( 2 ) , ⋯ } .
• K 1 has an orthonormal basis { e ^ 1 ( 1 ) , e ^ 2 ( 1 ) , ⋯ , e ^ i ( 1 ) , ⋯ } .
• K 2 has an orthonormal basis { e ^ 1 ( 2 ) , e ^ 2 ( 2 ) , ⋯ , e ^ j ( 2 ) , ⋯ } .
Let
h τ = ∑ i = 1 ∞ c i ( τ ) e i ( τ ) , c i ( τ ) ∈ ℝ , τ = 1 , 2 (93)
Now, we define the tensor product H = H 1 ⊗ H 2 of the spaces H 1 and H 2 as the real separable Hilbert space, which has the basis e i j consisting of all ordered pairs ( e i ( 1 ) , e j ( 2 ) ) , and we put e i j ≡ e i ( 1 ) ⊗ e j ( 2 ) . By definition, any h ∈ H can be uniquely represented as
h = ∑ i , j = 1 ∞ c i j e i j , ∑ i , j = 1 ∞ c i j 2 < ∞ (94)
Definition 2. The scalar product ( ⋅ , ⋅ ) in H is defined as
( g , h ) = ∑ i , j = 1 ∞ c i j d i j , (95)
where g = ∑ i , j = 1 ∞ c i j e i j ∈ H , h = ∑ i , j = 1 ∞ d i j e i j ∈ H .
Evidently, the set e i ⊗ e j is an orthonormal basis of the space H 1 ⊗ H 2 and therefore
‖ h ‖ 2 = ∑ i , j = 1 ∞ | ( h , e i j ) | 2 = ∑ i , j = 1 ∞ | c i j | 2 (96)
is the norm on H . The series
∑ i , j = 1 ∞ c i j e i j , c i j ∈ ℝ
converges in this norm. It is also straightforward to check that
‖ h 1 ⊗ h 2 ‖ = ‖ h 1 ‖ H 1 ‖ h 2 ‖ H 2 (97)
for all h 1 ∈ H 1 , h 2 ∈ H 2 .
Let us consider two compact linear operators
X ( 1 ) : H 1 → K 1 , X ( 2 ) : H 2 → K 2 (98)
For all h 1 ∈ H 1 , h 2 ∈ K 2 we have
X ( 1 ) h 1 = ∑ i = 1 ∞ c i ( 1 ) X ( 1 ) e i ( 1 ) , X ( 2 ) h 2 = ∑ i = 1 ∞ c i ( 2 ) X ( 2 ) e i ( 2 ) (99)
We define the tensor product X ( 1 ) ⊗ X ( 2 ) : H 1 ⊗ H 2 → K 1 ⊗ K 2 of X ( 1 ) and X ( 2 ) as
X h = ∑ i , j = 1 ∞ c i j ( X ( 1 ) e i ( 1 ) ⊗ X ( 2 ) e j ( 2 ) ) , (100)
where h ∈ H is given by (94).
Proposition 8. If X ( 1 ) : H 1 → K 1 , X ( 2 ) : H 2 → K 2 are linear compact ope- rators, then so is the operator X ( 1 ) ⊗ X ( 2 ) : H 1 ⊗ H 2 → K 1 ⊗ K 2 .
Proof. Linearity of X ≡ X ( 1 ) ⊗ X ( 2 ) follows directly from the definition. Taking an arbitrary h ∈ H satisfying (94) we obtain
‖ X h ‖ 2 = ‖ ∑ i , j = 1 ∞ ( c i j ( X ( 1 ) e i ( 1 ) ⊗ X ( 2 ) e j ( 2 ) ) ) ‖ 2 ≤ ∑ i , j = 1 ∞ | c i j | 2 ‖ X ( 1 ) e i ( 1 ) ⊗ X ( 2 ) e j ( 2 ) ‖ 2 ≤ ∑ i , j = 1 ∞ | c i j | 2 ‖ X ( 1 ) ‖ 2 ‖ X ( 2 ) ‖ 2 = ‖ X ( 1 ) ‖ 2 ‖ X ( 2 ) ‖ 2 ‖ h ‖ 2 (101)
Therefore X is bounded, and in particular,
‖ X ‖ ≤ ‖ X ( 1 ) ‖ ‖ X ( 2 ) ‖ (102)
To prove compactness we choose an arbitrary ε > 0 and linear bounded finite dimensional operators Y τ : H τ → K τ for which ‖ X τ − Y τ ‖ < ε ( τ = 1 , 2 ) .
Evidently,
X ( 1 ) ⊗ X ( 2 ) − Y ( 1 ) ⊗ Y ( 2 ) = ( X ( 1 ) − Y ( 1 ) ) ⊗ ( X ( 2 ) − Y ( 2 ) ) + ( X ( 1 ) − Y ( 1 ) ) ⊗ Y ( 2 ) + Y ( 1 ) ⊗ ( X ( 2 ) − Y ( 2 ) ) (103)
Using (102) we obtain
‖ X ( 1 ) ⊗ X ( 2 ) − Y ( 1 ) ⊗ Y ( 2 ) ‖ ≤ ‖ X ( 1 ) − Y ( 1 ) ‖ ‖ X ( 2 ) − Y ( 2 ) ‖ + ‖ X ( 1 ) − Y ( 1 ) ‖ ‖ Y ( 2 ) ‖ + ‖ Y ( 1 ) ‖ ‖ X ( 2 ) − Y ( 2 ) ‖ < ε 2 + ε ( ‖ X ( 1 ) ‖ + ε ) + ε ( ‖ X ( 2 ) ‖ + ε ) (104)
Therefore, the operator X ( 1 ) ⊗ X ( 2 ) can be approximated in norm by finite dimensional operators of the form Y ( 1 ) ⊗ Y ( 2 ) with an arbitrary precision. Thus, X ( 1 ) ⊗ X ( 2 ) is compact.
□
Proposition 9. For all linear compact operators X ( 1 ) : H 1 → K 1 and X ( 2 ) : H 2 → K 2 we have
( X ( 1 ) ⊗ X ( 2 ) ) * = ( X ( 1 ) ) * ⊗ ( X ( 2 ) ) * (105)
Proof. The set of linear combinations ∑ i f 1 ⊗ f 2 is dense in H 1 ⊗ H 2 , i.e. for all h ∈ H 1 ⊗ H 2 there is a sequence of linear combinations of h 1 ⊗ h 2 which converges to h in the norm. As the operators X ( 1 ) and X ( 2 ) are linear and bounded, it is sufficient to prove the equality in the lemma for the special case of h = h 1 ⊗ h 2 ∈ H 1 ⊗ H 2 , where we by definition have the formula
( X ( 1 ) ⊗ X ( 2 ) ) ( h 1 ⊗ h 2 ) = ( X ( 1 ) h 1 ) ⊗ ( X ( 2 ) h 2 ) (106)
Let α = α 1 ⊗ α 2 , β = β 1 ⊗ β 2 . where α 1 , α 2 ∈ H 1 and β 1 , β 2 ∈ H 2 . Then
( ( X ( 1 ) ⊗ X ( 2 ) ) α , β ) = ( X ( 1 ) α 1 ⊗ X ( 2 ) α 2 , β 1 ⊗ β 2 ) = ( X ( 1 ) α 1 , β 1 ) ( X ( 2 ) α 2 , β 2 ) = ( α 1 , ( X ( 1 ) ) * β 1 ) ( α 2 , ( X ( 2 ) ) * β 2 ) = ( α 1 ⊗ α 2 , ( X ( 1 ) ) * β 1 ⊗ ( X ( 2 ) ) * β 2 ) = ( α , ( X ( 1 ) ) * ⊗ ( X ( 2 ) ) * β ) (107)
Hence ( X ( 1 ) ⊗ X ( 2 ) ) * = ( X ( 1 ) ) * ⊗ ( X ( 2 ) ) * .
□
Proposition 10. If ( λ ( τ ) , q ( τ ) ) is the eigenpair of the operator X ( τ ) ; H τ → K τ ( τ = 1 , 2 ), then ( λ ( 1 ) λ ( 2 ) , q ( 1 ) ⊗ q ( 2 ) ) is the eigenpair of the operator X ( 1 ) ⊗ X ( 2 ) .
Proof.
( X ( 1 ) ⊗ X ( 2 ) ) ( q ( 1 ) ⊗ q ( 2 ) ) = ( X ( 1 ) q ( 1 ) ) ⊗ ( X ( 2 ) q ( 2 ) ) = ( λ ( 1 ) q ( 1 ) ) ⊗ ( λ ( 2 ) q ( 2 ) ) = λ ( 1 ) λ ( 2 ) q ( 1 ) ⊗ q ( 2 ) (108)
□
Submit or recommend next manuscript to SCIRP and we will provide best service for you:
Accepting pre-submission inquiries through Email, Facebook, LinkedIn, Twitter, etc.
A wide selection of journals (inclusive of 9 subjects, more than 200 journals)
Providing 24-hour high-quality service
User-friendly online submission system
Fair and swift peer-review system
Efficient typesetting and proofreading procedure
Display of the result of downloads and visits, as well as the number of cited articles
Maximum dissemination of your research work
Submit your manuscript at: http://papersubmission.scirp.org/
Or contact am@scirp.org