Inner product space
In linear algebra, an inner product space is a vector space with an additional structure called an inner product. This additional structure associates each pair of vectors in the space with a scalar quantity known as the inner product of the vectors. Inner products allow the rigorous introduction of intuitive geometrical notions such as the length of a vector or the angle between two vectors. They also provide the means of defining orthogonality between vectors (zero inner product). Inner product spaces generalize Euclidean spaces (in which the inner product is the dot product, also known as the scalar product) to vector spaces of any (possibly infinite) dimension, and are studied in functional analysis. The first usage of the concept of a vector space with an inner product is due to Peano, in 1898.^{[1]}
An inner product naturally induces an associated norm, thus an inner product space is also a normed vector space. A complete space with an inner product is called a Hilbert space. An (incomplete) space with an inner product is called a pre-Hilbert space, since its completion with respect to the norm induced by the inner product is a Hilbert space. Inner product spaces over the field of complex numbers are sometimes referred to as unitary spaces.
Contents
Definition
In this article, the field of scalars denoted F is either the field of real numbers R or the field of complex numbers C.
Formally, an inner product space is a vector space V over the field F together with an inner product, i.e., with a map
that satisfies the following three axioms for all vectors x, y, z ∈ V and all scalars a ∈ F:^{[2]}^{[3]}
- Conjugate symmetry:^{[Note 1]}
- Linearity in the first argument:
Alternative definitions, notations and remarks
Some authors, especially in physics and matrix algebra, prefer to define the inner product and the sesquilinear form with linearity in the second argument rather than the first. Then the first argument becomes conjugate linear, rather than the second. In those disciplines we would write the product ⟨x,y⟩ as ⟨y|x⟩ (the bra–ket notation of quantum mechanics), respectively y^{†}x (dot product as a case of the convention of forming the matrix product AB as the dot products of rows of A with columns of B). Here the kets and columns are identified with the vectors of V and the bras and rows with the linear functionals (covectors) of the dual space V^{∗}, with conjugacy associated with duality. This reverse order is now occasionally followed in the more abstract literature,^{[4]} taking ⟨x,y⟩ to be conjugate linear in x rather than y. A few instead find a middle ground by recognizing both ⟨·,·⟩ and ⟨·|·⟩ as distinct notations differing only in which argument is conjugate linear.
There are various technical reasons why it is necessary to restrict the basefield to R and C in the definition. Briefly, the basefield has to contain an ordered subfield in order for non-negativity to make sense,^{[5]} and therefore has to have characteristic equal to 0 (since any ordered field has to have such characteristic). This immediately excludes finite fields. The basefield has to have additional structure, such as a distinguished automorphism. More generally any quadratically closed subfield of R or C will suffice for this purpose, e.g., the algebraic numbers or the constructible numbers. However in these cases when it is a proper subfield (i.e., neither R nor C) even finite-dimensional inner product spaces will fail to be metrically complete. In contrast all finite-dimensional inner product spaces over R or C, such as those used in quantum computation, are automatically metrically complete and hence Hilbert spaces.
In some cases we need to consider non-negative semi-definite sesquilinear forms. This means that ⟨x,x⟩ is only required to be non-negative. We show how to treat these below.
Elementary properties
When F = R, conjugate symmetry reduces to symmetry. That is, ⟨x,y⟩ = ⟨y,x⟩ for F = R; while for F = C, ⟨x,y⟩ is equal to the complex conjugate.
Notice that conjugate symmetry implies that ⟨x,x⟩ is real for all x, since we have:
Moreover, sesquilinearity (see below) implies that
Conjugate symmetry and linearity in the first variable gives
so an inner product is a sesquilinear form. Conjugate symmetry is also called Hermitian symmetry, and a conjugate symmetric sesquilinear form is called a Hermitian form. While the above axioms are more mathematically economical, a compact verbal definition of an inner product is a positive-definite Hermitian form.
In the case of F = R, conjugate-symmetry reduces to symmetry, and sesquilinear reduces to bilinear. So, an inner product on a real vector space is a positive-definite symmetric bilinear form.
From the linearity property it is derived that x = 0 implies ⟨x,x⟩ = 0. while from the positive-definiteness axiom we obtain the converse, ⟨x,x⟩ = 0 implies x = 0. Combining these two, we have the property that ⟨x,x⟩ = 0 if and only if x = 0.
Combining the linearity of the inner product in its first argument and the conjugate symmetry gives the following important generalization of the familiar square expansion:
Assuming the underlying field to be R, the inner product becomes symmetric, and we obtain
The property of an inner product space V that
is also known as additivity.
Examples
Real numbers
A simple example is the real numbers with the standard multiplication as the inner product
Euclidean space
More generally, the real n-space R^{n} with the dot product is an inner product space, an example of a Euclidean n-space.
where x^{T} is the transpose of x.
Complex coordinate space
The general form of an inner product on C^{n} is known as the Hermitian form and is given by
where M is any Hermitian positive-definite matrix and y^{†} is the conjugate transpose of y. For the real case this corresponds to the dot product of the results of directionally different scaling of the two vectors, with positive scale factors and orthogonal directions of scaling. Up to an orthogonal transformation it is a weighted-sum version of the dot product, with positive weights.
Hilbert space
The article on Hilbert space has several examples of inner product spaces wherein the metric induced by the inner product yields a complete metric space. An example of an inner product which induces an incomplete metric occurs with the space C([a,b]) of continuous complex valued functions on the interval [a,b]. The inner product is
This space is not complete; consider for example, for the interval [−1,1] the sequence of continuous "step" functions, { f_{k}}_{k}, defined by:
This sequence is a Cauchy sequence for the norm induced by the preceding inner product, which does not converge to a continuous function.
Random variables
For real random variables X and Y, the expected value of their product
is an inner product.^{[6]}^{[7]}^{[8]} In this case, ⟨X,X⟩ = 0 if and only if Pr(X = 0) = 1 (i.e., X = 0 almost surely). This definition of expectation as inner product can be extended to random vectors as well.
Real matrices
For real matrices of the same size, ⟨A,B⟩ := tr(AB^{T}) with transpose as conjugation
is an inner product.
Vector spaces with forms
On an inner product space, or more generally a vector space with a nondegenerate form (so an isomorphism V → V^{∗}) vectors can be sent to covectors (in coordinates, via transpose), so one can take the inner product and outer product of two vectors, not simply of a vector and a covector.
Norms on inner product spaces
A linear space with a norm such as:
is a normed space but not an inner product space, because this norm does not satisfy the parallelogram equality required of a norm to have an inner product associated with it.^{[9]}^{[10]}
However, inner product spaces have a naturally defined norm based upon the inner product of the space itself that does satisfy the parallelogram equality:
This is well defined by the nonnegativity axiom of the definition of inner product space. The norm is thought of as the length of the vector x. Directly from the axioms, we can prove the following:
- Cauchy–Schwarz inequality: for x, y elements of V
- with equality if and only if x and y are linearly dependent. This is one of the most important inequalities in mathematics. It is also known in the Russian mathematical literature as the Cauchy–Bunyakovsky–Schwarz inequality.
- Orthogonality: The geometric interpretation of the inner product in terms of angle and length, motivates much of the geometric terminology we use in regard to these spaces. Indeed, an immediate consequence of the Cauchy–Schwarz inequality is that it justifies defining the angle between two non-zero vectors x and y (denoted ∠) in the case F = R by the identity
- We assume the value of the angle is chosen to be in the interval [0, π]. This is in analogy to the situation in two-dimensional Euclidean space.
- In the case F = C, the angle in the interval [0, π/2] is typically defined by
- Correspondingly, we will say that non-zero vectors x and y of V are orthogonal if and only if their inner product is zero.
- Homogeneity: for x an element of V and r a scalar
- The homogeneity property is completely trivial to prove.
- Triangle inequality: for x, y elements of V
- The last two properties show the function defined is indeed a norm.
- Because of the triangle inequality and because of axiom 2, we see that ||·|| is a norm which turns V into a normed vector space and hence also into a metric space. The most important inner product spaces are the ones which are complete with respect to this metric; they are called Hilbert spaces. Every inner product V space is a dense subspace of some Hilbert space. This Hilbert space is essentially uniquely determined by V and is constructed by completing V.
- Pythagorean theorem: Whenever x, y are in V and ⟨x,y⟩ = 0, then
- The proof of the identity requires only expressing the definition of norm in terms of the inner product and multiplying out, using the property of additivity of each component.
- The name Pythagorean theorem arises from the geometric interpretation of this result as an analogue of the theorem in synthetic geometry. Note that the proof of the Pythagorean theorem in synthetic geometry is considerably more elaborate because of the paucity of underlying structure. In this sense, the synthetic Pythagorean theorem, if correctly demonstrated, is deeper than the version given above.
- An induction on the Pythagorean theorem yields:
- If x_{1}, ..., x_{n} are orthogonal vectors, that is, ⟨x_{j},x_{k}⟩ = 0 for distinct indices j, k, then
- In view of the Cauchy-Schwarz inequality, we also note that ⟨·,·⟩ is continuous from V × V to F. This allows us to extend Pythagoras' theorem to infinitely many summands:
- Parseval's identity: Suppose V is a complete inner product space. If {x_{k}} are mutually orthogonal vectors in V then
- provided the infinite series on the left is convergent. Completeness of the space is needed to ensure that the sequence of partial sums
- which is easily shown to be a Cauchy sequence, is convergent.
- Parallelogram law: for x, y elements of V,
- The Parallelogram law is, in fact, a necessary and sufficient condition for the existence of a scalar product corresponding to a given norm. If it holds, the scalar product is defined by the polarization identity:
- which is a form of the law of cosines.
Orthonormal sequences
Let V be a finite dimensional inner product space of dimension n. Recall that every basis of V consists of exactly n linearly independent vectors. Using the Gram–Schmidt process we may start with an arbitrary basis and transform it into an orthonormal basis. That is, into a basis in which all the elements are orthogonal and have unit norm. In symbols, a basis {e_{1}, ..., e_{n}} is orthonormal if ⟨e_{i},e_{j}⟩ = 0 for every i ≠ j and ⟨e_{i},e_{i}⟩ = ||e_{i}|| = 1 for each i.
This definition of orthonormal basis generalizes to the case of infinite-dimensional inner product spaces in the following way. Let V be any inner product space. Then a collection
is a basis for V if the subspace of V generated by finite linear combinations of elements of E is dense in V (in the norm induced by the inner product). We say that E is an orthonormal basis for V if it is a basis and
if α ≠ β and ⟨e_{α},e_{α}⟩ = ||e_{α}|| = 1 for all α, β ∈ A.
Using an infinite-dimensional analog of the Gram-Schmidt process one may show:
Theorem. Any separable inner product space V has an orthonormal basis.
Using the Hausdorff maximal principle and the fact that in a complete inner product space orthogonal projection onto linear subspaces is well-defined, one may also show that
Theorem. Any complete inner product space V has an orthonormal basis.
The two previous theorems raise the question of whether all inner product spaces have an orthonormal basis. The answer, it turns out is negative. This is a non-trivial result, and is proved below. The following proof is taken from Halmos's A Hilbert Space Problem Book (see the references).^{[citation needed]}
Proof Recall that the dimension of an inner product space is the cardinality of a maximal orthonormal system that it contains (by Zorn's lemma it contains at least one, and any two have the same cardinality). An orthonormal basis is certainly a maximal orthonormal system, but as we shall see, the converse need not hold. Observe that if G is a dense subspace of an inner product space H, then any orthonormal basis for G is automatically an orthonormal basis for H. Thus, it suffices to construct an inner product space H with a dense subspace G whose dimension is strictly smaller than that of H. Let K be a Hilbert space of dimension ℵ_{0} (for instance, K = l^{2}(N)). Let E be an orthonormal basis of K, so |E| = ℵ_{0}. Extend E to a Hamel basis E ∪ F for K, where E ∩ F = ∅. Since it is known that the Hamel dimension of K is c, the cardinality of the continuum, it must be that |F| = c.
Let L be a Hilbert space of dimension c (for instance, L = l^{2}(R)). Let B be an orthonormal basis for L, and let φ : F → B be a bijection. Then there is a linear transformation T : K → L such that Tf = φ( f ) for f ∈ F, and Te = 0 for e ∈ E.
Let H = K ⊕ L and let G = {(k,Tk) : k ∈ K)} be the graph of T. Let Ḡ be the closure of G in H; we will show Ḡ = H. Since for any e ∈ E we have (e,0) ∈ G, it follows that K ⊕ 0 ⊂ Ḡ.
Next, if b ∈ B, then b = Tf for some f ∈ F ⊂ K, so ( f,b) ∈ G ⊂ Ḡ; since ( f,0) ∈ Ḡ as well, we also have (0,b) ∈ Ḡ. It follows that 0 ⊕ L ⊂ Ḡ, so Ḡ = H, and G is dense in H.
Finally, {(e,0) : e ∈ E} is a maximal orthonormal set in G; if
for all e ∈ E then certainly k = 0, so (k,Tk) = (0,0) is the zero vector in G. Hence the dimension of G is |E| = ℵ_{0}, whereas it is clear that the dimension of H is c. This completes the proof.
Parseval's identity leads immediately to the following theorem:
Theorem. Let V be a separable inner product space and {e_{k}}_{k} an orthonormal basis of V. Then the map
is an isometric linear map V → l^{2} with a dense image.
This theorem can be regarded as an abstract form of Fourier series, in which an arbitrary orthonormal basis plays the role of the sequence of trigonometric polynomials. Note that the underlying index set can be taken to be any countable set (and in fact any set whatsoever, provided l^{2} is defined appropriately, as is explained in the article Hilbert space). In particular, we obtain the following result in the theory of Fourier series:
Theorem. Let V be the inner product space C[−π,π]. Then the sequence (indexed on set of all integers) of continuous functions
is an orthonormal basis of the space C[−π,π] with the L^{2} inner product. The mapping
is an isometric linear map with dense image.
Orthogonality of the sequence {e_{k}}_{k} follows immediately from the fact that if k ≠ j, then
Normality of the sequence is by design, that is, the coefficients are so chosen so that the norm comes out to 1. Finally the fact that the sequence has a dense algebraic span, in the inner product norm, follows from the fact that the sequence has a dense algebraic span, this time in the space of continuous periodic functions on [−π,π] with the uniform norm. This is the content of the Weierstrass theorem on the uniform density of trigonometric polynomials.
Operators on inner product spaces
Several types of linear maps A from an inner product space V to an inner product space W are of relevance:
- Continuous linear maps, i.e., A is linear and continuous with respect to the metric defined above, or equivalently, A is linear and the set of non-negative reals {||Ax||}, where x ranges over the closed unit ball of V, is bounded.
- Symmetric linear operators, i.e., A is linear and ⟨Ax,y⟩ = ⟨x,Ay⟩ for all x, y in V.
- Isometries, i.e., A is linear and ⟨Ax,Ay⟩ = ⟨x,y⟩ for all x, y in V, or equivalently, A is linear and ||Ax|| = ||x|| for all x in V. All isometries are injective. Isometries are morphisms between inner product spaces, and morphisms of real inner product spaces are orthogonal transformations (compare with orthogonal matrix).
- Isometrical isomorphisms, i.e., A is an isometry which is surjective (and hence bijective). Isometrical isomorphisms are also known as unitary operators (compare with unitary matrix).
From the point of view of inner product space theory, there is no need to distinguish between two spaces which are isometrically isomorphic. The spectral theorem provides a canonical form for symmetric, unitary and more generally normal operators on finite dimensional inner product spaces. A generalization of the spectral theorem holds for continuous normal operators in Hilbert spaces.
Generalizations
Any of the axioms of an inner product may be weakened, yielding generalized notions. The generalizations that are closest to inner products occur where bilinearity and conjugate symmetry are retained, but positive-definiteness is weakened.
Degenerate inner products
If V is a vector space and ⟨·,···⟩ a semi-definite sesquilinear form, then the function:
makes sense and satisfies all the properties of norm except that ||x|| = 0 does not imply x = 0 (such a functional is then called a semi-norm). We can produce an inner product space by considering the quotient W = V/{x : ||x|| = 0}. The sesquilinear form ⟨·,·⟩ factors through W.
This construction is used in numerous contexts. The Gelfand–Naimark–Segal construction is a particularly important example of the use of this technique. Another example is the representation of semi-definite kernels on arbitrary sets.
Nondegenerate conjugate symmetric forms
Alternatively, one may require that the pairing be a nondegenerate form, meaning that for all non-zero x there exists some y such that ⟨x,y⟩ ≠ 0, though y need not equal x; in other words, the induced map to the dual space V → V^{∗} is injective. This generalization is important in differential geometry: a manifold whose tangent spaces have an inner product is a Riemannian manifold, while if this is related to nondegenerate conjugate symmetric form the manifold is a pseudo-Riemannian manifold. By Sylvester's law of inertia, just as every inner product is similar to the dot product with positive weights on a set of vectors, every nondegenerate conjugate symmetric form is similar to the dot product with nonzero weights on a set of vectors, and the number of positive and negative weights are called respectively the positive index and negative index. Product of vectors in Minkowski space is an example of indefinite inner product, although, technically speaking, it is not an inner product according to the standard definition above. Minkowski space has four dimensions and indices 3 and 1 (assignment of "+" and "−" to them differs depending on conventions).
Purely algebraic statements (ones that do not use positivity) usually only rely on the nondegeneracy (the injective homomorphism V → V^{∗}) and thus hold more generally.
Related products
The term "inner product" is opposed to outer product, which is a slightly more general opposite. Simply, in coordinates, the inner product is the product of a 1 × n covector with an n × 1 vector, yielding a 1 × 1 matrix (a scalar), while the outer product is the product of an m × 1 vector with a 1 × n covector, yielding an m × n matrix. Note that the outer product is defined for different dimensions, while the inner product requires the same dimension. If the dimensions are the same, then the inner product is the trace of the outer product (trace only being properly defined for square matrices). In a quip: "inner is horizontal times vertical and shrinks down, outer is vertical times horizontal and expands out".
More abstractly, the outer product is the bilinear map W × V^{∗} → Hom(V,W) sending a vector and a covector to a rank 1 linear transformation (simple tensor of type (1,1)), while the inner product is the bilinear evaluation map V^{∗} × V → F given by evaluating a covector on a vector; the order of the domain vector spaces here reflects the covector/vector distinction.
The inner product and outer product should not be confused with the interior product and exterior product, which are instead operations on vector fields and differential forms, or more generally on the exterior algebra.
As a further complication, in geometric algebra the inner product and the exterior (Grassmann) product are combined in the geometric product (the Clifford product in a Clifford algebra) – the inner product sends two vectors (1-vectors) to a scalar (a 0-vector), while the exterior product sends two vectors to a bivector (2-vector) – and in this context the exterior product is usually called the "outer (alternatively, wedge) product". The inner product is more correctly called a scalar product in this context, as the nondegenerate quadratic form in question need not be positive definite (need not be an inner product).
See also
- Space (mathematics)
- Normed vector space
- Energetic space
- Dual space
- Biorthogonal system
- Bilinear form
Notes
- ^ A bar over an expression denotes complex conjugation.
References
- ^ Moore, Gregory H. (1995). "The axiomatization of linear algebra: 1875-1940". Historia Mathematica. 22 (3): 262–303. doi:10.1006/hmat.1995.1025.
- ^ Jain, P. K.; Ahmad, Khalil (1995). "5.1 Definitions and basic properties of inner product spaces and Hilbert spaces". Functional Analysis (2nd ed.). New Age International. p. 203. ISBN 81-224-0801-X.
- ^ Prugovec̆ki, Eduard (1981). "Definition 2.1". Quantum Mechanics in Hilbert Space (2nd ed.). Academic Press. pp. 18ff. ISBN 0-12-566060-X.
- ^ Emch, Gerard G. (1972). Algebraic Methods in Statistical Mechanics and Quantum Field Theory. New York: Wiley-Interscience. ISBN 978-0-471-23900-0.
- ^ Finkbeiner, Daniel T. (2013), Introduction to Matrices and Linear Transformations, Dover Books on Mathematics (3rd ed.), Courier Dover Publications, p. 242, ISBN 9780486279664.
- ^ Ouwehand, Peter (November 2010). "Spaces of Random Variables" (PDF). AIMS. Retrieved 2017-09-05.
- ^ Siegrist, Kyle (1997). "Vector Spaces of Random Variables". Random: Probability, Mathematical Statistics, Stochastic Processes. Retrieved 2017-09-05.
- ^ Bigoni, Daniele (2015). "Appendix B: Probability theory and functional spaces". Uncertainty Quantification with Applications to Engineering Problems (PDF) (PhD). Technical University of Denmark. Retrieved 2017-09-05.
- ^ Jain, P. K.; Ahmad, Khalil (1995). "Example 5". Functional Analysis (2nd ed.). New Age International. p. 209. ISBN 81-224-0801-X.
- ^ Saxe, Karen (2002). Beginning Functional Analysis. Springer. p. 7. ISBN 0-387-95224-1.
Sources
- Axler, Sheldon (1997). Linear Algebra Done Right (2nd ed.). Berlin, New York: Springer-Verlag. ISBN 978-0-387-98258-8.
- Emch, Gerard G. (1972). Algebraic Methods in Statistical Mechanics and Quantum Field Theory. Wiley-Interscience. ISBN 978-0-471-23900-0.
- Young, Nicholas (1988). An Introduction to Hilbert Space. Cambridge University Press. ISBN 978-0-521-33717-5.