orthogonal matrix

2 years ago · 177f6602ab
--- a/content.tex
+++ b/content.tex
@ -231,7 +231,17 @@ The final component that sill needs to be expressed in the framework of linear a
 \Cref{sec:stocastic_matrix_model} formulates mathematical tools to algebraically describe an abstract model of probabilistic computations defined in \cref{sec:probabilistic_model}. This section takes a reverse approach. The tools developed in \cref{sec:stocastic_matrix_model} are based on stochastic matrices, which is an obvious choice to model probabilistic state transitions. Unfortunately this model has some shortcomings. This section first highlights these inconveniences and then fixes them. By doing so the model will gain in computational power, demonstrated by the implementation of Deutsch's algorithm. Finally, it will be shown that this extended model is indeed physically realizable.

 \subsection{Cleaning Up}
 The straight forward and rather simplistic choice of using probability coefficients in \cref{def:nbitRegister,def:state_space_probabilistic_computation} results in quite unwieldy state objects especially in the linear algebra representation. Of course, the probability mass of a complete sample space must always sum up to 1, demanding the normalization of state vectors by the $\norm{.}_1$ norm. The state space $\mathbf{B}^n$ defined in this way is an affine combination of its basis vectors. For an 1-bit system this corresponds to the line segment from $\mathbf{0}$ to $\mathbf{1}$ (see \cref{fig:affine_comb}). As \cref{thm:state_space_unit_sphere_surface_isomorphism} already suggest, randomized computations could be viewed as rotating a ray around the origin. If computations essentially are rotations, then angles between state vectors seem somewhat important. Of course with $\mathbf{a}, \mathbf{b} \in \mathbf{B}^n$ it would be possible to calculate the angle between both sates by rescaling their dot product by their lengths $\mathbf{0}^t \mathbf{1} \parens{\abs{\mathbf{a}} \abs{\mathbf{b}}}^{-1}$. State vectors with unit length would greatly simplify angle calculations. Then, the dot product would suffice. Fortunately, \cref{thm:state_space_unit_sphere_surface_isomorphism} states that $\mathbf{B}^n$ is isomorphic to a subset of the surface of the unit sphere. Therefore, it should also be possible to represent the state space as vectors with unit length. To distinguish between both representation we will write state vectors with coordinates on the unit sphere as $\ket{b}$. This notation is the standard notation of quantum states. By definition the length of $\ket{b} = \sum_{i=1}^N \alpha_i \ket{b_i}$ is 1. The linear coefficients $\alpha_i$ are not probabilities but so-called probability amplitudes and the Pythagorean theorem states that $1 = \sum_{i=1}^N \alpha_i^2$. This means squaring the amplitudes or taking the square root of probabilities maps between affine combinations of basis vectors and points on the unit sphere in the state space. As it turns out negative amplitudes must be allowed, thus this mapping is ambiguous and NOT an isomorphism. 
 The straight forward and rather simplistic choice of using probability coefficients in \cref{def:nbitRegister,def:state_space_probabilistic_computation} results in quite unwieldy state objects especially in the linear algebra representation. Of course, the probability mass of a complete sample space must always sum up to 1, demanding the normalization of state vectors by the $\norm{.}_1$ norm. The state space $\mathbf{B}^n$ defined in this way is an affine combination of its basis vectors. For an 1-bit system this corresponds to the line segment from $\mathbf{0}$ to $\mathbf{1}$ (see \cref{fig:affine_comb}). As \cref{thm:state_space_unit_sphere_surface_isomorphism} already suggest, randomized computations could be viewed as rotating a ray around the origin. If computations essentially are rotations, then angles between state vectors seem somewhat important. Of course with $\mathbf{a}, \mathbf{b} \in \mathbf{B}^n$ it would be possible to calculate the angle between both sates by rescaling their dot product by their lengths $\mathbf{0}^t \mathbf{1} \parens{\abs{\mathbf{a}} \abs{\mathbf{b}}}^{-1}$. State vectors with unit length would greatly simplify angle calculations. Then, the dot product would suffice. Fortunately, \cref{thm:state_space_unit_sphere_surface_isomorphism} states that $\mathbf{B}^n$ is isomorphic to a subset of the surface of the unit sphere. Therefore, it should also be possible to represent the state space as vectors with unit length. To distinguish between both representation we will write state vectors with coordinates on the unit sphere as $\ket{b}$. This notation is the standard notation of quantum states. For now, $\bra{b}$ will be the transposed vector of $\ket{b}$ and $\braket{b_1}{b_2} = \bra{b_1}\,\ket{b_2}$ is the dot product of $\ket{b_1}$ and $\ket{b_2}$. By definition the length of $\ket{b} = \sum_{i=1}^N \alpha_i \ket{b_i}$ is 1. The linear coefficients $\alpha_i$ are not probabilities but so-called probability amplitudes and the Pythagorean theorem states that $1 = \sum_{i=1}^N \alpha_i^2$. This means squaring the amplitudes or taking the square root of probabilities maps between affine combinations of basis vectors and points on the unit sphere in the state space. As it turns out negative amplitudes must be allowed, thus this mapping is ambiguous and NOT an isomorphism. Each point on the unit sphere describes exactly one possible state of a probabilistic computation. This means moving on the $2^n$-dimensional unit sphere can be viewed as a kind of computation on a state space $\mathcal{B}_{\R}^n$
 \begin{definition}[Real State Space]
  Given a mapping $\pi : \mathscr{B} \to \parens{0,1}^n$ of each configuration of a $n$-bit register to an orthonormal basis $\mathscr{B}$ of $\R^N$ with $N = 2^n$, points on the $N$-dimensional unit sphere form a computational state space:
  \begin{equation*}
    \mathcal{B}_{\R}^n \coloneqq \parensc*{\ket{b} = \sum_{i=1}^N a_i \mathbf{b}_i \in \R^N \:\middle|\: \norm{\ket{b}} = 1,\: \mathbf{b}_i \in \mathscr{B}}
  \end{equation*}
  with
  \begin{equation*}
    a_i^2 = P(\pi(\mathbf{b}_i)) 
  \end{equation*}
 \end{definition}

 \begin{figure}
  \centering
@ -248,5 +258,23 @@ The straight forward and rather simplistic choice of using probability coefficie
  \end{subfigure}
 \end{figure}

 \subsection{Orthogonal Operators}
 The set of operations mapping $\mathscr{B}_{\R}^n$ to $\mathscr{B}_{\R}^n$ is exactly the set of all rotations and rotoreflections, which together form the orthogonal group. Every element of the orthogonal group can be represented by an orthogonal matrix.
 \begin{definition}  
  A matrix $A \in \R^n$ is orthogonal iff
  \begin{equation*}
    A^{-1} = A^t \Leftrightarrow AA^t = A^tA = \idmat
  \end{equation*}
 \end{definition}
 \begin{remark}
  It is important that orthogonal matrices from a group, because this means the composition of two orthogonal operators is orthogonal again. So, orthogonal computation can be composed or decomposed by or into other orthogonal computations. This is extremely useful for describing and developing algorithms in this model.
 \end{remark}
 \begin{definition}{Orthogonal Computations}
  A computation on the state space $\mathcal{B}_{\R}^n$ is defined by an orthogonal matrix $A \in \R^N$ with $N = 2^n$.
 \end{definition}
 What does it mean if a matrix is orthogonal? Let $A = (a_{ij}) = (\mathbf{a}_i,\dots,\mathbf{a}_n) \in \R^{(n,n)}$ be orthogonal with . Then, it follows directly from $AA^t = (b_{ij}) = \idmat$ that $b_{ij} = \mathbf{a}_i^t \mathbf{a}_j = \delta_{ij}$. Hence, the columns (and rows) of $A$ form an orthonormal basis of $\R^n$. It is also easy to check that $A$ preserves the dot product making it angle and length preserving. Another direct consequence of $AA^t = \idmat$ is the reversibility of orthogonal computations.

 \subsection{Measurements}

 \contentsketch{amplitudes not probabilities}
 \contentsketch[caption={length preserving transition matrix}]{Transition matrix is not length preserving. Length preserving matrix: orthogonal matrix -> negative coefficients -> interference -> Deutsch's algorithm (new computational power)}
--- a/main.pdf
+++ b/main.pdf
--- a/main.tex
+++ b/main.tex
@ -5,6 +5,7 @@
 \usepackage{lmodern}

 \usepackage{amsmath, amsthm, amssymb, amsfonts}
 \usepackage{dsfont}
 \usepackage[mathscr]{euscript}
 \usepackage{mathtools}
 \usepackage{physics}
@ -29,6 +30,7 @@
 \DeclareMathOperator{\R}{\mathbb{R}}
 \DeclareMathOperator{\spanspace}{\text{span}}
 \DeclareMathOperator{\Hom}{\text{hom}}
 \DeclareMathOperator{\idmat}{\mathds{1}}

 \theoremstyle{plain}
 \newtheorem{theorem}{Theorem}