The key observation can already be made in \cref{eq:hadamard_on_superposition}. Probability amplitudes can destructively interfere with each other. In \cref{eq:hadamard_on_superposition} this can be seen in the term $(a_0- a_1)$ and in \cref{eq:2_hadamards_on_superposition} the amplitudes cancel each other out just perfectly to restore the original input state. It can't be mentioned enough: probability amplitudes are not probabilities. Destructive Interference is not possible with stochastic matrices from \cref{sec:stocastic_matrix_model} with all their entries being strictly positive. The next section shows how interference effects can be utilized effectively to outperform any probabilistic computation.
\subsubsection{Deutsch's Algorithm}
Given a function $f : \parensc{0,1}\to\parensc{0,1}$, the problem at hand is to determine whether $f(0)\stackrel{?}{=} f(1)$. Obviously, deterministic and even probabilistic computations need to evaluate $f$ two times, once for each input, in order to answer this question. Surprisingly, orthogonal computations only need one call to $f$. But how is that possible?
First, function evaluation needs to be addressed in the context of orthogonal computations. The requirement of orthogonality requires all computations to be reversible. But what if $f$ is not injective e.g. $f(0)= f(1)$? Well, a simple trick solves this dilemma: Instead of evaluating $f$ directly, $f$ will be wrapped by an orthogonal operator $O_f$. Note that the $\XOR$ operator:
is reversible as $\XOR\ket{x, y \oplus x}=\ket{x, y \oplus x \oplus x}=\ket{x,y}$. From the matrix form it apparent that $O_f$ even is orthogonal. Similarly, it follows that $O_f \ket{x,y}\coloneqq\ket{x, y \oplus f(x)}$ is reversible. A closer look reveals that $O_f$ just permutes the basis states depending on $f$ and again it is easily verifiable that $O_f O_f =\idmat$, making $O_f$ orthogonal.
So it is indeed possible to evaluate a function $f : \parensc{0, 1}\to\parensc{0, 1}$ in the orthogonal computational model. If the second qubit $\ket{y}$ is chosen initialized as $\ket{0}$, measuring that qubit in the computational basis after the application of $O_f$ returns the value of $f(x)$. The trick that makes it possible to anwser $f(0)\stackrel{?}{=} f(1)$ with one call to $O_f$ is to initialize both qubits in a perfect superposition of $\ket{0}$ and $\ket{y}$. Then, both values of $f(0)$ and $f(1)$ will interfere $\ket{y}$.
That is a lot to digest, but what happened is: Both qubits interfered with each other and although $f$ got applied to the second qubit, the effect of this operation moved through the amplitudes to the first qubit. From now on the second qubit is not important anymore, so only the first one will be considered from now on. The first bit viewed as an isolated system is in state $\sfrac{1}{\sqrt{2}}\parens{(-1)^{f(0)}\ket{0}+(-1)^{f(1)}\ket{1}}$. Applying a Hadamard gate results in