Umber

After facing much difficulty in the derivations from this post, and doing some cursory digging into the "advanced" techniques for solving the troublesome recurrence relations enclosed within, I uncovered some information about the Umbral Calculus, developed by one Rev. John Blissard¹ in an 1860s series of articles on symbolic different "Generic Equations,"²^,³ which is all together more esoteric and fascinating.

The idea is symbolic in nature, and aims to (amongst other things) find a relationship between discrete and continuous calculus. What it lacks in rigor, it more than makes up for with clever and seemingly magical symbolic derivations which may have foreshadowed Heiner's unified Algebra.⁴

This post is rather unstructured, and more or less serves as a meandering random walk through some applications of the Umbral Calculus which I found interesting and/or useful.

Operators

We'll begin by introducing some notation which simplifies some of the algebra to come. If we consider the fundamental theorem of calculus:

\frac{d}{dx} \int f(x) dx = f(x)

we'll rewrite the derivative and integral operators as $\mathcal {\color{#00B4CE}D\color{black}}$ and $\mathcal{\color{#006EB8}I\color{black}}$ :

\mathcal {\color{#00B4CE}D\color{black}} \mathcal{\color{#006EB8}I\color{black}} f(x) = f(x)

Theory of Discrete Calculus

Similarly, we can rewrite the discrete analogs of derivations and integrations (differences and summations):

\mathcal {\color{Orange}\Delta\color{black}} \mathcal{\color{#FDBC42}\Sigma\color{black}} f(x) = f(x)

Our goal is to find a relationship to bridge the gap between discrete and classical calculus via some additional operator we'll denote $\color{#613F99}\phi\color{black}$ such that:

\begin{aligned} \color{#613F99}\phi\color{black} \mathcal {\color{#00B4CE}D\color{black}} &= \mathcal {\color{Orange}\Delta\color{black}} \color{#613F99}\phi\color{black} \\ \color{#613F99}\phi\color{black} \mathcal{\color{#006EB8}I\color{black}} &= \mathcal{\color{#FDBC42}\Sigma\color{black}} \color{#613F99}\phi\color{black} \\ \end{aligned}

From pre-calc, we know that the derivative is a continuous rate of change of the line tangent to each point on a curve. For some function such as $f(x) = x^2$ , the derivative with respect to $x$ is $2x$ :

In Discrete Calculus, we're concerned with the discrete rate of change which we measure with $\mathcal {\color{Orange}\Delta\color{black}}$ :

$x$	$f(x)$	$\mathcal {\color{Orange}\Delta\color{black}}f(x)$
...	...	...
-1	1	-1
0	0	+1
1	1	+3
2	4	+5
3	9	+7
...	...	...

If we know the value of $\mathcal {\color{Orange}\Delta\color{black}}f(x)$ for a specific value of $x$ , we can also compute the value of $f(x+1)$ :

\begin{aligned} f(x) &= x^2 \\ f(x+1) &= (x+1)^2 \\ &= x^2 + \mathcal {\color{Orange}\Delta x^2\color{black}} \\ \implies \mathcal {\color{Orange}\Delta x^2\color{black}} &= (x+1)^2 - x^2 \\ &= x^2 + 2x + 1- x^2 \\ &= 2x + 1 \\ \end{aligned}

so we arrive at a value of $2x + 1$ for the discrete rate of change, where $\mathcal {\color{Orange}\Delta\color{black}}$ is known at the forward difference operator, the general form of which is:

\mathcal {\color{Orange}\Delta f(x)\color{black}} = f(x+1) - f(x)

Linear Operators

An important property of $\mathcal {\color{Orange}\Delta\color{black}}$ is its linearity – or that it preserves addition and scalar multiplication such that:

$\mathcal {\color{Orange}\Delta\color{black}}f(x) + \mathcal {\color{Orange}\Delta\color{black}}g(x) = f(x+1) + g(x+1) - f(x) - g(x)$
$\begin{aligned} \mathcal {\color{Orange}\Delta\color{black}}af(x) &= af(x+1) - af(x) \\ &= a\big[f(x+1) - f(x)\big] \\ &= a \mathcal {\color{Orange}\Delta\color{black}}f(x) \end{aligned}$

And, just like in classical, continuous calculus where differentiation is inversely related to integration, forward differencing is the inverse of summation:

\begin{aligned} \mathcal {\color{#00B4CE}D\color{black}} \mathcal{\color{#006EB8}I\color{black}} f(x) = f(x) \\ \mathcal {\color{Orange}\Delta\color{black}} \mathcal{\color{#FDBC42}\Sigma\color{black}} f(x) = f(x) \end{aligned}

This should be evident from our plot of $f(x) = x^2$ above where:

\begin{aligned} 9 &= 3^2 \\ &= 1 + 3 + 5 \\ &= \sum_{n=0}^{2} \mathcal {\color{Orange}\Delta n^2\color{black}}\\ &= \sum_{n=0}^{x-1} \mathcal {\color{Orange}\Delta n^2\color{black}}\\ &= \sum_{n=0}^{x-1} (2n+1)\\ \end{aligned}

We can prove that this generalization holds by expanding successive terms of a summation involving $\mathcal {\color{Orange}\Delta\color{black}}$ :

\begin{aligned} \mathcal{\color{#FDBC42}\sum_{n=a}^{x-1}\color{black}} \mathcal {\color{Orange}\Delta\color{black}} f(n) &= \mathcal {\color{Orange}\Delta\color{black}}f(a) + \mathcal {\color{Orange}\Delta\color{black}}f(a+1) + \cdots + \mathcal {\color{Orange}\Delta\color{black}}f(x-2) + \mathcal {\color{Orange}\Delta\color{black}}f(x-1) \\ &= \big[f(a+1) - f(a)\big] + \big[f(a+2) - f(a+1)\big] + \\ &\;\quad\quad + \cdots +\\ &\;\quad \big[f(x-1) - f(x-2)\big] + \big[f(x) - f(x-1)\big] \\ &= f(x) - f(a) \end{aligned}

which looks suspiciously similar to the Fundamental Theorem of Calculus... This is useful because it means that finding a summation formula for e.g. $n^2$ is equivalent to finding a function who's $\mathcal {\color{Orange}\Delta\color{black}}$ is $x^2$ like so:

\begin{aligned} \mathcal{\color{#FDBC42}\sum_{n=0}^{x-1}\color{black}} n^2 &= f(x) \iff f(x) = x^2 \\ \mathcal {\color{Orange}\Delta x\color{black}} &= (x+1) - x = 1 \\ \mathcal {\color{Orange}\Delta x^2\color{black}} &= (x+1)^2 - x^2 = 2x+1 \\ \mathcal {\color{Orange}\Delta x^3\color{black}} &= (x+1)^3 - x^3 = 3x^2 + 3x + 1 \\ &\vdots \end{aligned}

We can rearrange and substitute some of these equations to get a general expression for $x^2$ in terms of the known expressions of $\mathcal {\color{Orange}\Delta x\color{black}}$ and $\mathcal {\color{Orange}\Delta x^2\color{black}}$ :

\begin{aligned} x &= \frac{1}{2}(\mathcal {\color{Orange}\Delta x^2\color{black}} -1) \\ \mathcal {\color{Orange}\Delta x^3\color{black}} &= 3x^2 + \frac{3}{2}(\mathcal {\color{Orange}\Delta x^2\color{black}} -1) + 1 \\ &= 3x^2 + \frac{3}{2}\mathcal {\color{Orange}\Delta x^2\color{black}} - \frac{3}{2} + 1 \\ &= 3x^2 + \frac{3}{2}\mathcal {\color{Orange}\Delta x^2\color{black}} - \frac{1}{2} \\ \end{aligned}

And, since $\mathcal {\color{Orange}\Delta x\color{black}} = 1$ , we can slap it on the end which will let us do some additional factoring:

\begin{aligned} \mathcal{\color{Orange}\Delta x^3\color{black}} &= 3x^2 + \frac{3}{2}\mathcal{\color{Orange}\Delta x^2\color{black}} - \frac{1}{2}\mathcal{\color{Orange}\Delta x\color{black}} \\ 3x^2 &= \mathcal{\color{Orange}\Delta x^3\color{black}} - \frac{3}{2}\mathcal{\color{Orange}\Delta x^2\color{black}} + \frac{1}{2}\mathcal{\color{Orange}\Delta x\color{black}} \\ x^2 &= \frac{1}{3}\mathcal{\color{Orange}\Delta x^3\color{black}} - \frac{1}{2}\mathcal{\color{Orange}\Delta x^2\color{black}} + \frac{1}{6}\mathcal{\color{Orange}\Delta x\color{black}} \end{aligned}

an equation for $x^2$ in terms of $\mathcal{\color{Orange}\Delta\color{black}}$ . By linearity, we can factor out the $\mathcal{\color{Orange}\Delta\color{black}}$ :

\begin{aligned} x^2 &= \mathcal{\color{Orange}\Delta x^3\color{black}}\Big(\frac{1}{3}x^3 -\frac{1}{2}x^2 + \frac{1}{6}x\Big) \end{aligned}

which is (drum roll) a function who's delta is $x^2$ .

If Frank Lloyd Wright tried his hand at exponentiation

As alluded to before, the inverse of the forward difference is given by indefinite summation:

\mathcal {\color{Orange}\Delta\color{black}} \mathcal{\color{#FDBC42}\Sigma\color{black}} f(x) = f(x)

which is unique up to a constant term, like the indefinite integral. Again, recall that our overarching goal is to devise some method of ""interpolating"" between discrete and continuous calculus. The crux of this translation $\color{#613F99}\phi\color{black}$ will be falling powers which is like a parameterized factorial function:

\begin{aligned} x_{ \text{\textcircled{n-1}}} &= \overbrace{x(x-1)(x-2)\cdots(x-n+3)(x-n+2)}^{n \text{ terms}} \\ x_{ \text{\textcircled n}} &= x(x-1)(x-2)\cdots(x-n+2)(x-n+1) \\ x_{ \text{\textcircled{n+1}}} &= (x+1)x(x-1)(x-2)\cdots(x-n+3)(x-n+2) \end{aligned}

Since $x_{ \text{\textcircled{n-1}}}$ appears in both terms, we can sub it in:

\begin{aligned} x_{ \text{\textcircled{n-1}}} &= x(x-1)(x-2)\cdots(x-n+3)(x-n+2) \\ x_{ \text{\textcircled n}} &= x_{ \text{\textcircled{n-1}}}(x-n+1) \\ x_{ \text{\textcircled{n+1}}} &= (x+1)x_{ \text{\textcircled{n-1}}} \end{aligned}

Now lets consider the $\mathcal{\color{Orange}\Delta\color{black}}$ of $x$ falling $n$ :

\begin{aligned} \mathcal{\color{Orange}\Delta x_{ \text{\textcircled n}}\color{black}} &= (x+1){ \text{\textcircled n}} - (x){ \text{\textcircled n}} \\ &= (x+1)x_{ \text{\textcircled{n-1}}} - x_{ \text{\textcircled{n-1}}}(x-n+1) \\ &= (x+1 - x + n - 1)x_{ \text{\textcircled{n-1}}} \\ &= nx_{ \text{\textcircled{n-1}}} \\ \end{aligned}

which is shaped the same as the power rule for differentiation, just with the exponent and subscript switched...

\begin{aligned} \mathcal {\color{#00B4CE}D x^n\color{black}} &= nx^{n-1}\\ \mathcal{\color{Orange}\Delta x_{ \text{\textcircled n}}\color{black}} &= nx_{ \text{\textcircled{n-1}}} \end{aligned}

And therein lies the behavior of the umbral operator:

\begin{aligned} \mathcal{\color{#613F99}\phi\color{black}} : \reals &\rightarrow\reals \\ x^n &\xrightarrow[]{\mathcal{\color{#613F99}\phi\color{black}}} x_{ \text{\textcircled{n}}} \\ \end{aligned}

We can observe how $\mathcal{\color{#613F99}\phi\color{black}}$ composes with our difference/differentiation operators by tracing the diagram. E.g. from top left to bottom right, first taking the derivative of some number then applying $\mathcal{\color{#613F99}\phi\color{black}}$ to it, we get:

\mathcal{\color{#613F99}\phi\color{black}}\mathcal {\color{#00B4CE}D \color{black}}x^n = \mathcal{\color{Orange}\Delta\color{black}}\mathcal{\color{#613F99}\phi\color{black}}x_{ \text{\textcircled n}}

And, by linearity, this holds for all polynomials, as well as a bunch of other transcendental functions that can be expressed by series expansion by polynomials, so for concision we might just write:

\mathcal{\color{#613F99}\phi\color{black}}\mathcal {\color{#00B4CE}D \color{black}} = \mathcal{\color{Orange}\Delta\color{black}}\mathcal{\color{#613F99}\phi\color{black}}

which kind of begs the question, can't we just divide through by $\mathcal{\color{#613F99}\phi\color{black}}$ ? No! similar to matrix multiplication it's not multiplicatively commutative. We can, however, multiply through by (apply, composably) its inverse to yield expressions like the following:

\begin{aligned} \mathcal{\color{Orange}\Delta\color{black}} = \mathcal{\color{#613F99}\phi\color{black}}\mathcal {\color{#00B4CE}D \color{black}}\mathcal{\color{#613F99}\phi^{-1}\color{black}} \\ \mathcal{\color{#FDBC42}\Sigma\color{black}} = \mathcal{\color{#613F99}\phi\color{black}}\mathcal{\color{#006EB8}I\color{black}}\mathcal{\color{#613F99}\phi^{-1}\color{black}} \\ \end{aligned}

(and of course, this also applies to summation and integrals by reading the diagram from bottom-right to top-left). This Umbral technique is roughly analogous to Galois Theory⁵^,⁶ insofar as we're taking taking a hard problem like integration, translating it into a comparatively easier setting of discrete summation, solving the problem there, and then exfiltrating our findings back out into the "harder" context via this $\mathcal{\color{#613F99}\phi\color{black}}$ operator.

As one last trivial example, we could concretely derive the formula for $\sum n^2$ as:

\sum_{n=0}^{x-1} = \mathcal{\color{#613F99}\phi\color{black}} \int_0^x \mathcal{\color{#613F99}\phi^{-1}\color{black}}n^2 dn

Nontrivial $\phi$

None of these techniques are useful unless we know $\mathcal{\color{#613F99}\phi\color{black}}$ and $\mathcal{\color{#613F99}\phi^{-1}\color{black}}$ . For polynomials discussed so far, it's relatively straightforward to derive the values:

\begin{align} \mathcal{\color{#613F99}\phi x\color{black}} = x_{ \text{\textcircled 1}} = x \\ \mathcal{\color{#613F99}\phi x^2\color{black}} = x_{ \text{\textcircled 2}} = x(x-1) \\ \mathcal{\color{#613F99}\phi x^3\color{black}} = x_{ \text{\textcircled 3}} = x(x-1)(x-2) \\ \vdots \end{align}

Using the same substitution strategy used for computing the forward difference, we can attain the inverse:

\begin{align} \mathcal{\color{#613F99}\phi x\color{black}} = x_{ \text{\textcircled 1}} &= x \\ \mathcal{\color{#613F99}\phi x^2\color{black}} = x_{ \text{\textcircled 2}} &= x(x-1) \\ &= x^2 -x \\ &= x^2 - \mathcal{\color{#613F99}\phi x\color{black}} \\ \implies x^2 &= \mathcal{\color{#613F99}\phi x^2\color{black}} + \mathcal{\color{#613F99}\phi x\color{black}} \\ \mathcal{\color{#613F99}\phi^{-1}x^2\color{black}} &= x^2 + x \\ \end{align}

Stirling Numbers

And using this method, we can find $\mathcal{\color{#613F99}\phi\color{black}}$ and $\mathcal{\color{#613F99}\phi^{-1}\color{black}}$ for any polynomial:

\mathcal{\color{#613F99}\phi x\color{black}} = x \\ \mathcal{\color{#613F99}\phi x^2\color{black}} = x^2-x \\ \mathcal{\color{#613F99}\phi x^3\color{black}} = x^3-3x^2 + 2x \\ \mathcal{\color{#613F99}\phi x^4\color{black}} = x^4-6x^3 + 11x^2 - 6x

\mathcal{\color{#613F99}\phi^{-1} x\color{black}} = x \\ \mathcal{\color{#613F99}\phi^{-1} x^2\color{black}} = x^2+x \\ \mathcal{\color{#613F99}\phi^{-1} x^3\color{black}} = x^3+3x^2 + x \\ \mathcal{\color{#613F99}\phi^{-1} x^4\color{black}} = x^4 + 6x^3 + 7x^2 + x

These are known as the Signed Stirling Numbers of the first kind, and Unsigned Stirling Numbers of the second kind – respectively.

They have some nice properties. We can compute the coefficients of a row by multiplying the negated row index by the first term and adding it to the next term. For example, the second term of the fifth row would be:

\begin{aligned} \mathcal{\color{#613F99}\phi x^4\color{black}} = x^4-6x^3 + 11x^2 - 6x \\ \mathcal{\color{#613F99}\phi x^5\color{black}} = ?x^5 + \color{red}?\color{black}x^4 + ?x^3 + ?x^2 +?x \\ \implies (-4)(-6) + 11 = 35\\ \mathcal{\color{#613F99}\phi x^5\color{black}} = ?x^5 + \color{red}35\color{black}x^4 + ?x^3 + ?x^2 +?x \\ \end{aligned}

For the unsigned Stirling numbers, we compute the next set of coefficients by multiplying the corresponding coefficient from the previous row by its exponent and add it to the next term:

\begin{aligned} \mathcal{\color{#613F99}\phi^{-1} x^4\color{black}} = x^4 + 6x^3 + 7x^2 + x \\ \mathcal{\color{#613F99}\phi^{-1} x^5\color{black}} = x^5 + \color{red}?\color{black}x^4 + ?x^3 + ?x^2 + ?x \\ \implies (6)(3) + 7 = 25\\ \mathcal{\color{#613F99}\phi x^5\color{black}} = ?x^5 + \color{red}25\color{black}x^4 + ?x^3 + ?x^2 +?x \\ \end{aligned}

In order to further underscore the apparent relationship to Pascal's triangle an binomial coefficients, we express these polynomials in brackets and braces:

\begin{aligned} \mathcal{\color{#613F99}\phi x^n_k\color{black}} &= \begin{bmatrix} n + 1 \\ k \end{bmatrix} = -n \begin{bmatrix} n \\ k \end{bmatrix} + \begin{bmatrix} n \\ k - 1 \end{bmatrix} \\ \mathcal{\color{#613F99}\phi^{-1} x^n_k\color{black}} &= \begin{Bmatrix} n + 1 \\ k \end{Bmatrix} = k \begin{Bmatrix} n \\ k \end{Bmatrix} + \begin{Bmatrix} n \\ k + 1 \end{Bmatrix} \end{aligned}

These sequences lie at the heart of Umbral calculus, and have natural applications in combinatorics as well.

{n \choose k} = \frac{n_{ \text{\textcircled k}} }{k!}

that is, $n$ choose $k$ can be though of as the number of $k$ objects we can take from $n$ objects and by the number of permutations of $k$ objects, the quotient of which is the number of ways to choose $k$ objects from $n$ objects, ignoring their orderings. We can use this identity to compute $\mathcal{\color{#613F99}\phi\color{black}}$ for some other functions like:

\begin{aligned} e^{ax} &= \sum^{\infty}_{n = 0} \frac{a^nx^n}{n!} \\ \mathcal{\color{#613F99}\phi e^{ax} \color{black}} &= \sum^{\infty}_{n = 0} \frac{a^n x_{ \text{\textcircled n}}}{n!} \\ &= \sum^{\infty}_{n = 0} a^n { n \choose k} \end{aligned}

which looks an awful lot like the Binomial Theorem for $b=1$ :

\begin{aligned} (a+b)^x &= \sum^{\infty}_{n=0} {x \choose n} a^n b^{x-n} \\ (a+1)^x &= \sum^{\infty}_{n=0} {x \choose n} a^n \cdot 1 \\ \implies \mathcal{\color{#613F99}\phi e^{ax} \color{black}} &= (a+1)^x \end{aligned}

And, knowing $\mathcal{\color{#613F99}\phi e^{ax} \color{black}}$ , we can also tap into some trigonometric functions to derive equations for their discrete $\leftrightarrow$ continuous mappings:

\begin{aligned} \sin x &= -\frac{i}{2}(e^{ix} - e^{-ix}) \\ &= -\frac{i}{2}(\mathcal{\color{#613F99}\phi e^{ix} \color{black}} - \mathcal{\color{#613F99}\phi e^{-ix} \color{black}}) \\ &= -\frac{i}{2}\big((1+i)^x) - (1-i)^x\big) \\ &= -\frac{i}{2}\big((\sqrt 2e^{i\pi/4})^x - (\sqrt 2e^{-i\pi/4})^x \big) \\ &= -\sqrt{2^x} \frac{i}{2}\big(e^{i\pi/4} - e^{-i\pi/4} \big) \\ & = \sqrt{2^x} \sin \frac{x \pi}{4} \end{aligned}

And similarly, we get a formula for $\mathcal{\color{#613F99}\phi \cos x \color{black}}$ :

\begin{aligned} \mathcal{\color{#613F99}\phi \cos x \color{black}} &= \sqrt{2^x} \cos \frac{x \pi}{4} \end{aligned}

which in turn give rise to other classical results:

\begin{aligned} \mathcal {\color{#00B4CE}D \color{black}}e^x = e^x &\iff \mathcal{\color{Orange}\Delta\color{black}}2^x = 2^x \\ \\ \mathcal {\color{#00B4CE}D \color{black}}\sin x = \cos x &\iff \mathcal{\color{Orange}\Delta\color{black}}\mathcal{\color{#613F99}\phi \sin x \color{black}} = \mathcal{\color{#613F99}\phi \cos x \color{black}} \\ \end{aligned}

We can take the Umbral a step further, taking a step upwards in the direction of Category Theory by generalizing what we've observed about calculus operations to number theory and arithmetic.

Operational Calculus

Briefly revisiting the behavior of Umbral operations in conjunction with linear operators, we'll motivate discussion of their applicability to differential equations, recurrence relations, exponential shifting, and more.

Consider the number line of reals, together with arithmetic multiplication and addition (i.e. the field $\langle\mathbb F, +, \times\rangle$ ). Instead of thinking of a number as a point on the line, let's shift our representational understanding of them to be operations themselves, and addition and multiplication to be shifts and stretches applied to the numeral operators. E.g., the point $2$ together with other operators, can be thought of as:

\begin{aligned} (\color{red}2\color{black},5) &\xmapsto{\color{red}\times\color{black}} 10 \\ (-1, \color{red}2\color{black}) &\xmapsto{\color{red}+\color{black}} 1 \end{aligned}

The notion of addition can then be thought of as the translation which maps $0 \mapsto 2$ (shifting the whole number line accordingly), same with multiplication:

two points combined via the extraneous notion of $\color{red}\times\color{black}$ :

(2,3) \xmapsto{\color{red}\times\color{black}} 6 \\

two, as a point, on which we can apply the transformation $\color{red}\times3\color{black}$ :

2 \xmapsto{\color{red}\times 3\color{black}} 6 \\

The converse: three, as a point, on which we can apply the transformation $\color{red}\times2\color{black}$ :

3 \xmapsto{\color{red}\times 2\color{black}} 6 \\

and finally, the composition(s) of those transformations acting on $1$ :

1 \xmapsto{\color{red}\times 2\color{black}}\xmapsto{\color{red}\times 3\color{black}} 6

Considering numbers themselves as transformations acting on functions, we can also write:

\color{red}(2)(3)\color{black}f(x) = \color{red}6\color{black}f(x)

Numbers as operators (distinct from the implicit shorthand multiplication where we place numbers next to one another sans $\times, \cdot$ ) are themselves linear operators since they maintain:

additivity: $\color{red}a(b+c) = ab + ac\color{black}$
and commutative under scalar multiplication

So, the above expression is more explicitly written:

\color{red}2\color{black} \circ \color{red}3\color{black}\circ f(x) = \color{red}6\color{black} \circ f(x)

And, similarly, this applies to addition as well:

\begin{aligned} \color{red}2\color{black} + \color{red}3\color{black} &= \color{red}5\color{black} \\ \color{red}2\color{black}\circ f(x) + \color{red}3\color{black}\circ f(x) &= \color{red}5\color{black} \circ f(x) \end{aligned}

Examining a non-trivial linear operation which purportedly ought to follow similar rules, can we make sense of something like:

\mathcal {\color{#00B4CE}D \color{black}} + 1

In the context of composing operators, it's useful to divorce ourselves from the specifics of their application –only bearing in mind⁷ the category/domain of our contextual applicability, which for us so far is just $\reals$ :

Visually, we don't have to know or care about the values of our unlabeled points, just the transformations connecting them. Let's zoom in on the derivative operator $\mathcal {\color{red}D \color{black}}$ (also colored red to emphasize the representational equivalence to numeral operators). As established before, it's also a linear operator, preserving scalar addition and distributing & commuting under scalar multiplication:

\begin{aligned} \mathcal{\color{red}D\color{black}}a + &= a\mathcal{\color{red}D\color{black}} \\ \mathcal{\color{red}D\color{black}}(af(x) + bf(x)) &= \mathcal{\color{red}D\color{black}}af(x) + \mathcal{\color{red}D\color{black}}bf(x) \end{aligned}

Note here that $a,b$ don't ever have to be scalars. They could also be any other linear operator. So, given an expression like:

(\mathcal{\color{red}D + 2\color{black}})(\mathcal{\color{red}D+3\color{black}})

we can apply intuitive rules of algebra to derive:

\begin{aligned} (\mathcal{\color{red}D + 2\color{black}})(\mathcal{\color{red}D+3\color{black}}) &= (\mathcal{\color{red}D + 2\color{black}})\mathcal{\color{red}3\color{black}} + (\mathcal{\color{red}D+2\color{black}}) \mathcal{\color{red}D\color{black}} \\ &= \mathcal{\color{red} D^2 + 2D + 3D + 6\color{black}} \\ &= \mathcal{\color{red} D^2 + 5D + 6\color{black}} \end{aligned}

Which we can apply to a second order differential equation like so:

\mathcal{\color{red} D^2\color{black}} f(x) +\mathcal{ \color{red}5D\color{black}}f(x) + \mathcal{\color{red}6\color{black}}f(x)

We can just factor it into two maybe-much-simpler equations:

\begin{aligned} (\mathcal{\color{red}D + 2\color{black}})(\mathcal{\color{red}D+3\color{black}})f(x) &= 0 \\ \\ (\mathcal{\color{red}D + 2\color{black}})f(x) = 0, &\quad (\mathcal{\color{red}D+3\color{black}})f(x) = 0 \\ \\ \mathcal{\color{red}D\color{black}}f(x) = \color{red}-2\color{black}f(x), &\quad \mathcal{\color{red}D\color{black}}f(x) = \color{red}-3\color{black}f(x) \end{aligned}

by the principle of superposition, which we can combine to yield a general solution to the equation:

f(x) = c_1e^{-2x} + c_2e^{-3x}

So we can discuss and manipulate differential equations of any order using purely algebraic mechanics.

Unit Shifting

Let's consider another "outside-the-box" linear operator called the unit shift, denoted $\mathcal{\color{red}T\color{red}}$ , defined by:

\begin{aligned} \mathcal{\color{red}T\color{red}}f(x) &= f(x+1) \\ \mathcal{\color{red}T^{-1}\color{red}}f(x) &= f(x-1) \\ \mathcal{\color{red}T^{n}\color{red}}f(x) &= f(x+n) \\ \end{aligned}

We can express the forward difference operator $\mathcal{\color{Orange}\Delta\color{black}}$ then in terms of the unit shift $\mathcal{\color{red}T\color{red}}$ :

\begin{aligned} \mathcal{\color{Orange}\Delta\color{black}}f(x) &= \mathcal{\color{red}T\color{red}}f(x) - \color{red}1\color{black}f(x) \\ \mathcal{\color{Orange}\Delta\color{black}} &= \mathcal{\color{red}T\color{red}} - \color{red}1\color{black} \\ \end{aligned}

Incidentally, just as the terms of $\color{#613F99}\phi\color{black}$ gave us Stirling numbers, $\mathcal{\color{red}T\color{red}}$ gives us binomial coefficients:

\mathcal{\color{red}T\color{red}}x = x+1 \\ \mathcal{\color{red}T\color{red}}x^2 = x^2 + 2x \\ \mathcal{\color{red}T\color{red}}x^3 =x^3 + 3x^2 + 3x + 1 \\ \mathcal{\color{red}T\color{red}}x^4 =x^4 + 4x^3 + 6x^2 + 4x + 1 \\ \vdots

And, just as we used $\mathcal {\color{#00B4CE}D \color{black}}$ to solve differential equations, we can use $\mathcal{\color{Orange}\Delta\color{black}}$ to solve discrete difference equations:

\begin{aligned} \mathcal{\color{Orange}\Delta^2\color{black}} f(x) + \mathcal{\color{Orange}5\Delta\color{black}}x + \mathcal{\color{Orange}6\color{black}}f(x) &= 0 \\ \mathcal{\color{Orange}(\Delta + 2)(\Delta + 3)\color{black}}f(x) &= 0\\ \implies f(x) &= c_1\color{#613F99}\phi\color{black} e^{-2x} + c_2\color{#613F99}\phi\color{black}e^{-3x} \end{aligned}

Similarly, we can use $\mathcal{\color{red}T\color{red}}$ to solve recurrence relations:

\begin{aligned} \mathcal{\color{red}T^2 + 5T + 6\color{red}} &= 0\\ \mathcal{\color{red}(T + 2)(T+3)\color{red}}f(x) &= 0 \\ \implies f(x) &= c_1(-2)^x + c_2(-3)^x \end{aligned}

which, in turn, means we can solve recurrence relations just like we would solve differential equations:

\begin{aligned} f(x+1) &= \mathcal{\color{red}T\color{red}}f(x) \\ f(0) &= k \\ f(1) &= \mathcal{\color{red}T\color{red}}k \\ f(2) &= \mathcal{\color{red}T^2\color{red}}k \\ \vdots \\ \therefore f(x) &= \mathcal{\color{red}T^x\color{red}}k \\ \end{aligned}

Concretely, for e.g. the Fibonacci sequence where:

\begin{aligned} f(x+2) &= f(x+1) + f(x) \\ f(0) &= 0 \\ f(1) &= 1 \end{aligned}

We would express this in terms of unit shifts like so:

\begin{aligned} \mathcal{\color{red}T^2\color{red}}f &= \mathcal{\color{red}T\color{red}}f + \mathcal{\color{red}1\color{red}}f\\ \mathcal{\color{red}(T^2 - T - 1)\color{red}}f &= 0 \\ \mathcal{\color{red}(T - \frac{1 + \sqrt{5}}{2})(T - \frac{1 - \sqrt{5}}{2})\color{red}}f &= 0 \\ \mathcal{\color{red}(T - \varphi)(T - (-\varphi)^{-1})\color{red}}f &= 0 \\ \end{aligned}

(Note that the $\varphi$ used here is the golden ratio and its conjugate, not to be confused with our transformation $\color{#613F99}\phi\color{black}$ ). Using the same superposition principle, we get a general solution:

f(x) = \varphi^xk_1 + (-\varphi)^{-x}k_2 +

And the constant are given by the initial conditions:

\begin{aligned} f(0) = 0 &= \varphi^0 k_1 + (-\varphi)^0 k_2 \\ 0 &= k_1 + k_2 k_1 = -k_2 \\ \implies f(x) &= \big(\varphi^x - (-\varphi)^{-x} \big)k_1 \\ f(1) &= \big(\varphi^1 - (-\varphi)^{-1} \big)k_1 = 1 \\ \implies k_1 &= \frac{1}{\varphi + \varphi^{-1}} \\ &= \frac{1}{\frac{1 + \sqrt{5}}{2} + \frac{1 - \sqrt{5}}{2}} \\ &= \frac{1}{\sqrt{5}} \\ \implies k_2 &= -\frac{1}{\sqrt{5}} \end{aligned}

And so the closed form for the Fibonacci sequence via differential equations is:

f(x) = \frac{\varphi^x - (-\varphi)^{-x}}{\sqrt5}

What about a less-trivial example, i.e. one where the right hand side of the equation isn't zero:

\begin{aligned} y - \frac{dy}{dx} &= x^3 \\ \color{#00B4CE}1 \color{black}y - \mathcal{\color{#00B4CE}D \color{black}}y &= x^3 \\ (\mathcal{\color{#00B4CE}1 - D \color{black}})y &= x^3 \\ y &= (\mathcal{\color{#00B4CE}1 - D \color{black}})^{-1}x^3 \\ \end{aligned}

So how do we find the inverse of something like this? The trick is to notice that the product rule for differentiation can give us something like:

\color{#00B4CE}1 \color{black}y - \mathcal{\color{#00B4CE}D \color{black}}y = x^3 \\

such as:

\begin{aligned} \mathcal{\color{#00B4CE}D \color{black}}(fy) &= f\mathcal{\color{#00B4CE}D \color{black}}y + y\mathcal{\color{#00B4CE}D \color{black}}f \\ \mathcal{\color{#00B4CE}D \color{black}}(e^{-x}y) &= e^{-x}\mathcal{\color{#00B4CE}D \color{black}}y + y \mathcal{\color{#00B4CE}D \color{black}}e^{-x} \\ &= e^{-x}\mathcal{\color{#00B4CE}D \color{black}}y - e^{-x}y \\ e^{-x}\mathcal{\color{#00B4CE}D \color{black}}y &= \mathcal{\color{#00B4CE}D \color{black}}(e^{-x}y) + e^{-x}y \end{aligned}

which has the same form as our target equation, just multiplied by $e^{-x}$ :

\begin{aligned} \color{#00B4CE}1 \color{black}y - \mathcal{\color{#00B4CE}D \color{black}}y &= x^3 \\ \mathcal {\color{#00B4CE}D \color{black}}(e^{-x}y) +e^{-x}y - e^{-x}y &= e^{-x}x^3 \\ \mathcal {\color{#00B4CE}D \color{black}}(e^{-x}y) &= e^{-x}x^3 \\ \end{aligned}

and we do know how to compute the inverse of the righthand side, albeit tediously, through integration by parts (not once, not twice, ...):

\begin{aligned} e^{-x}y &= - \int e^{-x}x^3 dx \\ &= e^{-x}x^3 -3 \int e^{-x}x^2 dx \\ &= e^{-x}x^3 + 3e^{-x}x^2 -6 \int e^{-x}x dx \\ &= e^{-x}x^3 +3e^{-x}x^2 + 6e^{-x}x -6 \int e^{-x} dx \\ &= e^{-x}x^3 +3e^{-x}x^2 + 6e^{-x}x +6e^{-x} + c \\ y &= x^3 + 3x^2 + 6ex + 6 + ce^{x} \end{aligned}

Which means that we can shift a differential equation to make it easier to solve by multiplying through by an exponential. This is the exponential shift theorem.

e^{ax}\mathcal{\color{red}(D + a)\color{black}}y =\mathcal{\color{red}D\color{black}}(e^{ax}y)

Series Expansion

Alternatively, we could've just expanded $(\mathcal{\color{#00B4CE}1 - D \color{black}})^{-1}$ as a geometric series:

\begin{aligned} y &= \mathcal{\color{red}(1 + D + D^2 + D^3 + \cdots) \color{black}}x^3 + \mathcal{\color{red}(1 -D)^{-1} \color{black}}0 \\ &= x^3 + 3x^2 + 6x + 6 + ce^x \end{aligned}

We can use this approach to express any polynomial as a Taylor Series centered at an arbitrary point $a$ like so:

f(x + a) = \sum_{k=0}^\infty \frac{x^k\mathcal{\color{red}D^k \color{black}}f(a)}{k!}

Here, $\mathcal{\color{red}D\color{black}}$ is still differentiation with respect to $x$ , applied to $f$ evaluated at $a$ . Not that the left hand side of this equation is a unit shift of $x$ units applied to $a$ :

\mathcal{\color{red}T^x \color{black}}f(a) = \sum_{k=0}^\infty \frac{x^k\mathcal{\color{red}D^k \color{black}}f(a)}{k!}

and the sum on the right is just the sum we get if we expand $e^{x\mathcal{\color{red}D\color{black}}}$ :

\mathcal{\color{red}T^x} = e^{x\mathcal{\color{red}D\color{black}}}

So, just like exponentiating a complex number of the form $a + bi$ corresponds to the composition of a stretch action and an orthogonal rotation action, so too does exponentiation of the derivative operator produce a scalar stretch and orthogonal shift:

So, if we exponentiate $\mathcal{\color{red}a +bD\color{black}}$ , we get the composition of scalar stretch and orthogonal shift:

which finds application in describing iterated functions of the form:

e^{a\beta(x)\mathcal{\color{red}D\color{black}}} f(x) = f(h^{-1}(h(x) + a))

Footnotes

A peer of Cayley, whose work was instrumental to the techniques discussed in Librarians, Luhn, and Lizard Brain ↩
Blissard, John. "Theory of generic equations." Quarterly Journal of Pure Applied Math. 4, 279–305, 1861. ↩
Blissard, John. "Theory of generic equations." Quarterly Journal of Pure Applied Math. 5, 58.75, 185-208, 1862.
And it's actually despicable that I can't find a PDF of these, nor the de facto biography on him and his methods (The history of Blissard’s symbolic method with a sketch of its inventor’s life by E.T. Bell) anywhere. Aaron Swartz wept. ↩
Thynges that are thynges, that ARE other thynges ↩
Galois Theory 1: Prelude to the Nonexistence of the General Quintic ↩
Galois Theory 2: The General Quintic is Dead and Évariste Killed Him ↩
Andy and I, Uh... ↩

Umber

Tags

Previous Article

Next Article