How does a Compression Curve Work?

Dec 17

A few months ago, a friend asked me about algorithms for interpreting the force on a digital keyboard. He wanted to program a virtual MIDI device to change the how hard he seemed to play notes.

Some keyboards feel kind of dull to play, like you really have to hit it hard to go from quiet to loud. He didn't want that -- soft notes should be really soft, and loud notes should be really loud, but in-between notes should also be very responsive to small changes in force. And to top it off, he needed the intensity of this effect to be tunable with a dial!

The effect he was trying to code is called expansion: it's a function that keeps small numbers small, makes big numbers big, and makes intermediate numbers pick a lane. Its opposite is more common, and called compression. A compression function keeps the smallest numbers small and the biggest numbers big, but everything in between gets smushed towards a blah middle value. Here I present the writeup I sent my friend, unaltered from its original form.

Purpose

$\mu$ compression is an algorithm for taking a number between $0$ and $1$, and "compressing" it towards 1 while still allowing extreme values near $0$ to stay extreme. We also want to control how intense the compression is using a gain parameter, called $\mu$ (pronounced "mew," like the Pokémon).

Compressing with an arbitrary function

$\mu$ compression uses logarithms, but we could in theory use any old function, as long as it goes through the point $(0,0)$. Let's just make up a function called $f$, and the only things we'll know about it are that $f(0) = 0$ and that $f$ is strictly increasing: bigger inputs will always give us bigger outputs.

Now, $f(1)$ is some number, and that number is greater than 0 because $f$ is an increasing function. It's a constant -- no matter the value of $x$ or anything else, $f(1)$ is just some specific number that only depends on what our function $f$ really is. But we have no idea what $f(1)$ is, because we haven't decided on $f$ yet. In fact, we don't care what $f(1)$ is, because we are about to do some trickery to force it where we want it. We are going to make a new function $g$ defined as follows:

$$g(x) = \frac{f(x)}{f(1)}$$

What did we do there? We've proportionally scaled the whole function vertically. Every output of $g$ is like an output of $f$, but it is $f(1)$ times smaller. In particular, we've scaled things so that $g(1)$ will have to equal $1$ no matter what $f$ was. Check it out: when $x$ is $1$, $g(x)$ will just be $g(1) = \frac{f(1)}{f(1)} = 1$. This means that $g(x)$ will still send an input of $1$ to an output of $1$, even if it messes with numbers in between.

Adding a parameter

Our function $g$ gives a nice way of applying the shape of any increasing function $f$ to the range between $0$ and $1$. But what if we want a dial to change the shape of the function? The trick with $\mu$ compression is that we aren't just going to stretch $f$ vertically, but also horizontally. We will need to introduce our parameter $\mu$ like this:

$$h(x) = \frac{f(\mu x)}{f(\mu)}$$

By multiplying $x$ by $\mu$ before we run it through our function $f$, we are having $x$ "pretend" that it is $\mu$ times bigger. For example, if $\mu = 100$ and $x = 0.5$, our new function will treat $0.5$ like $50$, meaning the function $h$ is 100 time skinnier than $f$ was. Then we use the same trick as before to make sure that $h(1) = 1$ because $h(1) = \frac{f(\mu \cdot 1)}{f(\mu)} = 1$.

The ultimate effect of this is that bigger values of $\mu$ let us shrink down more distant parts of $f$ to be usable for the small numbers we care about.

Extending to negatives

In practice, we also want our function to work for $x$ values all the way down to $-1$. We can use a quick cheat to copy our function, but upside down. Instead of putting in $x$, we're going to put in the absolute value $|x|$, so negative numbers will give the same answers as their positive counterparts. Then we just flip any negative number's output upside down. That is, We'll multiply by the "sign" of $x$, multiplying outputs for positive numbers by a useless $1$ but multiplying outputs for negative inputs by $-1$. This gives a final general formula:

$$y(x) = \text{sgn}(x) \frac{f(\mu |x|)}{f(\mu)}$$

From here on in, we don't have to worry about the negatives. This formula gives them to us for free.

Why logarithms?

With this, all that's really left is to pick the right function as $f$. For compression in the real world, we ultimately want a function that looks sigmoid. That's going to require something that starts steep near $0$ but then gets shallower for big numbers. There are lots of functions out there we could choose, but the logarithm is a nice one for several reasons. First, it has the general shape we want. Second, it gets very, very shallow surprisingly quickly, meaning it will make for nice, strong compression if we use a big enough chunk of it (big $\mu$). We do need to make sure that $f$ goes through $(0,0)$, and log functions don't actually do that, so we have to scoot the function left a bit to make it work. We choose $f(x) = \log(x+1)$. Thus, our specific formula we will use for $\mu$ compression:

$$y(x) = \text{sgn}(x) \frac{\log(\mu |x| + 1)}{\log(\mu + 1)}$$

musicalgorithmsexplainer

Dan Strauss https://learnfromdan.com