noisEE, Part 1: Software
You wanna make some noise? Maybe brush off those rusty electrical engineering skills, flex those signal analysis muscles, and make up a menagerie of sound?
I thought that would be cool, so that’s what noisEE[1] is about.
Specifically, noisEE was a project to create different “colors” of noise, meant to ease me back into hardware development. As these things go, however, it turned out to be more involved than the simple project I originally intended.
In this first post I’ll be talking about the supporting signal analysis work I did in order to figure out how to generate the different colors of noise. The next post will cover the rendition of the filter in hardware, posted here.
Colors of Noise
What do we mean by “colors” of noise?
Maybe listening to some examples will clear things up:
White:
Pink:
Red/brown:
They’re all clearly noise: there’s that signature fuzz in all of the examples. However, the differences between them should stand out: white has a harsh edge, and mellows out as we go to pink until we get to the dull roar of red[2].
Surprisingly, they’re all derived from the same white noise sample, but with some modulation on the signal. Specifically, we’re adjusting the frequency spectrum of the original signal: if you think back to the bad old Winamp days, there was usually a frequency spectrum display that looked a lot like this:
(lifted from Wikimedia)
going from low frequencies on the left to higher frequencies on the right. The height of the bars represent how loud those frequencies are, and as the music changes, the bars follow. But this is a low-resolution frequency analysis, and we can get a higher resolution sample, like this fraction of a second sample from a random song[3]:
Modulating the signal means changing the bars at a certain time, so the frequency spectrum looks different. A simple modulation is simply turning down the volume: the Winamp signal looks similar, but now all the bars are charted in a lower/softer decibel range:
Another simple modulation is to boost the bass, bringing out the drums and other low frequency instruments.
(Sure, the effect looks subtle, but check out the decibel scale: boosting parts of your signal by 12db is boosting the power by a factor of 8. On listening it’s super obvious the bass has been boosted, just under the point of distortion.)
With the visualized frequency spectrum in our pocket, we can define how the different sorts of noise should look.
White noise represents a purely random signal. Interestingly, that means it has a flat spectrum: all frequencies are about equally present in a white noise signal.
It’s a little weird; what does a random signal have to do with having a flat frequency spectrum? Why isn’t the frequency spectrum also random? Let’s say that our frequency spectrum is otherwise flat, but it has a spike: we plugged our randomness generator into a wall outlet, and now the 60Hz power signal is leaking into the signal, which shows up as a spike around 60Hz in our frequency spectrum. Well, now signal isn’t quite random anymore. In fact, if the line noise is so large it dominates over the rest of the signal, it might be easy to guess what the future signal looks like (like a 60Hz wave), so the signal is no longer random. By extension, any spikes in the frequency spectrum mean there’s some repetition, and that means the output isn’t random, so a random signal should be flat, shown by contradiction[4].
What if we do allow some leakage from the past to the future? In other words, if we know the current signal value, then we have information about where the signal will be in a short time. This sounds similar to one of Einstein’s lesser contributions to physics, his work on Brownian motion, the behavior of a particle suspended in fluid moving around as the fluid jostles it. At any given time the particle might be jostled one direction or another, but it only takes small steps before being jostled in another direction: the particle is taking a random walk.
In other words, the position of the particle doesn’t change much before it gets jostled again in a random direction. If you’re familiar with calculus, this is the same as integrating a white noise signal. This means any low frequency signals are strong relative to the high frequency signals, which cancel in short order. As an example, consider a suspended particle in the ocean: it’ll get jostled around randomly, but if it’s near the shore there will be a noticeable slow periodic motion.
Interestingly, when we look at the frequency spectrum this signal is also flat, but it’s a sloped line. We’ll get into exactly how sloped this line is in the next section.
Because this noise is based on Brownian motion, we call it brown noise. We also call it red noise, by analogy with red light being made up of low-frequency light and brown/red noise emphasizing the lower frequencies.
Pink noise is halfway between white and red noise: it also has a sloped line frequency spectrum, but at a shallower slope than red noise; in fact, it’s exactly halfway between (hence mixing white and red to get pink).
It turns out that it’s super easy to generate white noise and red noise, but not so easy to generate pink noise. To understand why, we’ll have to delve into signal theory.
Signals and Systems, SOS
Given the fact that brown noise is a random walk of white noise, it’s easy to make a digital filter which can make brown noise from a given white noise signal: take the previous brown noise sample, and add in some amount of the current white noise sample (take a step in the random walk), and that’s your current brown noise sample.
There’s some choice in how much brown noise you keep, and how much white noise you add in. If you add only a little bit of white noise, then there’s less of a filtering effect, and more of the lower frequencies get through. Note, though, is that there’s a cutoff point; up until that cutoff frequency, almost nothing is filtered (the passband), but at higher frequencies the filtering becomes more and more severe (the roll off). So I lied earlier: brown noise attenuates higher frequencies more severely, but that doesn’t apply to the passband. It’s not a simple line, it’s more like a segmented flat and sloped line.
(Note that this is an idealized visualization of the transfer function. Since we’re filtering flat spectrum white noise, we expect the real-world Audacity frequency spectrum to bear a passing resemblance to these ideal representations.)
What’s interesting is that the sloped part of the line never gets steeper, no matter how much brown noise you keep and how little white noise you add: changing how much brown noise you keep effects the cutoff frequency (and how loud the final signal is), and how much white noise you add affects how loud the parts of the signal being passed through are. All filters built in this way will have the same slope, with the signal getting 4 times as quiet each time the frequency doubles: in EE jargon, this means the slope is always roughly -6db/octave[5].
What’s really interesting is that we can also create a low-pass filter with a resistor and capacitor, which is one of the first things an electrical engineering course will teach you to make. This will be somewhat helpful in the next post, when we’re creating the filters in hardware.
By choosing the right filter weights, we can position the sloped part of the filter so it covers the entire range of human hearing, from 20Hz to 20kHz[6]. For example, using 0.99765 · previous + 0.0990460 · white at 44.1kHz[7] will put the cutoff frequency right below 20Hz, giving us a good rendition of brown noise across all of human hearing.
So what does this mean for making pink noise?
If pink noise has half the slope of brown noise, then it should have a slope of -3db/octave. Unfortunately, there aren’t any simple filters out there that have a roll off with this slope: in fact, most filters have steeper slopes[8]. However, we might be able to kludge together something that approximates the sort of frequency response we’re looking for, and we’ll even build it on top of the low-pass filter we already know and love.
The central insight comes from Robin Whittle’s in-depth page on pink noise generation[9]: simple low pass filters usually have a slope of 0db/octave or -6db/octave, but there’s one spot that the slope is different, right around the cutoff frequency. There, the frequency response is curved into a knee, and it’s going from 0 to -6db/octave, and at points it’s around -3db/octave.
So we can chain a bunch of low-pass filters together, with each filter situated so that the knee of one filter ends right as another starts[10]. We can use the pinking parameters given by Paul Kellet from Robin’s page to get the following spectrum:
With more filters, we can smooth out the roll off and get progressively closer to a -3db/octave line. In reality, you don’t have infinite materials to make filters, so you stop at some point and call it good enough. And looking at it, the first 3 filter formulation is probably fine for most purposes.
500 Shades of Red
So we have white noise (assume we’re given it), and red noise (a simple low pass filter), and pink noise (using multiple low pass filters added together), each with a linear frequency response slope of 0db/octave, -3db/octave, and -6db/octave.
What if we wanted all the colors of the (red) rainbow?
In other words, could we find a parameterized way to generate any linear slope of falloff, like -1.8db/octave or -5.623db/octave? There’s no obvious reason we couldn’t use the same adding trick that we used for pink noise to produce other outcomes.
To make things easier to calculate, we’ll re-use the cutoff frequencies from the pink noise filters (roughly 16.5, 270, 5300, and infinity Hz) and then change the overall volume of the filter (passband attenuation). This has the nice effect that when we use a digital IIR filter formulation (basically the current+previous form I was using earlier), we only have to change the amount of white noise we add into the signal in order to change the attenuation.
Since we’re re-using the pink noise cutoff frequencies, pure white/red/pink noise are easy to make. White noise turns down all the filters except the one with a cutoff at infinity hertz (no roll off). Similarly, red noise turns down all filters except the one with a cutoff frequency of 16.5Hz. Pink uses Paul’s parameters like before.
To get any falloff, we can just interpolate the passband attenuation between white and pink, and pink and red[11]. But how to do the interpolation isn’t clear. After experimentation, it’s clear the filter weights don’t have a linear relationship with the amplitude of the signal, and that’s before taking into account the fact we really care about the logarithmic power scale. We could try to come up with a clever mathematical way to overcome each of these problems, like throwing a square into the interpolation in order to bypass the amplitude vs power problem[12].
However, Paul Kellet from Robin’s page tuned the pink noise parameters by hand for just one slope, and ain’t nobody got time for tuning a pile of parameterized polynomial functions that might not even fit our problem. Let’s just give up and use computational brute force!
What? White Noise is evolving!
The basic idea is that we can just use a greedy hill-climbing algorithm to evolve our way to the best weights. We’ll do this for some fine-grained number of slopes between 0 and -6db/octave; we don’t expect anything to be discontinuous[13], so it should be easy to just string together the points and use those to find a nicely parameterizable function with a polynomial regression, or even just use those points directly: 1000 numbers will take up a ridiculously small amount of disk space on modern computers, and you can likely get away with fewer points for most practical purposes[14].
For each slope we want to hill-climb, we start with some reasonable filter parameters, and see what the frequency response looks like. We’re looking for a linear response with a certain slope, so we see how closely the actual frequency response corresponds with what we want, and define an error quantity to tell us how far off reality is from the dream. By crawling towards parameters with lower errors, we can get reality pretty close to what we’re looking for.
There are 2 important things that we feed into our error value: the difference in actual and expected slope, the R2 correlation of the data[15]. The first is pretty obvious: if we have a data set A with slope -2.5db/octave, and data set B with slope -2.2db/octave, and we want a slope of -2.1db/octave, then B is better than A.
The second falls out of doing the linear regression to get a best-fit line from our possibly-not-linear frequency response (think back to the somewhat wavy pink filter), and corresponds to how linear the data looks. R2 = 1 means the data is a perfect line, while R2 = 0 means the data doesn’t look anything like a line at all; wavy lines like we might get with our pink filter are somewhere between. We obviously want a perfect line, so lower R2s should raise the error.
There’s a question of how much we should weigh each source of error; there isn’t an obvious answer, because it’s a question of priorities. We would rather have a perfectly straight line at exactly the right slope, but we don’t live in that perfect world, so we need to make trade offs. What do we take, a -0.1db/octave inaccuracy or a few points of inaccuracy in R2[16]? There’s nothing that tells us what the right mix is, so we’ll have to define our own objective function.
In our case, I chose to weight the slope higher than both the error in the linear regression; I wanted to definitely wanted to get into the vicinity of the right slope, and get that right: getting the response to be smoothly linear was less important, because we know that we’re going to have a wavy response anyways, and we don’t want to overly reward our evolving parameters for making something that looks smooth, while ending up with some slope far from our target[17].
So now we can let this process loose on a computer, and just let it crunch numbers until victory!
Except we haven’t nailed down an important answer: what is a reasonable starting point?
Originally, I was using something that sounds reasonable; for each slope target, use the previous target a small increment away. Unfortunately, something went wrong with this: after weeks of number crunching, the process would get stuck somewhere between white and pink noise. My hypothesis is that it would get stuck on a local maximum, and then it would have a hard time moving away[18]. This isn’t really satisfying, because we didn’t expect any local maxima. Instead, it may have been because I was using real noise samples to fit the data with, instead of using more idealized transfer function models, and the process would get stuck looking for an ideal-enough solution that didn’t exist with real-world data.
I’m not a good scientist (but we already knew that), so I changed both potential sources of problems: I changed the starting point to the a linear interpolation of each of the parameters between our defined white/pink/red endpoints, and started using easier-to-calculate ideal transfer functions for the filters. After this, I got actual numbers within a few days at a reasonable density of data.
Graphs n’ Data
In the digital filter formulation, we only change one parameter of the weighting function. Jumping up a layer of abstraction[19], we can plot each of these parameters against the target slope:
Each plot is made up of 60 different points, tied together with simple lines. This is fine on the desktop, where 60 64 bit floats[20] would have been a lot of data for Bill Gates in the 1980s, but it’s a lot of data for something like a micro-controller with 8k of memory, which we mightbe using in the next post. So, we would rather have a much smaller number of points to provide a good but rough approximation, and have some room on the micro-controller for actual code[21].
Like I hinted at before, we could try to polyfit each of the parameter curves, but that doesn’t look promising, and the break in the middle of the curve might even introduce problems with jumps in values across the middle point. Plus, evaluating the polynomial on a microcontroller might take a long time depending on how complicated it has to be.
Instead, we could just find a piecewise linear approximation: we’ll fit a series of line segments to the function, minimizing something like squared error and keeping the lines continuous. The process mostly works like our earlier greedy algorithm, starting with some initial data (equidistant points on the x-line) and then finding progressively better fitting lines. Fortunately, R has a package to do this for us[22], segmented.
Balancing size and goodness of fit, I chose to fit to 8 points. You can see it’s an alright fit:
If you want to see the animated spectrum changing with frequency response, we can see how it changes, with some wonkiness[23]:
Finally, if we use this parameter set to produce a smooth sweep from white to red noise, we get something much like:
So we’ve rendered this algorithm in software, but the idea was always to render it in hardware, like a real lumberjack. If you’re interested in that development process, check out Part 2: Hardware.
Downloads
CSV for the 60 point set.
CSV for the minimal 8 point set.
Github: Generation scripts used for finding the parameters, and for generating the audio samples and visualizations used in this blog post.
[1] ↑ Etymology: noise + a large EE (Electrical Engineering) component.
[2] ↑ There are more noises than just white, pink, and red: if you’re interested, you should look at the Wikipedia page for the colors of noise.
[3] ↑ Using the low-level-ish audio tool Audacity to do Fourier transforms, the standard way to break down a signal into frequency components.
[4] ↑ This does not mean that if a signal’s frequency spectrum is flat, then it is random: in order to show that, you should look into applying the Diehard randomness tests. Furthermore, this does not mean that any given random signal is suitable for use in cryptography; if you need that, look intocryptographically secure pseudo-random number generators, or getting a secure hardware noise generator.
[5] ↑ An octave is a doubling of frequency, and around 6 decibels is 4 times the power in a signal. This is equivalent to 20db/decade, or every multiple of 10.
[6] ↑ You can’t roll off forever, though, and eventually the signal becomes indistinguishable from pervasive quiet electrical noise, so there’s effectively a floor out there.
[7] ↑ These parameters are part of the Paul Kellet 3-subfilter formulation that we’ll use later.
[8] ↑ The goal for most filter design is to create filters that have the sharpest cutoff between the passband and stopband, trying to minimize the space where we could usefully talk about “rolloff” (while dealing with other effects, like whether the passband starts to look like a windy ocean or the filter requires million dollar parts). For example, if we were trying to cut out 60Hz noise coming from wall power, we only want to drop the 60Hz, and as little of adjacent signals like 40Hz or 80Hz. We explicitly don’t want a steeper roll off, so we’re sticking with the 1st order filters and are throwing away a fair amount of signal theory we don’t need.
[9] ↑ Robin was a god-send: without their page, I would not have had a way to kick this project off the ground.
[10] ↑ I’m glossing over the fact that we’re adding the signals in the time domain, which allows one signal to dominate in the logarithmic power domain. We would have to tune the parameters differently if we had to add the frequency spectra together directly.
[11] ↑ Interpolating directly from white to red wouldn’t work, since that would leave out the filters at higher frequencies and produce more of a disjoint line frequency response.
[12] ↑ The power is the amplitude squared, and using decibels makes things logarithmic.
[13] ↑ The part around -3db/octave will be non-differentiable, but that’s expected, since there’s no reason for both the white→pink and red→pink evolution processes to seamlessly mesh into each other. More importantly, since we’re just approximating everything, we don’t want to impose differentiability around that point (as nice as it would look), because it doesn’t help us like it would if we were using a single differentiable function over the entire range. Continuity we do expect.
[14] ↑ Now, if we were trying to approximate the Riemann zeta function, then we would have some problems with such a simple approach.
[15] ↑ I’m skipping over the “y-intercept” error (which is actually then intercept at some small value, since in a logarithmic world there is no such thing as a y-intercept), which we need to hold to some value, or we might end up with wildly fluctuating filter loudnesses from one step to the next. We don’t just keep the intercept constant, though; instead, since we’re attenuating the signal at high frequencies, we can compensate in lost power by letting more of the lower frequencies through, raising the target y-intercept as we go from white to red.
[16] ↑ Maybe you’ll also notice that the units are all wrong; we can’t really add a quantity of db/octave to a quantity of… whatever units R2 has. If you want everything to fit together for dimensional analysis, you can think of the weights we’ll be multiplying these quantities by as having the inverted units.
[17] ↑ A concern: what if we’ve defined our objective function so that we have some false peaks, so we climb to some non-optimal peak and end up stuck? We can combat this by looking at the combined error; we know that the pink parameters are already about as good as they’re going to get, so if the errors are not too far off, then we’re likely okay. Plus, the problem is pretty simple, so it would be surprising if there were sub-optimal peaks.
[18] ↑ I tried to use tricks like a kludged together analog of simulated annealing which didn’t help.
[19] ↑ If you’re confused by what I’m doing, Worrydream’s page on the Ladder of Abstraction is a fantastic introduction to thinking on different layers of abstractions.
[20] ↑ Let’s assume 8-byte (64 bit) floats. With 60 points for each of 4 filters, that adds up to 1.9k bytes.
[21] ↑ Plus, it would be much easier to search through the data if we have less of it.
[22] ↑ No, R hasn’t gotten better since the last time I ranted about it.
[23] ↑ Note that the spectrum is a bit messy, especially between pink and red, where the higher frequencies bend upwards and are otherwise -6db/octave. I’m guessing that I either need way more filters to get a good approximation, or also move the cutoff frequency of the filters to get really good results. The next part will go into why changing the cutoff frequency wasn’t really viable.
It doesn’t change the fact, though, that the behavior of the final state at -6db/octave is weird: why isn’t the infinite filter lower? I think it’s because of an earlier decision to cap the floor of the filter gain: you can see all the filters going to some small value. Unfortunately, it wasn’t low enough for the infinite filter: not sure it’s worth redoing the calculations, though.