Residents

Gnoissienne No. 1

Sorry for the break in posts, I’ve been absorbed in a bunch of books and haven’t had anything complete to share until today.

5 days ago I got a digital piano off craigslist, and it’s the best thing ever. After abandoning the guitar a few years ago I’ve had nothing to really get my hands in to. Holding down low C on a synthesizer and twisting the FM knob is not the most involving physical performance.

The above recording is the product of the last 4 days of practice. Erik Satie makes for a soft and gratifying takeoff. Although I prefer this interpretation over my relatively straightforward playthrough, I’m not ready for that kind of rhythmic anarchy. Hopefully whatever dreaminess I was able to pull off is distinguishable from the mistakes :D

Also, for the first time in my life, I’m actually interested in theory. Since my first Sonic Youth show in 1999 I’ve actively scorned and unlearned every rule I was aware of. The unlearning was wildly successful – I really feel like a blank slate. The downside, more apparent by the day, is that outside of improvised noise music my toolbox is painfully empty. To my advantage I know exactly what I want, so maybe it won’t take years of study and accidental epiphanies to get to where I want to be.

The polished RBM drum brain will be coming shortly, as well as the latter Gnoissienes. Cheers.

DX Orphan

Found this on an old cassette tape, along with some other cool stuff. It’s from the last time I went on a generative music kick. Some kind of process that would chew up things played live, if I’m remembering correctly. I’ll probably be posting some of the other recordings I found – there are a bunch of ‘solo acoustic guitar’ tracks that might be strange enough to be worth sharing. We’ll see.

Enjoy.

Minor Success

It turns out NumPy is awesome and making things with it is a breeze. I’ve got a restricted boltzmann machine up and running, able to stably reproduce whatever binary grid I throw at it. All thanks to Geoffery Hinton’s RBM cookbook.

I think I’m going to try this out as a way of handling the time problem. It takes both the current input and a complete copy of the previous input (as in, the previous step in the pattern) and uses the previous input as a way to adjust the bias of the current. By doing this it can respond to changes over time, trajectories associated with different inputs. It can be extended out to multiple time frames, but is reported as working very well with just one. I was worried about the computation time, but at the end of that paper they say they were able to generate real-time video with one (for motion capture data, really really cool stuff), so I might as well give it a shot.

The idea I mention at the end of the last post was similar, but would end up having a very different effect. Rotating weights of a bias signal, a unique weight for each step in a loop, would end up encoding step-specific variations. The CRBM (Conditional RBM, the multiple-time-frame model) would only have the most recent states known to it. I don’t think the rotating weights will be able to work on its own – there is nothing to ensure entropy of event triggers over time, reliably. But, I do think this could be a useful trick to employ with the CRBM.

I’ve got some things to try out with the current machine and some tools to make to get it talking to through a midi cable. Update soon.

Neural Networks

So it looks like I missed my deadline by a full week, but NBD. I’m pretty convinced that getting an artificial neural network working in the context of generative music is worth whatever time it takes. It feels like it’s been longer than 3 weeks. Once I found a good handful of sources, digesting the information just became a daily task.

Here’s what I’ve got so far: nn.zip

Entry level stuff to play around with in Python. I use Python 2.7.2, but I doubt the version matters. The NumPy library is required.

The main file is nn_core.py It contains Neuron, Layer and Network classes, as well as 2 training algorithms and a couple of tools. Test scripts for a few different things are included. You’ll probably have to add the directory of the files to your %PYTHONPATH% environment variable. On Windows, just go Start -> Control Panel -> System -> Advanced Settings -> Environment Variables, then create a PYTHONPATH variable if it isn’t already there. Add the directory name to the list of values.

There are 4 test scripts that show how the modules are used. There is also a tricky analog to digital converter layer. Turns out that a layer of neurons that can all inhibit each other look suspiciously like a series of comparator/gain stages that ADC’s are built out of. Each neuron adds 1 bit to the resolution. A DAC could easily be rigged up in the same way.

To create a net and train it to mimic a XOR gate right out of the box, just start the python interpreter:

1) >>> from nn_core import *
2) >>> net = Network(2, [(3, None, None), (1, None, None)])
3) >>> import sets_binary
4) >>> learning = sets_binary.XOR_training
5) >>> learning = format_to_numpy(learning)
6) >>> randomize_weight(net)
7) >>> backprop_trainer(net, learning)
Trained Successfully in  6  iterations.

2) creates an instance of Network(# of Inputs, [(# of neurons in this layer, summing type, activation type), ...]). ‘None’ lets the network use default activation and summing functions. Each neuron in the last layer is also an output. The very last thing I did was switch from using regular python lists to NumPy arrays. This ended up breaking some stuff that really isn’t worth fixing (like the original training sets), hence the format_to_numpy() function.

>>> input = np.array([0.5, -0.89])
>>> net.compute(input)
array([ 0.56548238])

The training scripts show a lot more than what makes sense to go over here.

The two primary educational resources that I’ve been gratuitously lifting from are here and here.

The Plan

Moving over to NumPy and seeing the calculation times actually slow down made me realize I need to build everything around the use of numpy.array. At the same time, I’ve played around with my current tools enough to know that back propagation learning isn’t going to work for my desired input. Time to make something new.

At this point I really need to define the input. Lucky for me, MIDI already serves as a highly reduced feature set. In fact, MIDI is already a bit deficient. The discriminating features of a drum hit are impossible to discern from a lone MIDI trigger. Many of those features seem necessary to me in the prediction of following events in any given musical pattern, which is really what a neural net should be good for. This gives me an opportunity to both spice up some dull MIDI signals, and at the same time fit the added features to real-life examples of successful pattern recognition/generation.

How I represent time is going to the the biggest hurdle. There are a few options, and none of them are particularly pretty. My options seem to be:

1) One input for each step of a pattern. If you use a lot of xOx sequencers, this might seem like the go-to choice. Easy to translate to a user interface, easy to interpret the big picture of the pattern right off the output. It would mean that the network will be generating entire patterns at once, to be probably imported into pure data and looped through. I don’t think this is where I want to go, just from a performance standpoint. I admit it has the benefit of allowing for really slow computation times relative to an on-the-fly timing structure, though that may get overtaken by the increased load caused by having to map 16, 32, or more dimensions instead of just one. This could be either binary (step on/off) or continuous (steps with intensity). From the limited testing I did, as the step count increases the training time skyrockets and the error rate goes crazy. Of course, I might have no idea what I’m doing, haha. I’ll keep testing this as I go on; no reason not to.

2) One high res input for the timing stream. Scan through the loop from -1.0 to 1.0. Now I really need to define what an event is. MIDI signals don’t tend to change much over time. MIDI CC can and does, but’s not usually a way to represent relationships between triggers. Events need to have push and pull. I think at its most basic and 1 dimensional, it is an intensity level that varies over time. In the case of a single drum, it could mimic the envelope. Better yet, because the actual sound decay of the drum need not tie into this, it the envelope could represent the amount of space in the pattern that the drum is taking up. If you have a limited amount of space, and each event eats up a share of that, it can be that one event influences another. With the inclusion of a decay dimension (and others like inertia, maybe even attraction and repulsion), and the connection to time (back to that in a moment), I think there are some opportunities for (some level of) pattern recognition (or at the least a more interesting static).

A restricted boltzmann machine is a particular kind of  neural net architecture. It’s stochastic (random appeals to me), uses associative learning techniques (which make intuitive sense for my data), and can be used for generation. It trains by attempting to recreate the input, which is exactly what I need. It makes as good a choice as any for experiment #2.

RBMs use binary signals, which will heavily affect how time is allowed to be used.. While I can imagine a large number of thresholds scattered about on the timeline, boxing in chunks of time and associating them with event data, that really just ends up looking like a messy version of the xOx style input. I just need to lock specific times to increases or decreases in the factors that make up event triggering. Thus, I should have each step of time rotate in a unique weight connected to a bias! At the very least, I’d get some crude time-specific event sequencing. I haven’t seen any examples of this being used, so there’s probably serious downsides that I don’t know about. We’ll see!

That would also throw any biological analogy out the window, I think. Drat.

3) Another way to represent time, that would actually allow for associations to form between timeframes, would be to use Markov Chains. I’ve seen examples of an even more realistic method that takes previous sets of inputs along with the current, all interconnected. Weeks to train, though, so nope. Markov Chains are fun and I’ve used them before. Definitely planning on having spicy MC inputs, if all else goes well.

 

2) seems like my best bet, and 1) will be easy enough to keep checking. I read about a way to convert binary RBM input/output into semi-smooth values. Like for Event Decay or whatever. What we can do is make it run 128 tests per step, and due to innate randomness you can count the number of times it was 1, add them up (say its on half the time, 64 ‘on’s) and get a ratio from that. In fact, if you can get away with 128 cycles per step, you can just take that # as a MIDI value. Slick. This requires your neurons to hover around a very specific energy level, but that seems like the point anyways. I wouldn’t use that value as a note, because a blur of randomness plugged right into a keyboard sounds exactly like you think it does, but for abstract Events with interpreters that take randomness into account I don’t see a problem.

Getting it to run fast enough will be a challenge. I’m learning more about how little I know about programming than I am about programming itself. RBMs have only 2 layers and a single weight array between them, meaning that vectorizing everything and really getting NumPy to work for me will be a bit easier. I lack training data. I need a lot of MIDI patterns, soon. Even if this thing somehow ends up working, its not going to jump out of the box ripping guitar solos. It needs good examples.

Damn, it’s late.

Noise Gate Abuse

Hey hey,

One of the best things about the Nord G2 is it’s infinite supply of noise gates and envelope followers. Check this out:

Noise gate multitap by Atonal Microshores

Simple drum pattern being fed into a stereo pair of noise gates and an 8 tap delay. Audio from the noise gates is the source for the delays, and the envelope lines from the gates control which tap is being heard. All attack/release/delay times are set to 62.5ms – the exact length of a half of a step in the pattern.

Gates’ thresholds are dialed in to trigger repeatedly during the pattern. Halfway through I start messing with a HPF on their input, causing the filter frequency to be controlled by the gates as well.

I’ve been messing with variations of this sound for a good week and it’s endlessly entertaining. The challenge of crafting the ‘control shape’ is really fun – there are a lot of different factors and the sweet spots are fragile. Try it out – the same sort of thing could be easily recreated in a VST environment or Pure Data.

Playbox Walkthrough – 6 – Endcap

I’m going to take this time to scatter some thoughts about.

All in all, I’m pretty happy about how the Playbox project turned out. My past ventures into PD have always been littered with the byproducts of bad planning and lack of direction. Building an entirely new set of instruments for each musical idea makes it very hard to get to the good parts. Not that building these things isn’t fun – it is – but there are smarter and faster ways to do it. Generalization of certain tasks combined with well-reasoned forethought is without a doubt the way to go. I shutter to think how much time I’ve wasted in the past.

There are still a few things I could have done better. Some of the variable names break convention, $0-event-type and $0-event-selection should have been combined into a single value (this one is pretty bad), and my whole buffer section is totally off-base. ClaudiusMaximus noticed that the entire thing could be replaced by a simple [textfile] (without reading or writing actual text files) after just a moment’s glance.

That said, the flaws are minor. The thing works and I can move to the next step with a bit of confidence. I’ve taken care of a necessary utility, but now I need to turn my attention to music.

My usual style of pattern creation has consistently made use of randomization. It’s a good way to inject movement and unpredictability into things that would otherwise remain stationary, but it’s not without a down side. You can filter and limit it until nothing ever sounds out of place or wrong, but it lacks a certain feel. How do I explain? It’s like comparing a black and white photograph to a silhouette cast on TV static. The photograph can be moving, or it can be trash. The silhouette may always work, but it lacks the photograph’s subtlety, risk, and humanity. I’m in need of an instrument to perform with.

Building such a thing is hard enough to imagine for just a stand alone drum machine. Thinking about doing it for an entire orchestra of electronic whatevers is almost paralyzing. Breaking the problem down to it’s component pieces, it seems that the core issue is pattern analysis. I need to be able to take seeds or stems of musical patterns and extrapolate upon them in a fluid, creative way.

While grinding my gears to dust on this, I remembered a talk I saw a couple years ago. Check this out [the timecode link is bugged, skip to 21:30]. Especially the 2nd half of the demonstration, the part concerning pattern generation. It’s only like 5 minutes long, so do it. Machine imagination. Stunning.

So I think I need to hack together some serious neural net action. The level of quality shown in that talk is far above my pay grade, but the basics are surprisingly accessible. The standard types of patterns used in music can be boiled down to stuff far more simple than the handwriting examples in that video. With the current par of “slightly better than random noise” that I’ve set it’s hard to get discouraged! My python’s been rusting for far too long anyways.

At the same time I can’t afford to get lost down any rabbit holes. If I can’t get encouraging results within the next 2 weeks I’ll have to move on to more conventional solutions. Anything worth sharing (including failures, if wild enough!) will result in more posts, of course.

With regard to the Playbox I’ll be taking care of some of the unfinished functionality and eventually posting the finished patches. There are a handful of extra tools (the looper, GUI selection stuff, event quantizer, etc) that will be included as soon as they are polished up.

More to come. Cheers.

Late

There’s something worth digging out of this, but it’s too late to keep going tonight. TAMING SPACE ELEPHANTS ISN’T NEARLY AS SIMPLE AS IT MAY SEEM

 

not enough time by Atonal Microshores

Playbox Walkthrough – 5 – Saving & Buffers

SNOW DAY IN SEATTLE!

As any PD user knows, the act of saving a patch does not save the values of any GUI elements (the position of a slider, etc), nor does it save the contents of any [pack], [f], or similar object. Methods of saving transient values must be done manually. There are a few tools floating around to solve this problem, notably Memento, Pool, & SSSAD. I can’t speak to their effectiveness, but they seem to work for others.

My method of saving is simple and effective, though I wouldn’t be surprised if more experienced users would think it a bit of a cludge.

 

[prepend set] reigns supreme.

The example on the left shows the most simple implementation. The msg box is constantly updated whenever the slider value changes, and a [loadbang] sends the last saved value to the slider when the patch is opened. The example on the right shows a cleaner version where the saved value is only updated when instructed. Contents of the msg boxes are saved along with the patch.

This style is easy to expand upon by [pack]ing and [unpack]ing values into message strings.

There is a downside – the msg box containing the saved values (I’ll call them saveboxes from here on out) cannot be stored within an abstraction. If you have 10 instances of an abstraction open, all with different values, they’ll all be updating the same savebox. Next time you load your patch, all 10 abstractions will have the same settings. We’re forced to store the values outside of the abstraction. This has led me to always build a subpatch container for each abstraction – a [pd looper] to house [looper-core] & it’s savebox.

I admit that having to copy & paste subpatches when using certain abstractions is annoying, but it works, looks clean, and is easy to implement.

An additional difficulty that comes with using data structures is that the dynamic nature of their contents requires a dynamic save system. For this, I use [textfile]. [textfile] is an object that lets you read and write to text files (duh), and the way it works parallels nicely with our data structure controls. Both are limited to messages resetting them to the beginning of their memory (“traverse” or “rewind”) and jumping one step forward (“next” or “bang”).

[pd playbox]

Here is our container subpatch once again.

Green Box: Names entered into the symbol GUI element are run through a [prepend set]/savebox pair. A “bang” sent to param_init resets the name to it’s saved value.

Blue Box: These are the msgs that are used by saving & loading code. “load”, “save”, name, and $0-patch-name. $0-patch-name is a variable I’m starting to use in all of my patches that involve text file saving. With a patch name of “Song10″ and a Playbox name of “Drums”, the saved text file will have the file name of “Song10_Drums.playbox”, making it easy to keep track of what’s what.

Purple Box: Subpatch containing the [textfile] stuff.

[pd savebox]

Any incoming msgs that do not start with “load” are sent directly to the [textfile]. These messages will already have “add” at the beginning, causing them to be stored in [textfile]‘s memory.

Msgs starting with “load” will always contain the real name of the textfile to be loaded. First, send a “load clear” to the outlet (this gets routed to $0-allmem-clear once inside the [playbox-core] abstraction). Second, send the file name to [textfile]. Third, send a “rewind”. Fourth, the first “bang”, causing [textfile] to output it’s first line.

Once that first line is sent, the “bang” in [t b a] is used to cause the next line to be sent – a simple loop. Each line is prepended with “load data” which is used within [playbox-core] to route it to the correct place.

[playbox-core]

Another quick look at [playbox-core]. The two name values are formatted into [symbol]s and separate subpatches are used for the save and load machinery.

[pd save]

An incoming “save” msg first sends “clear” directly to the outlet. This gets fed into the [textfile], wiping it’s memory.

Green Box: The 0 at the head of the chain is the beginning slot number. The rest is a basic [pointer] loop, except for the addition of a counter that gets triggered when the current memory slot has been completely parsed through. This counter adds 1 to the current slot # and starts the process anew. Only once the next memory slot would be #5 (which does not exist!) is the completion trigger sent to the purple box.

Blue Box: The 7 values of each event are collected into a [pack], with the addition of the current slot # added onto the front. This msg string has “add” prepended onto it and is sent out to the [textfile].

Red Box: Once all 5 slots have been counted through and sent to the [textfile], we then add the current slot # and the current loop length to memory.

Purple Box: This stuff formats the file name. “PatchName_PlayboxName.playbox”

[pd load]

Msgs streaming in from the [textfile], as well as the command “load” are sent here.

Red Box: “load” is sent to the green box. “clear” clears all memory. “data *” is sent to the 2nd [route]. “data looplength” and “data slot” are sent to their respective sends, while event data is sent to the blue box.

Green Box: The text file name is constructed and sent to [textfile], directing it to load that file and start the reading loop.

Blue Box: Event msgs, with the slot # at the front, are added to memory. [list split 1] is used to cut off the slot #.

 

I think that covers the saving and loading. As always, the mechanics take up way more space than the concepts behind them.

Output/Export Buffers

I’m going to go over these because it uses a neat little trick which others may find useful. Unfortunately, they will most likely be removed from this project. There is just too much cludgy list-spam and I noticed a delay in the time between triggering an export and actually seeing it printed as soon as I plugged ‘em in.

Blue Box: Items coming from $0-export are sent through an initially blank [prepend]. That item is sent into a [list] object, which feeds into the right inlet of the [prepend]. This has the result of compiling a single, long msg containing every item that is sent to $0-export.

An example: Let’s say the msgs being sent are “a a a”, “b b b”, “c c c”, etc. After each of these is sent, the [list] object will contain “a a a”, then “a a a b b b”, then “a a a b b b c c c”, and so on.

Green Box: When $0-dump-export-buffer is triggered, the storage [list] is sent into a [list split 8]. Each event being exported is 8 items long (“event stype sval note vel track out id“), so the first 8 elements are sent through [list trim] and exported. The rest of the msg is fed back through the splitter in a loop. After this is done, a blank “list” msg is sent to [prepend] to clear it.

Why would I want this?

Consider a situation where an event is used to trigger some memory operation on the same Playbox that the event is being played from. Perhaps a copy/paste or note adjustment. Now, because the memory is read out of order, that event could be read before the rest of the events contained on that same step. Problems could occur. A way around that is to read through the memory, store the events, and only send them out once the memory has been completely parsed through.

I’ve never had to do much list manipulation and there very well may be simpler, cleaner solution than this. If so, let me know!

Plan B is to have $0-dump-output-buffer and $0-dump-export-buffer send some sort of “step complete” or “export complete” message that can be used for any timing-sensitive actions. I’ll be doing some stress tests to see what the real impact of these choices are.

 

I’ll be wrapping up this walkthrough with my next post.

Playbox Walkthrough – 4 – Memory Operations 2

The last of the memory operations that we’ll go over are the copy, paste, & edit functions. Play & export, are in line with what we’ve already covered and don’t really warrant a detailed description. As always, any questions posted in the comments will be answered.

[pd copy]

The copy function isn’t that much different from delete. The copy buffer is cleared and events within the selection range pass from [get] to [append] through the standard filters. The only thing worth noting is the new $0-copy-location value being used in the green box. Whenever a selection is copied, the current selection range is stored. If the selection type is all or id, this information is not used and the events will be pasted without any scheduling adjustment. If the selection type is range though, these values will be of use in the paste function, which we will go over next.

[pd paste]

Nothing special here. The important stuff is in [pd scheduling]:

The goal here is to make it so that events can be copied from one location in a pattern to another. This is the code that adjusts the scheduling of the events being pasted.

Blue Box: If the selection type at time of copy is range as well as at time of paste, send “2″ to the gates controlling both inlets. If one or the other is all or id, skip adjustment and route event’s stype and sval directly to the output.

Green Box: If adjustment is to be made, send [pd stype-convert] the current selection range as well the current event’s scheduling pair.

This is [pd stype-convert]. It’s not as tricky as it may look.

Green Box: This compares both paste pos stype and copy pos stype to the event’s stype. The goal is to convert the paste pos and copy pos to the same stype as the event. If they are different, the stype values are sent out the [select]‘s right outlet. If the same, the paste or copy sval is passed on through to the purple box.

Blue Box: If the paste pos stype does not match the event stype, it is [pack]ed and fed through a [route]. If stype is 0, convert sval to % type, and vice versa.

Red Box: Same as blue box, but for the copy pos stype.

Purple Box: Finds the difference between the adjusted copy pos sval and the adjusted paste pos sval. The result of this is the amount that the event’s sval needs to be shifted.

Brown Box: Adjusts the event sval to it’s new location.

[pd edit]

Let’s take one more look at the [pd commands-edit] subpatch from earlier:

When an edit message is received, first $0-edit-type is sent, then the new value is routed to it’s type-specific send. This is important to the flow of [pd edit].

This is [pd edit]. It mainly consists of a [get]/[set] pair, each with their own [pointer]. Notice at the top of the patch, that the entire process is initiated by a “bang” sent to $0-do-edit.

Green Boxes: If range is all, the $0-allow-edit spigot is set to be permanantly open. If range is id or range, $0-allow-edit will be controlled by the appropriate filter.

Blue Box: [get] & [set] [pointer] loops are tied together.

Red Box: A subpatch for each value in an event. These subpatches receive the copy message and act as triggers for $0-do-edit, filters for selection range, and do the actual editing. We’ll look at a couple  of these below.

This is the note editing subpatch. First, a [route] switch sets the [gate] depending on the type msg. If “note”, then set [gate] to trigger the [f]. The [f] is then primed with the new note value from $0-edit-note, and then $0-do-edit is triggered.

This is the scheduling edit subpatch.

Blue Box: If selection type is range, route incoming events to the green box.

Green Box: Check if event is within selection range. If so, send 1 to $0-allow-edit.

The rest works the same as the note editer, with the addition of [pd adjust-sval] which does just what it sounds like.

 

Quick and dirty. I think there are only 2 more posts left in this series. Next, I’ll go over the save/load methods and output buffers.

Cheers.