Sunday, March 15, 2015

Strange Circles

As described in this previous post, the  text below is a draft of one of several "interludes" to be included in a book that I am working on concerned with music and artificial neural networks.

Figure I-1. Three different types of circles of pitch classes.
In the novel Us Conductors (Michaels, 2014), protagonist Lev Termen reflects that “it was not that I was careless in my calculations; it was that I was seeking the wrong sum” (p. 161).  A parallel situation arose when I first began to train artificial neural networks on musical problems.  My goal was to create a clean example of how to interpret a trained network to include in a book on connectionist modeling (Dawson, 2004).  I felt that it should be quite straightforward to train a network on a well-defined music problem, and then be able to pull the theory that defined the problem right out of the network’s internal structure.  I was wrong.

The task that I investigated was chord classification.  The input units of the network represented the keys of a small piano.  These inputs were used to present tetrachords – chords built from four notes – to the network.  The network learned to classify each chord as belonging to one of four types: major, minor, dominant, or diminished.  Traditional music theory defines each of these tetrachord types in terms of the presence of particular musical intervals within the chord.  To make the problem more challenging chords were presented in different musical keys and in different inversions.

In order to learn this problem, the network required four hidden units.  After learning was completed, the task was to make sense of the network’s internal structure.  In particular, the question of interest concerned what musical features were being detected by each hidden unit.

The first attempt at answering this question involved examining hidden unit activities to the various input chords.  It was expected (from traditional music theory) that each hidden unit would be sensitive to a musical interval involved in defining tetrachords of a particular type, and that the network would solve the problem by combining these different intervals.  However, there was no evidence that supported this approach.

After a number of false starts, I simply looked at the weights of the connections from each input unit (i.e. each ‘piano key’) to each hidden unit.  This approach revealed two very interesting regularities.  First, there was a regular mapping between weights and note names.  For instance, different input units that represented the note A at different octaves had exactly the same connection weight feeding into the same hidden unit.  In other words, if one views a connection weight as the network’s name for a note, then these notes encoded pitch-class instead of pitch.

Second, more than one pitch-class was given the same name – the same connection weight – by the network.  For instance, one hidden unit assigned the same connection weight to the pitch-classes C, D, E, F#, G#, and A#.  Another hidden unit assigned the same connection weight to the pitch-classes C, E, and G#.

In considering the various pitch-classes that were given the same connection weights in the network, it became evident that the network had discovered a new way of solving the chord classification task: using what I call strange circles.

Traditional music theory employs circular representations of pitch-classes.  For example, most music students are exposed to the circle of perfect fifths that is illustrated on the left of Figure I-1.  In this circle one can find each of the twelve possible pitch-classes; nearest neighbors in the circle are a perfect fifth (seven semitones) apart.  This circle is not strange; students use it (for example) to determine how many sharps or flats are required in a written key signature.

The tetrachord classification network discovered that it needed circles of pitch-classes, but based them on musical intervals other than the perfect fifth.  For instance, one hidden unit named pitch-classes using the two circles of major seconds that are illustrated in the middle of Figure I-1.  In either of these circles, nearest neighbors are a major second (two semitones) apart.  If a pitch-class belonged to one circle, it was given one connection weight.  If it belonged to the other circle, it was given a different connection weight.  Any pitch-class that belonged to the same circle of major seconds was given the same connection weight.

Similarly, another hidden unit named pitch-classes using the four circles of major thirds that are illustrated on the right of Figure I-1. In each of these circles, nearest neighbors are a major third (four semitones) apart.  When these circles were used, pitch-classes that fell into the same circle of major thirds were given identical connection weights; pitch-classes that belonged to different circles of major thirds were assigned different connection weights.

One way in which circles of major seconds and circles of major thirds are strange circles for this network is that they define weird equivalence classes for pitch.  Traditional Western tonal music recognizes twelve distinct pitch-classes.  However, a hidden unit that organizes inputs using circles of major seconds only recognizes two distinct pitch-classes (one for each circle).  A hidden unit that organizes input using circles of major thirds only recognizes four distinct pitch-classes (one for each circle).

A second way in which these circles are strange is that they are not typical topics of music theory.  Students learn the circle of perfect fifths because of its utility; students do not learn strange circles, because their utility is much less evident.  This is not to say that these strange circles are removed from traditional theory, because this theory defines the musical intervals which in turn define each circle.

This leads to the third way in which these circles are strange.  How is it possible to use circles of major seconds and circles of major thirds to represent the differences between major, minor, dominant, and diminished tetrachords?  We will postpone the answer to this question until Chapter 7 which examines chord classification networks in detail.

For now, the key point is that even when traditional music theory defines the input/output mapping for a task, this theory is the only means for mapping inputs into outputs.  Artificial neural networks are capable of finding alternative music theories.  However, to find such surprises one must examine the internal structure of trained networks.

In short, supervised learning of standard musical problems might lead a researcher to expect to find the network solves the problem in a particular way.  However, networks can easily defy such expectations.  It is not that a researcher’s expectations are careless.  It is just that a network can compute different sums. 


No comments:

Post a Comment