Daniel Iglesia
Comp/Aesth of Computer Music
Final Project - 12/5/04

Analysis, Resynthesis, and Scoring of a Very Short Sound

In brief: I have built a max patch that attempts to stretch out a very short sound, analyze it for prominent tones, and score it for piano. Neither the intermediate stages nor the final product sounds at all like the original, but I still found it an interesting way to generate material.

Given the noisiness of the source sound(s), the difficulty of analyzing short sounds accurately and of the strict semi-tone structure of a piano, the result of this experiment sounds nothing like the original. Yet I found it conceptually interesting as applying a chance and chaos based system to spectral music practices.

All work was done in max/msp. A pfft~ object was used to take an FFT of the source sound; given the small sound sizes (152 and 491 milliseconds) the FFT size was the single greatest factor in determining what the output would sound like. Too large of a window size would yield too few frames, while too short of one would cut out low frequencies and have poor frequency accuracy. After trying, a size of 512 seemed to be best (though other sizes yield interesting results).

After receiving the sound, the sub patch plays back each FFT frame slowly (over a period of 10 seconds). This is fed to a fiddle~ object, which is set to output the top five sinusoidal components of the current frame (and their relative volumes). This is quantized to the nearest semitone and recorded as midi data, fed to a sequencer and sampler, and rendered as piano chords. Each resulting chord is therefore a representation of one FFT window.

I chose piano due to my current work on a piano piece. Whether any material generated with this system makes it in is yet to be seen.

Example 1: coin drop
(Unfortunately, due to their small size, good spectrograms of these sounds were not available)

Figure 1: original sound (1.wav)

Figure 2: the 13 resultant FFT frames, resynthesized, at a window size of 512. (2.wav)

Here, we can see what will turn into the chords.

Figure 3: the top 5 sinusoidal components of each FFT frame, resynthesized. (3.wav)

Notice some very nice amplitude modulation/ beating.

Figure 4: above frequencies and amplitudes, quantized to midi data, and sent to a piano sampler. (4.wav)

It's a stereo file, overlain here for convenience.

Figure 5: score (no dynamic markings)

(Accidentals carry through bar) Why this has so many white keys I'm not sure.

Example 2: whip crack

Figure 6: source sound (491 milliseconds) (6.wav)

Figure 7: individual FFT frames (7.wav)

Figure 8: Top 5 sinusoidal components of each FFT frame (8.wav)

Figure 9: transcribed for piano (9.wav)

For comparison's sake, I did it again on FFT size 1024, resulting in fewer frames but more data in each. Totally different result

Figure 10: redone on piano from FFT size 1024 (10.wav)

In my abstract, I mention that I am mining a sound for "prominent tones". Obviously, there is no such thing, since it is impossible to describe a short and noisy sound in 5 voices. Yet the result is not at all random, either. I feel it has an interesting mix of order and disorder: some pitches (most likely the resonances of the surfaces) carry through, or become a sort of pedal tone. One could almost say that each transcription has its own abstract tonality (though determined as much by the FFT size and other chaotic factors as it is by any physical properties), with emphases on certain tones. Whether this is true or merely the illusion of such a violent quantization to midi tones is unknown.