An unplanned process

We like to think that the idea behind Pipa was born out of a holy mix of interest, curiosity, great portion of passion, dedication and some insanity. But the other day we got a question from a Beta-user – Do we use a structured creative process when creating products like Pipa? The short answer was that we don’t. Or hmm, maybe... We realized that we didn’t really know, so we wrote down the different steps we took when creating Pipa.. Here are our 4 steps - from nothing, to Pipa - our expressive vocal synthesizer.

1. Experiments with time and pitch

A few years back our development team started to research different techniques in the time stretching and pitch shifting realm. At the time the purpose of this wasn't really to make a new instrument or effect, it was more about experimentation and gaining knowledge about how sound would react to different kinds of processing. In short - we were curious and wanted to learn.

The experiments covered various kinds of both known and unknown techniques. One of the conclusions made was that a properly tweaked granulator did a fairly good job for both time stretching and pitch shifting, but it messed up the sound a little too much when it came to phase.

This led to the conceptual idea that if a granulator processor was aware of each grain's phase, it would be possible to take control over both time and pitch with far less artifacts. If each grain was more like a wavetable, and an actual sound could be divided into a set of wavetables, it should be possible to play all these wavetables the same way a granulator is playing grains, with the difference that everything always would be phase aligned. The big issue was how to take an audio recording and convert it into thousands of wavetables… We are used to editing audio, but cutting up an audio file like this seemed almost impossible.

2. Limitations leads to new ideas

We started off by recording different instruments investigating the audio in all possible ways. Quite soon we realized that the wavetable idea would not work on any percussive instruments. There needed to be a clear fundamental present in each sample. The cycle of a fundamental frequency is what sets the start and end of each wavetable. It really didn't matter how many or how strong the harmonics was, as long as the sound contained one distinct pitch it could work. To be able to determine every grain it also needed to be recorded in a very dry environment since any acoustic reflections would interfere with nearby cycles.

If the above conditions were met...

A) a dry recording
B) containing one fundamental (a monophonic sound) with natural harmonics
C) without percussive elements

...it would be possible to create a tool that splits an audio file into thousands of wavetables where each table has a unique length corresponding to the fundamentals frequency.

Skaka Interface Main View! Zoom view of one cycle in a waveform

A lot of development went into inventing this audiofile-to-wavetable-analyzer-and-converter algorithm, but after some tweaking we were able to extract a set of wavetables with a defined pitch and length from an audio recording automatically. Putting these tables into our updated granulator (ok, we ended up rewriting the whole granulator code base we initially built) we had a really nice sounding time and pitch machine! The only issues were that it didn't work for most sounds, and the pitching surely didn't work in real time because the analysing part consumed way too much CPU.

3. Exploring practical uses

As some of you may know Klevgrand many years back created Jussi - a pure synthesized voice instrument that became quite popular in the music production industry. Jussi (the virtual instrument) with its characteristic sound has been heard in all kinds of genres and contexts, all from live performances at the London Royal Concert Hall to pop music productions that reached the Billboards. Users have asked for a female version of the Jussi, and we've surely tried making that happen, but it just didn't sound good enough to be released. Jussi is Jussi and doesn't seem to want to be anything else…

Based on the experience with Jussi and how much we actually still like playing around with it, we decided to make our new wavetable-granulator-thing into some kind of vocal synth. Our priority was to make it sound organic and musical (more than make it sound like an actual real voice), and be playable on a keyboard with lots of possibilities to control formants and dynamics seamlessly. We knew Jussi was a hit and wanted to create an instrument in the same sort of sound space.

Skaka Interface Main View!
How Pipa morphs between the wavetables in 3 dimentions

We put our designer Sebastian - who also happens to be a professional singer - in the recording booth and made him record every note from his range spanning over a set of vocals we've chosen (U, O, Y, A). All this in three different dynamics. After some editing we ended up with ~40 audio files that were analyzed and converted to about 35000 wavetables.

4. Evaluation

With all this content we were now able to replay wavetables just as we wanted, in three different dimensions; pitch, vowels and dynamics. Because of the concept of wavetable synthesis, we could also crossfade between the tables without messing up phase. After some proof of concept and prototyping we began to develop a brand new vocal instrument (where Sebastian’s wife Linnea, who also happens to be a professional singer, appears as the female voice) that we decided to call PIPA which is swedish slang for throat. A lot of work remained, but we were confident that this idea would fly!

Skaka Interface Main View!
The finished vocal synth - Pipa