Impress: a machine learning approach to audio affect prediction

images

description

Soundscape Composition is the artistic processing and combination of sound recordings used to trigger associations and memories in a listener. Impress was designed to predict affect of soundscapes, at interactive speeds.
Impress was motivated by the lack of any measurement tool to aid soundscape performers in eliciting intended responses from listeners.

Impress facilitates the acquisition of audio training data from real-world environments. Moreover, it was designed to provide performers autonomous indication of affect during live performance along two axes of pleasantness and eventfulness.

  • Pleasantness is the association of pleasure to a stimulus; also known as valence.
  • Eventfulness relates to the arousal response.

System Description

Impress logs user responses and audio features extracted from an audio signal. Audio features extracted from an audio signal are:

  • Mel frequency ceptral coefficients
  • Perceptual spread;
  • Perceptual sharpness; and
  • Total loudness

Impress is used in two modes:

  1. Training mode. A n-second audio buffer acquires samples from the microphone FIFO. Data is logged when the user inputs their response on the affect grid to the current audio environment. The response is registered when the users touch leaves the grid. The mean and standard deviation of audio features are calculated from the audio buffer and stored in a database with the response values.
  2. Prediction mode. Multiple linear regression (MLR) models are built for each axis of the affect grid using the ordinary least squares algorithm to estimate the model parameters. An audio buffer is updated using the same process as in training mode. The buffer is iteratively copied and the audio features extracted. The means and standard deviations of audio features are then used to make the affect predication, which is displayed on the affect grid. 

Thorogood, M., Pasquier, P., (2013) "Impress: A Machine Learning Approach to Soundscape Affect Classification in a Music Performance Environment", New Interface in Musical Expression (NIME), Seoul, Korea.

PDF

Report abuse

10x's