Technology Review describes some recent research from the University of Hong Kong. Students there set about using a neural network to classify music spread across ten genres. Given the number of variables in a musical piece this is considered one of the harder problems of AI. The project was able to achieve a considerable success rate, around 87%. As the article explains this high ratio can be attributed to the kind and in particular the depth of the network used.
Neural networks as I understand them are typically constructed in layers. The first group of artificial neurons accepts input and plugs its outputs into the next group, which then plugs into a success group and so on until the final layer. This arrangement augments or weakens weights during training and has similar advantages when the network is applied, refining the results as information flows through the network. The students used a network with a particular wiring scheme, a convolutional network which is usually used in visual recognition. While their network only had three layers, according to the article this is unusually deep, helping drive optimal classification.
Unfortunately, the high success rate was limited to the initial training library. When the students introduced a wider selection of music from outside of the lab, the network didn’t fare so well. Their assumption that more training would help is valid as local optimization can be a problem with directed learning systems. The article doesn’t mention the training speed but if the speed of matching mentioned is an indication then it may not be very long before the students’ hypothesis about more general classification is tested.
I am most interested to see further application of this particular type of network for archival purposes. Volunteering on a digitization project gives me plenty of opportunity to consider the costs in identifying and adequately tagging works once they are converted. I’d be willing to bet a success rate in the high eighties is pretty close to what human volunteers are able to achieve on average. A successfully deployed neural network could act as a force multiplier on top of the efforts of volunteers speeding their ability to make the vast body of pre-digital works that much more available.