feeds | grep links > Internet Kill Switch, Fair Use before DRM in Brazil, and More

feeds | grep links > Self Replicating MakerBot, AI Predicting Manhole Explosions, Mousing without the Mouse, and More

  • Self replicating MakerBot
    Via Nat’s Four Short Links on O’Reilly Radar. As he notes, highly appropriate as MakerBot started as a modified RepRap which was all about being self reproducible.
  • AI used to predict manhole explosions in NYC
    I had no idea the scale of this problem was worth harnessing machine learning to tackle but according to Slashdot, apparently it is. It sounds to me like a pretty big multivariate analysis depending on pretty laboriously collected data and observations from the field. Regardless of the risk of a heavy, iron manhole cover being ejected in a gout of flame and gas, the idea to use an AI to help stay on top of the mammoth maintenance challenge for a city as old as New York greatly appeals to me.
  • NetApp threatens sellers of appliances running ZFS
    What the Slashdot summary glosses over but the linked articles make a bit more clear is that there is a history to these complaints to goes back a ways. The same company apparently repeatedly threatened Sun for much the same reason that they are now threatening NAS maker Coraid. I find it hard to credit that there isn’t a less fraught file system offering similar capabilities originating more directly from the FLOSS world.
  • Mousing without a mouse
    Priya Ganapati describes an MIT project from the creator of Sixth Sense, Pranav Mistry. It definitely seems to be strongly related, using commodity hardware to track your mousing hand as you pantomime the gestures you’ve become used to in order to drive your computer without actually needing a mouse. Given the rate at which scroll wheels get gummed up, I would gladly invest many times more than the $20 figure quoted to never have to clean any part of a mouse ever again.
  • Incremental update to OLPC XO to include multitouch screen
    Via Hacker News.
  • Skype’s encryption is partially reverse engineered
  • Fan remake of Ultima VI released
  • Blizzard backs down on requiring real names in its forums

feeds | grep links > Bill to Pressure Those Who Would Break the Internet, Historic Cipher Revealed, New Developments in Weak AI, and More

Neural Network in JavaScript

I’ve seen just about everything else implemented in the lightweight, scripting language of the web, so why not a neural network? I saw this via Hacker News and it doesn’t strike me as too far different from libraries for doing heavy graphic processing in JavaScript. I could also see some distributed applications potentially using this. Think about it, modern browsers increasingly provide excellent client side storage, useful for hanging onto locally produced results, and means for sending and receiving data much more smartly, just what you would need to distribute tasks and collect results. I think a folding@home style project that works completely in the browser makes a great deal of sense.

The code is licensed under an MIT license and is available on github. Both would make it pretty simple to grab it and experiment quickly, whether pursuing my idea of distributed neural networks or any application that could make use of the capabilities of such a tool in a lightweight execution environment.

feeds | grep links > State of WikiLeaks’ Site, What to Expect in Firefox 4, and More

Album Composed with Algorithmic Swarm

This story from Make is a little different than the couple of other recent AI music stories I’ve written up. In those instances, the music is being generated or processed at a much lower level, in a more integrated fashion. As near as I can tell, the work here, by Evan Merz’s, is more like an audio assemblage that happens to be driven by a predator-prey cellular automata. The inspiration and borrowing from Cage is hardly surprising as so much of his work was governed by a meticulously following of how systems unfolded by seemingly simple rules.

Swarm Controlled Sampler – Becoming Live from Evan Merz on Vimeo.

I enjoyed the three tracks in the embedded video. They are more coherent than I would have guessed but do have a thrilling edge to them arising either from the structural changes the CA wrought or just the knowledge that this form of primal, computational complexity was harnessed in creating these creative expressions.

Astronomers Use AI to Help Classify Galaxies

Slashdot links to a Singularity Hub article describing a project that is forehead slappingly obvious in hindsight.

Scientists are teaching an artificial intelligence how to classify galaxies imaged by telescopes like the Hubble. Manda Banerji at the University of Cambridge along with researchers at University College London, Johns Hopkins and elsewhere, has succeeded in getting the program to agree with human analysis at an impressive rate of more than 90%.

The article goes on to explain how the team used data from Galaxy Zoo to train the AI.  Galaxy Zoo is a crowd sourced effort to aggregate small bits of highly distributed human effort to classify galaxies in astronomical imagery.  It has produced some startlingly good results due to efforts at cross verification.  It makes perfect sense as a training set for a directed learning program.

The AI will be used to alleviate the more trivial tasks involved in many coming astronomical projects so that human input can be applied for best effect, on the harder problems inherent in sifting through the reams of data.

Using Neural Networks to Classify Music

Technology Review describes some recent research from the University of Hong Kong. Students there set about using a neural network to classify music spread across ten genres. Given the number of variables in a musical piece this is considered one of the harder problems of AI. The project was able to achieve a considerable success rate, around 87%. As the article explains this high ratio can be attributed to the kind and in particular the depth of the network used.

Neural networks as I understand them are typically constructed in layers. The first group of artificial neurons accepts input and plugs its outputs into the next group, which then plugs into a success group and so on until the final layer. This arrangement augments or weakens weights during training and has similar advantages when the network is applied, refining the results as information flows through the network. The students used a network with a particular wiring scheme, a convolutional network which is usually used in visual recognition. While their network only had three layers, according to the article this is unusually deep, helping drive optimal classification.

Unfortunately, the high success rate was limited to the initial training library. When the students introduced a wider selection of music from outside of the lab, the network didn’t fare so well. Their assumption that more training would help is valid as local optimization can be a problem with directed learning systems. The article doesn’t mention the training speed but if the speed of matching mentioned is an indication then it may not be very long before the students’ hypothesis about more general classification is tested.

I am most interested to see further application of this particular type of network for archival purposes. Volunteering on a digitization project gives me plenty of opportunity to consider the costs in identifying and adequately tagging works once they are converted. I’d be willing to bet a success rate in the high eighties is pretty close to what human volunteers are able to achieve on average. A successfully deployed neural network could act as a force multiplier on top of the efforts of volunteers speeding their ability to make the vast body of pre-digital works that much more available.

Another Approach to Machine Learning

This profile by Katie Drummond at Wired of a Darpa project in AI caught my eye. In the past year or so, I’ve seen eulogies for sub-fields of artificial intelligence and announcements of the re-invigoration of the overall field. I like reading about this pair of researchers quietly getting on with it.

The problem Yann LeCun and Rob Fergus at NYU are tackling is how to get a machine to learn without the labor intensive guidance and training that is usually required. This is a big problem in both the fields of computer science and philosophy, identifying where field ends and object begins and vice versa.

Existing software programs rely heavily on human assistance to identify objects. A user extracts key feature sets, like edge statistics (how many edges an object has, and where they are) and then feeds the data into a running algorithm, which uses the feature sets to recognize the visual input.

“People spend huge amounts of time building these feature sets, figuring out which are better or more accurate, and then refining them,” LeCun told Danger Room. “The question we’re asking is whether we can create computers that automatically learn feature sets from data. The brain can do it, so why not machines?”

Drummond includes a decent high level explanation of the pair’s approach, a method of layering masks that seems similar to certain aspects of neural networks. If they make progress beyond these promising beginnings, it will have implications not just for hard problems in computing but perhaps for how our own brains tackle these challenges of object identification and self learning.

Google Announces Prediction API

Slashdot highlights one of the more interesting announcements to come out of the Google I/O developer conference. This particular API is invite only at the moment which may be wise to at least initially throttle its usage. As big as Google’s data centers are, I cannot imagine that does much to constrain the sort of heavy, heavy toll of computation that goes into a system like this.

The Prediction API is a simple to program, supervised learning system. You feed data in and guide the system to learn how to characterize that data. As a result, it is able to extrapolate properties and trends from the data, depending on how you trained it. There are some good example uses on the web site.

  • Language identification
  • Customer sentiment analysis
  • Product recommendations & upsell opportunities
  • Message routing decisions
  • Diagnostics
  • Document and email classification
  • Suspicious activity identification
  • Churn analysis
  • And many more…

The uses are really only constrained by your data. There are a couple of choices of machine learning techniques to tailor analysis and prediction. I am curious as to how such a general services stacks up to more specialized, commercial software like targeted personalization and log analysis for security monitoring. I suspect this may be more useful at the lower end, to gets some idea of what sort of value could be derived from your data via a more in depth analysis.

Regardless, I enjoy the idea of being able to easily harness a simple machine learning system with a small bit of code. Hints at AI on tap if extrapolated out beyond the horizon.