By James MacLachlan
Amazing, isn’t it? What we can achieve in the world of science and technology. The telephones, televisions, and computers we used to plug in at home have all been enhanced and condensed to fit conveniently, and stylishly, into our pockets. We are constantly inventing sophisticated and intelligent ways to help us interact with the world around us. Superior forms of technology which, in our ever-changing, ever-advancing modern world, feels as primitive as a Neanderthal’s stone tools when compared with the tools that we use every day. Smartphones and the internet have become integral to our lives, and it’s a struggle to imagine a life without such advanced technology.
More recently, technology has been developed that merges the real world with the digital. Apple’s iWatch made sure we were connected to our phones pretty much all the time, and made us feel like James Bond while we were doing it. Google’s somewhat short-lived Google Glass project attempted to seamlessly integrate the digital world with reality, also making us feel like spies, with cool, albeit rather embarrassing glasses. And, most recently, Sony’s PlayStation VR manages to immerse gamers inside the game, meaning that you could become James Bond.
However, there’s one company in particular that is making enormous strides in new technology, with products and applications you probably use every single day, perhaps without even realising it. I’ve already mentioned it. I am, of course, talking about Google. In 2014, they acquired the British Artificial Intelligence company DeepMind for a hefty $500m, with the intention they of working together on new advanced applications for AI. Their rather cryptic and mysterious goal is, to ‘solve intelligence’, and with their latest invention, they may just be one step closer to doing just that.
In collaboration with researchers from the University of Oxford, Google-DeepMind have created an AI technology that can accurately lip-read human speech on television better than a professional. This AI can understand words and phrases by focusing on the speaker’s lips and processing the many shapes and movements of their mouth into data. It can then automatically process and convert the raw data into complete sentences. After lengthy experimentation on various television programmes, including BBC News and Question Time, and collecting nearly five thousand hours of data in the process, the researchers compared the AI against a professional lip-reader. The two entities examined a random selection of two hundred clips from the entire collection of data and were tasked with transcribing the lot. The professional managed to record only 12.4% of the data without error, whereas the AI could understand a staggering 46.8% of it, effectively quadrupling the results of the human. The article also mentions that many of the AI’s mistakes were negligible. Missing a letter ‘s’ at the end of a word, for example.
The potential applications of this technology are vast, with the researchers suggesting improvements to hearing aids and better speech recognition in louder environments – a feature Google will certainly put into place for their own version of Siri. It could also aid in subtitling live television, concerts, and events. However, there have been some concerns about the negative implications of such technology, such as the safety of our data, but the researchers state that the AI will perform less effectively over greater distances, so we shouldn’t worry about that. Of course, not all technology can survive. Fax machines, pagers and floppy disks have all been sent to the recycle bin in the sky in favour of newer tech. Will AI like this stand the test of time? And just how far can technology advance?