“A new computer software programme has the potential to lip-read more accurately than people and to help those with hearing loss, Oxford University researchers have found. Watch, Attend and Spell (WAS), is a new artificial intelligence (AI) software system that has been developed by Oxford, in collaboration with the company DeepMind. The AI system uses computer vision and machine learning methods to learn how to lip-read from a dataset made up of more than 5,000 hours of TV footage, gathered from six different programmes including Newsnight, BBC Breakfast and Question Time. The videos contained more than 118,000 sentences in total, and a vocabulary of 17,500 words. The research team compared the ability of the machine and a human expert to work out what was being said in the silent video by focusing solely on each speaker’s lip movements. They found that the software system was more accurate compared to the professional. The human lip-reader correctly read 12 per cent of words, while the WAS software recognised 50 per cent of the words in the dataset, without error. The machine’s mistakes were small, including things like missing an “s” at the end of a word, or single letter misspellings.”
Related Content
Related Posts:
- From square to cube: Hardware processing for AI goes 3D, boosting processing power
- New study could help unlock ‘game-changing’ batteries for electric vehicles and aviation
- Machine learning helps mathematicians make new connections
- Heatstroke: why the hotter the clock, the more accurate its timekeeping
- Light-carrying chips advance machine learning
- AI automatic tuning delivers step forward in Quantum computing
- In-Memory Computing Using Photonic Memory Devices
- New Material for Splitting Water
- Experts Predict When Artificial Intelligence Will Exceed Human Performance
- Google DeepMind and FHI collaborate to present research at UAI 2016