It is not an exaggeration to say that deep learning has taken the world of computer vision, and many other recognition tasks, by storm. Many of the most difficult recognition problems have seen gains over the past few years that are astonishing. Although we have seen large improvements in the accuracy of recognition as a result of Deep Neural Networks (DNNs), deep learning approaches have two well-known challenges: they require large amounts of labelled data for training, and they require a type of compute that is not amenable to current general purpose processor/memory architectures. Some companies have responded with architectures designed to address the particular type of massively parallel compute required for DNNs, including our own use of FPGAs, for example, but to date these approaches have primarily enhanced existing cloud computing fabrics. But I work on HoloLens, and in HoloLens, we’re in the business of making untethered mixed reality devices. We put the battery on your head, in addition to the compute, the sensors, and the display. Any compute we want to run locally for low-latency, which you need for things like hand-tracking, has to run off the same battery that powers everything else. So what do you do?”