Main Content

Samsung Electronics Introduces A High-Speed, Low-Power NPU Solution for AI Deep Learning

Deep learning algorithms are a core element of artificial intelligence (AI) as they are the processes by which a computer is able to think and learn like a human being does. A Neural Processing Unit (NPU) is a processor that is optimized for deep learning algorithm computation, designed to efficiently process thousands of these computations simultaneously.

Samsung Electronics last month announced its goal to strengthen its leadership in the global system semiconductor industry by 2030 through expanding its proprietary NPU technology development. The company recently delivered an update to this goal at the conference on Computer Vision and Pattern Recognition (CVPR), one of the top academic conferences in computer vision fields.

This update is the company’s development of its On-Device AI lightweight algorithm, introduced at CVPR with a paper titled “Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss”. On-Device AI technologies directly compute and process data from within the device itself. Over 4 times lighter and 8 times faster than existing algorithms, Samsung’s latest algorithm solution is dramatically improved from previous solutions and has been evaluated to be key to solving potential issues for low-power, high-speed computations.

Streamlining the Deep Learning Process
Samsung Advanced Institute of Technology (SAIT) has announced that they have successfully developed On-Device AI lightweight technology that performs computations 8 times faster than the existing 32-bit deep learning data for servers. By adjusting the data into groups of under 4 bits while maintaining accurate data recognition, this method of deep learning algorithm processing is simultaneously much faster and much more energy efficient than existing solutions.

Samsung’s new On-Device AI processing technology determines the intervals of the significant data that influence overall deep learning performance through ‘learning’. This ‘Quantization1 Interval Learning (QIL)’ retains data accuracy by re-organizing the data to be presented in bits smaller than their existing size. SAIT ran experiments that successfully demonstrated how the quantization of an in-server deep learning algorithm in 32 bit intervals provided higher accuracy than other existing solutions when computed into levels of less than 4 bits.

When the data of a deep learning computation is presented in bit groups lower than 4 bits, computations of ‘and’ and ‘or’ are allowed, on top of the simpler arithmetic calculations of addition and multiplication. This means that the computation results using the QIL process can achieve the same results as existing processes can while using 1/40 to 1/120 fewer transistors2.

As this system therefore requires less hardware and less electricity, it can be mounted directly in-device at the place where the data for an image or fingerprint sensor is being obtained, ahead of transmitting the processed data on to the necessary end points.

The Future of AI Processing and Deep Learning
This technology will help develop Samsung’s system semiconductor capacity as well as strengthening one of the core technologies of the AI era – On-Device AI processing. Differing from AI services that use cloud servers, On-Device AI technologies directly compute data all from within the device itself.”

Link to article