Trying out the Intel Neural Compute Stick 2 – Movidius NCS2_weixin_30321709的博客-程序员宝宝

技术标签: 运维  嵌入式  raspberry pi  

Trying out the Intel Neural Compute Stick 2 – Movidius NCS2

Disclaimer: The opinions in this article (and on this website in general) are entirely mine and not those of my employer Dell EMC. Testing has been done in a short period of time and may not accurately reflect real-world performance.

PDF version of this post is available here

What is it?

This week I’ve tested the Intel Neural Compute Stick 2 (NCS2). A USB stick for visual computing which originally comes from Movidius – a company acquired by Intel in September 2016. Intel’s landing page for the NCS2 can be found here.

The NCS2 is equipped with 16 VPU’s or Visual Processing Units which are designed specifically for image and video processing. With this it’s possible to run Machine Learning frameworks like Caffe and Tensorflow and to leverage CNN (Convolutional Neural Networks) to do inferencing on data.

So what?

What makes this really interesting is that since it comes in USB format it can simply be plugged into devices at the edge that normally lack the processing power to run machine learning frameworks. With that it becomes possible to process IoT data where it’s generated. The NCS2 has the potential to make Edge Computing a reality instead of just a buzzword.

The compound impact could be significant if the network backhaul is taken into consideration. Imagine replacing a constant stream of video data across the network with just the metadata resulting from running inferencing on the very same video stream. Massive amounts of video vs. some text data. Network savings alone could pay for this pretty quickly.

Note that the very same type of chip is sold embedded in video cameras and drones. They’re pretty expensive though. A webcam + a Raspberry Pi and one of the NCS2 sticks could be a very low-cost way to get a security camera with real-time item recognition for very little money in comparison. 

Why not use a GPU?

Many edge devices lack the capabilities to host a GPU, either due to space, cost or thermal limitations (GPU’s generate a lot of heat and many edge devices are passively cooled). 

Hang on – isn’t that just a devkit?

Yes and no. The USB stick can be used as a devkit to develop code for embedded versions of the Movidius chip of the kind that may be destined for cameras, drones, robots, etc. However, it can also be used as a very flexible drop-in solution for any edge devices or IoT gateways with low processing power but where the capability to do inferencing is desired.

Is it actually useful?

Yes, it looks like it might actually be able to do the job. The job being processing video directly on the edge devices. In particular it shines when plugged into platforms with low-end CPUs which would never be able to run inferencing on their own.

Functionality testing

To find out if it’s powerful enough to process video real-time I used a webcam and fed the video stream to a sample application from the OpenVINO toolkit (link). This particular demo app actually does several things at the same time: Facial recognition, Age detection, Pose detection and Mood detection. All these are run stacked in one command (all details will be included hands-on post shortly). It actually performs very well, although note that the video is not in full HD. Accuracy on the age detection isn’t the best, but of course that reflects more on the algorithm / training data than the NCS2 (it thinks I’m in my 20’s which is flattering).

Facial, Age and Mood detection with Intel Neural Compute Stick 2


Performance testing

Inferencing can be done on a CPU as well. So, for the NCS2 to be useful it would have to outperform whatever CPU is already on the platform it’s plugged into. Therefore I ran a benchmark test (this one)on both the NCS2 and a number of CPUs. The CPUs it was compared to were:

When the NCS2 was being used for the benchmark I was also curious to see if the platform / computer the NCS2 was plugged into affected the benchmark results. Maybe the NCS2 performance would be affected by the host CPU, memory and storage?

The platforms where the NCS2 was tested:

  • Dell Edge Gateway 5000
  • Dell Latitude 7440
  • Dell Precision 5530
Edge Gateway 5000 and Movidius

Note that the floating point precision differs when running the benchmark on CPU vs the NCS2, so it’s not completely apples to apples. This is because the NCS2 only support half precision (FP16) whereas the CPU only support full, or normal, precision (FP32). This probably doesn’t make a huge difference when doing inferencing, which is the only thing the NCS2 is likely to be doing in a real-world application. For learning however, the floating point precision may cause the algorithm to learn either garbage or nothing at all. This article is summarizes the topic nicely for those interested: FP16 and FP32 difference for deep learning

Each platform was tested with CPU and with the NCS2. Three platforms x 2 tests = 6 results.

NOTE: I only ran these tests a few times, so please don’t consider it exhaustive. It would need to be run dozens of times for each and have the results balanced out to get more accurate readings. However, this is all I had time for and it’s at least an indicator of performance. 

The NCS2 completely outperform the Atom CPU on the Edge Gateway 5000. This is where it was tested initially. Further testing shows that it’s more or less equal to a Gen4 Intel i7 but falls behind when compared to a Gen8 Intel i7 CPU.

This is expected of course. The NCS2 isn’t a very expensive device at $87.99 USD ( at the time of writing). This is a pretty cheap way to add Machine Learning power on devices which have lower-end CPUs, like IoT edge gateways. 

There are slight differences in results when the NCS2 is running on less powerful platforms vs. newer machines. This indicates that there are more factors that play into the results than the NCS2 itself, like the type of CPU, memory and storage on the platform the NCS2 is plugged into.

From the results it’s clear that the Movidius NCS2 can’t compete with a modern i7 CPU, but of course it’s a lot cheaper and supposedly draw a lot less power. That would make it ideal for connecting to edge devices where limiting power consumption may be desired.

Practical considerations

For those who may be interested in getting one these I’d like to point out a few things.

1. USB speeds

The NCS2 changes speeds when an app uses it for execution of a neural network. More importantly, when this happens the OS believes that the original USB 2.0 device has been removed and is being replaced with a new USB 3.0 device. This is reversed when code execution finishes.

Movidius stick plugged in:

Feb 22 06:29:33 localhost kernel: [  396.100651] usb 1-1: new high-speed USB device number 11 using xhci_hcd<br>
Feb 22 06:29:33 localhost kernel: [  396.230055] usb 1-1: New USB device found, idVendor=03e7, idProduct=2485<br>
Feb 22 06:29:33 localhost kernel: [  396.230068] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3<br>
Feb 22 06:29:33 localhost kernel: [  396.230077] usb 1-1: Product: Movidius MyriadX<br>
Feb 22 06:29:33 localhost kernel: [  396.230084] usb 1-1: Manufacturer: Movidius Ltd.<br>
Feb 22 06:29:33 localhost kernel: [  396.230091] usb 1-1: SerialNumber: 03e72485

Benchmark_app run starting

Feb 22 06:30:19 localhost kernel: [  442.564640] usb 1-1: USB disconnect, device number 11<br>
Feb 22 06:30:20 localhost kernel: [  442.993334] usb 1-1: new high-speed USB device number 12 using xhci_hcd<br>
Feb 22 06:30:20 localhost kernel: [  443.122975] usb 1-1: New USB device found, idVendor=03e7, idProduct=f63b<br>
Feb 22 06:30:20 localhost kernel: [  443.122989] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3<br>
Feb 22 06:30:20 localhost kernel: [  443.122997] usb 1-1: Product: VSC Loopback Device<br>
Feb 22 06:30:20 localhost kernel: [  443.123005] usb 1-1: Manufacturer: Intel Corporation<br>
Feb 22 06:30:20 localhost kernel: [  443.123012] usb 1-1: SerialNumber: 00000000000000000

Benchmark_app run finished

Feb 22 06:31:28 localhost kernel: [  511.851893] usb 1-1: USB disconnect, device number 12<br>
Feb 22 06:31:29 localhost kernel: [  512.126213] usb 1-1: new high-speed USB device number 13 using xhci_hcd<br>
Feb 22 06:31:29 localhost kernel: [  512.255008] usb 1-1: New USB device found, idVendor=03e7, idProduct=2485<br>
Feb 22 06:31:29 localhost kernel: [  512.255020] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3<br>
Feb 22 06:31:29 localhost kernel: [  512.255027] usb 1-1: Product: Movidius MyriadX<br>
Feb 22 06:31:29 localhost kernel: [  512.255034] usb 1-1: Manufacturer: Movidius Ltd.<br>
Feb 22 06:31:29 localhost kernel: [  512.255040] usb 1-1: SerialNumber: 03e72485

2. The NCSDK and NCSDK2 can’t be used

There are two versions of the NCSDK available. Both of these are for the original NCS and won’t work with the NCS2. This is clearly stated on the Intel webpage but if you’re like me you may miss it and make the assumption that the NCSDK2 is for the NCS2 stick. I wasted a fair bit of time on this before realizing it wasn’t working by design.

Instead use the Intel distribution of OpenVINO which is available here

3. Can it be run in a container?

Yes, but there are no pre-written Dockerfiles for the NCS2 as there were for the original NCS. The NCSDK2 contains a Docker file but it won’t work since it’s for a different version of the neural compute stick (see the NCSDK note above).

However, it’s not hard to build a container with OpenVINO and run that. I have verified that this works. In fact, there’s an excellent Dockerfile by Mateo Guzman available here and I’ve forked it here since I wanted to make some modifications to it. Feel free to use either of them.

Additionally, if you’re impatient or short on time, I’ve made a pre-built docker image on Docker Hub which can be accessed here.

NOTE: The container has to be run in privileged mode. This is due to the NCS2 device ID changing and the USB device being re-enumerated the moment code is loaded onto it (see the USB speed changes note above). The NCSDK2 has a way to change the mode of the original NCS so it can be run in non-privileged mode but this won’t work on the NCS2. I don’t know of a workaround so far.

I’ll write a post on usage shortly which will contain more detail on how to run the demo apps, download and optimize the models for the sample apps as well as how to runt the sample apps themselves.

4. Can it be used on a Raspberry Pi?

Yes, it appears so although I haven’t tested it on the Pi yet. Intel has instructions for installing OpenVINO on the Pi here. I may post a Dockerfile or Docker image for the Pi when I’ve had a chance to test it.

5. What about heat generation and cooling?

The NCS2 is passively cooled and actually is its own heatsink. It’s made of metal and the “fins” on both sides of it allow enough airflow to keep it cool. It does get warm during testing but so far not extremely so.

Intel Movidius NCS2 heatsink

6. Size

It’s a bit broad, which can make it difficult to plug in at times. It also risks covering other USB ports or as in my case – the power inlet for my laptop. A USB extension cable or powered USB hub could easily mitigate this of course.


The Intel Movidius Neural Compute Stick 2 does seem like a valid option for running inferencing at the edge. While it’s not as powerful as a full-on GPU nor a modern CPU, it has the potential to excel in the niche of low-power edge devices like IoT gateways where the onboard CPU isn’t powerful enough to do inferencing on its own. 


版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。



Getting Started with MavenGetting Started in 5 MinutesGetting Started in 30 MinutesIntroductionsThe Build LifecycleThe POMProfilesRepositoriesStandard Directory LayoutThe Dependency Mechanis...

《剑指offer》第2章 03-15题解_剑指offer2专项突破题解_Netfishless的博客-程序员宝宝

第二章 面试需要的基础知识1. 数组剑指 Offer 03. 数组中重复的数字找出数组中重复的数字。在一个长度为 n 的数组 nums 里的所有数字都在 0~n-1 的范围内。数组中某些数字是重复的,但不知道有几个数字重复了,也不知道每个数字重复了几次。请找出数组中任意一个重复的数字。输入:[2, 3, 1, 0, 2, 5, 3]输出:2 或 3 排序后扫描复杂度O(nlogn)哈希表,空间复杂度O(n),时间复杂度O(n)原地哈希:有的元素多次重复,有的元素确实,使用原地哈希

but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC_写bug小能手的博客-程序员宝宝




flex实现多行多列-内容单行或者2行-超过2行显示省略号_flex 多行省略号_billycoder的博客-程序员宝宝

一个学生问到的问题效果代码&lt;!DOCTYPE html&gt;&lt;html lang="en"&gt; &lt;head&gt; &lt;meta charset="UTF-8" /&gt; &lt;meta http-equiv="X-UA-Compatible" content="IE=edge" /&gt; &lt;meta name="viewport" content="width=device-width, initial-scale=1.0".




android 自定义控件_山水有相逢-马哥哥的博客-程序员宝宝


473. 火柴拼正方形——回溯算法_回溯法问题的解空间火柴_errNotFoundException的博客-程序员宝宝

473. 火柴拼正方形(Java)题目链接:难度:中等1.题目输入为小女孩拥有火柴的数目,每根火柴用其长度表示。输出即为是否能用所有的火柴拼成正方形。2.示例输入: [1,1,2,2,2]输出: true解释: 能拼成一个边长为2的正方形,每边两根火柴。3.题解思路:先判断所有火柴的总长度是否为4的倍数。如果不是则直接返回false。如果是4的倍数就尝试将火柴放入四个边看能否组




电池管理系统(Battery Management System,简称BMS)是一种用于监控和管理电池组的电子系统。BMS主要应用于锂离子电池、铅酸电池、镍氢电池等可充电电池系统。它的主要目的是确保电池在安全、高效、可靠的状态下运行,以提高电池的使用寿命、性能和安全性。