Practical Embedded Object Detection with PlaidML
Comments on Hacker News
PlaidML allows GPU accelerated applications to be deployed on almost any hardware. We introduce microplaid – an open source set of tools for developing accelerated object detection applications on embedded devices. We provide a parts list and outline using microplaid to build a mobile object detector based on the UP Squared board.
Microplaid is a suite of tools intended to make PlaidML more readily deployable on embedded devices. The initial release provides a web server and client frontend for 4 object detectors - FP16 and FP32 versions of a small MobileNet+SSD (WM: .25), a full (300x300, WM 1.0) MobileNet+SSD, and a small MobileNet+SSD trained on 2000 pictures of 7 different hand gestures.
We used TensorFlow’s excellent object detection pipeline to do the training, and an in house tool to convert the tensorflow models into ‘tile’ files, which contain weights, the Tile code to actually do the computations, and python code for post-processing. These tools will eventually be pushed into microplaid but they are currently under heavy development and lack important features.
Microplaid works on any device PlaidML supports, including your desktop or laptop GPU.
Building the Smart Camera
We’ve been really impressed with the price/performance of the new Apollo Lake-based UP Squared board. Specifically, the Pentium N4200 based boards with Intel HD Graphics 505: for $230 USD, you get 4GB of RAM, a solid main CPU, and around 220 GFLOPS of compute power. PlaidML supports ARM devices, but development on the UP Squared is as simple as it gets.
The entire parts list is available at the bottom of this post. There’s nothing particularly complicated about this setup other than the voltage converter. The UP Squared board requires a 5V/6A power supply, which isn’t readily available in battery form. Pigtails need to be soldered onto the voltage converter which then plugs into the battery and the UP Squared. The camera is friction mounted in a hole drilled on a drill press which makes the camera decidely no longer waterproof, but it could be readily modified to be so.
After installing the camera, battery, and voltage converter, you should be ready to go. The specified battery provides between 6 and 10 hours of continuous runtime.
PlaidML supports Windows 10, as does microplaid and the UP Squared. We’ve tested the entire setup on Windows but the performance lags behind Linux, so we stick with that for our smart camera. Follow the microplaid Windows instructions if you want to use Windows.
First, install Ubuntu 16.04 from a USB stick. Stock Ubuntu works well on the UP Squared, but users have two choices for graphics drivers: Intel’s proprietary drivers or the open source Beignet drivers.
We use the open source Beignet drivers because they consistently provide 30-50% better performance than Intel’s proprietary drivers in the use cases we tested. Beignet doesn’t currently support FP16, but we’re working with them to upstream a version that does. FP32 on Beignet is still generally faster than FP16 on the Intel drivers.
The following commands will install the drivers and launch a MobileNet 1.0-based object detector:
sudo apt install beignet python-opencv git git clone https://github.com/plaidml/microplaid && cd microplaid sudo pip install -r object_detection/requirements.txt python object_detection/web.py
The object detector supports any camera that OpenCV supports.
Different networks can be specified on the command line (in this instance, the custom hand signals network):
python object_detection/web.py --tile=handsigns_ssd_mobilenet_v1
Tile files are hardware independent. The Tile code contained in the tile file is compiled for the platform it’s being deployed on at launch time, thus, there is a small delay before the web frontend becomes available. The frontend is available once the initial run is complete:
[I 180123 11:23:12 web:68] Initial run complete.
Once the front-end is live, browse to port 5000 on the device (http://localhost:5000 if you’re running locally). The web frontend currently requires a browser with WebSocket and HTML5 Canvas support.
The UP Squared achieves anywhere between 5 and 12 FPS on smaller MobileNets, depending on the temparature of the GPU. Active cooling is necessary to sustain high performance. The HD505 supports FP16 which generally provides a solid 30% gain in performance (with the caveat that FP16 support is not available in the Beignet mainline yet).
We benchmarked popular vision networks on PlaidML and on Intel’s Computer Vision SDK Beta R3, which provides Intel GPU acceleration for some vision networks. The Computer Vision SDK doesn’t currently support object detection so we were unable to provide comparative numbers for MobileNet/SSD:
|MobileNet 1.0||PlaidML (Beignet, FP16)||1||53.64 ms|
|MobileNet 1.0||PlaidML (Beignet, FP32)||1||70.05 ms|
|MobileNet 1.0||Intel Inference Engine (clDNN FP32)||1||80.6 ms|
|ResNet-50||PlaidML (Beignet, FP32)||1||349.2 ms|
|ResNet-50||Intel Inference Engine (clDNN FP32)||1||246.03 ms|
|UP Squared||$289||Package with fan, power supply, and case. 4GB RAM, 32 GB eMMC|
|TalentCell Battery||$65||20 Amp Hour (@ 5 V) Rechargable Li-ion|
|Voltage Converter||$15||10 amp, oversized|
|Power Cables||$10||Connects to the battery and the UP Squared, solder to voltage converter.|
|ELP USB webcam||$45||Consider extra lenses|
We welcome feedback, bugs, questions, and usage reports for microplaid - just file an issue on GitHub.