Recently we posted early results from our work to bring deep learning to more people through OpenCL support including initial benchmarks on AMD and NVIDIA hardware. As a business we are building on this technology to bring real-time computer vision to every device. In this post we will discuss the key issue of processing speed, open source a tool we use to measure speed on real workloads, and share our performance progress. Through careful optimization our runtime software, code-named Plaid, is now up to 1.4x faster than TensorFlow 1.3 + cuDNN 6 for real-time vision tasks.
Earlier this week, we posted a first look at our work to bring deep learning to more people on more platforms. Today, we’re adding details on our plan to open source our software and an update on our development progress. With our support for the OpenCL open standard, people with a GPU from any manufacturer, including NVIDIA, AMD, and Intel, will soon be able to get started with real datasets in minutes. Users won’t need to sacrifice speed for that freedom, our software is as fast as TensorFlow + cuDNN in some cases and it will continue to improve.
I’m excited to announce Vertex.AI’s work to bring deep learning to OpenCL and share a first look at our results so far. This work is intended to make deep learning accessible to more people and speed up progress across the field. Read on for the details and what’s coming next.
We’re working to bring the power of neural nets to every application, using new technology invented and built in-house, to make applications that weren’t possible, possible. There’s a large gap between the capabilities neural networks show in research and the practical challenges in actually getting them to run on the platforms where most applications run. Making these algorithms work in your app requires fast enough hardware paired with precisely tuned software compatible with your platform and language. Efficient plus compatible plus portable is a huge challenge—we can help.