Hot Chips 31 Live Blogs: MLperf Benchmark

Tuesday, August 20th, 2019 - Machine Learning, Teknologi

Hot Chips 31 Live Blogs: MLperf Benchmark


05:58PM EDT – MLperf is an up-and-coming benchmark aimed at machine learning, backed by a number of industry leaders in this area.

06:08PM EDT – Over 50 partners

06:08PM EDT – ML hardware is projected to be $60B in 2025

06:09PM EDT – Benchmarking helps drive hardware development

06:09PM EDT – Benchmark design overview

06:10PM EDT – ML is different to traditional SPEC

06:10PM EDT – Training and Inference have different issues

06:11PM EDT – Training develops to a target quality

06:11PM EDT – Two divisions:

06:11PM EDT – Closed division is a fixed model, open division where the model is unspecified

06:11PM EDT – Create a set of benchmarks around key ML areas

06:12PM EDT – Commerce, Research, Vision, Speech, Language

06:12PM EDT – Have to choose the model in the closed division

06:13PM EDT – MLperf requires real world datasets

06:14PM EDT – Chose the metrics. Throughput or time to train?

06:14PM EDT – MLperf goes on time-to-train, as the least bad choice

06:15PM EDT – Architecture, framework, implementation. Overall model must be mathematically equivalent

06:15PM EDT – Also hyperparameter tuning

06:16PM EDT – Finding good hyperparameters is expensive and not the point of the benchmark. It would make it a search contest

06:16PM EDT – Solution is hyperparameter borrowing between submissions

06:16PM EDT – Another challenge is variance

06:17PM EDT – Because convergence changes, solution is to run each benchmark multiple times and drop outliers

06:17PM EDT – Aim for 2-5% variance without 1000 runs required

06:17PM EDT – Now inference benchmarks

06:17PM EDT – Same as closed division and open division

06:18PM EDT – Inference is used differently: single stream, multiple stream, server stream, offline

06:18PM EDT – Inference has prioritized around vision, reflecting real world use cases

06:18PM EDT – Four different metrics

06:19PM EDT – Latency, QPS, Minimum Latency, Throughput

06:19PM EDT – Pre-trained weights for closed division, have to use standard C++ load generator

06:20PM EDT – Inference specific: quantization and retraining

06:20PM EDT – In closed division, quantization must be principled

06:22PM EDT – Results have the issue of scale. Number of chips, power, time etc

06:23PM EDT – Impact of benchmarking on MLperf

06:23PM EDT – Is this benchmark moving the industry forward

06:27PM EDT – Developing the framework to move forward. Aiming for agile benchmarking

06:27PM EDT – Rules have changed and parameters are being learned

06:27PM EDT – Rule challenges to optimize for something the industry needs and works to drive the industry forward

06:28PM EDT – Making reference implementations faster and more reliable

06:28PM EDT – Launching a non-profit called MLcommons, the home of MLperf

06:28PM EDT – Mission to accelerate ML innovation

06:29PM EDT – Benchmarks, large public datasets, best practices, outreach

06:29PM EDT – Helping push a young industry forward

06:30PM EDT – In the process of getting founding members of MLcommons

06:30PM EDT – Q&A time

06:32PM EDT – Q: How do you make the benchmark useful if ML iterates every few months? A: First goal was to create a benchmark match reality. We like to lower the barrier to entry over time. Try to make the reference implementaitons on higher perf. Also sub-benchmarks using traces.

06:33PM EDT – Q: Is MLperf a non-standard at this point? A: We’re aiming to build a consensus between hardware that won’t do MLperf and accept everything apporach. We believe in actual benchmarking. We’re still feeling our way.

06:33PM EDT – That’s a wrap.

Source link : Hot Chips 31 Live Blogs: MLperf Benchmark


Pictures gallery of Hot Chips 31 Live Blogs: MLperf Benchmark

Hot Chips 31 Live Blogs: MLperf Benchmark | admin | 4.5