Edge AI: Systems Design and ML for IoT Data Analytics

2
Edge AI: Systems Design and ML for IoT Data Analytics Radu Marculescu Dept. of ECE The University of Texas at Austin Austin, TX, USA [email protected] Diana Marculescu Dept. of ECE The University of Texas at Austin Austin, TX, USA [email protected] Umit Y. Ogras Dept. of ECE University of Wisconsin-Madison Madison, WI, USA [email protected] ABSTRACT With the explosion in Big Data, it is often forgotten that much of the data nowadays is generated at the edge. Specifically, a major source of data is users’ endpoint devices like phones, smart watches, etc., that are connected to the internet, also known as the Internet-of-Things (IoT). This “edge of data” faces several new challenges related to hardware-constraints, privacy-aware learning, and distributed learning (both training as well as inference). So what systems and machine learning algorithms can we use to generate or exploit data at the edge? Can network science help us solve machine learning (ML) problems? Can IoT- devices help people who live with some form of disability and many others benefit from health monitoring? In this tutorial, we introduce the network science and ML techniques relevant to edge computing, discuss systems for ML (e.g., model compression, quantization, HW/SW co-design, etc.) and ML for systems design (e.g., run-time resource optimization, power management for training and inference on edge devices), and illustrate their impact in addressing concrete IoT applications. CCS CONCEPTS • Computing methodologies → Object recognition; Neural networks; Computer systems organization→Embedded systems; • Theory of computation→Distributed algorithms; KEYWORDS Deep learning, federated learning, IoT, health monitoring ACM Reference format: Radu Marculescu, Diana Marculescu, and Umit Ogras. 2020. Edge AI: Systems Design and ML for IoT Data Analytics. In Proceedings of 2020 ACM Conference on Knowledge and Data Discovery (KDD’20), August 23- 27, 2020, Virtual Event, CA, USA. ACM, New York, NY, USA, 2 pages. https://doi.org/10.1145/3394486.3406479 1 Target Audience and Prerequisites This tutorial is intended for an audience relatively new to the IoT area, with interests in edge computing and energy-efficient AI system design. The tutorial assumes some basic background in ML (e.g., classification, clustering, convolutional neural networks (CNN), etc.), network science (e.g., graphs, community detection, etc.), and systems design (e.g., scheduling, resource management, etc.). We will explain the more advanced concepts in ML and systems optimization in sufficient detail needed in order to make the presentation self-contained. The material discussed in this tutorial is relevant to students and researchers from industry and academia interested in work at the intersection of ML, network science, and systems optimization with applications to IoT and edge computing. 2 Tutorial Outline [Part 1: Algorithms] The first part will introduce relevant algorithms for IoT and edge AI, design constraints, new directions in distributed learning and resource management. - 1.1.EdgeAI (motivation/challenges/algorithms) - 1.2 Federated learning - 1.3 Deep learning model compression [Part 2: Architectures] The second part will discuss new architectures to support the algorithms introduced in Part 1. We also discuss hardware-software co-design ideas targeting edge AI with applications from image and human activity recognition. - 2.1 System-aware modeling/optimization of ML applications - 2.2 ML and system co-design via neural architecture search - 2.3 Sensor-rich wearable IoT applications and data analytics - 2.4 Energy-harvesting and managements for edge devices [Part 3: Applications] Last part will discuss the characteristics of a few concrete system implementations for distributed edge inference, human activity recognition, and gait analysis with IoT devices. Finally, the speakers conclude the tutorial and answer audience questions. - 3.1 Distributed inference on edge devices: Anatomy of an IoT system for distributed image classification over a network of IoT - 3.2 Human activity recognition: The anatomy of a wearable system for human activity - recognition and gait analysis - 3.3 Conclusions/Q&A Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. KDD ’20, August 23–27, 2020, Virtual Event, USA. © 2020 Copyright is held by the owner/author(s). ACM ISBN 978-1-4503-7998-4/20/08. DOI: https://doi.org/10.1145/3394486.3406479 Tutorial Abstract KDD ‘20, August 23–27, 2020, Virtual Event, USA 3565

Transcript of Edge AI: Systems Design and ML for IoT Data Analytics

Edge AI: Systems Design and ML for IoT Data Analytics

Radu Marculescu Dept. of ECE

The University of Texas at Austin Austin, TX, USA

[email protected]

Diana Marculescu Dept. of ECE

The University of Texas at Austin Austin, TX, USA

[email protected]

Umit Y. Ogras Dept. of ECE

University of Wisconsin-Madison Madison, WI, USA [email protected]

ABSTRACT With the explosion in Big Data, it is often forgotten that much of the data nowadays is generated at the edge. Specifically, a major source of data is users’ endpoint devices like phones, smart watches, etc., that are connected to the internet, also known as the Internet-of-Things (IoT). This “edge of data” faces several new challenges related to hardware-constraints, privacy-aware learning, and distributed learning (both training as well as inference). So what systems and machine learning algorithms can we use to generate or exploit data at the edge? Can network science help us solve machine learning (ML) problems? Can IoT-devices help people who live with some form of disability and many others benefit from health monitoring?

In this tutorial, we introduce the network science and ML techniques relevant to edge computing, discuss systems for ML (e.g., model compression, quantization, HW/SW co-design, etc.) and ML for systems design (e.g., run-time resource optimization, power management for training and inference on edge devices), and illustrate their impact in addressing concrete IoT applications.

CCS CONCEPTS • Computing methodologies → Object recognition; Neural networks; • Computer systems organization→Embedded systems; • Theory of computation→Distributed algorithms;

KEYWORDS Deep learning, federated learning, IoT, health monitoring

ACM Reference format:

Radu Marculescu, Diana Marculescu, and Umit Ogras. 2020. Edge AI: Systems Design and ML for IoT Data Analytics. In Proceedings of 2020 ACM Conference on Knowledge and Data Discovery (KDD’20), August 23-27, 2020, Virtual Event, CA, USA. ACM, New York, NY, USA, 2 pages. https://doi.org/10.1145/3394486.3406479

1 Target Audience and Prerequisites This tutorial is intended for an audience relatively new to the IoT area, with interests in edge computing and energy-efficient AI system design. The tutorial assumes some basic background in ML (e.g., classification, clustering, convolutional neural networks (CNN), etc.), network science (e.g., graphs, community detection, etc.), and systems design (e.g., scheduling, resource management, etc.). We will explain the more advanced concepts in ML and systems optimization in sufficient detail needed in order to make the presentation self-contained. The material discussed in this tutorial is relevant to students and researchers from industry and academia interested in work at the intersection of ML, network science, and systems optimization with applications to IoT and edge computing.

2 Tutorial Outline

[Part 1: Algorithms] The first part will introduce relevant algorithms for IoT and edge AI, design constraints, new directions in distributed learning and resource management. - 1.1.EdgeAI (motivation/challenges/algorithms) - 1.2 Federated learning - 1.3 Deep learning model compression

[Part 2: Architectures] The second part will discuss new architectures to support the algorithms introduced in Part 1. We also discuss hardware-software co-design ideas targeting edge AI with applications from image and human activity recognition. - 2.1 System-aware modeling/optimization of ML applications - 2.2 ML and system co-design via neural architecture search - 2.3 Sensor-rich wearable IoT applications and data analytics - 2.4 Energy-harvesting and managements for edge devices

[Part 3: Applications] Last part will discuss the characteristics of a few concrete system implementations for distributed edge inference, human activity recognition, and gait analysis with IoT devices. Finally, the speakers conclude the tutorial and answer audience questions. - 3.1 Distributed inference on edge devices: Anatomy of an IoT system for distributed image classification over a network of IoT - 3.2 Human activity recognition: The anatomy of a wearable

system for human activity - recognition and gait analysis - 3.3 Conclusions/Q&A

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. KDD ’20, August 23–27, 2020, Virtual Event, USA. © 2020 Copyright is held by the owner/author(s). ACM ISBN 978-1-4503-7998-4/20/08. DOI: https://doi.org/10.1145/3394486.3406479

Tutorial Abstract KDD ‘20, August 23–27, 2020, Virtual Event, USA

3565

3 Tutorial Presenters

Radu Marculescu is a professor and the Laura Jennings Turner Chair in Engineering in the ECE Department at University of Texas at Austin. He received his Ph.D. from Univ. of Southern California in 1998 and has been on the ECE faculty at Carnegie Mellon University between 2000-2019. His work on networks-on-chip design and optimization is widely recognized, most recently with the 2019 IEEE Computer Society 2019 Edward J. McCluskey Technical Achievement Award. His current research projects include machine learning and optimization for manycore systems design, AI approaches for HW/SW co-design, and distributed learning approaches for edge devices.

Diana Marculescu is a professor and Motorola Regents Chair in Electrical and Computer Engineering at the University of Texas at Austin. Before joining UT Austin in December 2019 as department chair, she was at Carnegie Mellon University (2000-2019). She received the Ph.D. degree in computer engineering from the University of Southern California, Los Angeles, CA (1998). Her research interests include energy- and reliability-aware computing, hardware aware machine learning, and computing for sustainability and natural science applications. Diana received multiple best papers awards and national or international recognitions for her research contributions. She is a Fellow of ACM and IEEE.

Umit Y. Ogras received his Ph.D. degree in ECE from Carnegie Mellon University in 2007. He worked at Intel as a Research Scientist between 2008-2017, and at Arizona State University between 2013-2020. He is currently an Associate Professor at the University of Wisconsin-Madison. Dr. Ogras has received 2018 DARPA Young Faculty Award, 2017 NSF CAREER Award, Intel SCL Research Award, and best paper awards at 2019 CASES, 2017 CODES-ISSS, 2012 IEEE Trans. on CAD, 2011 IEEE VLSI Transactions. His research interests include energy-efficient embedded systems, wearable internet-of-things, flexible hybrid electronics, and multicore architectures.

ACKNOWLEDGMENTS The tutorial presenters acknowledge the tutorial contributors, i.e., Dr. Kartikeya Bhardwaj of ARM Inc. and Dr. Dimitrios Stamoulis of Microsoft. This work has been partially funded by Air Force Research Laboratory (AFRL) and Defense Advanced Research Projects Agency (DARPA) under agreement number FA8650-18-2-7860, AWS ML Research Program, and NSF CAREER award CNS-1651624.

REFERENCES [1] K. Bhardwaj, N. Suda, R. Marculescu, “EdgeAI: A Vision for Deep Learning in

IoT Era.” IEEE Design and Test, November 2019. [2] R. Kim, J.R. Doppa, P.P. Pande, D. Marculescu, and R. Marculescu, “Machine

Learning and Manycore Systems Design: A Serendipitous Symbiosis,” in IEEE Computer, July 2018.

[3] K. Bhardwaj, W. Chen, R. Marculescu, “New Directions in Distributed Deep Learning: Bringing the Network at Forefront of IoT Design.” To appear in Proceedings of Design Automation Conference (DAC), July 2020.

[4] Konecny, J., McMahan, B., Ramage, D.: Federated optimization: Distributed optimization beyond the datacenter. arXiv preprint arXiv:1511.03575 (2015)

[5] Konecny, J., McMahan, H.B., Yu, F.X., Richtarik, P., Suresh, A.T., Bacon, D.: Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016)

[6] McMahan, H.B., Moore, E., Ramage, D., Hampson, S., et al.: Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629 (2016)

[7] Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10(2), 12 (2019)

[8] K. Bhardwaj, N. Suda, R. Marculescu, “Dream Distillation: A Data-Independent Model Compression Framework.” ICML Workshop on On-Device Machine Learning and Compact Deep Neural Network Representations (ODML-CDNNR), Oral Presentation, June 2019.

[9] K. Bhardwaj, C. Lin, A. Sartor, R. Marculescu, “Memory-and communication-aware model compression for distributed deep learning inference on IoT.” ACM Transactions on Embedded Computing Systems (TECS) 18.5s (2019): 1-22., International Conference on Hardware Software Codesign and System Synthesis (CODES+ISSS), October 2019.

[10] W. Choi, K. Duraisamy, R. Kim, J. Doppa, P. Pande, D. Marculescu, and R. Marculescu, “On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems,” IEEE Trans. on Computers, vol. 67, no. 5, pp. 672-686, 1 May 2018.

[11] E. Cai, D.C. Juan, D. Stamoulis, and D. Marculescu. "NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks." In Asian Conference on Machine Learning, pp. 622-637. 2017.

[12] T.-W. Chin, R. Ding, C. Zhang, and D. Marculescu, “Towards Efficient Model Compression,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, June 2020.

[13] Z. Chen, J. Zhang, R. Ding, and D. Marculescu, “ViP: Virtual Pooling for Accelerating CNN-based Image Classification and Object Detection,” in Proc. IEEE Winter Conference on Applications of Computer Vision (WACV), Aspen, CO, March 2020.

[14] T.-C. Wu, R. Ding, and D. Marculescu, “AdaScale: Towards Real-time Video Object Detection using Adaptive Scaling,” in Proc. Conference on Systems and Machine Learning (SysML), Palo Alto, CA, April 2019.

[15] Z. Chen, R. Ding, T.-W. Chin, and D. Marculescu, “Understanding the Impact of Label Granularity on CNN-based Image Classification,” in Proc. IEEE Intl. Workshop on Data Science and Big Data Analytics (DSBDA) in conjunction with IEEE Int. Conference on Data Mining (ICDM), Singapore, Nov. 2018.

[16] R. Ding, T.-W. Chin, D. Marculescu, and Z. Liu, “Regularizing Activation Distribution for Training Binarized Deep Networks,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, June 2019.

[17] R. Ding, Z. Liu, T.-W. Chin, D. Marculescu, and S. Blanton, “Lightweight Quantized Deep Neural Networks for Fast and Accurate Inference,” in Proc. ACM/IEEE Design Automation Conference (DAC), Las Vegas, June 2019.

[18] D. Stamoulis, R. Ding, D. Wang, D. Lymberopoulos, B. Priyantha, J. Liu, and D. Marculescu. "Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours." In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. 2019.

[19] D. Arriba-Pérez, M. Caeiro-Rodríguez, and, J. M. Santos-Gago, “Collection and processing of data from wrist wearable devices in heterogeneous and multiple-user scenarios,” Sensors, 16(9), 2016, 1538.

[20] G. Bhat, J. Park, and U. Y. Ogras, “Near-Optimal Energy Allocation for Self-Powered Wearable Systems,” in Proc. Int. Conf. on Comput.-Aided Design, Nov. 2017, pp. 368–375.

[21] G. Bhat, R. Deb, V. Chaurasia, H. Shill, and U. Y. Ogras, “Online human activity recognition using low- power wearable devices,” International Conference on Computer-Aided Design (ICCAD), Nov. 2018, pp. 1-8).

[22] G. Bhat, R. Deb, and U. Y. Ogras, “OpenHealth: Open source platform for wearable health monitoring,” IEEE Design & Test, 2019, vol. 36 , Issue: 5 , Oct. 2019, doi: 10.1109/MDAT.2019.2906110.

[23] M. A. Case, H. A. Burwic, K. G. Volpp, and M. S. Patel, “Accuracy of smartphone applications and wearable devices for tracking physical activity data,” Jama, 313(6), 2015, pp. 625-626.

[24] A.J. Espay et al. "Technology in Parkinson's Disease: Challenges and Opportunities" Movement Disorders vol. 31 no. 9 pp. 1272-1282 2016.

[25] J. Park, H. Joshi, H. Lee, S. Kiaei, and U. Y. Ogras. “Flexible PV-cell modeling for energy harvesting in wearable IoT applications,” ACM Trans. on Embedded Computing Systems (TECS) 16, no. 5s (2017): 1-20.

[26] S. Sudevalayam and P. Kulkarni, “Energy harvesting sensor nodes: Survey and implications,” IEEE Communications Surveys & Tutorials, vol. 13, no. 3, pp. 443–461, 2010.

[27] Y. Tuncel, G. Bhat, U. Y. Ogras, “Special Session: Physically Flexible Devices for Health and Activity Monitoring: Challenges from Design to Test,” VLSI Test Symposium, April 2020.

Tutorial Abstract KDD ‘20, August 23–27, 2020, Virtual Event, USA

3566