Program

Cloud Intelligence / AIOps Workshop '24 Program Schedule

April 27th, 2024
Venue: San Diego at the Hilton La Jolla Torrey Pines
Note: All times listed on this page are local to San Diego (PDT)
8:30 - 8:40 am

Opening

8:40 - 9:30 am

Keynote 1: Transcending Neural Processing Units

Hadi S. Esmaeilzadeh, UCSD

Dr. Esmaeilzadeh is awarded early tenure at UC San Diego, where he is the inaugural holder of Halicioglu Chair in Computer Architecture with the rank of full professor in Computer Science and Engineering. Before UC San Diego, he was an assistant professor in the School of Computer Science at Georgia Tech. There, he was the inaugural holder of the Allchin Family Early Career Professorship. Dr. Esmaeilzadeh obtained his Ph.D. in Computer Science and Engineering from the University of Washington in 2013. His Ph.D. work received the 2013 William Chan Memorial Best Dissertation Award. He is the founding director of the Alternative Computing Technologies (ACT) Lab, where his team is developing new technologies and cross-stack solutions to enable responsible immersive intelligence. His research has been recognized by several best paper awards and honorable mentions. Hadi’s work on dark silicon has also been profiled in The New York Times.

Abstract: For the past decade, the IT industry has witnessed an intense arms race in the development of Neural Processing Units (NPUs) as deep learning took center stage. This focus was driven by the fact that the algorithmic advancements in deep learning coincided with the effective end of Dennard scaling, which engendered dark silicon and a lack of performance from general-purpose platforms. Interestingly, major software companies invested heavily in developing various NPUs for their internal workloads and even for their public cloud offerings. However, end-to-end applications are not just neural networks but often span across multiple domains, e.g., database queries, compression, encryption, video coding, signal processing, and traditional machine learning. Focusing solely on this single domain of deep learning is suboptimal, as it misses the potential to proliferate and promote cross-domain multi-acceleration. There is a growing need to harness the power of chaining heterogeneous accelerators from the edge to the cloud. This is the prime time for cross-domain multi-acceleration, and the research community must address important challenges to resolve the economics of scale and unleash the power of accelerators beyond NPUs. Many exciting key areas call for research and exploration such as Cross-Domain Languages (CDLs), multi-target compilation, data motion appliances, and system primitives for multi-acceleration. This talk takes a journey starting from my work that coined the term “NPU” and explores our ongoing research on transcending them to realize the next wave of upheaval: the move toward economically viable cross-domain multi-acceleration.

9:30 - 10:00 am

Technical Talk: Intelligent Overclocking for Improved Cloud Efficiency

Aditya Soni, Mayukh Das, Pulkit Misra, Chetan Bansal
10:00 - 10:30 am

Morning Break

10:30 - 11:00 am

Technical Talk: Anomaly Detection for Incident Response at Scale

Hanzhang Wang, Gowtham Kumar Tangirala, Gilkara Pranav Naidu, Charles Mayville, Arighna Roy, Joanne Sun, Ramesh Babu Mandava
11:00 - 11:20 am

Project Showcase: QLM: Queue Management for Large Language Model Serving

Archit Patke, Dhemath Reddy, Saurabh Jha, Christian Pinto, Haoran Qiu, Shengkun Cui, Chandra Narayanaswami, Zbigniew Kalbarczyk, Ravishankar K. Iyer
11:20 - 11:40 am

Project Showcase: Optimizing Data I/O for LLM Datasets on Remote Storage

Tianle Zhong, Jiechen Zhao, Xindi Guo, Qiang Su, Geoffrey Fox
12:00 - 1:30 pm

Lunch

1:30 - 2:20 pm

Panel Discussion: Landing AI/ML in Cloud Service for quality and efficiency: Challenges and Opportunities

Moderator: Jian Zhang, Microsoft
Panelists: Hadi S. Esmaeilzadeh, UCSD; Martin Maas, Google; Madan Musuvathi, Microsoft; Prashant Shenoy, UMass Amherst; Zhangwei Xu, Microsoft
2:20 - 3:10 pm

Keynote 2: Sustainable Cloud Operations and the Role of AI

Prashant Shenoy, University of Massachusetts Amherst

Prashant Shenoy is currently a Distinguished Professor and Associate Dean in the College of Information and Computer Sciences at the University of Massachusetts Amherst. He received the B.Tech degree in Computer Science and Engineering from the Indian Institute of Technology, Bombay and the M.S and Ph.D degrees in Computer Science from the University of Texas, Austin. His research interests lie in distributed systems and networking, with a recent emphasis on cloud and green computing. He has been the recipient of several best paper awards at leading conferences, including a Sigmetrics Test of Time Award. He serves on editorial boards of the several journals and has served as the program chair of over a dozen ACM and IEEE conferences. He is a fellow of the ACM, the IEEE, the AAAS, and the AAIA.

Abstract: The exponential growth of cloud computing has been a defining trend of our time, fueled by rapidly growing demands from data-intensive and machine-learning workloads. Despite the end of Dennard scaling, the cloud's energy demand grew more slowly than expected over the past decade due to the aggressive implementation of energy-efficiency optimizations. Unfortunately, there are few significant remaining optimization opportunities using traditional methods, and moving forward, the cloud's and AI's continued exponential growth will translate into rising energy demand, which, if left unchecked, will translate to increasing carbon emissions.

In this talk, I will discuss the role of AI in enabling sustainable cloud operations. I will discuss how AI workloads have contributed to the rising demand for cloud computing and the promise that AI holds for enhancing the sustainability of cloud platforms. I will then present our CarbonFirst approach, which focuses on using AI and optimization-driven approaches for carbon-aware scheduling to reduce the carbon footprint of modern cloud applications. I will end with open research challenges in the emerging field of computational decarbonization.

3:10 - 3:30 pm

Afternoon Break

3:30 - 4:00 pm

Technical Talk: UniCache: The Next 700 Caches for Serverless Computing

Jovan Stojkovic, Tianyin Xu, Hubertus Franke, Josep Torrellas
4:00 - 4:30 pm

Technical Talk: Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction

Haoran Qiu, Weichao Mao, Archit Patke, Shengkun Cui, Saurabh Jha, Chen Wang, Hubertus Franke, Zbigniew Kalbarczyk, Tamer Başar, Ravishankar K. Iyer
4:30 pm

Closing