Services that rely heavily on Artificial Intelligence (AI), such as speech understanding and image recognition, have been receiving an enormous amount of attention. In the meanwhile, the ever increasing scale and complexity of cloud platform itself calls for leveraging more AI for building and managing cloud platform, to deliver highly efficient and reliable cloud service, enable high customer satisfaction, and achieve high engineering productivity. In this talk, I will first share our vision of infusing AI into Azure platform and DEVOps process, we call it intelligent cloud platform and AIOps. I will then give a brief overview of our AIOps efforts. I will also use Resource Central as a case study. Resource Central (RC) is a novel machine learning and prediction-serving system for improving cloud resource management. We are placing RC right at the center of Microsoft Azure. To conclude, I will discuss some lessons from deploying such research efforts in production and how they relate to academic research.
Cloud Intelligence Keynote Slide Deck
Murali Chintalapati @ Microsoft
Software Analytics focuses on utilizing data-driven approaches to help improve the quality of software systems, the user experience of interacting with software systems, and the productivity of software development process. Software analytics is an important research area in the software engineering community for more than a decade. It has already made broad impact in software industry.
In the era of cloud computing, the entire software stack, ranging from user facing experiences to fundamental storage and computing platform, is often manifested as cloud services. Due to its distributed nature, great complexity and enormous scale, cloud services pose unique challenges and opportunities to the software analytics research. In particular, AIOps and AI for Software Systems are two emerging topics both researchers and practitioners are actively working on in recent years.
In this talk, I will first introduce the research landscape of software analytics and cloud Intelligence. Then using a couple of projects as examples, I will talk about our research and impact in software analytics, as well as our experiences working with different product teams on joint innovations across Microsoft. I will also discuss the research challenges and opportunities in cloud Intelligence moving forward.
Yi Zhen (LinkedIn Corporation)*; Yung-Yu Chung (LinkedIn); Yang Yang (LinkedIn Corporation); Lei Zhang (LinkedIn Corporation); Ruoying Wang (LinkedIn); Bo Long (LinkedIn Corporation); Tie Wang (LinkedIn Corporation); Pranay Kanwar (LinkedIn Corporation); Dong Wang (LinkedIn Corporation); Mike Snow (LinkedIn Corporation); Sanket Patel (LinkedIn Corporation); Stephen Bisordi (LinkedIn Corporation); Viji Nair (LinkedIn Corporation)
Zezhong Zhang (eBay Inc)*; Keyu Nie (eBay Inc); Tao Yuan (eBay Inc)
Ruoying Wang (LinkedIn)*; Lei Zhang (LinkedIn Corporation); Yang Yang (LinkedIn Corporation); Yi Zhen (LinkedIn Corporation); Bo Long (LinkedIn Corporation); Tie Wang (LinkedIn Corporation); Vinoth Govindaraj (LinkedIn Corporation); Todd Palino (LinkedIn Corporation); Samir Tata (LinkedIn Corporation); Viji Nair (LinkedIn Corporation)
All systems and applications are composed from basic data structures and algorithms, such as index structures, priority queues, and sorting algorithms. Most of these primitives have been around since the early beginnings of computer science (CS) and form the basis for every CS intro lecture. Yet, we might be in front of inflection point. A recent result by my group shows that machine learning has the potential to significantly alter the way those primitives are implemented and the performance they can provide.
In this talk, I will use index structures, such as B-Trees, Hash-Maps, and Bloom-Filters, as an example to explain the intuition behind learned data structures and algorithms, and outline opportunities and existing research challenges for using this technology in practice.
Cloud computing infrastructure is becoming ubiquitous worldwide. With the rapid growth of digitization and IoT devices, the need of large-scale Cloud infrastructure keeps increasing, which presents greater challenges to its management and operational efficiency. At Alibaba Cloud Intelligence, we focus on using data and the very best techniques that Cloud enables, such as AI algorithms, to manage the Cloud infrastructure itself in an autonomous fashion. In this talk, we give an overview of the top issues Cloud infrastructure operation is facing. Then we share some recent progress on specific topics such as Cloud resource capacity planning, fast datacenter anomaly detection, hardware failure prediction, cluster-level self-healing and so on.
Zhuangbin Chen (The Chinese University of Hong Kong)*; Yu Kang (MSRA); Feng Gao (Microsoft, Redmond); Li Yang (Microsoft, Redmond); Jeffery Sun (Microsoft, Redmond); Zhangwei Xu (Microsoft, Redmond); Pu Zhao (Microsoft Research); Bo Qiao (Microsoft Research); Liqun Li (Microsoft Research); Xu Zhang (Microsoft Research); Qingwei Lin (Microsoft Research); Michael Lyu (The Chinese University of Hong Kong)
Minghua Ma (Tsinghua University )*; Christopher Zheng (McGill University); Junjie Chen (Tianjin University); Yilin Li (Tsinghua University); Xiao Peng (China EverBright Bank); Gang Wang (China EverBright Bank); Yong Wu (China EverBright Bank); Fang Zhou (China EverBright Bank); Wenchi Zhang (China EverBright Bank); Kaixin Sui (Bizseer Technology); Dan Pei (Tsinghua University)
Jargalsaikhan Narantuya (Gwangju Institute of Science and Technology); Jiwon Yang (GIST); Jongwon Kim (GIST); Hyuk Lim (Gwangju Institute of Science and Technology)*
Yigong Hu (Johns Hopkins University)*; Ze Li (Microsoft); Peng Huang (Johns Hopkins University); Suhas Pinnamaneni (Microsoft); Francis David (Microsoft); Yingnong Dang (Microsoft, USA); Murali Chintalapati (Microsoft)
Si Qin (Microsoft Research)*; Yong Xu (Microsoft Research ); Shandan Zhou (Microsoft, USA); Qingwei Lin (Microsoft Research); Thomas Moscibroda (Microsoft, USA); Hongyu Zhang (University of Newcastle); Saurabh Agarwal (Microsoft Azure); Karthikeyan Subramanian (Microsoft Azure); Eli Cortez (Microsoft Research); John Miller (Microsoft Azure); Chris Cowdery (Microsoft Azure); Shanti Kemburu (Microsoft Azure); Dongmei Zhang (Microsoft Research)
Salesforce is world's #1 customer relationship management (CRM) platform. Salesforce Trusted Infrastructure is the foundation of our services. In this talk, we'll share Salesforce infrastructure data science approach to augment and enhance the efficiency of data center operations with an interpretable ML model.