Next 10x in AI - System, Silicon, Algorithms, Data
Abstract: This talk explores the critical elements driving the next 10x in AI acceleration, focusing on the interconnected pillars of System & Silicon, Algorithms, and Data. We delve into efficient codesign for accelerated computing, highlighting techniques like flash attention, quantization and compression, alongside silicon and system innovations. Moving beyond dense matrix workloads, the talk examines trends like retrieval-augmented Large Language Models (LLMs), multimodal systems and agents, while emphasizing the crucial role of data quality and training methods. This presentation offers a holistic view of the evolving AI landscape, providing engineers a roadmap for navigating and contributing to the future of AI acceleration and its applications.