Program

Zipfle of full program proceedings! (~180 MB)

Tutorials: Sunday, August 25, 2024

Time (PDT) Title Presenters
7:45AM-8:30AM Breakfast/Registration
 
8:30AM-10:35AM Tutorial 1: AI Assisted Hardware Design - Will AI Elevate or Replace Hardware Engineers?

Chair: Bryan Chin, UCSD
 
  Introduction
Bryan Chin, UCSD
  Introduction to AI for Chip Design
Mark Ren, NVIDIA
  AI Driven Optimization
Stelios Diamantidis, Synopsys
  LLM and Chip Design
Hans Bouwmeester, PrimisAI
10:35AM-11:00AM Coffee Break (1/2 hr)
 
11:00AM-12:30PM Tutorial 1: AI Assisted Hardware Design - Will AI Elevate or Replace Hardware Engineers?

Chair: Bryan Chin, UCSD
 
  Domain Adaptive LLM Models
Hanxian Huang, UCSD
  LLM Agents for Chip Design
Mark Ren, NVIDIA
  Future Directions & Panel Discussion
 
12:30PM-1:45PM Lunch (1 hr 15 min)
 
1:45PM-3:15PM Tutorial 2: The Cooling of Hot Chips: How thermal technology is keeping up with the AI revolution

Chair: Seshu Madhavapeddy, Frore Systems
 
  Thermal techniques for higher data center compute density
Tom Garvens, Supermicro
  Next-generation cooling for NVIDIA’s Accelerated Computing
Ali Heydari, NVIDIA
3:15PM-3:45PM Coffee Break (1/2 hr)
 
3:45PM-6:00PM Tutorial 2: The Cooling of Hot Chips: How thermal technology is keeping up with the AI revolution

Chair: Seshu Madhavapeddy, Frore Systems
 
  Thermal challenges of Edge Devices
Nader Nikfar, Qualcomm
  Solid-state active cooling helps maintain Moore’s Law
Prabhu Sathyamurthy, Frore Systems
  Applications for thermo-electric cooling
Jesse Edwards, Phononic
6:00PM-8:00PM Reception
 

Conference Day 1: Monday, August 26, 2024

Time (PDT) Title Presenters
7:45AM-9:15AM Breakfast/Registration
 
9:15AM-9:30AM Welcome
 
  General Chair Welcome
Ron Diamant, General Chair
  Progam Co-Chairs Welcome
Rob Aitken & Larry Yang, PC Co-Chairs
9:30AM-11:00AM High-Performance Processors Part 1

Chair: Ian Bratt
 
  Snapdragon X Elite Qualcomm Oryon CPU: Design & Architecture Overview
Gerard Williams, Qualcomm
  Lunar Lake: Powering the Next Generation of AI PCs
Arik Gihon, Intel
  IBM Next Generation Processor and AI Accelerator
Chris Berry, IBM
11:00AM-11:30AM Coffee Break
 
11:30AM-1:00PM Specialized Processors

Chair: Renu Raman
 
  Blackhole and TT-Metalium - The Standalone AI Computer and its Programming Model
Jasmina Vasiljevic & Davor Capalija, Tenstorrent
  SK Hynix AI-Specific Computing Memory Solution: From AiM device to Heterogeneous AiMX-xPU System for Comprehensive LLM Inference
Guhyun Kim, SK Hynix
  Built for the Edge: The next generation Intel® Xeon 6 SoC
Praveen Mosur, Intel
1:00PM-2:15PM Lunch (1 hr 15 min)
 
2:15PM-3:15PM Keynote #1

Chair: Ralph Wittig
 
  Predictable Scaling and Infrastructure
Trevor Cai, OpenAI
3:15PM-4:15PM AI Processors Part 1

Chair: Pradeep Dubey
 
  NVIDIA Blackwell Platform: Advancing Generative AI and Accelerated Computing
Ajay Tirumala & Raymond Wong, NVIDIA
  SambaNova SN40L RDU: Breaking the Barrier of Trillion+ Parameter Scale Gen AI Computing
Raghu Prabhakar, SambaNova
4:15PM-4:45PM Coffee Break (1/2 hr)
 
4:45PM-6:45PM AI Processors Part 2

Chair: David Weaver
 
  Intel Gaudi 3 AI Accelerator: Architected for Gen AI Training and Inference
Roman Kaplan, Intel
  AMD InstinctTM MI300X Generative AI Accelerator and Platform Architecture
Alan Smith & Vamsi Krishna Alla, AMD
  An AI Compute ASIC with Optical Attach to Enable Next Generation Scale-up Architectures
Manish Mehta, Broadcom
  FuriosaAI RNGD: A Tensor Contraction Processor for Sustainable AI Computing
June Paik, Furiosa
6:45PM-8:30PM Reception
 

Conference Day 2: Tuesday, August 27, 2024

Time (PDT) Title Presenters
7:45AM-8:30AM Breakfast/Registration
 
8:30AM-9:00AM Poster Lightning Session
 
  Poster Lightning Talks
 
9:00AM-10:30AM AI Processors Part 3

Chair: Yasuo Ishii
 
  AMD Versal™ AI Edge Series Gen 2 for Vision and Automotive
Tomai Knopp & Jeffrey Chu, AMD
  Onyx: A Programmable Accelerator for Sparse Tensor Algebra
Kalhan Koul, Stanford
  Next Gen MTIA - Meta’s Recommendation Inference Accelerator
Mahesh Maddury & Pankaj Kansal, Meta
10:30AM-11:00AM Coffee Break
 
11:00AM-12:30PM Networking Processors

Chair: Jae W. Lee
 
  DOJO: An Exa-Scale Lossy AI Network using the Tesla Transport Protocol over Ethernet (TTPoE)
Eric Quinnell, Tesla
  ACF-S: An 8-Tbit/s SuperNIC for High-Performance Data Movement in AI & Accelerated Compute Networks
Shrijeet Mukherjee & Thomas Norrie, Enfabrica
  4 Tbit/s Optical Compute Interconnect Chiplet for XPU-to-XPU Connectivity
Saeed Fathololoumi, Intel
12:30PM-1:45PM Lunch (1 hr 15 min)
 
1:45PM-2:45PM Keynote #2

Chair: Ian Bratt
 
  The Journey to Life with AI Pervasiveness
Victor Peng, AMD
2:45PM-3:45PM High-Performance Processors Part 2

Chair: Nhon Quach
 
  Wafer-Scale AI: Enabling Unprecedented AI Compute Performance
Sean Lie, Cerebras
  XiangShan: An Open-Source Project for High-Performance RISC-V Processors Meeting Industrial-Grade Standards
Kaifan Wang, Chinese Academy of Sciences
3:45PM-4:15PM Coffee Break (1/2 hr)
 
4:15PM-6:15PM High-Performance Processors Part 3

Chair: Lingjie Xu
 
  AmpereOne: Sustainable Computing for AI & Cloud Native Workloads
Matthew Erler, Ampere Computing
  Inside MAIA 100
Sherry Xu & Chandru Ramakrishnan, Microsoft
  AMD Next Generation “Zen 5” Core
Brad Cohen & Mahesh Subramony, AMD
  MN-Core 2: Second-generation processor of MN-Core architecture for AI and general-purpose HPC applications
Jun Makino, Preferred Networks
6:15PM-6:30PM IEEE TCMM Awards
 
  IEEE TCMM Awards
Gabriel Southern, TCMM Chair
6:30PM-6:45PM Closing Remarks
 
  Closing Remarks
Jan-Willem van de Waerdt, Vice Chair

Posters

(NOTE: Not all posters have videos)

Title Authors & Affiliation
Picasso: An Area/Energy-Efficient End-to-End Diffusion Accelerator with Hyper-Precision Data Type Sungyeob Yoo, Geonwoo Ko, Seri Ham, Seeyeon Kim, Yi Chen & Joo-Young Kim; Korea Advanced Institute of Science and Technology
NeuGPU: A Neural Graphics Processing Unit for Instant Modeling and Real-Time Rendering for Mobile AR/VR Devices Junha Ryu, Hankyul Kwon, Wonhoon Park, Zhiyong Li, Beomseok Kwon, Donghyeon Han, Dongseok Im, Sangyeob Kim, Hyungnam Joo, Minsung Kim & Hoi-Jun Yoo; Korea Advanced Institute of Science and Technology
Space-Mate: A 303.5mW Real-Time NeRF SLAM Processor with Sparse Mixture-of-Experts-based Acceleration Gwangtae Park, Seokchan Song, Haoyang Sang, Dongseok Im, Donghyeon Han, Sangyeob Kim, Hongseok Lee & Hoi-Jun Yoo; Korea Advanced Institute of Science and Technology
A Low-power Large-Language-Model Processor with Big-Little Network and Implicit-Weight-Generation for On-device AI Sangyeob Kim, Sangjin Kim, Wooyoung Jo, Soyeon Kim, Seongyon Hong, Nayeong Lee & Hoi-Jun Yoo; Korea Advanced Institute of Science and Technology
A 40-nm 13.88-TOPS/W FC-DNN Engine for 16-bit Intelligent Audio Processing Featuring Weight-Sharing and Approximate Computing Tay-Jyi Lin, Ze Li, Yun-Cheng Chen, Chien-Tung Liu & Jinn-Shyan Wang; National Chung Cheng University, Taiwan
A Trusted Execution Environment RISC-V System on Chip Binh Kieu-Do-Nguyen, Tuan-Kiet Dang, Khai-Duy Nguyen, Cong-Kha Pham & Trong-Thuc Hoang; University of Electro-Communications
A 1.19GHz 9.52Gsamples/sec Radix-8 FFT Hardware Accelerator in 28nm Larry Tang, Siyuan Chen, Keshav Harisrikanth, Guanglin Xu, Franz Franchetti & Ken Mai; Carnegie Mellon University
PACE: A Scalable and Energy Efficient CGRA in a RISC-V SoC for Edge Computing Applications Vishnu Nambiar, Yi Sheng Chong, Thilini Bandara, Dhananjaya Wijerathne, Zhaoying Li, Rohan Juneja, Li-Shiuan Peh, Tulika Mitra & Anh Tuan Do; Institute of Microelectronics, Agency for Science, Technology and Research (A*STAR) and National University of Singapore
LSPU: A 20.7ms Low-latency Point Neural Network-based 3D Perception and Semantic LiDAR SLAM System-on-Chip for Autonomous Driving System Jueun Jung, Seungbin Kim, Bokyoung Seo, Wuyoung Jang, Sangho Lee, Jeongmin Shin, Donghyeon Han & Kyuho Lee; Ulsan National Institute of Science and Technology
RISC-V-based System-on-Chips for IoT Applications Khai-Duy Nguyen, Tuan-Kiet Dang, Binh Kieu-Do-Nguyen, Cong-Kha Pham & Trong-Thuc Hoang; University of Electro-Communications (UEC), Tokyo, Japan
A Smart Cache for a SmartNIC! Scaling End-Host Networking to 400Gbps and Beyond Annus Zulfiqar, Ali Imran, Venkat Kunaparaju, Ben Pfaff, Gianni Antichi & Muhammad Shahbaz; Purdue University
Towards True GPU Performance Scaling for OpenGPU Blaise Tine & Hyesoon Kim; UCLA
CogniVision: A mW Power envelope SoC for Always-on Smart Vision in 40nm Anuimesh Gupta, Japesh Vohra & Massimo Alioto; National University of Singapore
NeCTAr and RASoC: Tale of Two Class SoCs for Language Model Inference and Robotics in Intel 16 Viansa Schmulbach, Jason Kim, Ethan Gao, Nikhil Jha, Ethan Wu, Oliver Yu, Ben Oliveau, Xiangwei Kong, Brendan Roberts, Connor McMahon, Lixiang Yin, Vamber Yang, Brendan Brenner, George Moujaes, Boyu Hao, Lucy Revina, Kevin Anderson, Bryan Ngo, Yufeng Chi, Hongyi Huang, Reza Sajadiany, Raghav Gupta, Ella Schwarz, Jennifer Zhou, Ken Ho, Jerry Zhao, Anita Flynn and Borivoje Nikolić; University of California, Berkeley