Using model compilation and graph rewriting to optimize hardware efficiency and performance is the goal of this internship, which aims to enhance the performance of transformer-based AI models. We'll take a close look at a number of AI model optimization strategies, including dispatching, subgraph fusion, and device placement. These will be utilized to efficiently facilitate performance enhancements in various AI model topologies and tasks.
top of page
bottom of page