As the demand for generative AI technology continues to rise, industry giants, including Microsoft, Google, AWS, and OpenAI, are exploring the development of their own custom chips tailored for AI workloads. Contrary to popular belief, the primary driver behind this push isn't chip shortages but rather a strategic shift toward optimizing the efficiency and cost-effectiveness of processing generative AI queries.
Speculation has swirled around efforts by OpenAI and Microsoft to develop custom chips for handling generative AI tasks, with Microsoft collaborating with AMD on a project codenamed Athena and OpenAI rumored to be eyeing potential acquisitions to bolster its chip-design capabilities. In the meantime, Google and AWS have already introduced their own chips for AI workloads in the form of Tensor Processing Units (TPUs) for Google and AWS' Trainium and Inferentia chips.
So, what's motivating these companies to delve into custom chip development? Analysts and experts point to two key factors: the cost of processing generative AI queries and the efficiency of existing chips, primarily Graphics Processing Units (GPUs). Currently, Nvidia's A100 and H100 GPUs dominate the AI chip market, but their efficiency in handling generative AI workloads is under scrutiny.
Nina Turner, a research manager at IDC, notes that GPUs may not be the most efficient processors for generative AI tasks, and creating custom silicon could potentially address this efficiency issue. GPUs, while highly effective for matrix inversion, a fundamental mathematical process in AI, are costly to operate. The pursuit of silicon processors optimized for specific AI workloads could help alleviate cost-related concerns.
Custom silicon, according to Turner, has the potential to reduce power consumption, improve compute interconnectivity, and enhance memory access, ultimately lowering query costs. For instance, OpenAI's operation cost for ChatGPT is roughly $694,444 per day, which translates to 36 cents per query, based on a report from research firm SemiAnalysis.
Furthermore, custom silicon provides the advantage of exerting control over chip access and designing elements tailored specifically for large language models (LLMs), thereby enhancing query speed.
This shift towards custom chip design is likened to Apple's approach to producing chips for its devices, where specialization trumps general-purpose processors. Despite the popularity of Nvidia's GPUs, they, too, are considered general-purpose devices. Custom chips could be the answer to optimizing performance for specific functions, such as image processing and specialized generative AI.
However, experts caution that developing custom chips is no easy feat. It involves significant challenges, including high investment requirements, lengthy design and development timelines, complex supply chain issues, a scarcity of talent, and the need for a sufficient volume of production to justify the expenditure.
For companies embarking on this journey from scratch, the process can take a minimum of two to two and a half years, with the scarcity of chip design talent causing delays. Several large tech companies have mitigated this challenge by either acquiring startups with expertise in chip development or partnering with experienced firms in the field.
Despite ongoing discussions about chip shortages, experts believe that the move towards custom chip development by companies like OpenAI and Microsoft is more about addressing inference workloads for LLMs, particularly as Microsoft continues to incorporate AI features into its applications. It appears that these companies have specific requirements that aren't met by existing solutions, and a specialized chip for inference workloads, which is more cost-effective and efficient than large GPUs, may be the solution.
Acquiring a major chip designer may not be a cost-effective approach for OpenAI, given the substantial expenses involved in designing and producing custom chips. Instead, experts suggest that OpenAI could explore the acquisition of startups with AI accelerators, a more economically viable option.
To support inferencing workloads, potential acquisition targets could include Silicon Valley firms like Groq, Esperanto Technologies, Tenstorrent, and Neureality. Additionally, SambaNova might be a suitable candidate if OpenAI is willing to transition away from Nvidia GPUs and adopt an on-premises approach, moving beyond a cloud-only paradigm.
The path towards achieving fully autonomous vehicles is a lengthy and intricate journey. Systems that incorporate cutting-edge technologies to enhance vehicle autonomy levels must undergo rigorous safety and durability testing before they can be integrated into vehicles meant for public roads. These systems, collectively referred to as Advanced Driver Assistance Systems (ADAS), encompass a complex network of power supplies, sensors, and electronics. The effectiveness of ADAS largely hinges on the precision of the sensing equipment and the speed and accuracy of the onboard autonomous controller's analysis.
Artificial intelligence (AI) plays a pivotal role in the functioning of autonomous vehicles, particularly in the context of onboard analysis. Market research firm IDTechEx's recent report on AI hardware at the network edge predicts substantial growth, with AI chips – specialized semiconductor components designed to efficiently handle machine learning tasks – projected to generate over $22 billion in revenue by 2034. Among various industry verticals, the automotive sector is anticipated to experience the most significant growth, with a compound annual growth rate (CAGR) of 13% over the next decade.
AI chips in automotive vehicles are typically situated within centrally located microcontrollers (MCUs), which are connected to sensors and antennae to form a functional ADAS. These onboard AI computing capabilities serve various purposes, including driver monitoring (for driver-specific adjustments, monitoring drowsiness, and responding to accidents), driver assistance (for object detection and steering/braking corrections), and in-vehicle entertainment (with onboard virtual assistants akin to those on smartphones and smart appliances).
Of these functions, driver assistance is the most critical, as it directly influences the level of autonomous driving a vehicle can achieve. The automotive industry's reference point for defining different levels of driving automation is the SAE Levels of Driving Automation, ranging from Level 0 (no automation) to Level 5 (full automation). Presently, the highest state of autonomy for private vehicles is SAE Level 2, with the transition to Level 3 representing a significant technological leap.
A variety of sensors, including LiDAR and vision sensors, installed in the vehicle collect crucial data, which is then processed by the central computing unit for steering and braking adjustments. Effective processing relies on extensive training of the machine learning algorithms employed by the AI chips. This training involves exposing the algorithms to large volumes of ADAS sensor data, enabling them to accurately detect, identify, and differentiate objects, as well as gauge depth of field and distinguish objects from their backgrounds. ADAS functions can be passive (alerting the driver through sounds, lights, or feedback) or active (making real-time adjustments for the driver), necessitating swift and precise calculations.
The development of System-on-Chips (SoCs) for vehicular autonomy is a relatively recent phenomenon. Still, it's evident that there is a trend toward smaller node processes, which enhance performance. As autonomy levels rise, more computational power is required, and this shift to smaller nodes aligns with this demand, effectively outsourcing the computational complexity to semiconductor circuitry.
However, transitioning to smaller nodes entails higher manufacturing costs, particularly with the use of advanced lithography machines. This cost factor poses a significant barrier to entry for many semiconductor manufacturers. Consequently, several Integrated Device Manufacturers (IDMs) are outsourcing high-performance chip production to foundries capable of advanced fabrication.
To ensure cost efficiency in the future, chip designers must consider scalability in their systems. As the adoption of autonomous driving levels progresses incrementally, designers who overlook scalability may incur escalating costs in increasingly advanced nodes. Hardware that can adapt to more advanced AI algorithms is essential.
While it will take some time before we witness vehicles with the highest levels of automation on the roads, the technology to reach that point is gaining momentum. The next few years are particularly crucial for the automotive industry as it navigates the path toward autonomous driving.