Research Article Impacting Stock Market
A research article has rarely had such a significant impact on the stock market. Over the past week, chip giants have seen their stock prices plummet, erasing some of the spectacular gains made over the past year. The reason behind this decline is a new technology unveiled by Google researchers that hints at a significant reduction in the need for memory to operate generative artificial intelligence models. This advancement could lower inference costs while impacting the demand for certain chips.
The impact on memory manufacturers is hard to determine. Firstly, because the technology is still in the research stage, with no guarantee of large-scale applications or real benefits. Secondly, because any gains could be offset by increased usage, driven by the decrease in AI costs. Additionally, the demand for these chips continues to outstrip supply. The market’s reaction illustrates a persistent ambivalence between significant commercial prospects and fears of a possible bubble.
Memory Requirements Could Decrease Sixfold
The rise of generative AI is not solely reliant on immense computing power provided by graphics cards. It also depends heavily on vast amounts of memory, particularly HBM memories crucial for model training. This massive appetite has led to a production crisis, resulting in a reallocation of capacities towards AI components and a surge in prices. The repercussions extend beyond the sector, affecting chips used in smartphones and PCs.
Google’s team innovation focuses on the inference phase, the process of generating text or an image. Their algorithm compresses AI models significantly reducing memory needs without any loss of accuracy. Tests on several open-source models achieved a sixfold reduction. Its implementation is exceptionally efficient and results in negligible execution overhead (thus negligible additional costs), as stated by the researchers.
Disproportionate Reaction?
Despite the efficiency gains, these results do not imply a sixfold reduction in overall memory needs. Google’s compression method could, however, lead to a significant drop in inference costs – a potentially crucial advancement as AI development expands usage. It could also enable local model execution without cloud platform costs.
Analysts believe market concerns about memory demand are exaggerated. Three arguments support this view. Firstly, the most lucrative HBM chips should not be affected. Secondly, efficiency gains could allow more models to run with constant memory quantities rather than reducing the volumes required for equivalent use. Lastly, reduced inference costs could accelerate AI deployment – a phenomenon known as the Jevons paradox.
For more information:
– Nvidia’s Strategic Shift with its First Dedicated Inference Chip
– AI Boosts Profits for Samsung and SK Hynix






