- Researchers unveil vulnerabilities in closed AI models from OpenAI and Google.
- The attack method reveals hidden layers, including embedding projection, through API queries.
- The cost of attack varies based on model size and query volume.
- Specifics of newer models remain undisclosed, but dimensions of deprecated models are revealed.
- Attack exposes critical aspects of models, raising concerns about replication and misuse.
- Proactive monitoring and robust security measures are essential to safeguard AI assets.
Main AI News:
Whizkids at OpenAI and Google have faced a setback as researchers uncover vulnerabilities in their closed AI models. Computer scientists from prestigious institutions including Google DeepMind, ETH Zurich, and others have devised an attack method that sheds light on the hidden layers of transformer models.
This breakthrough partially exposes the inner workings of these so-called “black box” models, revealing crucial details such as the embedding projection layer. The attack, achieved through API queries, comes at a cost ranging from a few dollars to several thousand, depending on the model’s size and query volume.
In a recent paper, the researchers detail how, for less than $20 USD, they extracted the entire projection matrix of OpenAI’s ada and babbage models, confirming hidden dimensions of 1024 and 2048, respectively. They also provide insights into the gpt-3.5-turbo model, estimating it would cost under $2,000 to recover its entire projection matrix.
Both OpenAI and Google have been notified of these findings and have taken steps to fortify their defenses against such attacks. While specifics of the newer OpenAI gpt-3.5-turbo models remain undisclosed, deprecated models like ada and babbage had their sizes revealed, deemed harmless due to their obsolescence.
Although the attack doesn’t fully unravel a model’s complexity, it exposes critical aspects like the final weight matrix and width, which are indicative of the model’s parameters and capabilities. This level of insight raises concerns about potential replication and misuse of proprietary models, prompting discussions on safeguarding AI advancements.
In response to these developments, experts emphasize the importance of monitoring usage patterns and implementing robust security measures to safeguard sensitive AI assets. As AI continues to evolve, proactive measures are essential to maintain integrity and security in the face of emerging threats.
Conclusion:
The discovery of vulnerabilities in closed AI models presents significant implications for the market. It underscores the importance of robust security measures to safeguard sensitive AI assets against potential exploitation. As AI continues to advance, proactive measures and ongoing vigilance are imperative to maintain integrity and security in the evolving landscape of artificial intelligence.