A global report on Thursday highlighted that foundation models like Meta’s Llama 2 and OpenAI’s GPT-4, which are based on artificial intelligence (AI), lack transparency. The Foundation Model Transparency Index, crafted by AI researchers from Stanford University, MIT Media Lab, and Princeton University, assessed the transparency of the top 10 AI models in terms of sharing information about their work and how users interact with their systems.
The report revealed a fundamental lack of transparency in the AI industry, with no major foundation model developer meeting the criteria for adequate transparency. In the tested models, Meta’s Llama 2 scored the highest at 54%, followed by BloomZ at 53%, and OpenAI’s GPT-4 at 48%.
The report also looked at several other models, such as Stability’s Stable Diffusion (47%), Google’s PaLM 2 (40%), Anthropic’s Claude (36%), Command from Cohere (34%), AI21 Labs’ Jurassic 2 (25%), Inflection-1 (21%) from Inflection, and Amazon’s Titan (12%).
As these models increasingly shape society, the report indicates a concerning decline in transparency. The researchers express worry that if this trend persists, foundation models might end up as opaque as social media platforms and other past technologies, potentially repeating their shortcomings.
The report measured transparency using 100 indicators, focusing on details about how the models are created, their functionality, and user interaction. The researchers evaluated companies based on their primary and effective foundation models, collecting information that developers publicly shared up until September 15.
Two researchers assigned scores to each developer based on 100 indicators, checking if the developer met each indicator using publicly available information. The initial scores were then shared with the company leaders, giving them the opportunity to challenge scores they didn’t agree with.
Even though the average score was only 37%, there were positive signs. At least one developer satisfied 82 out of the 100 indicators, indicating room for improvement. The researchers suggest that developers can enhance transparency by learning from the best practices of their peers. They see this report as a starting point and plan to track improvements in future versions of the Index.