TL;DR:
- Informatica announces CLAIRE GPT and CLAIRE Co-pilot, the latest AI-powered data management platform and its companion application.
- CLAIRE GPT utilizes large language models and generative AI, which allows customers to reduce their time spent on data management tasks by up to 80%.
- CLAIRE processes 54 trillion data management transactions every month, holds 23 petabytes of data, and includes 50,000 metadata-aware connections.
- DQ Insights, a part of Informatica’s data observability capabilities, utilizes generative AI capabilities to provide fixes for detected data quality issues.
- CLAIRE GPT’s generative AI capabilities will also deliver automatic data classification at a petabyte scale to help generate data governance artifacts and write business rules for MDM tasks.
- CLAIRE GPT can learn from Informatica’s massive repository of historical MDM data and use GPT-3.5 and other large language models to generate suggestions.
- Informatica’s CLAIRE Co-Pilot can assist in automating repetitive tasks such as debugging, testing, refactoring, and documentation and work as a team with CLAIRE GPT.
- CLAIRE GPT uses OpenAI’s GPT-3.5 and other models for language tasks developed for specific industries and individual customers.
- CLAIRE users can see the lightning bolt next to the insights and recommendations when looking at the screen.
- The “human in the loop” approach helps minimize possible errors from LLM hallucinations.
Main AI News:
Informatica has recently announced its latest release of the AI-powered data management platform, CLAIRE GPT, which is now available in the cloud, along with its companion application, CLAIRE Co-pilot. According to the company, CLAIRE GPT utilizes large language models and generative AI, which allows customers to reduce their time spent on data management tasks such as data mapping, data quality, and governance by up to 80%.
Since its inception in 2017, Informatica has been using AI and machine learning technology to improve data management. As data management can be a big data problem, Informatica adopted AI and ML technologies to spot patterns across its platform and generate useful predictions. While some customers remain on-prem, plenty of Informatica customers have migrated to the cloud, where they can benefit from advanced AI-powered data management capabilities while also contributing to the development of these capabilities.
CLAIRE processes a staggering 54 trillion data management transactions every month, representing various data governance tasks such as ETL/ELT, master data management (MDM) matching, data catalog entry, data quality rule-making, and more. CLAIRE holds an enormous 23 petabytes of data and includes 50,000 metadata-aware connections that represent every operating system, database, application, file system, and protocol.
Informatica is now taking its AI/ML game to the next level with the release of CLAIRE GPT, its next-generation data management platform. According to Informatica Chief Product Officer Jitesh Ghai, the company is leveraging all the data in CLAIRE to train large language models that can handle common data quality and MDM tasks on behalf of users.
Ghai states, “Now, in the cloud, all of that metadata and all of the AI and ML algorithms are expanded to support data integration workloads and make it simpler to build data pipelines, to auto-identify data quality issues at petabyte scale. This was not done before. This is new. We call it DQ Insights, a part of our data observability capabilities.” DQ Insights utilizes generative AI capabilities to provide fixes for detected data quality issues.
Moreover, the platform allows automatic data classification at the petabyte scale, which helps to generate data governance artifacts and write business rules for MDM tasks. Some of these generative AI capabilities will be delivered via CLAIRE Co-pilot, which is part of CLAIRE GPT.
With CLAIRE GPT learning from Informatica’s massive repository of historical MDM data and using GPT-3.5 and other large language models to generate suggestions, customers can radically simplify their master data management. Ghai notes that an MDM project that would typically take 12 to 16 months to complete can now be completed in just weeks.
For instance, one of Informatica’s car manufacturing customers previously employed 10 data engineers for over two years to develop 200 classifications of proprietary data types within their data lake. However, with Informatica’s auto-classification, they were able to generate 400 classifications within minutes. This not only saves time and money but also allows organizations to allocate resources more effectively.
Informatica’s latest development, CLAIRE GPT, promises to transform the data management game with its prompt-based data management experiences. With this suite of tools, even users with minimal technical skill sets can create prompts for CLAIRE GPT to follow.
For instance, a user could simply say, “CLAIRE, connect to Salesforce. Aggregate customer account data on a monthly basis. Address data quality inconsistencies with date format. Load into Snowflake.” The AI assistant delivers automation, insights, and benefits for data integration, data quality, master data management, and cataloging for governance, as well as to democratize data through the marketplace to our data marketplace.
But Informatica doesn’t stop there. CLAIRE Co-Pilot, to be launched in Q3 or Q4 of this year, can assist in automating repetitive tasks such as debugging, testing, refactoring, and documentation. As subject matter expert stand-ins, the two AI assistants can work as a team in a manner similar to pair programming. “Pairs programming has its benefits with two people supporting each other and coding,” he says. “Data management and development equally can benefit from an AI assistant, and Claire Co-pilot is that AI assistant delivers automation, insights, and benefits for data integration, for data quality, for master data management, for cataloging for governance, as well as to democratize data through the marketplace to our data marketplace.”
CLAIRE GPT uses OpenAI’s GPT-3.5 to generate responses alongside traditional classification and clustering algorithms. However, Informatica is also working with Google’s Bard and Facebook’s LLaMA for some language tasks. “We have what we think of as a system of models, a network of models, and the path you go down depends on the data management operation,” said Ghai. “It depends on the instruction, depends on whether it’s ingestion or ETL or data quality or classification.”
Furthermore, Informatica uses models developed specifically for certain industries, such as financial services or healthcare. “And then we have local tenanted models that are for individual customers bespoke to their operations,” Ghai says. “That’s the magic of interpreting the instruction and then routing it through our network of models depending on the understanding of what is being asked and then what data management operations need to be conducted.”
CLAIRE users can see the lightning bolt next to the insights and recommendations when looking at the screen. If data quality issues arise, the AI assistant will highlight them as issues identified for the user to validate. This “human in the loop” approach helps minimize possible errors from LLM hallucinations, according to Ghai.
Conlcusion:
Informatica’s new CLAIRE GPT and CLAIRE Co-pilot tools represent a significant step forward in the field of AI-powered data management. By streamlining the data management process and making it more accessible to users with limited technical skills, these tools promise to increase efficiency and productivity across a wide range of industries.
Furthermore, the generative AI capabilities of CLAIRE GPT and the automation offered by CLAIRE Co-pilot have the potential to transform the way businesses approach data management, allowing them to free up resources and focus on core business activities. As such, we expect that these new tools will have a significant impact on the market, driving increased adoption of AI-powered data management solutions in the years to come.