Rapid Adoption of Generative AI Accelerates the Need for Data Management

TL;DR:

  • Interest in large language models intensifies the focus on data management, placing pressure on technology leaders to ensure effective storage, filtering, and protection of data for AI use.
  • Companies with robust data infrastructure can leverage large language models for custom business applications, gaining a competitive edge.
  • CIOs seek assistance from data experts and specialized vendors to establish data frameworks that enable generative AI applications.
  • Data quality, formatting, and organization are crucial for training AI models, emphasizing the need for comprehensive data governance.
  • Startups like Granica offer services to optimize data storage, processing, and protection for generative AI, reducing costs and ensuring cybersecurity.
  • Concerns regarding data privacy, context, and quality are amplified by the adoption of generative AI.
  • Companies are exploring innovative solutions, such as chatbots built on large language models, to improve operations and gain a competitive advantage.

Main AI News:

In today’s business landscape, the rise of large language models, exemplified by OpenAI’s ChatGPT, has intensified the importance of effective data management. As a result, technology leaders within organizations are facing increased pressure to establish robust systems for storing, filtering, and safeguarding data to support AI applications.

Rob Zelinka, the Chief Information Officer of Jack Henry, a financial technology firm, emphasizes the significance of data governance across industries. He asserts that with the integration of large language models, the need for a well-structured data management approach becomes even more critical.

Companies that have already laid the groundwork for a solid data infrastructure are at a distinct advantage. They can swiftly leverage large language models to enhance various aspects of their operations, including contract management, customer service, and code generation. To outpace their competitors in the race for innovation, business technology leaders now face mounting demands to deliver data frameworks that enable the practical implementation of generative AI applications.

To address this challenge, some Chief Information Officers have sought assistance from in-house data experts and external vendors specializing in data infrastructure setup and cost management. Data, encompassing transaction records, analytics, code, and proprietary information, serves as the backbone of any AI model. It is instrumental in training algorithms to identify patterns and make accurate predictions.

Larry Pickett, the Chief Information and Digital Officer of Syneos Health, takes responsibility for formulating a comprehensive corporate data management strategy. The primary focus of this strategy is to streamline the organization, cleansing and organizing data across all departments. Syneos Health has already integrated data from its operational systems, such as enterprise resource planning and clinical trial information, into a centralized digital repository called a data lake.

To further refine its data infrastructure, Syneos Health dedicated approximately 18 months to preparing its data repository for training AI models. Pickett spearheaded a team of data scientists and business domain experts to construct “feature stores,” which act as centralized repositories for reusable machine learning building blocks.

Moreover, the company prioritizes efficient data storage by deleting obsolete information and retaining only the data required for AI, dashboards, and other applications. Pickett emphasizes the importance of staying vigilant in managing cloud and data storage costs to prevent unnecessary expenditures.

The training of large language models necessitates access to substantial amounts of data, leading to costly storage, processing, and protection requirements. Startups like Granica, a recently emerged company based in Mountain View, California, offer services tailored to support companies in harnessing generative AI effectively. By compressing data stored in cloud platforms like Amazon.com and Google Cloud, Granica significantly reduces the size and cost of cloud object storage, which houses vast amounts of unstructured data.

Granica’s innovative approach has garnered significant attention, securing $45 million in funding from prominent venture-capital firms such as New Enterprise Associates and Bain Capital Ventures. These investments will allow Granica to continue offering its services, enabling companies to leverage generative AI while minimizing costs and ensuring cybersecurity.

In the pursuit of securing AI training data, Nylas, a provider of email, calendar, and contacts APIs, has turned to Granica’s Screen service. This service excels at removing sensitive company data and personally identifiable information during the data compression process. John Jung, Nylas’s Vice President of Engineering, highlights the importance of scrubbing personally identifiable information from generative AI tools to prevent the generation of false results.

Industry analysts predict a surge in startups dedicated to assisting companies in managing data access and sifting through data for generative AI purposes. As organizations focus on maintaining data quality, they recognize its equal importance to controlling costs. This entails ensuring that data is correctly formatted, organized, and relevant for training AI models. Without proper data cleansing and categorization, businesses risk storing meaningless data, resulting in unnecessary expenses.

Jack Henry, for instance, is presently prioritizing data governance efforts. Rob Zelinka collaborates with the company’s Chief Risk Officer to define data access rights and usage policies. Additionally, in partnership with the Chief Technology Officer, the team is exploring ways to integrate generative AI into their products and platforms.

Erick Brethenoux, a Distinguished Vice President Analyst at Gartner, highlights the prevalent concerns among companies regarding data quality, context, and privacy when employing large language models. Although these challenges have existed for some time, the current interest in generative AI has significantly accelerated the urgency to address them.

Syneos Health is gearing up to introduce its “Protocol Genius” tool, a chatbot built on OpenAI’s large language model and ChatGPT. This innovative solution allows users to search across an extensive database of 400,000 clinical protocols. Driven by the competitive nature of the business landscape, Syneos Health aims to lead the way, confident that others will follow suit in adopting similar technologies.

Conclusion:

The rise of generative AI has underscored the significance of effective data management in today’s business landscape. Companies that prioritize data governance and invest in robust data infrastructure are better positioned to harness the power of large language models for enhanced business applications. Startups specializing in optimizing data storage and security for generative AI offer valuable solutions, catering to the increasing demand in the market. As organizations navigate the challenges of data privacy, context, and quality, they must remain proactive in adapting their data management strategies to unlock the full potential of generative AI technologies.

Source