Silicon Valley’s leading AI developers, like Scale AI and Appen, are recruiting poets and writers to enhance their generative AI models

TL;DR:

  • Silicon Valley’s AI giants like Scale AI and Appen are actively hiring poets, novelists, and writers with advanced degrees to improve their generative AI models.
  • Creative writers will craft short stories and assess the literary quality of AI-generated text, unlocking new possibilities across various languages.
  • The collaboration between AI and creative writers signifies a strategic focus on enhancing fluency in poetic forms.
  • This investment holds promise for AI firms, offering a competitive edge in the evolving generative AI landscape.
  • The recruitment of expert creatives addresses AI’s limitations in replicating complex poetic structures and styles.
  • Major AI developers have historically relied on accessible databases, predominantly featuring English content.
  • Compensation for expert creatives in underrepresented languages can be substantial, emphasizing the demand for linguistic expertise.
  • The professionalization of data work is a significant shift, reflecting higher standards for expertise and language command.
  • Embracing creative writers may mitigate copyright infringement concerns, offering a solution to AI’s use of copyrighted material.

Main AI News:

In the competitive landscape of artificial intelligence development, Silicon Valley’s top players are turning to an unexpected source of talent: poets. High-profile training data companies like Scale AI and Appen are actively recruiting poets, novelists, playwrights, and writers with advanced degrees. Beyond English, they are seeking creative minds in languages less represented on the internet, such as Hindi and Japanese. The goal? To harness the creative prowess of wordsmiths to enhance their AI models.

These visionary firms are enlisting contractors to craft short stories on specific topics, which will then be fed into AI models. Additionally, these creative talents will evaluate the literary quality of AI-generated text, bridging the gap between AI’s impressive capabilities and the underlying annotation work that powers it.

The symbiotic relationship between generative AI and creative writers gained prominence with the launch of ChatGPT in November 2022, celebrated for its ability to compose poems in English. Now, annotation firms are actively amassing a repository of creative writing data across various languages, signaling a strategic emphasis on fluency in poetic forms as they refine their generative writing products.

This investment holds the promise of significant returns for AI firms. Dan Brown, a professor at the University of Waterloo specializing in computational creativity, asserts, “Replicating classical language forms is a way of looking prestigious.” Scale AI and Appen boast prestigious client rosters, including industry giants like OpenAI, Meta, Google, and Microsoft. In the fiercely competitive generative AI race, being the first to conquer specific languages and markets can yield immense advantages.

An Appen spokesperson disclosed a surge in demand for writing contractors, particularly in non-English languages, since the close of 2022. They highlighted the unique expertise of creative writers in crafting high-quality training data for creative AI generation, encompassing poetry, song lyrics, and narrative writing.

Scale AI, while not revealing specific recruitment details, emphasized the enduring role of humans in the AI development loop, citing it as critical for responsible, safe, and accurate AI.

Training AI models to produce high-caliber literary content, like poetry, presents formidable challenges. Many large language models are designed to replicate existing content rather than innovate. The measure of creativity often hinges on novelty—how distinct the AI-generated text is from existing literary works. However, AI models like ChatGPT primarily aim to mimic human writing rather than push creative boundaries.

Fabricio Goes, an informatics professor at the University of Leicester, echoes this sentiment, highlighting that AI systems are built to reproduce existing content rather than exhibit creativity. This divergence becomes evident when AI attempts to imitate the distinctive styles and structures of renowned poets, often falling short.

For instance, ChatGPT struggles to mirror the fluid and unstructured verses of Walt Whitman, reverting to rigid four-line stanzas despite attempts to deviate. These challenges intensify when AI is tasked with crafting poetry in languages beyond English. Adapting to diverse poetic styles, such as haiku and waka in Japanese, presents formidable hurdles.

Major AI developers have historically relied on easily accessible databases for training their models, including Project Gutenberg and Archive of Our Own. However, these sources predominantly feature content in English.

To bridge this linguistic gap, Scale AI and Appen are willing to pay a premium for creative writers who can excel in languages where AI struggles. For instance, expert Japanese-language poets can command rates as high as $50 per hour. This demand, combined with the requirement for advanced degrees, elevates the compensation for these professionals.

The shift towards professionalization in data work, as seen in the recruitment of expert creatives, reflects the evolving landscape of AI development. Companies are transitioning from building AI models from scratch to fine-tuning them for specific applications. This transition reflects higher standards for crowd-based data work, where expertise and command over language are paramount.

While the embrace of creative writers in AI development may raise questions about the sustainability of this employment model, it also offers a potential solution to one of the creative industry’s major criticisms—copyright infringement. Recent protests and lawsuits by creators in various industries highlight concerns about AI developers using copyrighted material without permission. Texts generated for Scale AI and Appen, however, are likely to be owned entirely by the training data companies or their clients.

Julian Posada, an assistant professor at Yale University, suggests that this shift towards purchasing creative writing for AI models could reshape the industry, particularly if ongoing copyright litigation yields successful outcomes. The tech sector may increasingly rely on creative writers to craft original content that avoids infringing on copyright.

In harnessing the artistry of poets and writers, Silicon Valley’s AI developers are not merely aiming to replicate human creativity; they are striving to elevate it, heralding a new era in the evolution of artificial intelligence.

Conclusion:

Silicon Valley’s embrace of poets and writers to refine generative AI models signifies a strategic move towards elevating the quality and diversity of AI-generated content. This investment in creative talent not only enhances the capabilities of AI but also addresses copyright concerns in the industry. As AI firms compete to lead in the generative AI race, this approach could offer a unique edge in capturing untapped markets and languages while avoiding legal pitfalls associated with copyrighted material.

Source