Sora video app surpasses one million downloads fueling copyright debate

OpenAI’s text-to-video application, Sora, has rocketed to one million downloads in less than five days, a pace that outstrips the initial launch of ChatGPT. The rapid adoption, achieved despite the app being available only to a limited group of invited users in North America, has intensified a global debate over artificial intelligence and the use of copyrighted materials for training purposes. The app’s ability to generate short video clips from simple text prompts has pushed it to the top of Apple’s US App Store, demonstrating significant public interest in generative video technology.

This surge in popularity is forcing a direct confrontation between AI developers and intellectual property holders. The core of the issue lies in the massive datasets of images and videos required to train models like Sora, which creators and rights holders argue often include their work without permission, compensation, or credit. As AI-generated content featuring famous characters and the likenesses of public figures floods social media, the legal and ethical boundaries of “fair use” are being tested in courtrooms and in the court of public opinion, raising critical questions about ownership, creativity, and the financial exposure of a burgeoning industry.

A New Frontier in Content Creation

Sora represents a significant step forward in consumer-facing AI tools, transitioning complex video generation technology into a user-friendly application. Users can input written descriptions and, within moments, receive a ten-second video clip created by the AI. This ease of use has led to an explosion of creative, and often controversial, content. Bill Peebles, who heads the Sora division at OpenAI, acknowledged the “surging growth” in a post on the social media platform X, noting that his team was working diligently to keep up with demand. The app’s launch success, even in its limited-release phase, signals a powerful market appetite for tools that lower the barrier to video production.

However, the platform’s design, which encourages direct sharing to social media, has amplified concerns over its potential for misuse. The speed at which users can create and disseminate realistic videos has outpaced the development of clear legal and ethical guidelines. While some celebrate the democratization of creative tools, others worry about the implications for misinformation, deepfakes, and the erosion of intellectual property rights that have long protected artists and creators. The technology’s capabilities are advancing far faster than the legal frameworks designed to govern them.

Intellectual Property and Public Figures

Almost immediately after its release, Sora became a flashpoint for copyright and likeness rights violations. Users generated and shared videos depicting characters from well-known media franchises, such as Nintendo’s Pokémon, and digital recreations of deceased celebrities, including musicians Michael Jackson and Tupac Shakur. The issue gained personal resonance when Zelda Williams publicly asked people to stop sending her AI-generated clips of her late father, the actor Robin Williams, highlighting the emotional toll such content can take on families.

OpenAI has attempted to navigate this sensitive area by claiming that “strong free speech interests” apply to depictions of historical figures. For those considered “recently deceased,” the company states that authorized individuals can request the removal of a likeness, though it has not publicly defined the timeframe that qualifies as “recent.” The situation was further complicated when a video circulated showing OpenAI’s own CEO, Sam Altman, alongside Pokémon characters while a voiceover stated, “I hope Nintendo doesn’t sue us.” While Nintendo has not announced any formal legal action, the incident underscores the unresolved tension between AI-generated content and established copyright law.

The Murky Waters of Fair Use

The legal defense most often cited by AI developers is the doctrine of “fair use,” which permits the limited use of copyrighted material without permission for purposes such as criticism, commentary, or transformation. AI companies argue that training models on vast datasets is a transformative use because the final output is a new creation, not a direct copy of the source material. However, this interpretation is being fiercely contested. Critics and rights holders argue that the sheer volume of copyrighted data ingested by these models amounts to industrial-scale infringement that undermines the original work’s market value. Courts are now tasked with applying century-old copyright principles to a technology that can analyze and replicate artistic styles in seconds, with dozens of high-profile lawsuits pending.

The Legal and Financial Stakes

The legal battles surrounding AI training data carry immense financial risk for technology firms. A prominent example is the class-action lawsuit filed by authors against the AI company Anthropic. The company agreed to a landmark settlement of $1.5 billion to resolve claims that it used pirated digital copies of books to train its AI model, Claude. The lawsuit, brought by authors including Andrea Bartz and Charles Graeber, alleged that Anthropic downloaded millions of books from pirate websites. A federal court found that while the act of training could be considered fair use, the acquisition of the training data through illicit means was not. This settlement, which amounts to roughly $3,000 per affected work, sends a powerful message to the AI industry about the severe consequences of using unauthorized data.

Another lawsuit targeting OpenAI directly has gained momentum after a New York district court ordered the company to release internal communications related to the deletion of a dataset allegedly containing pirated books. Plaintiffs argue these communications could demonstrate willful infringement, which carries the potential for much higher statutory damages of up to $150,000 per work. These cases illustrate a growing trend of legal challenges that could reshape the economics of generative AI, potentially forcing companies to license their training data and share revenue with creators.

OpenAI’s Response and Future Plans

In response to the mounting criticism, OpenAI CEO Sam Altman has adopted a conciliatory public stance. In a blog post, he stated that the company is “learning quickly” from how people are using Sora and is actively taking feedback from users and rights holders. He announced plans to provide creators with “more granular control over generation of characters” and floated the idea of future revenue-sharing agreements, though no specific timeline or structure for such a system has been revealed.

Altman has also controversially framed some Sora videos as a new form of “interactive fan fiction,” a term traditionally used to describe written works by fans using existing characters. This argument attempts to position the AI-generated content within a legally contested but often tolerated area of creative expression. However, it remains uncertain whether courts or intellectual property holders will accept this characterization. At the same time, Altman has acknowledged that some users find Sora’s existing safeguards too restrictive, highlighting the difficult balance the company must strike between preventing misuse and fostering innovation. He has asked for “grace,” stating that “the rate of change will be high.”

Broader Industry Implications

The controversy surrounding Sora is not unique to OpenAI; it reflects a sector-wide challenge. Governments are beginning to respond to the rapid advancements. In Japan, officials have formally requested that OpenAI move to an “opt-in” system, which would require prior approval from property owners before their work can be used in training data, a significant shift from the current “opt-out” model. This move signals a growing international appetite for stricter regulation of AI training practices.

As legal and regulatory pressures mount, the central conflict remains between the immense potential of generative AI as a creative tool and the unresolved ethical framework for its development. The outcomes of the current lawsuits and the direction of new legislation will likely determine the future of content creation. The industry is at a crossroads, where it must decide whether to build its future on a foundation of licensed data and collaboration with creators or continue to test the limits of fair use, risking further legal and public backlash.