The Hidden World of YouTube: Fueling AI with Obscure Videos

Researchers from the University of Massachusetts Amherst have analyzed YouTube videos to understand their impact on AI training. Their findings reveal many videos aimed at personal audiences, including children under 13. This research raises concerns about privacy and copyright as companies like OpenAI use these videos to develop AI models.

PTI | Amherst | Updated: 28-06-2024 11:06 IST | Created: 28-06-2024 11:06 IST

The Hidden World of YouTube: Fueling AI with Obscure Videos — AI Generated Representative Image

Country:
United States

Amherst, Jun 28 (The Conversation)—As the artificial intelligence revolution gathers pace, data remains its lifeblood. OpenAI and Google have turned to YouTube as a rich source of training data. However, what exactly comprises this YouTube archive? A team from the University of Massachusetts Amherst set out to investigate, analyzing random samples of YouTube videos to demystify this extensive dataset.

Their 85-page publication sheds light on the surprising contents of YouTube. They discovered many videos intended for personal use or small groups, with a significant proportion created by children under 13.

While most users experience YouTube through algorithmically recommended videos, a vast iceberg of obscure content remains unexplored. Researchers documented thousands of personal videos with minimal views but high engagement, indicating they were meant for a small audience, such as friends and family. This contrasts with the widely known popular content, exposing another layer of YouTube as a video-centered social network for close-knit groups.

The research gains urgency in the context of a New York Times exposé revealing that OpenAI and Google are leveraging these videos to train their large language models. Concerns about YouTube's terms of service, copyright issues, and the sheer volume of data—including content from kids—are growing.

The researchers, while not condemning Google, underscore that OpenAI's opacity about training materials and the potential inclusion of user-generated content from children pose serious ethical questions. With the Federal Trade Commission's Children's Online Privacy Protection Rule in mind, regulatory efforts are needed to ensure legal protections for user data, particularly as AI continues to evolve.

(This story has not been edited by Devdiscourse staff and is auto-generated from a syndicated feed.)

The Hidden World of YouTube: Fueling AI with Obscure Videos

ALSO READ

Google introduces Gemini Chatbot service in India

Top Stories in Finance: JPMorgan Lifts Bonus Cap, Labour's Tax Loophole Stand, Rival AI Start-up by OpenAI Co-founder

TikTok Faces U.S. Privacy Violation Suit, DOJ Drops Data Security Allegations

Brazil's Top Court Dismisses Google and Telegram Fake News Probe

DOJ Prepares Lawsuit Against TikTok for Children's Privacy Violations

TRENDING

Innovative Health Breakthroughs: Therapy Horses, Bird Flu Vaccines, and More

Global INDIAai Summit 2024 Kicks Off in New Delhi with Focus on AI Democrati...

Thrilling Stats: British Grand Prix Dominance at Silverstone

India's Record Wheat and Paddy Procurement 2024-25: A Boon for Farmers

DevShots

Latest News

Federal Judge Halts Biden Administration's Transgender Nondiscrimination Rule

Pro-Palestinian Protesters Vacate University of Toronto Encampment

Former Toronto Raptors Forward Faces Federal Felony Charge in Gambling Scandal

Saudi Arabia Strengthens Defence Ties with Turkey

OPINION / BLOG / INTERVIEW

Customized Fiscal Analysis: Estimating Structural Budget Balances in Developing Asia

Climate Change Impacts on South Asian Women Farmers: Health, Livelihoods, and Policy Solutions

Smart Cooling for a Sustainable Future: Policies and Technologies for Eco-Friendly Solutions

Transforming MSME Finance: G20's Push for Open Banking Systems and India's Innovative Solutions

Connect us on

SECTORS

EDITIONS

OTHER LINKS

OTHER PRODUCTS

CONNECT