Search powered by Algolia
← All articles

AI safety and content moderation

Chatbot safety, content filtering, deepfake detection, and child safety in AI systems.

Deepfakes, disinformation, and the fight for media authenticity
Researchsafety

Deepfakes, disinformation, and the fight for media authenticity

The growing threat of deepfakes and AI-generated misinformation, and the technologies fighting back.

2025-11-1318 min read
DeepfakesMisinformationMedia
E-commerce content moderation at scale: AI-powered brand safety
Industrysafety

E-commerce content moderation at scale: AI-powered brand safety

How AI-powered content moderation handles 500K+ daily submissions while maintaining brand safety standards.

2025-11-1017 min read
E-commerceContent ModerationBrand Safety
Enterprise customer service chatbot safety: preventing brand risk at scale
Industrysafety

Enterprise customer service chatbot safety: preventing brand risk at scale

How enterprise chatbots can go wrong and the safety frameworks needed to prevent brand-damaging incidents at scale.

2025-11-0816 min read
ChatbotCustomer ServiceBrand Safety
Protecting young minds: AI ethics for children and education
Researchsafety

Protecting young minds: AI ethics for children and education

The unique safety challenges of AI systems designed for children and educational contexts.

2025-11-0615 min read
ChildrenEducationSafety
AI safety incidents of 2024: lessons from real-world failures
Industrysafety

AI safety incidents of 2024: lessons from real-world failures

An analysis of major AI safety incidents in 2024 and the lessons they teach about building safer AI systems.

2025-11-0421 min read
Safety Incidents2024Lessons Learned
The future of AI content moderation: smarter, safer, more responsible
Researchsafety

The future of AI content moderation: smarter, safer, more responsible

How AI content moderation is evolving beyond keyword filters to multi-dimensional safety evaluation.

2025-11-0213 min read
Content ModerationFutureSafety
Ensuring safety in AI responses: the safety aspect
Researchsafety

Ensuring safety in AI responses: the safety aspect

A detailed look at the safety dimension of RAIL Score and how it measures harmful content in AI outputs.

2025-10-2412 min read
SafetyRAIL ScoreHarmful Content
When AI chatbots go wrong: how to fix them
Researchsafety

When AI chatbots go wrong: how to fix them

Common failure modes in AI chatbots and practical strategies for detecting and preventing harmful responses.

2025-10-2014 min read
ChatbotsSafetyFailure Modes

Try RAIL Score for safety

Evaluate your AI outputs across 8 dimensions of responsible AI.

Open evaluator