Content Moderation on goodinfo.net Daily

Researchers Find ChatGPT Can Generate Violent and Sexualized Images

goodinfo.net — Thu, 18 Jun 2026 07:07:00 +0800

Core Summary

BBC reports that researchers have discovered specific prompts can bypass ChatGPT’s safety filters to generate violent and sexualized images. This finding has reignited public discussion about the safety boundaries of AI-generated content and highlights the ongoing challenges large language models face in content moderation.

Event Details

According to BBC Technology, multiple independent research teams testing OpenAI’s latest image generation capabilities found that despite multiple built-in safety protections, carefully crafted indirect prompts can still induce the model to output content that violates usage policies. This content includes images depicting violent scenes and sexual suggestion.

Researchers point out that the attack methods primarily exploit prompt injection and multi-step guidance techniques. Attackers gradually lower the model’s safety threshold through staged conversations, ultimately bypassing filtering mechanisms. This approach is similar to social engineering attacks, exploiting the model’s limitations in contextual understanding to evade detection.

OpenAI responded that the company is actively patching discovered vulnerabilities and will continue investing resources to strengthen its safety protection system. A spokesperson emphasized that no filtering system is one hundred percent perfect, and the company employs a defense-in-depth strategy combining automated detection and human review to minimize risk.

Panoramic Perspective

This incident reveals the core dilemma in AI safety: how to ensure content safety while maintaining model functional flexibility. The capability of large language models fundamentally stems from their broad understanding of language, which both enables them to complete beneficial tasks and makes them potentially exploitable.

From a technical perspective, the persistent existence of prompt injection attacks indicates that traditional methods relying solely on input filtering and output detection are no longer sufficient to address increasingly complex attack methods. The industry is exploring more advanced solutions including reinforcement learning-based alignment techniques, real-time content classifiers, and multi-layered safety protection architectures.

From a regulatory perspective, this incident may accelerate legislative processes for AI-generated content across countries. The EU AI Act has already brought high-risk applications under strict regulatory scope, and this incident may push image generation models into stricter compliance requirements.

Multiple Perspectives

Security researchers believe this discovery was not unexpected but highlights the urgency of the problem. They call for more transparent vulnerability disclosure mechanisms so the security community can assist companies in timely discovery and patching.

Industry observers note this is not just an OpenAI problem but a shared challenge for the entire AI industry. All companies providing generative AI services need continuous safety investment—this is an arms race with no finish line.

Privacy advocates worry that overly strict content filtering may harm legitimate use cases such as medical education, artistic creation, and historical research. They call for finding balance between safety and freedom.

Editor: GoodInfo Global News Team

Meta Repeatedly Refuses EU Body Over Facebook and Instagram User Ban Data

goodinfo.net — Thu, 28 May 2026 22:11:00 +0800

Meta Repeatedly Refuses EU Requests Over User Ban Transparency

EU regulators have disclosed that Meta has repeatedly failed to provide detailed information about user ban decisions on Facebook and Instagram, marking the third time this year the company has been cited for non-cooperation under the Digital Services Act (DSA).

The EU’s Digital Services Coordinator Network (DSC Network) stated that Meta did not comply with DSA Article 24 requirements to share data on banned accounts within mandated timeframes. This data includes the number of banned accounts, ban reason categories, and appeal outcomes.

EU authorities emphasized that large online platforms are legally obligated to provide transparent content moderation data to regulators, ensuring community rule enforcement does not exhibit systemic bias or discrimination. Meta’s continued non-cooperation could face further penalties — DSA violations can result in fines of up to 6% of global annual revenue.

Meta responded that it is engaged in constructive dialogue with EU regulators but cited user privacy laws as limiting certain data-sharing capabilities. The company says it is developing a new reporting system to better balance transparency with privacy protection.

Analysts note the dispute reflects a broader platform governance challenge: the EU seeks to establish robust digital market oversight through the DSA, while tech giants face inherent tensions between data transparency and user privacy.

Meta has already faced multiple DSA-related penalties in the EU. Cumulative fines could exceed €500 million if this investigation leads to further sanctions.

[Brief] Ofcom Says TikTok and YouTube Not Safe Enough for Kids

goodinfo.net — Thu, 21 May 2026 09:54:00 +0800

UK regulator Ofcom has reported that TikTok and YouTube are not safe enough for child users, calling for stronger content moderation and protective measures. The report notes that while both platforms have taken some protective steps, significant gaps remain in preventing children from accessing harmful content. Ofcom is requiring platforms to invest more resources in minor protection and consider implementing stricter age verification mechanisms.

Source: BBC