Add AI Content Safety image moderation under experiment flag by fisher-alice · Pull Request #71577 · code-dot-org/code-dot-org
This PR adds image moderation via Azure AI Content Safety under an experiment flag 'ai-content-safety'.
PR updates include:
- Added
AzureAiContentSafetyservice to backend and associated key in secrets (azure_ai_content_safety_key) - Added call to
AzureAIContentSafety'smoderate_imageinImageModeration'smoderation_image. - Added new backend endpoint in
files_api.rb: "POST /v3/images/moderate-ai-content-safety" - Removed Firehose logging to 'azure-content-moderation' and now unused
image_urlparam. See https://codedotorg.atlassian.net/browse/INF-1996 - Added an experiment flag 'ai-content-safety', and if flag is turned on,
moderateImageuses Azure AI Content Safety. The default service is still Azure Content Moderator.
Note that I added to azure_ai_content_safety_key to AWS secrets using bin/update_secrets which was cool and much easier than adding a secret manually (which I've done in the past).
I added the severity level blocks (currently at 2 for each category) on the front-end. There has been discussion about folks wanting to know why an image is blocked - eee Slack thread. Follow-up can include adding metadata to when an image is flagged with the categories and severity level.
FYI, we can test images on https://contentsafety.cognitive.azure.com/image! So if product/curriculum decide on different severity blocked levels for different categories, we can update.
| const CATEGORY_SEVERITY_LEVEL_BLOCKED: Record<string, number> = { | |
| Hate: 2, | |
| SelfHarm: 2, | |
| Sexual: 2, | |
| Violence: 2, |
Links
Testing story
Tested locally and logged the json returned (in dev console) when experiment flag was turned on.
Follow-up
- Add categories and severity level in Statsig reporting if image flagged by AI Content Safety.
- Once tested on
prodwith flag, can remove experiment flag and migrate to AI Content Safety. - Remove Azure Content Moderator and associated code.
- Consider moving the scaling of small images to backend.



