REFERENCE51 terms

Content Moderation Glossary

Essential terminology for content moderation APIs, safety systems, compliance, and online platform management. From API concepts to legal requirements.

Showing 51 terms

Allowlist / Whitelist

Moderation

A list of approved words, phrases, domains, or users that are exempt from moderation rules. Overrides blocklist entries.

Related:BlocklistFalse Positive

API (Application Programming Interface)

Technical

A set of protocols and tools that allows different software applications to communicate. SafeComms provides a REST API for content moderation.

Related:REST APISDKWebhook

API Key

Technical

A unique authentication token used to identify and authorize requests to the SafeComms API. Keep your API keys secret to prevent unauthorized access.

Related:AuthenticationRate Limiting

Appeal

Enforcement

Process allowing users to contest moderation decisions. Required by DSA (EU) and best practice under GDPR.

Related:False PositiveManual ModerationTransparency

Automated Moderation

Moderation

Use of algorithms and machine learning to detect and filter harmful content without human intervention. Faster and more scalable than manual review.

Related:Manual ModerationHybrid Moderation

Bad Actor

Safety

A user who intentionally posts harmful, abusive, or rule-violating content. May use evasion techniques to bypass filters.

Related:EvasionBanShadowban

Ban

Enforcement

Permanent or temporary suspension of a user account, preventing them from accessing the platform. Typically applied for severe or repeated violations.

Related:ShadowbanSuspensionIP Ban

Blocklist / Blacklist

Moderation

A list of prohibited words, phrases, domains, or users. Content matching blocklist entries is automatically rejected or flagged.

Related:AllowlistProfanity Filter

Brigading

Safety

Coordinated mass harassment, review bombing, or reporting campaigns. Often organized off-platform.

Related:HarassmentAbuseCoordinated Inauthentic Behavior

Confidence Score

Technical

A numerical value (0-1 or 0-100) indicating how certain the moderation system is that content violates a rule. Higher scores = higher confidence.

Related:ThresholdFalse PositiveFalse Negative

Content Warning

Safety

Notice displayed before showing potentially disturbing content (violence, adult themes). Allows user choice.

Related:NSFWAge Gate

COPPA (Children's Online Privacy Protection Act)

Legal

US law requiring parental consent before collecting data from children under 13. Affects platforms that allow child users.

Related:GDPRAge VerificationParental Consent

CSAM (Child Sexual Abuse Material)

Safety

Illegal content depicting minors in sexual situations. Must be reported to authorities (NCMEC in US). Zero tolerance for storage or distribution.

Related:NCMECPhotoDNA

Doxing / Doxxing

Safety

Publishing private personal information about someone without consent, often with malicious intent (addresses, phone numbers, family details).

Related:PIIHarassmentPrivacy Violation

DPA (Data Processing Agreement)

Legal

A legal contract between data controller and processor defining data handling responsibilities. Required under GDPR when using third-party services.

Related:GDPRData ControllerData Processor

Endpoint

Technical

A specific URL in an API that performs a function. Example: `/v1/moderate/text` is the SafeComms text moderation endpoint.

Related:APIREST APIRequest

Evasion

Safety

Techniques used to bypass content filters, such as character substitution (@ for a), zero-width spaces, or deliberate misspellings.

Related:Bad ActorProfanity FilterLeetspeak

False Negative

Technical

When harmful content is incorrectly classified as safe. More dangerous than false positives but harder to detect.

Related:False PositiveConfidence Score

False Positive

Technical

When legitimate content is incorrectly flagged as violating rules. Reducing false positives improves user experience.

Related:False NegativeConfidence ScoreThreshold

GDPR (General Data Protection Regulation)

Legal

EU privacy law governing data collection, processing, and user rights. Applies to any platform serving EU users, regardless of location.

Related:DPARight to DeletionData Minimization

Hate Speech

Safety

Content that attacks or demeans individuals or groups based on protected characteristics (race, religion, gender, sexual orientation, etc.).

Related:ToxicityHarassmentDiscrimination

Hybrid Moderation

Moderation

Combination of automated and manual review. AI handles high-confidence cases; humans review edge cases and appeals.

Related:Automated ModerationManual Moderation

IP Ban

Enforcement

Blocking access from a specific IP address. Less effective than account bans due to VPNs, but useful for bot traffic.

Related:BanRate Limiting

Latency

Technical

The time delay between sending a request and receiving a response. SafeComms averages 120ms latency for text moderation.

Related:Response TimeTimeout

Leetspeak / 1337speak

Safety

Alternative spelling using numbers and symbols (e.g., "h3ll0" for "hello"). Often used for filter evasion.

Related:EvasionProfanity Filter

Machine Learning (ML)

Technical

AI systems that learn patterns from data. Used in content moderation to detect toxicity, spam, and harmful content with high accuracy.

Related:Natural Language ProcessingTraining DataModel

Manual Moderation

Moderation

Human review of content. Slower and more expensive than automated moderation, but better at context and nuance.

Related:Automated ModerationHybrid Moderation

Moderation Profile

Moderation

A configuration defining which rules to apply, sensitivity thresholds, and actions to take. Allows different moderation for different content types.

Related:ThresholdRulePolicy

Natural Language Processing (NLP)

Technical

AI technology that understands and analyzes human language. Powers contextual moderation beyond simple keyword matching.

Related:Machine LearningSentiment AnalysisToxicity

NCMEC (National Center for Missing & Exploited Children)

Legal

US organization that receives CSAM reports via CyberTipline. Platforms must report suspected CSAM within 24 hours.

Related:CSAMCyberTipline

NSFW (Not Safe For Work)

Safety

Content inappropriate for workplace viewing, typically sexual or violent imagery. May be allowed with age gates or content warnings.

Related:Adult ContentContent Warning

Payload

Technical

The data sent in an API request or webhook. For SafeComms, the payload includes the content to moderate and optional metadata.

Related:RequestResponseWebhook

Phishing

Safety

Fraudulent content attempting to steal credentials or financial information, often disguised as legitimate links or services.

Related:SpamMalwareScam

PII (Personally Identifiable Information)

Safety

Data that can identify an individual: email addresses, phone numbers, social security numbers, addresses, credit card details.

Related:GDPRData LeakDoxing

Profanity Filter

Moderation

System that detects and blocks offensive language. Can be word-based (blocklists) or ML-based (contextual understanding).

Related:BlocklistToxicityEvasion

Rate Limiting

Technical

Restricting the number of API requests or content submissions from a user/IP within a time window. Prevents abuse and spam.

Related:ThrottlingAPI KeyQuota

REST API

Technical

A web service architecture using HTTP methods (GET, POST, etc.). SafeComms uses REST for easy integration with any language.

Related:APIEndpointSDK

Rule

Moderation

A specific moderation policy, such as "block profanity" or "flag hate speech." Multiple rules can be combined in a profile.

Related:Moderation ProfilePolicyThreshold

SDK (Software Development Kit)

Technical

A library that simplifies API integration for a specific programming language. SafeComms offers SDKs for Node.js, Python, PHP, and more.

Related:APILibraryIntegration

Section 230

Legal

US law protecting platforms from liability for user-generated content. Allows "good faith" moderation without legal risk.

Related:Platform LiabilityFOSTA-SESTA

Sentiment Analysis

Technical

Determining the emotional tone of text (positive, negative, neutral). Used to detect hostile or aggressive language.

Related:ToxicityNatural Language Processing

Shadowban

Enforcement

Hiding a user's content from others without notifying them. User thinks their posts are visible, but nobody else can see them.

Related:BanSuspension

Spam

Safety

Unwanted repetitive content, often commercial or phishing links. Includes comment spam, fake reviews, and bot-generated posts.

Related:BotPhishingRate Limiting

Threshold

Technical

The minimum confidence score required to flag content. Lower thresholds = more sensitive (more false positives); higher = less sensitive (more false negatives).

Related:Confidence ScoreFalse PositiveTuning

Timeout

Technical

Maximum time to wait for an API response before giving up. SafeComms recommends 5-10 second timeouts for moderation requests.

Related:LatencyError Handling

Toxicity

Safety

Rude, disrespectful, or hostile language likely to make users leave a conversation. Includes insults, profanity, and personal attacks.

Related:Hate SpeechProfanityHarassment

Tuning

Moderation

Adjusting moderation thresholds and rules to balance false positives and false negatives for your specific use case.

Related:ThresholdFalse PositiveModeration Profile

User-Generated Content (UGC)

General

Any content created by users rather than the platform: posts, comments, reviews, messages, images, videos.

Related:Content ModerationPlatform Liability

Violence / Gore

Safety

Content depicting physical harm, injury, death, or graphic violence. Often prohibited or requires content warnings.

Related:NSFWContent Warning

Webhook

Technical

An HTTP callback that sends real-time notifications to your server when events occur, such as when content is flagged.

Related:CallbackEventPayload

Zero-Day Content

Safety

Newly emerging harmful content patterns that existing filters don't catch. Requires continuous model updates.

Related:EvasionModel Training

Ready to Build Safer Platforms?

Now that you understand the terminology, start implementing world-class content moderation with SafeComms.

Get Free API Key View Documentation

Missing a Term?

If you'd like to see additional terms added to this glossary, please contact us at [email protected] or submit feedback through our support portal.