InstallHub
Content Policy Check
The content policy check scans human-readable fields — README, description, artifact names, field labels — for prohibited content including credentials, personally identifiable information, malicious URLs, and abusive language.
Fields Scanned
| Field | Source | Why It Is Scanned |
|---|---|---|
| README.md | Package root | Documentation is publicly visible in the marketplace |
manifest.description | manifest.json | Shown in marketplace search results |
manifest.changelog | manifest.json | Displayed on version history page |
| Artifact names | All artifact JSON files | Shown in import preview and audit logs |
| Form field labels | AtlasForm artifact | Rendered to end users during form submission |
| Node display names | ProcessDefinition artifact | Shown in workflow editor and execution logs |
Policy Violation Categories
| Category | Severity | Examples |
|---|---|---|
| Credential Exposure | Critical (FAIL) | API keys matching known formats (AWS, Stripe, GitHub tokens), passwords, connection strings, PEM keys |
| Personally Identifiable Information | High (FAIL) | Real names + email addresses in combination, social security number patterns, passport number patterns |
| Malicious URLs | High (FAIL) | URLs matching threat intelligence blocklists — known phishing domains, malware delivery URLs |
| Prohibited Keywords | Medium (WARN or FAIL) | Profanity, hate speech, competitor disparagement claims, copyright-infringing content |
| Misleading Claims | Medium (WARN) | "Guaranteed to work", "Official BizFirstGO" when publisher is not BizFirstGO |
| Suspicious Encoding | Medium (WARN) | Base64-encoded blobs in text fields, Unicode obfuscation patterns, zero-width characters |
Credential Pattern Detection
The scanner uses regex patterns to detect common credential formats in text:
// Patterns that trigger CredentialExposure (Critical):
AWS Access Key: AKIA[0-9A-Z]{16}
GitHub Token: gh[pousr]_[A-Za-z0-9_]{36,255}
Stripe API Key: sk_(live|test)_[0-9a-zA-Z]{24,}
Generic API Key: api[_-]?key\s*[=:]\s*['"]?[A-Za-z0-9\-_]{20,}['"]?
PEM Private Key: -----BEGIN (RSA |EC |OPENSSH )?PRIVATE KEY-----
Connection String: (Server|Data Source)=[^;]+;(Initial Catalog|Database)=[^;]+;.*Password=[^;]+
// Example finding:
{
"check": "ContentPolicy",
"severity": "Critical",
"rule": "CredentialExposure.AwsAccessKey",
"field": "README.md",
"value": "AKIAIOSFODNN7EXAMPLE",
"message": "README contains what appears to be an AWS Access Key ID. Remove all real credentials from documentation."
}
URL Scanning
All URLs found in scanned text fields are extracted and checked against:
- Google Safe Browsing API — phishing and malware domain list
- Spamhaus Domain Block List — known spam and abuse domains
- BizFirstAI internal blocklist — domains previously flagged in marketplace submissions
False Positive Likelihood
Content policy checks have a higher false positive rate than injection checks. Common cases:
- A README that uses an AWS key format as a documentation example (use
<YOUR_AWS_KEY>placeholders instead) - A domain name that shares a substring with a flagged domain
- A field label in a non-English language that phonetically resembles a blocked keyword
In all these cases, submit a false positive explanation. Content policy false positives are common and reviewed promptly (typically same business day for marketplace submissions).