Royal AI Firewall Documentation

Overview

Royal AI Firewall is a free, GPL-licensed WordPress plugin that classifies and controls AI bot traffic at the WordPress application layer. It identifies AI agents by their User-Agent header, exposes the live list in a per-bot dashboard, and lets you set a policy (allow / block / log-only) for each one with a one-click dropdown.

This is the entire plugin. There is no Pro version, no premium tier, no upgrade prompts. Every feature ships on wordpress.org. The only outbound network call the plugin can make — an optional daily fetch of the latest bot catalog — is off by default and requires explicit opt-in via the wizard or Settings page.

Getting Started

Walk through the first-run setup in 60 seconds. The wizard is skippable from every screen if you want to dive straight into the dashboard.

Install from WordPress.org

In wp-admin, go to Plugins → Add New and search for Royal AI Firewall. Click Install Now, then Activate. WordPress redirects you to the 4-step setup wizard automatically.

Welcome screen

Step 1 explains what the plugin does and asks you to continue. No fields to fill in. Hit Let’s go.

Environment detection

Step 2 scans your install for Cloudflare and other security plugins, then shows a checklist of what was found. The next step depends on whether Cloudflare is detected.

Cloudflare dial-down (if detected)

If Cloudflare is in front of your site, step 3 lists exactly which CF settings to turn off so Royal AI Firewall can take over the AI-bot layer, plus which to leave on. See Cloudflare Setup for the full breakdown.

Pick a default policy

The last step asks how the plugin should handle AI bots until you decide per-bot rules. Three options: Log only (recommended for the first 24h), Block training bots, allow retrieval bots, or Block all AI bots except verified search engines. You can change this in Settings any time.

Open the dashboard

Click Finish setup. The plugin lands you on the dashboard. Bot hits start populating as AI agents visit your site — typically 2–6 hours on a public, indexed site, or you can fire a quick test from the command line:

curl -A "GPTBot/1.2" https://your-site.com/

Refresh the dashboard and you should see a GPTBot hit recorded.

i Bundled catalog

The plugin ships with a bundled catalog of 55+ AI bots. You do not need to enable any outbound update to use the plugin — classification works offline from day one. The optional daily catalog refresh is for users who want fresher catalogs between plugin releases. See Catalog Updates for details.

Requirements

Requirement	Minimum	Notes
WordPress	6.4	Tested up to 7.0. Multisite-compatible.
PHP	8.0	Tested through 8.2 and 8.3.
MySQL / MariaDB	5.7 / 10.4	Whatever WordPress 6.4 requires. Plugin uses `%i` placeholders (available in WP 6.2+).
Disk	~80 KB	Plugin zip size. Three custom DB tables, all small.
WP-Cron	working	Two recurring events: hourly rollup, daily log prune. Neither makes a network call.
HTTPS	recommended	Not required for the plugin itself, but the bot catalog endpoint (if opted in) uses HTTPS.

AI Bot Catalog

The plugin ships with a bundled catalog of 55+ recognized AI bots, organized into six categories. Each entry includes the bot’s identifier, owner, intended purpose, default policy, and the blocking consequences (for example, “blocking GPTBot may remove your site from ChatGPT search results”).

Training crawlers

Bots that scrape content to build and improve foundation models. Blocking these removes you from future model training datasets but does not affect on-demand retrieval.

GPTBot · ClaudeBot · anthropic-ai · Bytespider · TikTokSpider · FacebookBot · Meta-ExternalAgent · GoogleOther · GoogleOther-AI · Google-Extended · MistralBot · ai2bot · ai2bot-dolma · cohere-ai · Amazonbot · PetalBot

Retrieval bots (on-demand)

Bots that fetch your content when an end user explicitly asks an AI assistant about your site. Blocking these means you don’t show up in chat responses when users mention your URL.

ChatGPT-User · ClaudeBot-User · Claude-Web · Perplexity-User · Meta-ExternalFetcher · facebookexternalhit · APIs-Google

AI search engines

Bots that index your content for AI-native search results pages. Blocking these removes you from Perplexity, OAI-SearchBot results, etc.

OAI-SearchBot · PerplexityBot · Applebot-Extended · MicrosoftCopilotBot · DuckAssistBot · YouBot · PhindBot · iAsk · Komo · Liner · Brave Leo · Andi

Traditional search engine crawlers

Well-known SEO crawlers that predate the AI wave. Seven of the twelve are protected by the search engine guard — the per-bot dropdown is disabled for them and blocking requires an explicit Settings override with a warning.

Always-allow guarded (7): Googlebot · Googlebot-Image · Googlebot-Video · Googlebot-News · Bingbot · Applebot · DuckDuckBot

Also recognized, no guard: BingPreview · Storebot-Google · Mediapartners-Google · AdsBot-Google · adidxbot

Agent browsers (newer category)

The newest class — AI agents acting as their human user’s browser, navigating your site on the user’s behalf. Treat these similar to retrieval bots unless you have a reason otherwise.

OperatorAgent · ChatGPT-Atlas · Claude-Computer-Use

Dataset scrapers

Bots that build publicly-distributed datasets (Common Crawl, etc.) that other AI vendors then use as training input. Blocking these is one upstream from training crawlers.

CCBot (Common Crawl) · Diffbot · ImagesiftBot · Omgilibot · Timpibot

How identification works

This release identifies bots by matching the User-Agent header against the bundled fingerprint catalog. A spoofed User-Agent will match a real bot’s record, so treat the dashboard as the answer to “what’s claiming to be each bot” rather than a verified attribution.

For the search-engine guard, blocking is still off by default — a spoofed Googlebot UA can’t be blocked unless you explicitly enable the Search engine override toggle in Settings. Managing the actual edge layer (Cloudflare, your CDN, or a security plugin running before WordPress) remains the right place to enforce identity at the network level.

Policy Modes

Two layers of policy. The default policy applies to every recognized AI bot unless you set a per-bot override on that specific bot.

Default policy (global)

Mode	What it does	When to use
Log only	Records every AI bot hit but blocks nothing. The dashboard fills with data; the bots reach your content.	The recommended starting point. Run for 24–48h to see what’s actually hitting your site before deciding what to block.
Block training, allow retrieval	Blocks training crawlers (GPTBot, ClaudeBot, Bytespider, CCBot, etc.). Allows retrieval bots (ChatGPT-User, Claude-Web, Perplexity-User).	If you want to stay discoverable when users explicitly ask AI assistants about your site, but don’t want your content fed to model training pipelines.
Block all AI bots	Blocks every AI bot in the catalog. Allows traditional search engines (Googlebot, Bingbot, etc.) by the always-allow guard.	Maximum AI-bot lockdown. SEO crawlers still get through. Useful for membership sites, paywalled content, or anything you specifically don’t want AI agents reading.

Per-bot override

Each bot row in the dashboard has a four-option dropdown that takes precedence over the default mode for that specific bot:

Use default policy — falls back to the global default mode you picked above.
Always allow — the bot is allowed regardless of default mode. Use this for a specific training bot you trust.
Log only — the bot is allowed and recorded; never blocked. Use this when you want visibility on a specific bot but aren’t ready to block.
Block — the bot receives a 403 response immediately, before WordPress runs any heavy work.

The master “Block all” panic button

Every dashboard load shows a one-click Block all AI bots button at the top. Clicking it switches the default policy to Block all and confirms with a redirect. Click it again to revert to Log only. The search-engine guard still applies in Block all mode, so Googlebot et al. stay allowed.

Search Engine Guard

Seven bots are protected from accidental blocking by default: Googlebot, Googlebot-Image, Googlebot-Video, Googlebot-News, Bingbot, Applebot, and DuckDuckBot. The dashboard dropdown is disabled for these, and the REST API endpoints reject block attempts on them with a 409 Conflict response.

! Override is one toggle, with a warning

If you genuinely want to block a search engine, flip the Search engine override toggle in Settings. The toggle ships with a warning that “blocking Googlebot removes your site from Google Search”. Once enabled, the per-bot dropdowns become active for the guarded bots and the REST endpoints accept block requests.

This guard is independent of the default policy mode. Even in Block all AI bots mode, search engines stay allowed unless you’ve explicitly enabled the override.

Cloudflare Setup

Cloudflare and Royal AI Firewall both have opinions about AI bots. To get the per-bot dashboard and one-click controls in Royal AI Firewall to work, you need to dial down Cloudflare’s AI-specific features so requests reach WordPress where this plugin can see and decide on them.

Cloudflare’s general protections (DDoS, managed WAF, SSL, Bot Fight Mode) are fine to leave on — they don’t conflict with the WordPress-layer controls.

Turn OFF in Cloudflare

Setting	Where to find it	Set to
AI Audit	Security → Settings → AI Audit	Allow
AI Labyrinth	Security → Bots → AI Labyrinth	OFF
Custom WAF rules blocking AI bots	Security → WAF → Custom rules	DELETE (per-bot controls in this plugin replace them)
Security Level	Security → Settings → Security Level	Medium or Low

Leave ON in Cloudflare

These don’t conflict with Royal AI Firewall and provide real value:

DDoS protection — keep on.
Managed WAF rules — keep on.
SSL/TLS — keep on.
Bot Fight Mode (basic tier) — keep on. Blocks well-known abusive crawlers that aren’t AI agents.
Browser Integrity Check — keep on.
Caching — keep on. The plugin’s REST endpoints set Cache-Control: no-store headers and trigger DONOTCACHEPAGE to opt out of caching where it matters.

How Cloudflare detection works

Royal AI Firewall detects Cloudflare on every wp-admin page load by checking for the cf-ray, cf-connecting-ip, or CDN-Loop: cloudflare headers on the incoming request. A persistent 24-hour state ensures the dashboard UI stays stable even when an occasional admin request doesn’t pass through CF.

When Cloudflare is detected, the wizard’s step 3 surfaces the dial-down guide above and the Cloudflare visibility status card appears on the dashboard with an honest estimate of how many AI bots may have been filtered at the edge before reaching WordPress.

! HostGator / Newfold-brand hosting

Some shared hosting providers run their own Cloudflare layer in front of every site they host. If your CF dashboard shows no AI bot filtering but Royal AI Firewall’s dashboard still shows zero hits after 24 hours, your host may be filtering at a separate CF layer you can’t configure. Check the cf-mitigated response header on a curl probe to your site — if it’s present and you didn’t configure it, that’s the host’s Cloudflare. Contact host support.

Security Plugin Compatibility

Royal AI Firewall coexists cleanly with other security plugins. On activation it scans for common security plugins and shows compatibility notes on the dashboard and Settings page.

Edge-firewall security plugins

Popular edge-firewall plugins run their own firewall before WordPress loads. AI bots they block at their layer won’t appear in Royal AI Firewall’s dashboard — you only see the bots that reach WordPress. The two layers don’t conflict; they just sit at different points in the request path.

Practical impact: if your other security plugin already blocks GPTBot at its edge, you won’t see GPTBot hits in Royal AI Firewall’s dashboard either way. To get full visibility, allow AI bots at the edge-firewall layer and use Royal AI Firewall’s per-bot controls for the decision instead.

WordPress-layer security plugins

Security plugins that run their checks inside WordPress (after the request reaches PHP) coexist cleanly with Royal AI Firewall. Both layers see every request and can apply their own rules. AI-bot decisions made by Royal AI Firewall happen at parse_request priority 1, before most other plugins run, so blocked bots receive their 403 before any heavy WordPress work fires.

Royal Plugins integrations

GuardPress — detected automatically. Royal AI Firewall’s dashboard shows a first-party compatibility status. The two plugins handle different layers (GuardPress = login + general security, Royal AI Firewall = AI-bot identification + policy) and run side-by-side with no overlap.
Royal MCP 1.4.33+ — detected automatically. A first-party bridge captures every MCP tool call into the MCP Activity widget on the Royal AI Firewall dashboard, with full tool name and result status. See MCP / Abilities API for details.

MCP / Abilities API Logging

Royal AI Firewall hooks the WordPress Abilities API to log every ability invocation, regardless of which MCP server plugin triggers it. This gives you a unified view of what AI agents are doing through MCP, separate from the HTTP-layer bot dashboard.

What gets logged

For each ability invocation, the plugin records:

Ability ID (e.g. core:create-post)
Caller’s User-Agent and IP (when present in the request context)
Response status (success or error code)
Timestamp
For Royal MCP 1.4.33+: tool name, MCP client ID, full result status

What is not logged: argument values. The tool argument array can contain arbitrary customer data (post content, search queries, etc.), so only the keys of the argument array are recorded, not the values.

Where to see it

The dashboard surfaces an MCP / Abilities API activity widget when ability invocations have been recorded in the last 24 hours. It shows the top 20 abilities by invocation count with success / error breakdown. The widget hides itself when no MCP traffic has been seen, so non-MCP sites don’t see a noisy empty panel.

Compatible MCP server plugins

Royal AI Firewall logs ability invocations from any plugin that implements the WordPress Abilities API hooks (wp_before_execute_ability and wp_after_execute_ability, available in WordPress 6.9+).

If Royal MCP 1.4.33 or later is installed, an additional first-party bridge captures every MCP tool call from that server with full tool name and result status — not just the WordPress Abilities API subset.

Privacy & Telemetry

Royal AI Firewall is privacy-first by default. The plugin makes zero outbound HTTP calls on a fresh install. Telemetry is off. The catalog auto-update is off. There is no license check, no analytics call, no traffic beacon.

What never leaves your server

Regardless of any toggle state, the following are never sent anywhere:

Your site URL or domain
Customer email addresses
Invocation log contents (which bots hit, which URLs, when)
Specific IP addresses
Specific bot identities or hit counts
User-Agent strings of visitors

Anonymous telemetry toggle

The Settings page has an Anonymous usage data toggle (off by default). The toggle exists to reserve the opt-in for a future release; this version of the plugin does not send any telemetry payload even when the toggle is on. If we ever wire up telemetry in a later release, it will be strictly aggregate, non-identifying metrics (plugin version, wizard completion, customized policy count, bucketed bot counts) — documented on this page before it ships and never enabled without the toggle staying on.

Log retention

Invocation logs default to 7-day retention. Logs older than 7 days are pruned by the daily raif_log_prune cron event. To customize retention, use the raif_log_retention_days filter:

add_filter( 'raif_log_retention_days', function( $days ) {
    return 30; // Keep logs for 30 days instead of 7
} );

Valid range: 1 to 365 days. Values below 1 fall back to the default 7; values above 365 clamp to 365.

Catalog Updates

Royal AI Firewall ships with a bundled bot fingerprint catalog. That catalog refreshes automatically every time you update the plugin through wp-admin → Plugins — no opt-in required, no outbound network call.

How the per-release refresh works

Each plugin release includes a fresh data/fingerprints-bundled.json with any new bots or User-Agent string changes since the last release. When you update the plugin, Royal AI Firewall checks the bundled catalog’s database_version against what’s currently cached. If the bundled snapshot is newer, the cache is replaced with the new catalog. This happens silently on plugins_loaded after the version bump — you don’t need to reactivate or do anything.

The optional daily refresh (opt-in)

If you want catalogs fresher than the per-release cadence (typically 2–4 weeks between plugin releases, faster after major AI-vendor launches), there’s an opt-in toggle — Keep catalog updated between releases — on the final wizard step and in Settings → Bot fingerprint database.

When enabled, the plugin schedules a daily WP-Cron event (raif_fingerprint_update) that makes one HTTPS GET to fingerprints.royalplugins.com/v1/index.json. The request body is empty. Only the plugin version in the User-Agent header (e.g. royal-ai-firewall/1.0.0) and a standard If-None-Match cache validator are sent.

The toggle is off by default. Unticking the toggle immediately unschedules the cron — no outbound call will ever fire again unless you re-enable it. The bundled catalog continues to refresh on plugin updates regardless.

How to disable the daily update entirely

Three ways:

Recommended: Untick the Keep catalog updated between releases toggle in Settings.
Code: Use the raif_fingerprint_endpoint filter to return an empty string:
```
add_filter( 'raif_fingerprint_endpoint', '__return_empty_string' );
```
Site-wide: Set WP_HTTP_BLOCK_EXTERNAL in your wp-config.php to block all external HTTP requests from WordPress (affects every plugin, not just this one).

Uninstall Behavior

By default, data is preserved on uninstall. Your invocation logs, daily rollups, per-bot policy overrides, and all plugin options survive the uninstall so a reinstall picks up where you left off.

To delete everything on uninstall

Tick the Delete all logs, tables, and options when the plugin is uninstalled toggle in Settings → Data before deactivating. Then deactivate and delete the plugin through wp-admin → Plugins.

When the toggle is on, the uninstall script drops the three custom tables (wp_raif_invocation_log, wp_raif_daily_rollup, wp_raif_bot_policy) and deletes every raif_* option. The site returns to the same state it was in before you ever installed the plugin.

i Deactivation is non-destructive

Deactivating the plugin (without deleting) only unschedules cron events. Data stays put. Reactivate any time to resume where you left off.

Dashboard Shows Zero Hits

If the dashboard shows zero AI bot hits even though you’re sure AI bots are visiting your site, work through these in order.

Fire a manual test

From a terminal with internet access (not from the WordPress server itself):

curl -A "GPTBot/1.2" https://your-site.com/

Refresh the dashboard. If GPTBot now shows 1 hit, the plugin is working — you just hadn’t had real bot traffic yet. New / low-traffic sites can take 2–6 hours to see the first organic AI bot hit.

Check Cloudflare

If you’re behind Cloudflare and the manual curl test above doesn’t register, Cloudflare may be blocking AI bots at the edge before they reach WordPress. Walk through the Cloudflare setup dial-down guide and run the test again.

Check the REST API response

The dashboard reads its data via the /wp-json/royal-ai-firewall/v1/dashboard REST endpoint. From your wp-admin dashboard, open the browser’s Network tab and reload — you should see a 200 response with JSON data. If you see a 401, your session expired (refresh the page). If you see a 500, check the WordPress error log.

Check for caching plugins serving stale responses

Some cache plugins intercept REST API responses. The plugin sends Cache-Control: no-store headers and the DONOTCACHEPAGE constant, but exotic configurations (Cloudflare APO, server-side fastcgi cache, Varnish) may still cache. Exclude /wp-json/royal-ai-firewall/* from any REST API caching in your cache plugin’s settings.

Check that the WP-Cron is running

The hourly rollup cron (raif_rollup_build) aggregates the raw invocation log into the per-bot dashboard summary. If WP-Cron is broken on your site, the dashboard may show stale numbers. From wp-admin → Tools → Site Health, check the Loopback test — if it fails, your site can’t fire WP-Cron events.

Blocking Not Working

If you’ve set a bot to Block but it still shows up with successful 200 responses in the dashboard, the blocking layer isn’t firing for one of these reasons:

Search engine guard

If the bot you’re trying to block is Googlebot, Bingbot, Applebot, or DuckDuckBot, the search engine guard is overriding your block attempt by design. The Settings page has a toggle to override the guard with a clear warning. See Search Engine Guard.

Edge layer is allowing through

The blocking happens at WordPress’s parse_request priority 1 — before WordPress runs most of its work, but after Cloudflare, your CDN, and any edge-layer security plugins have already let the request through. The bot is, at this point, already inside WordPress; we’re just returning a 403 before any of the heavy work runs.

That’s the right place for an AI-bot policy plugin to live. If you need to block at a different layer entirely — before the request hits PHP at all — configure that in your CDN or edge-firewall settings.

Server-side cache is serving the response

If your server-side cache (nginx fastcgi cache, Varnish, or similar) was warmed by a previous unblocked hit, the cached 200 response may be served back even after you set the policy to Block. Clear the cache for the URLs the bot is hitting and re-test.

Per-bot override is set to “Use default policy”

Double-check the dropdown on the bot row. If it says Use default policy, the bot inherits whatever your global default is (Log only, by default). To block this specific bot regardless of default, set the dropdown to Block explicitly.

FAQ

Is there a Pro version?

No. Every feature ships in the free release on WordPress.org. There’s no upgrade prompt, no license key field, no paid tier on the roadmap.

Does the plugin call home?

Not by default. Zero outbound HTTP calls on a fresh install. The bundled bot catalog refreshes from the plugin zip on every plugin update. The optional daily catalog refresh is opt-in only and sends no site data when enabled. See Privacy & Telemetry and Catalog Updates.

Will this block Googlebot?

No. Googlebot, Bingbot, Applebot, and DuckDuckBot are protected by the search engine guard. Override requires explicit Settings toggle with a warning. See Search Engine Guard.

Does it work on multisite?

Yes. Activate per-site or network-wide. Each site maintains its own bot catalog, log, and per-bot policy overrides.

What WordPress and PHP versions does it require?

WordPress 6.4 minimum (tested up to 7.0). PHP 8.0 minimum (tested through 8.2). See Requirements.

Does it slow down my site?

The classifier runs in-process against a small pre-compiled UA pattern list and is designed to stay in the sub-millisecond range on the hot path. Logging is buffered and flushed on the WordPress shutdown hook, after the response is sent to the visitor, so any database writes never sit on the request’s critical path.

What if I have Wordfence / iThemes Security / etc.?

They coexist cleanly. The plugin auto-detects popular security plugins on activation and notes the compatibility on the dashboard. See Security Plugin Compatibility.

How do I get support?

The WordPress.org plugin support forum at wordpress.org/support/plugin/royal-ai-firewall/. We monitor and reply there.

Filters & Actions (for developers)

Royal AI Firewall exposes a small set of filters and actions for developers who want to customize behavior or hook into events.

Filters

Filter	Returns	Purpose
`raif_log_retention_days`	int (1–365)	Change the invocation log retention window. Default 7.
`raif_fingerprint_endpoint`	string (URL)	Change the bot catalog endpoint, or return empty string to disable the daily fetch entirely (when opt-in is on).
`raif_should_skip_classification`	bool	Return true to skip classification and logging for the current request. Useful for private admin areas or specific routes you never want in the log.

Actions

Action	Fires when	Args
`raif_request_classified`	After every request is classified (bot or not).	`Classification $classification`
`raif_policy_decided`	After the policy engine decides on an action for a classified request.	`PolicyDecision $decision, Classification $classification`
`raif_request_blocked`	Just before a blocked bot request is short-circuited with a 403.	`Classification $classification, PolicyDecision $decision`
`raif_policy_override_set`	After a per-bot override is created or updated via the REST API.	`string $bot_id, string $action`
`raif_policy_override_cleared`	After a per-bot override is deleted via the REST API.	`string $bot_id`
`raif_fingerprint_db_updated`	After the bot catalog cache is replaced with a fresher snapshot.	`string $new_version, string $old_version`
`raif_log_pruned`	After the daily retention prune runs.	`int $deleted_count, string $cutoff`
`raif_mcp_tool_logged`	After a Royal MCP tool call is captured via the first-party bridge.	`string $tool_name, string $status, Classification $classification`

Example: log classification events to your own table

add_action( 'raif_request_classified', function( $classification ) {
    if ( $classification->is_bot() && 'training-crawler' === $classification->category ) {
        // Your own custom logging here
        my_plugin_log_training_bot( $classification->bot_id, $classification->ip );
    }
} );