部署 Anubis:让 AI 网络爬虫寸步难行

2025年5月1日清晨,我注意到网站遭遇高负载攻击。数据库服务器运行负载持续维持在100%,而正常负载仅为7%。

图0:部署 Anubis:让 AI 网络爬虫寸步难行

过去7天CPU使用率,今日凌晨3点前均维持在7%左右,此后飙升至95-100%

纵观过去四周数据,此现象极为异常。

图0:部署 Anubis:让 AI 网络爬虫寸步难行

过去7天CPU使用率,今日凌晨3点前均维持在7%左右,此后飙升至95-100%

我发现是某个IP地址在作祟,将其封禁后负载立即下降。

异常已持续数小时,是时候启用Anubis了。

本文涉及:

元素周期表

POW(工作量证明)

Anubis 是一款基于工作量证明的中间件解决方案。它接收传入流量,进行特殊处理后,决定是否将请求转发至您的Web服务器。具体实现将在后文详述。本文旨在记录我的实践过程(供个人参考),并提供简明易懂的部署指南(供您参考)。

实现说明

以下是我向他人讲解添加阿努比斯的方式。下方附有更详细的 Nginx 配置示例。

本说明假设您已拥有可正常运行的网站配置(含 TLS)。我们将该配置称为原始配置。

将原始配置复制粘贴至新创建的“server”模块,并监听不同端口号(我选择 127.0.0.1:8080)。我们将此配置称为副本。

在副本中移除 TLS 配置——您不再需要它。也可保留配置,但本文示例假设已移除。

在原始配置中,您不再直接提供内容,而是将所有请求转发至 Anubis。

配置 Anubis 使其将请求转发至您创建的副本。

完成。

大致如此。具体细节稍后详述,下面示例将助你理解。

安装

安装过程一如既往地简单:

[11:49 test-nginx01 dvl ~] % sudo pkg install go-anubis
Updating local repository catalogue...
local repository is up to date.
All repositories are up to date.
The following 1 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
	go-anubis: 1.15.1

Number of packages to be installed: 1

The process will require 10 MiB more space.
4 MiB to be downloaded.

Proceed with this action? [y/N]: y
[test-nginx01.int.unixathome.org] [1/1] Fetching go-anubis-1.15.1.pkg: 100%    4 MiB   3.9MB/s    00:01    
Checking integrity... done (0 conflicting)
[test-nginx01.int.unixathome.org] [1/1] Installing go-anubis-1.15.1...
[test-nginx01.int.unixathome.org] [1/1] Extracting go-anubis-1.15.1: 100%
=====
Message from go-anubis-1.15.1:

--
Anubis has been installed! Typically anubis is run behind a proxy, such as
nginx, caddy, haproxy, or similar, that passes the `x-real-ip` header to
anubis, which runs by default on port 8923. You will need to supply the
target url that you are protecting.

Amend rc.conf as required. For example:

anubis_enable=YES
anubis_args="-target http://localhost:4000/myapp -bind :8923"

For more information, see https://anubis.techaro.lol/

配置

我的配置选项如下:

[11:49 test-nginx01 dvl ~] % sudo sysrc anubis_enable=YES anubis_args="-target http://localhost:4000/myapp -bind :8923"
anubis_enable="YES"
anubis_args="-target http://127.0.0.1:8080 -bind :8923 -difficulty 2 -cookie-domain=freshports.org -policy-fname=/usr/local/etc/anubis-freshports.json"
[11:50 test-nginx01 dvl ~] % 

当我通过 sudo service anubis start 启动 Anubis 时,在 /var/log/messages 中看到:

May  1 12:13:29 test-nginx01 anubis[62887]: Rule error IDs:
May  1 12:13:29 test-nginx01 anubis[62887]: 
May  1 12:13:29 test-nginx01 anubis[62887]: {"time":"2025-05-01T12:13:29.021133213Z","level":"INFO","source":{"function":"main.main","file":"github.com/TecharoHQ/anubis/cmd/anubis/main.go","line":222},"msg":"listening","url":"http://localhost:8923","difficulty":4,"serveRobotsTXT":false,"target":"http://127.0.0.1:8080","version":"v1.15.1","debug-x-real-ip-default":""}

它正在等待中。

[12:13 test-nginx01 dvl /usr/local/etc/rc.d] % sockstat -4 -p 8923                           
USER     COMMAND    PID   FD  PROTO  LOCAL ADDRESS         FOREIGN ADDRESS      
www      anubis     62888 3   tcp4   10.55.0.42:8923       *:*

等等,你设置的是 localhost——这只是个原生 FreeBSD 监狱环境。生产环境中我会将其迁移至类似 lo1 接口的 127.0.0.24:8923。

更详细的示例

这是添加Anubis前的正常工作Nginx配置。为简化说明,省略了大部分网站细节。

[17:36 stage-nginx01 dvl /usr/local/etc/nginx/includes] % cat freshports.conf 
# redirect from port 80 to 443
server {
  listen 10.55.0.42:80;
 
  server_name test-nginx01.int.unixathome.org;
 
  return 301 https://$server_name$request_uri;
}
 
server {
  listen 10.55.0.42:443 ssl;
  http2 on;
   
  server_name test-nginx01.int.unixathome.org;
 
  include "/usr/local/etc/freshports/virtualhost-common.conf";      # this contains the documentroot etc
  include "/usr/local/etc/freshports/virtualhost-common-ssl.conf";
 
  ssl_certificate     /usr/local/etc/ssl/test-nginx01.int.unixathome.org.fullchain.cer;
  ssl_certificate_key /usr/local/etc/ssl/test-nginx01.int.unixathome.org.key;
 
}

添加Anubis后的配置:

# redirect from port 80 to 443
# no changes here
server {
  listen 10.55.0.42:80;
 
  server_name test-nginx01.int.unixathome.org;
 
  include "/usr/local/etc/freshports/virtualhost-common.conf";
 
  return 301 https://$server_name$request_uri;
}
 
# take your original ssl server, and comment out all the content, and add in the proxy stuff.
server {
  listen 10.55.0.42:443 ssl;
  http2 on;
   
  server_name test-nginx01.int.unixathome.org;
 
#  include "/usr/local/etc/freshports/virtualhost-common.conf";
#  include "/usr/local/etc/freshports/virtualhost-common-ssl.conf";
 
  # this gets added in
  location / {
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_pass http://anubis;
  }
 
  ssl_certificate     /usr/local/etc/ssl/test-nginx01.int.unixathome.org.fullchain.cer;
  ssl_certificate_key /usr/local/etc/ssl/test-nginx01.int.unixathome.org.key;
}
 
# this is what your TLS based config above will redirect to
# this matches settings in /etc/rc.conf - see next section
upstream anubis {
  server 127.0.0.1:8923 max_fails=99999 max_conns=1024 fail_timeout=1s;
  keepalive 256;
  keepalive_timeout 120s;
  keepalive_requests 256;
}
 
 
# this is a copy of the original content, minus ssl and listening on another port
server {
  listen 127.0.0.1:8080;
   
  server_name test-nginx01.int.unixathome.org;
 
  # this ensures the logs created are correct
  set_real_ip_from  10.55.0.42;
  real_ip_header    X-Forwarded-For;
  real_ip_recursive on;
 
 
  include "/usr/local/etc/freshports/virtualhost-common.conf"; # this what serves up your content.
#  include "/usr/local/etc/freshports/virtualhost-common-ssl.conf";
 
#  ssl_certificate     /usr/local/etc/ssl/test-nginx01.int.unixathome.org.fullchain.cer;
#  ssl_certificate_key /usr/local/etc/ssl/test-nginx01.int.unixathome.org.key;
}

Anubis配置文件:

[17:43 test-nginx01 dvl ~] % grep anubis /etc/rc.conf
anubis_enable="YES"
anubis_args="-target http://localhost:8080 -bind :8923 -difficulty 2 -cookie-domain=freshports.org -policy-fname=/usr/local/etc/anubis-freshports.json"

策略文件源自https://www.techug.com/post/setting-up-anubis-on-freebsd/,后端部分由我添加。

[17:43 test-nginx01 dvl ~] % cat /usr/local/etc/anubis-freshports.json
{"bots":
        [
        {
                "name": "generic-bot-catchall",
                "user_agent_regex": "(?i:bot|crawler|gpt)",
                "action": "CHALLENGE",
                "challenge": {
                        "difficulty": 16,
                        "report_as": 4,
                        "algorithm": "slow"
                }
        },
        {
                "name": "well-known",
                "path_regex": "^/.well-known/.*$",
                "action": "ALLOW"
        },
        {
                "name": "backend",
                "path_regex": "^/backend/.*$",
                "action": "ALLOW"
        },
        {
                "name": "favicon",
                "path_regex": "^/favicon.ico$",
                "action": "ALLOW"
        },
        {
                "name": "robots-txt",
                "path_regex": "^/robots.txt$",
                "action": "ALLOW"
        },
        {
                "name": "generic-browser",
                "user_agent_regex": "Mozilla",
                "action": "CHALLENGE"
        }
        ]
}

以上配置应可正常运行。若仍存在问题请告知,我将及时更新本文。

日志记录

以下是我笔记本访问网站的日志条目,已重新排版避免横向滚动:

May  3 20:20:28 test-nginx01 anubis[58856]: {"time":"2025-05-03T20:20:28.161082763Z","level":"INFO","source":
{"function":"github.com/TecharoHQ/anubis/lib.(*Server).PassChallenge",
"file":"github.com/TecharoHQ/anubis/lib/anubis.go","line":402},"msg":"challenge took",
"user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.4 Safari/605.1.15",
"accept_language":"en-US,en;q=0.9","priority":"u=0, i","x-forwarded-for":"10.1.98.18","x-real-ip":"10.1.98.18",
"check_result":{"name":"bot/generic-browser","rule":"CHALLENGE"},"elapsedTime":87}

该IP地址(10.1.98.18)即为我的笔记本设备所用。

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注


京ICP备12002735号