web_fetch
web_fetch.stencila/cache/web/
url | |||
raw |
How it Works
URL validation — the URL must use the http://or https://scheme. HTTP caching — responses are cached locally with full RFC 7234 compliance. Subsequent fetches of the same URL use conditional requests ( If-Modified-Since, If-None-Match) and honor 304 Not Modifiedresponses. Content processing — HTML pages are parsed and converted to Markdown. Images referenced in the page are downloaded in parallel (up to 8 concurrent, with retries) and saved alongside the Markdown file in a media/subdirectory. Image references in the Markdown are rewritten to point to the local copies. Output — the tool returns a manifest listing the saved files with sizes and line counts, along with instructions to use read_file, grep, or globto explore the content.
Guard Rules
web.credential_url | |||||
web.metadata_endpoint | |||||
web.internal_network | |||||
web.non_https | https://http:// | ||||
web.high_risk_port | |||||
web.domain_allowlist | allowedDomains | ||||
web.domain_denylist | disallowedDomains | ||||
web.parse_failure | https://example.com/path |
Metadata Hosts
web.metadata_endpointweb.credential_url
169.254.169.254— AWS, Azure, most cloud providers fd00:ec2::254— AWS IMDSv2 IPv6 endpoint metadata.google.internal— GCP 100.100.100.200— Alibaba Cloud
Credential Path Prefixes
web.credential_url
/latest/meta-data/iam/security-credentials(AWS IMDSv1/v2) /latest/api/token(AWS IMDSv2) /computeMetadata/v1/instance/service-accounts(GCP) /metadata/identity/oauth2/token(Azure) /latest/meta-data/ram/security-credentials(Alibaba Cloud)
Internal Network Detection
web.internal_network
localhost*.local, *.internalhostname suffixes Loopback IPs: 127.0.0.0/8, ::1Private IPv4: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16Link-local: 169.254.0.0/16, fe80::/10Shared address space: 100.64.0.0/10IPv4-mapped IPv6: ::ffff:0:0/96(when the mapped address is private)
High-Risk Ports
web.high_risk_port
| 22 | SSH |
| 23 | Telnet |
| 25 | SMTP |
| 135 | MS RPC |
| 139 | NetBIOS |
| 445 | SMB |
| 2375 | Docker daemon (unencrypted) |
| 2376 | Docker daemon (TLS) |
| 3306 | MySQL |
| 5432 | PostgreSQL |
| 5900 | VNC |
| 6379 | Redis |
| 6443 | Kubernetes API |
| 8200 | Vault |
| 8500 | Consul |
| 9200 | Elasticsearch |
| 27017 | MongoDB |