feat: timezone="auto" resolves from any egress + weekly geoip auto-update

Refine timezone="auto" so it ALWAYS resolves (drop the "host" sentinel):
- ""/"auto" resolve from the proxy egress when a proxy is set, else from the
  host own public IP (direct lookup); an explicit zone is the only opt-out.
- on failure: with a proxy raise; without a proxy fall back to the host TZ.

GeoIP DB now auto-updates against daijro/geoip-all-in-one weekly rebuild:
cache the latest, re-check after GEOIP_REFRESH_DAYS (7), prune old tags,
reuse a stale cache offline; GEOIP_MMDB_VERSION is only the cold fallback.

tests: test_geo.py (37) + test_geoip_update.py; full unit suite 429 green
plus 8 live combinations (proxy / no-proxy / explicit / failing / freshness).
This commit is contained in:
feder-cr
2026-06-06 05:16:20 +02:00
parent 369f3f7fdb
commit f2664f96e1
8 changed files with 361 additions and 113 deletions
+3 -2
View File
@@ -7,9 +7,10 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
## [Unreleased] ## [Unreleased]
### Added ### Added
- `timezone="auto"`: resolve the browser timezone from the proxy egress IP. A session with a proxy and no explicit timezone now defaults to `auto` — a foreign proxy paired with the host TZ is the classic `timezone_mismatch` signal. The egress IP is discovered through the proxy (SOCKS supported) and mapped to its IANA zone with an offline mmdb (`daijro/geoip-all-in-one`, downloaded + cached on first use; `STEALTHFOX_GEOIP_MMDB` points at your own). Precedence: an explicit zone wins; `""`/`"auto"` without a proxy stay on the host TZ; `"host"`/`"local"` force the host TZ even behind a proxy. With a proxy, an unresolvable zone raises rather than silently falling back. - `timezone="auto"`: the browser timezone is auto-derived from the egress IP. By default (no explicit timezone) it ALWAYS resolves — from the proxy egress when a proxy is set, otherwise from the host's own public IP — so the zone can never disagree with the IP (the classic `timezone_mismatch` signal). An explicit `"Area/City"` is the only way to force a specific zone. On failure: with a proxy the launch raises (no silent host-TZ fallback behind a foreign proxy); without a proxy it falls back to the host TZ so a transient lookup can't break the launch.
- The egress IP is mapped to its IANA zone with an offline mmdb (`daijro/geoip-all-in-one`). It auto-updates against the upstream weekly rebuild: cached locally, re-checked after `GEOIP_REFRESH_DAYS` (7), older copies pruned, and a stale cache is reused when offline. `STEALTHFOX_GEOIP_MMDB` points at your own `.mmdb` to skip the download.
- `resolve_session_timezone(timezone, proxy)` and `ensure_geoip_mmdb()` re-exported at the package root (plus `GeoTimezoneError`) so integrations that own their launch can reproduce the resolution. - `resolve_session_timezone(timezone, proxy)` and `ensure_geoip_mmdb()` re-exported at the package root (plus `GeoTimezoneError`) so integrations that own their launch can reproduce the resolution.
- `tests/test_geo.py`: 32 unit tests (precedence policy, proxy→requests translation, egress discovery, IP→IANA mapping, fail-early). - `tests/test_geo.py` (37) + `tests/test_geoip_update.py` (freshness / auto-update / offline fallback) unit tests.
### Changed ### Changed
- New runtime dependencies: `requests[socks]` (SOCKS egress lookup), `maxminddb` (mmdb reader), `tzdata` (IANA database for `zoneinfo`, which Windows lacks). - New runtime dependencies: `requests[socks]` (SOCKS egress lookup), `maxminddb` (mmdb reader), `tzdata` (IANA database for `zoneinfo`, which Windows lacks).
+5 -10
View File
@@ -146,27 +146,22 @@ Schemes supported: `socks5`, `socks4`, `http`, `https`. Auth works on all of the
The browser timezone follows `timezone=`: The browser timezone follows `timezone=`:
```python ```python
# default: with a proxy, the timezone is auto-derived from the proxy egress IP # default: timezone is auto-derived from the egress IP (proxy egress if a
# proxy is set, otherwise the host's own public IP)
with InvisiblePlaywright(proxy=proxy) as browser: with InvisiblePlaywright(proxy=proxy) as browser:
... ...
# explicit IANA zone always wins # explicit IANA zone always wins — the only way to force a specific zone
with InvisiblePlaywright(proxy=proxy, timezone="America/New_York") as browser: with InvisiblePlaywright(proxy=proxy, timezone="America/New_York") as browser:
... ...
# opt out and keep the host timezone even behind a proxy
with InvisiblePlaywright(proxy=proxy, timezone="host") as browser:
...
``` ```
| `timezone=` | with proxy | without proxy | | `timezone=` | with proxy | without proxy |
|---|---|---| |---|---|---|
| `""` (default) | auto-derived from egress IP | host timezone | | `""` (default) / `"auto"` | auto from proxy egress IP | auto from host public IP |
| `"auto"` | auto-derived from egress IP | host timezone |
| `"Area/City"` | that zone | that zone | | `"Area/City"` | that zone | that zone |
| `"host"` / `"local"` | host timezone | host timezone |
A proxy in a different country paired with the host timezone is the classic `timezone_mismatch` signal, so a proxy with no explicit timezone now resolves automatically. The egress IP is looked up through the proxy and mapped to its IANA zone with an offline database ([`daijro/geoip-all-in-one`](https://github.com/daijro/geoip-all-in-one)), downloaded and cached on first use. If a proxy is set but the zone can't be resolved, the launch raises rather than silently falling back to the host zone pass an explicit `timezone=` or `timezone="host"` to override. Point `STEALTHFOX_GEOIP_MMDB` at your own `.mmdb` to skip the download. The timezone always tracks the actual egress, so it can't disagree with the IP — a proxy in a different country paired with the host timezone is the classic `timezone_mismatch` signal. The egress IP is mapped to its IANA zone with an offline database ([`daijro/geoip-all-in-one`](https://github.com/daijro/geoip-all-in-one)), which auto-updates against its weekly rebuild and is cached locally (point `STEALTHFOX_GEOIP_MMDB` at your own `.mmdb` to skip the download). On failure: with a proxy the launch raises rather than silently using the host zone (pass an explicit `timezone=` to override); without a proxy it falls back to the host timezone so a transient lookup failure can't break the launch.
### Pinning specific fingerprint fields ### Pinning specific fingerprint fields
+36 -32
View File
@@ -1,23 +1,23 @@
"""Resolve the session timezone from the proxy egress IP (``timezone="auto"``). """Resolve the session timezone from the egress IP (``timezone="auto"``).
Approach B: discover the egress IP with one HTTP request routed *through the Approach B: discover the egress IP with one HTTP request routed *through the
configured proxy*, then map IP → IANA timezone with an offline mmdb proxy* when one is set, otherwise a direct request that sees the host's own
public IP — then map IP → IANA timezone with an offline mmdb
(``daijro/geoip-all-in-one``, downloaded + cached by ``download.py``). (``daijro/geoip-all-in-one``, downloaded + cached by ``download.py``).
Precedence (see ``resolve_session_timezone``): Precedence (see ``resolve_session_timezone``):
"host" / "local""" force host TZ (escape hatch) explicit IANA → unchanged explicit always wins
explicit IANA → unchanged explicit always wins "" / "auto" → egress ALWAYS resolve. With a proxy, from the proxy
"" + no proxy → "" host TZ (default, unchanged behaviour) egress IP; without a proxy, from the host's
"" + proxy → egress NEW default: a proxy with no timezone is own public IP. This is the default.
exactly the timezone_mismatch trap, so we
auto-resolve it.
"auto" + no proxy → "" nothing to resolve, fall back to host TZ
"auto" + proxy → egress
When a proxy IS set we fail loudly rather than silently fall back to the host On failure:
TZ — a foreign proxy paired with the host timezone is the precise signal with a proxy → raise a foreign proxy paired with the host TZ is
detectors flag as ``timezone_mismatch``. the precise ``timezone_mismatch`` signal, so
we fail loudly rather than fall back silently.
without a proxy → "" (host) the host TZ is a safe default, so a transient
lookup failure must not break the launch.
""" """
from __future__ import annotations from __future__ import annotations
@@ -79,14 +79,16 @@ def _proxies_for_requests(proxy: Dict[str, str]) -> Dict[str, str]:
def discover_egress_ip( def discover_egress_ip(
proxy: Dict[str, str], *, timeout: float = 10.0 proxy: Optional[Dict[str, str]] = None, *, timeout: float = 10.0
) -> str: ) -> str:
"""Return the public IP seen when routing through ``proxy``. """Return the public egress IP.
Tries each echo endpoint in turn; raises :class:`GeoTimezoneError` if none Routes the request through ``proxy`` when given (SOCKS support requires
return a valid IP (SOCKS support requires ``requests[socks]`` / PySocks). ``requests[socks]`` / PySocks); with ``proxy=None`` it makes a direct
request that sees the host's own public IP. Tries each echo endpoint in
turn; raises :class:`GeoTimezoneError` if none return a valid IP.
""" """
proxies = _proxies_for_requests(proxy) proxies = _proxies_for_requests(proxy) if proxy else None
last_err: Optional[Exception] = None last_err: Optional[Exception] = None
for url in _IP_ECHO_ENDPOINTS: for url in _IP_ECHO_ENDPOINTS:
try: try:
@@ -139,22 +141,24 @@ def resolve_session_timezone(
) -> str: ) -> str:
"""Map the user's ``timezone`` setting to a concrete IANA zone (or ``""``). """Map the user's ``timezone`` setting to a concrete IANA zone (or ``""``).
See the module docstring for the full precedence table. Raises See the module docstring for the full precedence table. ``""``/``"auto"``
:class:`GeoTimezoneError` when a proxy is set but the egress timezone ALWAYS resolve from the egress IP (proxy egress if a proxy is set, else the
cannot be resolved (fail-early — never silently use the host TZ behind a host's own public IP). On failure: with a proxy we raise
foreign proxy). :class:`GeoTimezoneError` (never silently use the host TZ behind a foreign
proxy); without a proxy we fall back to ``""`` (host TZ) so a transient
lookup failure can't break the launch.
""" """
tz = (timezone or "").strip() tz = (timezone or "").strip()
if tz.lower() in ("host", "local"):
return ""
if tz and tz.lower() != "auto": if tz and tz.lower() != "auto":
return tz # explicit IANA wins return tz # explicit IANA wins
if not _proxy_is_set(proxy): # "" or "auto" → always resolve from the egress IP.
return "" # "" / "auto" without a proxy → host TZ
# proxy set, tz is "" (new default) or "auto" → resolve from egress.
assert proxy is not None
from .download import ensure_geoip_mmdb from .download import ensure_geoip_mmdb
ip = discover_egress_ip(proxy) proxy_set = _proxy_is_set(proxy)
mmdb = ensure_geoip_mmdb() try:
return ip_to_timezone(ip, mmdb) ip = discover_egress_ip(proxy if proxy_set else None)
return ip_to_timezone(ip, ensure_geoip_mmdb())
except Exception:
if proxy_set:
raise # fail-early behind a proxy (timezone_mismatch trap)
return "" # no proxy: host TZ is a safe fallback
+4 -2
View File
@@ -53,8 +53,10 @@ RELEASE_URL_TEMPLATE = (
# daijro/geoip-all-in-one merges IP2Location LITE + GeoLite2 + DB-IP into a # daijro/geoip-all-in-one merges IP2Location LITE + GeoLite2 + DB-IP into a
# single mmdb (country ISO + coordinates + IANA timezone via tzfpy), rebuilt # single mmdb (country ISO + coordinates + IANA timezone via tzfpy), rebuilt
# weekly. GPL-3.0, so we DOWNLOAD it at runtime into the user cache (like the # weekly. GPL-3.0, so we DOWNLOAD it at runtime into the user cache (like the
# Firefox binary) rather than bundling it into this MIT package. Pinned to a # Firefox binary) rather than bundling it into this MIT package. The `-all`
# known-good weekly tag; bump to refresh. The `-all` variant covers IPv4+IPv6. # variant covers IPv4+IPv6. download.py tracks the LATEST release and refreshes
# weekly; GEOIP_MMDB_VERSION is only the cold-cache fallback when the GitHub
# API is unreachable on a machine that has never downloaded the DB.
GEOIP_REPO: str = "daijro/geoip-all-in-one" GEOIP_REPO: str = "daijro/geoip-all-in-one"
GEOIP_MMDB_VERSION: str = "2026.06.03" GEOIP_MMDB_VERSION: str = "2026.06.03"
GEOIP_ASSET: str = "geoip-aio-all.mmdb.zip" GEOIP_ASSET: str = "geoip-aio-all.mmdb.zip"
+119 -30
View File
@@ -5,9 +5,11 @@ import hashlib
import os import os
import platform import platform
import re import re
import shutil
import sys import sys
import tarfile import tarfile
import tempfile import tempfile
import time
import zipfile import zipfile
from pathlib import Path from pathlib import Path
@@ -158,46 +160,133 @@ def ensure_binary(version: str = BINARY_VERSION) -> Path:
# ───────────────────────────────────────────────────────────────────────── # ─────────────────────────────────────────────────────────────────────────
# GeoIP mmdb (used by timezone="auto" to map proxy egress IP → IANA zone) # GeoIP mmdb (timezone="auto" map egress IP → IANA zone)
#
# daijro/geoip-all-in-one is rebuilt WEEKLY, so we don't pin a tag. We cache
# the latest mmdb and, once it's older than GEOIP_REFRESH_DAYS, re-check the
# latest release and pull a newer build if one exists. Net effect: no download
# (not even an API call) on a launch within the window; auto-refresh after it;
# a stale cache is reused when offline rather than breaking the launch.
# ───────────────────────────────────────────────────────────────────────── # ─────────────────────────────────────────────────────────────────────────
def geoip_mmdb_path(version: str = GEOIP_MMDB_VERSION) -> Path: GEOIP_REFRESH_DAYS = 7 # matches daijro's weekly rebuild cadence
"""Cache location for the extracted geoip mmdb."""
return cache_root() / "geoip" / version / GEOIP_MMDB_NAME
def ensure_geoip_mmdb(version: str = GEOIP_MMDB_VERSION) -> Path: def _geoip_root() -> Path:
"""Return a path to the geoip mmdb, downloading + caching it if needed. return cache_root() / "geoip"
Set ``STEALTHFOX_GEOIP_MMDB`` to point at a user-supplied mmdb (or a test
fixture) to skip the download entirely. Otherwise the pinned weekly build def _geoip_check_marker() -> Path:
of ``daijro/geoip-all-in-one`` is fetched from GitHub Releases (public, no return _geoip_root() / ".last_check"
token) into the user cache and unzipped once.
def _cached_geoip_mmdb() -> Path | None:
"""Newest cached mmdb across tag dirs, or None. Tag dirs are date strings
(e.g. ``2026.06.03``) so a lexical sort is chronological."""
root = _geoip_root()
if not root.exists():
return None
cands = sorted(root.glob("*/*.mmdb"))
return cands[-1] if cands else None
def _geoip_cache_fresh(max_age_days: int) -> bool:
marker = _geoip_check_marker()
if not marker.exists():
return False
return (time.time() - marker.stat().st_mtime) < max_age_days * 86400
def _touch_geoip_marker() -> None:
m = _geoip_check_marker()
m.parent.mkdir(parents=True, exist_ok=True)
m.touch()
def _latest_geoip_tag() -> str:
"""Latest ``daijro/geoip-all-in-one`` release tag via the GitHub API."""
headers = {"Accept": "application/vnd.github+json"}
token = _github_token()
if token:
headers["Authorization"] = f"token {token}"
r = requests.get(
f"https://api.github.com/repos/{GEOIP_REPO}/releases/latest",
headers=headers, timeout=15,
)
r.raise_for_status()
tag = r.json().get("tag_name")
if not tag:
raise RuntimeError("no tag_name in geoip-all-in-one latest release")
return tag
def _download_geoip_tag(tag: str) -> Path:
"""Download + extract a specific tag's mmdb if not already cached."""
dst_dir = _geoip_root() / tag
target = dst_dir / GEOIP_MMDB_NAME
if not target.exists():
url = GEOIP_RELEASE_URL_TEMPLATE.format(tag=tag, asset=GEOIP_ASSET)
dst_dir.mkdir(parents=True, exist_ok=True)
with tempfile.TemporaryDirectory() as td:
archive = Path(td) / GEOIP_ASSET
_download_file(url, archive)
_extract(archive, dst_dir)
if target.exists():
return target
# asset name inside the zip may differ from GEOIP_MMDB_NAME
found = sorted(dst_dir.glob("*.mmdb"))
if found:
return found[0]
raise RuntimeError(f"geoip mmdb not found after extraction in {dst_dir}")
def _prune_old_geoip_tags(keep: str) -> None:
"""Drop every cached tag dir except ``keep`` to bound disk usage."""
root = _geoip_root()
if not root.exists():
return
for d in root.iterdir():
if d.is_dir() and d.name != keep:
shutil.rmtree(d, ignore_errors=True)
def geoip_mmdb_path() -> Path | None:
"""Path to the currently-cached mmdb (newest tag), or None if none cached."""
return _cached_geoip_mmdb()
def ensure_geoip_mmdb(max_age_days: int = GEOIP_REFRESH_DAYS) -> Path:
"""Return a geoip mmdb, kept fresh against daijro's weekly rebuild.
Resolution order:
1. ``STEALTHFOX_GEOIP_MMDB`` env → use that file (user-supplied / test).
2. A cached mmdb younger than ``max_age_days`` → use it (no network).
3. Else ask GitHub for the latest tag, download it if not already cached,
prune older tags, and reset the freshness timer.
4. If the API/download is unreachable but a cached mmdb exists → use it
(and reset the timer so we don't hammer the API while offline).
5. Cold cache + no network → fall back to the pinned ``GEOIP_MMDB_VERSION``;
if that download also fails, raise.
""" """
override = os.environ.get("STEALTHFOX_GEOIP_MMDB") override = os.environ.get("STEALTHFOX_GEOIP_MMDB")
if override: if override:
p = Path(override) p = Path(override)
if not p.exists(): if not p.exists():
raise RuntimeError( raise RuntimeError(f"STEALTHFOX_GEOIP_MMDB points to a missing file: {p}")
f"STEALTHFOX_GEOIP_MMDB points to a missing file: {p}"
)
return p return p
dst = geoip_mmdb_path(version) cached = _cached_geoip_mmdb()
if dst.exists(): if cached and _geoip_cache_fresh(max_age_days):
return dst return cached
url = GEOIP_RELEASE_URL_TEMPLATE.format(tag=version, asset=GEOIP_ASSET) try:
dst.parent.mkdir(parents=True, exist_ok=True) tag = _latest_geoip_tag()
with tempfile.TemporaryDirectory() as td: except Exception:
archive = Path(td) / GEOIP_ASSET if cached:
_download_file(url, archive) _touch_geoip_marker() # recheck after the window; don't hammer
_extract(archive, dst.parent) return cached
tag = GEOIP_MMDB_VERSION # cold cache + API down → pinned fallback
if dst.exists(): mmdb = _download_geoip_tag(tag)
return dst _prune_old_geoip_tags(mmdb.parent.name)
# The asset name inside the zip may differ from GEOIP_MMDB_NAME — fall _touch_geoip_marker()
# back to the first .mmdb the archive produced. return mmdb
candidates = sorted(dst.parent.glob("*.mmdb"))
if candidates:
return candidates[0]
raise RuntimeError(f"geoip mmdb not found after extraction in {dst.parent}")
+7 -6
View File
@@ -137,12 +137,13 @@ class InvisiblePlaywright:
locale: BCP-47 tag (e.g. ``"en-US"``). Drives the locale: BCP-47 tag (e.g. ``"en-US"``). Drives the
``Accept-Language`` header and ``navigator.language``. ``Accept-Language`` header and ``navigator.language``.
timezone: IANA zone (e.g. ``"America/New_York"``) — used as-is timezone: IANA zone (e.g. ``"America/New_York"``) — used as-is
when set. ``""`` (default) or ``"auto"`` resolves the zone when set, the only way to force a specific zone. ``""``
from the proxy egress IP when a proxy is set (one lookup (default) or ``"auto"`` ALWAYS resolves from the egress IP:
through the proxy + an offline mmdb), otherwise the host TZ. through the proxy when one is set, otherwise from the host's
``"host"`` / ``"local"`` forces the host TZ even behind a own public IP (one lookup + an offline mmdb). On failure: with
proxy. With a proxy, an unresolvable zone raises rather than a proxy it raises (a foreign proxy on the host TZ is the
silently falling back to the host TZ (``timezone_mismatch``). ``timezone_mismatch`` signal); without a proxy it falls back to
the host TZ so a transient lookup failure can't break launch.
extra_prefs: Optional dict of Firefox prefs overlayed on top extra_prefs: Optional dict of Firefox prefs overlayed on top
of the generated profile — useful for niche tweaks of the generated profile — useful for niche tweaks
without monkey-patching the package. without monkey-patching the package.
+56 -31
View File
@@ -136,6 +136,20 @@ def test_discover_egress_ip_all_fail_raises(monkeypatch):
discover_egress_ip(SOCKS) discover_egress_ip(SOCKS)
@pytest.mark.unit
def test_discover_egress_ip_no_proxy_is_direct(monkeypatch):
# proxy=None → direct request, requests.get must get proxies=None.
seen = {}
def fake_get(url, **kw):
seen["proxies"] = kw.get("proxies", "MISSING")
return _FakeResp("192.0.2.55")
monkeypatch.setattr(_geo.requests, "get", fake_get)
assert discover_egress_ip(None) == "192.0.2.55"
assert seen["proxies"] is None
# ────────────────────────────────────────────────────────────────────── # ──────────────────────────────────────────────────────────────────────
# ip_to_timezone — mocked mmdb reader # ip_to_timezone — mocked mmdb reader
# ────────────────────────────────────────────────────────────────────── # ──────────────────────────────────────────────────────────────────────
@@ -194,8 +208,9 @@ def stub_egress(monkeypatch):
"""Make egress resolution deterministic + offline; record if it ran.""" """Make egress resolution deterministic + offline; record if it ran."""
state = {"called": False} state = {"called": False}
def fake_discover(proxy, **kw): def fake_discover(proxy=None, **kw):
state["called"] = True state["called"] = True
state["proxy_arg"] = proxy
return "203.0.113.7" return "203.0.113.7"
monkeypatch.setattr(_geo, "discover_egress_ip", fake_discover) monkeypatch.setattr(_geo, "discover_egress_ip", fake_discover)
@@ -208,56 +223,66 @@ def stub_egress(monkeypatch):
@pytest.mark.unit @pytest.mark.unit
@pytest.mark.parametrize("sentinel", ["host", "local", "HOST", "Local"]) def test_resolve_explicit_iana_wins(stub_egress):
def test_resolve_host_sentinel_forces_host_tz(sentinel, stub_egress): # An explicit zone wins and never triggers resolution (proxy or not).
# Even with a proxy set, "host"/"local" force the host TZ and never resolve.
assert resolve_session_timezone(sentinel, SOCKS) == ""
assert stub_egress["called"] is False
@pytest.mark.unit
def test_resolve_explicit_iana_wins_over_proxy(stub_egress):
assert resolve_session_timezone("Asia/Tokyo", SOCKS) == "Asia/Tokyo" assert resolve_session_timezone("Asia/Tokyo", SOCKS) == "Asia/Tokyo"
assert stub_egress["called"] is False # no resolution when explicit assert resolve_session_timezone("Asia/Tokyo", None) == "Asia/Tokyo"
@pytest.mark.unit
def test_resolve_empty_no_proxy_is_host(stub_egress):
assert resolve_session_timezone("", None) == ""
assert stub_egress["called"] is False assert stub_egress["called"] is False
@pytest.mark.unit @pytest.mark.unit
def test_resolve_auto_no_proxy_is_host(stub_egress): def test_resolve_empty_with_proxy_resolves_from_proxy(stub_egress):
assert resolve_session_timezone("auto", None) == ""
assert stub_egress["called"] is False
@pytest.mark.unit
def test_resolve_empty_with_proxy_defaults_to_auto(stub_egress):
# NEW default: a proxy with no timezone auto-resolves from the egress.
assert resolve_session_timezone("", SOCKS) == "America/New_York" assert resolve_session_timezone("", SOCKS) == "America/New_York"
assert stub_egress["called"] is True assert stub_egress["called"] is True
assert stub_egress["proxy_arg"] == SOCKS # routed through the proxy
@pytest.mark.unit @pytest.mark.unit
def test_resolve_auto_with_proxy_resolves(stub_egress): def test_resolve_auto_with_proxy_resolves_from_proxy(stub_egress):
assert resolve_session_timezone("auto", HTTP) == "America/New_York" assert resolve_session_timezone("auto", HTTP) == "America/New_York"
assert stub_egress["proxy_arg"] == HTTP
@pytest.mark.unit
def test_resolve_empty_no_proxy_resolves_from_host(stub_egress):
# auto ALWAYS resolves — without a proxy, from the host's own public IP.
assert resolve_session_timezone("", None) == "America/New_York"
assert stub_egress["called"] is True assert stub_egress["called"] is True
assert stub_egress["proxy_arg"] is None # direct request, no proxy
@pytest.mark.unit @pytest.mark.unit
def test_resolve_direct_proxy_treated_as_no_proxy(stub_egress): def test_resolve_auto_no_proxy_resolves_from_host(stub_egress):
assert resolve_session_timezone("auto", {"server": "direct://"}) == "" assert resolve_session_timezone("auto", None) == "America/New_York"
assert stub_egress["called"] is False assert stub_egress["proxy_arg"] is None
@pytest.mark.unit @pytest.mark.unit
def test_resolve_fail_early_propagates(monkeypatch): def test_resolve_direct_proxy_resolves_via_host(stub_egress):
# With a proxy set, a discovery failure must raise — never silent host TZ. # direct:// counts as "no proxy" → resolve from the host IP, don't skip.
def boom(proxy, **kw): assert resolve_session_timezone("auto", {"server": "direct://"}) == "America/New_York"
assert stub_egress["proxy_arg"] is None
@pytest.mark.unit
def test_resolve_no_proxy_failure_falls_back_to_host(monkeypatch):
# Without a proxy, a lookup failure must NOT break the launch → host TZ ("").
def boom(proxy=None, **kw):
raise GeoTimezoneError("offline")
monkeypatch.setattr(_geo, "discover_egress_ip", boom)
assert resolve_session_timezone("auto", None) == ""
assert resolve_session_timezone("", None) == ""
@pytest.mark.unit
def test_resolve_proxy_failure_raises(monkeypatch):
# With a proxy set, a failure must raise — never a silent host-TZ fallback.
def boom(proxy=None, **kw):
raise GeoTimezoneError("no egress") raise GeoTimezoneError("no egress")
monkeypatch.setattr(_geo, "discover_egress_ip", boom) monkeypatch.setattr(_geo, "discover_egress_ip", boom)
with pytest.raises(GeoTimezoneError): with pytest.raises(GeoTimezoneError):
resolve_session_timezone("auto", SOCKS) resolve_session_timezone("auto", SOCKS)
with pytest.raises(GeoTimezoneError):
resolve_session_timezone("", SOCKS)
+131
View File
@@ -0,0 +1,131 @@
"""Unit tests for the intelligent geoip mmdb auto-update in `download.py`.
daijro/geoip-all-in-one rebuilds weekly; `ensure_geoip_mmdb` keeps the cache
fresh without a download (or API call) on every launch. These tests mock the
cache root, the latest-tag API, and the per-tag download so nothing touches the
network.
"""
import os
import time
import pytest
import invisible_playwright.download as dl
@pytest.fixture
def cache(tmp_path, monkeypatch):
"""Point the cache at tmp_path and clear the env override."""
monkeypatch.setattr(dl, "cache_root", lambda: tmp_path)
monkeypatch.delenv("STEALTHFOX_GEOIP_MMDB", raising=False)
return tmp_path
def _make_cached(root, tag, name=dl.GEOIP_MMDB_NAME):
d = root / "geoip" / tag
d.mkdir(parents=True, exist_ok=True)
f = d / name
f.write_bytes(b"FAKE-MMDB")
return f
def _set_marker_age(root, days):
m = root / "geoip" / ".last_check"
m.parent.mkdir(parents=True, exist_ok=True)
m.touch()
old = time.time() - days * 86400
os.utime(m, (old, old))
# ──────────────────────────────────────────────────────────────────────
# env override
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_env_override_returns_file(tmp_path, monkeypatch):
f = tmp_path / "mine.mmdb"
f.write_bytes(b"X")
monkeypatch.setenv("STEALTHFOX_GEOIP_MMDB", str(f))
assert dl.ensure_geoip_mmdb() == f
@pytest.mark.unit
def test_env_override_missing_raises(tmp_path, monkeypatch):
monkeypatch.setenv("STEALTHFOX_GEOIP_MMDB", str(tmp_path / "nope.mmdb"))
with pytest.raises(RuntimeError):
dl.ensure_geoip_mmdb()
# ──────────────────────────────────────────────────────────────────────
# freshness window
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_fresh_cache_no_network(cache, monkeypatch):
f = _make_cached(cache, "2026.06.03")
_set_marker_age(cache, 0) # just checked
def boom():
raise AssertionError("latest-tag API must NOT be called within the window")
monkeypatch.setattr(dl, "_latest_geoip_tag", boom)
assert dl.ensure_geoip_mmdb(max_age_days=7) == f
@pytest.mark.unit
def test_stale_same_tag_no_download(cache, monkeypatch):
f = _make_cached(cache, "2026.06.03")
_set_marker_age(cache, 30) # stale → will re-check
monkeypatch.setattr(dl, "_latest_geoip_tag", lambda: "2026.06.03")
# real _download_geoip_tag runs but target exists, so no actual download:
monkeypatch.setattr(dl, "_download_file", lambda *a, **k: (_ for _ in ()).throw(
AssertionError("must not download when tag already cached")))
assert dl.ensure_geoip_mmdb(max_age_days=7) == f
@pytest.mark.unit
def test_stale_new_tag_downloads_and_prunes(cache, monkeypatch):
old = _make_cached(cache, "2026.06.03")
_set_marker_age(cache, 30)
monkeypatch.setattr(dl, "_latest_geoip_tag", lambda: "2026.06.10")
def fake_download(tag):
return _make_cached(cache, tag) # simulate fetch+extract of the new tag
monkeypatch.setattr(dl, "_download_geoip_tag", fake_download)
got = dl.ensure_geoip_mmdb(max_age_days=7)
assert got.parent.name == "2026.06.10"
assert not old.parent.exists() # old tag pruned
assert got.exists()
# ──────────────────────────────────────────────────────────────────────
# offline resilience
# ──────────────────────────────────────────────────────────────────────
@pytest.mark.unit
def test_api_down_with_cache_uses_cache(cache, monkeypatch):
f = _make_cached(cache, "2026.06.03")
_set_marker_age(cache, 30)
def boom():
raise OSError("offline")
monkeypatch.setattr(dl, "_latest_geoip_tag", boom)
assert dl.ensure_geoip_mmdb(max_age_days=7) == f # stale cache reused, no raise
@pytest.mark.unit
def test_cold_cache_api_down_falls_back_to_pinned(cache, monkeypatch):
# no cache at all + API unreachable → pinned GEOIP_MMDB_VERSION fallback.
def boom():
raise OSError("offline")
monkeypatch.setattr(dl, "_latest_geoip_tag", boom)
captured = {}
def fake_download(tag):
captured["tag"] = tag
return _make_cached(cache, tag)
monkeypatch.setattr(dl, "_download_geoip_tag", fake_download)
got = dl.ensure_geoip_mmdb(max_age_days=7)
assert captured["tag"] == dl.GEOIP_MMDB_VERSION
assert got.exists()