xref: /plugin/hideip/README.md (revision eb189a72196deaddac65fb31c555327baf072e90)
1e6a02230Stracker-user# Hide IP plugin for DokuWiki
2e6a02230Stracker-user
3e6a02230Stracker-userRemoves IP addresses from everywhere a DokuWiki admin might see them. Two parts that work together:
4e6a02230Stracker-user
5e6a02230Stracker-user- **Real-time anonymisation** — an action plugin that rewrites `$_SERVER['REMOTE_ADDR']` to `0.0.0.0` and clears every common forwarding header before any DokuWiki code reads them. From the moment the plugin is enabled, every new changelog entry, page metadata write, mailer header, and `$INFO['client']` value carries `0.0.0.0` instead of the real address.
6e6a02230Stracker-user- **Historical scrub** — an admin page that walks the IP-bearing files DokuWiki already has on disk (changelogs and page metadata) and rewrites them. Authorship (user field) and modification timestamps are preserved; only the IP field changes.
7e6a02230Stracker-user
8e6a02230Stracker-userTested on DokuWiki `2025-05-14b "Librarian"`.
9e6a02230Stracker-user
10e6a02230Stracker-user## Why not the existing plugins?
11e6a02230Stracker-user
12e6a02230Stracker-userThere are three related plugins on dokuwiki.org. None of them quite fit this use case.
13e6a02230Stracker-user
142a25b111Stracker-user**`anonip`** (Andreas Gohr, 2016, [dokuwiki.org/plugin:anonip](https://www.dokuwiki.org/plugin:anonip)) was the canonical answer. It generates a pseudo-IPv6 from `auth_browseruid()`. The problem, [as discussed on the DokuWiki forum](https://forum.dokuwiki.org/d/18284-dont-store-ip-addresses), is that `auth_browseruid()` mixes in the first IPv4 octet of the client address, which makes the hash brute-forceable: an attacker iterating through ~256 octet values and a list of common User-Agent strings can recover the user's first IP octet and User-Agent. A community [fork by adakaleh](https://github.com/adakaleh/dokuwiki-plugin-anonip/tree/replace_auth_browseruid) swaps in `session_id()` instead — better, but adds session-stability assumptions and still produces a pseudo-IP for an admin to see when usernames already differentiate edits.
15e6a02230Stracker-user
16e6a02230Stracker-user**`gdpr`** (Michael Große, 2019, [dokuwiki.org/plugin:gdpr](https://www.dokuwiki.org/plugin:gdpr)) takes a different approach: it lets real IPs be recorded as users edit, then strips them from changelog entries older than `$conf['recent_days']` (default 90 days). This works for GDPR retention but doesn't fit a "no IPs anywhere, ever" policy, and it leaves IPs visible in the Recent Changes view for up to three months. It also touches only changelog files and skips the master changelog (`_dokuwiki.changes`) entirely, leaving IPs visible in Recent Changes after old per-page entries are cleaned.
17e6a02230Stracker-user
18e6a02230Stracker-user**This plugin** combines what works from both: anonip's real-time interception (simplified to a constant `0.0.0.0` since the wiki has named users) plus a one-shot historical scrub that does cover the master changelogs and page metadata.
19e6a02230Stracker-user
20e6a02230Stracker-user## What gets anonymised
21e6a02230Stracker-user
22e6a02230Stracker-user### In real time (action component)
23e6a02230Stracker-user
24e6a02230Stracker-user`$_SERVER['REMOTE_ADDR']` becomes `0.0.0.0`. These forwarding headers are removed: `HTTP_X_FORWARDED_FOR`, `HTTP_X_REAL_IP`, `HTTP_CLIENT_IP`, `HTTP_FORWARDED`, `HTTP_CF_CONNECTING_IP`, `HTTP_TRUE_CLIENT_IP`. The User-Agent is left alone (it's not an IP).
25e6a02230Stracker-user
26e6a02230Stracker-userThe hook fires on `INIT_LANG_LOAD`, after DokuWiki's `init.php` has applied any trusted-proxy `X-Forwarded-For` rewriting, but before any other code reads the IP. Every downstream consumer — `clientIP()`, `$INPUT->server->str('REMOTE_ADDR')`, changelog writers, metadata, mailer `X-Originating-IP`, AJAX page-info — sees the placeholder.
27e6a02230Stracker-user
28e6a02230Stracker-user### On disk (admin component, on demand)
29e6a02230Stracker-user
30e6a02230Stracker-user| File pattern | What it is | What changes |
31e6a02230Stracker-user| --- | --- | --- |
32e6a02230Stracker-user| `data/meta/**/*.changes` | Per-page change history; master `_dokuwiki.changes` | TSV field 2 (IP) rewritten to `0.0.0.0` |
33e6a02230Stracker-user| `data/media_meta/**/*.changes` | Per-media change history; master | Same |
34e6a02230Stracker-user| `data/meta/**/*.meta` | Page metadata (serialized) | `$meta[*]['last_change']['ip']` rewritten to `0.0.0.0` |
35e6a02230Stracker-user
36e6a02230Stracker-userTimestamps, page IDs, usernames, summaries, and size deltas are preserved verbatim. File mtimes are restored after each write.
37e6a02230Stracker-user
38e6a02230Stracker-user**Out of scope by design:**
39e6a02230Stracker-user
40e6a02230Stracker-user- `data/attic/` and `data/media_attic/` — historical gzip archives of page revisions. Admins generally don't view these; the project's filesystem owner has access to logs anyway. Rewriting them would be slow and require gzip handling for marginal benefit.
41e6a02230Stracker-user- `data/cache/`, `data/tmp/`, `data/log/`, `data/locks/` — ephemeral or regenerated.
42e6a02230Stracker-user
43*eb189a72Stracker-user## Why `127.0.0.1` shows up (and why it's left alone)
44*eb189a72Stracker-user
45*eb189a72Stracker-userYou may notice `127.0.0.1` in changelog entries (often summarised `external edit`) or in a page's `last_change` metadata, even with real-time anonymisation active and after a scrub. **This is not a real visitor IP and nothing is leaking.**
46*eb189a72Stracker-user
47*eb189a72Stracker-user`127.0.0.1` is a value DokuWiki **hardcodes itself** (`inc/ChangeLog/ChangeLog.php`) as its "external edit" marker — it stamps it whenever a page's on-disk `.txt` modification time no longer matches its changelog, i.e. the file was created or edited directly on disk rather than through the wiki. This is common in container / bind-mounted setups (volume operations, restores, `git` checkouts, an editor touching a file) and also applies to pages DokuWiki ships without a changelog, such as `wiki:syntax`.
48*eb189a72Stracker-user
49*eb189a72Stracker-userBecause it's a literal written by core — not derived from `$_SERVER` — the real-time action component cannot intercept it, and core re-creates it on the next view (into page metadata via `pageinfo()`) and on the next save (into the changelog via `detectExternalEdit()`). Scrubbing it would therefore be an endless treadmill, and it's a loopback address that identifies no one.
50*eb189a72Stracker-user
51*eb189a72Stracker-userSo **the preview and scrub deliberately ignore `127.0.0.1`** (alongside the `0.0.0.0` placeholder and already-blank values): it is never counted and never rewritten. Real visitor IPs are still anonymised to `0.0.0.0` exactly as before — only this benign loopback marker is left as-is.
52*eb189a72Stracker-user
53e6a02230Stracker-user## Page locking still works
54e6a02230Stracker-user
55e6a02230Stracker-userFor each lock, DokuWiki writes the username if one is set, otherwise IP and session-id. From `inc/common.php`:
56e6a02230Stracker-user
57e6a02230Stracker-user```php
58e6a02230Stracker-userif ($INPUT->server->str('REMOTE_USER')) {
59e6a02230Stracker-user    io_saveFile($lock, $INPUT->server->str('REMOTE_USER'));
60e6a02230Stracker-user} else {
61e6a02230Stracker-user    io_saveFile($lock, clientIP() . "\n" . session_id());
62e6a02230Stracker-user}
63e6a02230Stracker-user```
64e6a02230Stracker-user
65e6a02230Stracker-userIf your wiki has no anonymous edits (the use case this plugin is built for), every lock uses the username and IP isn't consulted. Even if you do allow anonymous edits, `unlock()` checks `session_id()` as a fallback, so a constant `0.0.0.0` still releases your own lock correctly.
66e6a02230Stracker-user
67e6a02230Stracker-user## Install
68e6a02230Stracker-user
69e6a02230Stracker-userIn your wiki:
70e6a02230Stracker-user
71e6a02230Stracker-user1. **Admin → Extension Manager → Manual Install**
72e6a02230Stracker-user2. Upload `hideip.zip`, click **Install**
73e6a02230Stracker-user3. Real-time anonymisation is now active. To scrub existing data: refresh the Admin page and open **Hide IP** under Additional Plugins.
74e6a02230Stracker-user
75e6a02230Stracker-userOr drop the directory into `lib/plugins/hideip/` directly.
76e6a02230Stracker-user
77e6a02230Stracker-user## Run the historical scrub
78e6a02230Stracker-user
79e6a02230Stracker-user1. **Admin → Hide IP**
80e6a02230Stracker-user2. Click **Preview (count only)** to see how many files and IP slots would be rewritten.
81e6a02230Stracker-user3. Click **Scrub now** to execute. The scrub is atomic per file (tmp file + rename) and preserves mtimes. It's also idempotent — running it again is a no-op once everything is at the placeholder.
82e6a02230Stracker-user
83e6a02230Stracker-userTake a backup with the Site Backup plugin first if you want a recovery point — the scrub is intentionally destructive.
84e6a02230Stracker-user
85e6a02230Stracker-user## Security model
86e6a02230Stracker-user
87e6a02230Stracker-user- **Admin-only.** `forAdminOnly() = true`, plus an explicit `auth_isadmin()` check inside the scrub method.
88e6a02230Stracker-user- **CSRF-protected.** All actions go through `checkSecurityToken()`.
89e6a02230Stracker-user- **POST-only scrub.** The scrub action rejects GET / HEAD so a stray link or prefetch can't trigger it.
90e6a02230Stracker-user- **Atomic writes.** Every file write goes through a sibling `.hideip_tmp_<8 hex>` file and is `rename()`d into place. A concurrent reader sees either the old file or the new file, never a half-written state.
91e6a02230Stracker-user- **File mtimes preserved.** The on-disk modification time of each file is restored after the scrub, so it doesn't look like everything was just edited.
92e6a02230Stracker-user- **Idempotent.** Re-running scrub is safe — already-anonymised entries are skipped, no double-writes.
93e6a02230Stracker-user
94047cf127Stracker-user## Trusted-proxy caveat
95047cf127Stracker-user
96047cf127Stracker-userThe action component replaces `$_SERVER['REMOTE_ADDR']` at `INIT_LANG_LOAD`, which fires after DokuWiki's base-URL constants (`DOKU_URL`, `DOKU_BASE`) are already frozen into place (init.php:103-104). However, runtime calls to `is_ssl()` after init — used by some URL-building helpers — also check `REMOTE_ADDR` against `$conf['trustedproxy']` to decide whether to trust an `HTTP_X_FORWARDED_PROTO` header. If your wiki is behind a reverse proxy that relies on this check, set `$conf['baseurl']` explicitly in `conf/local.php` so DokuWiki does not consult `REMOTE_ADDR` for URL or SSL construction.
97047cf127Stracker-user
98e6a02230Stracker-user## Compatibility
99e6a02230Stracker-user
100e6a02230Stracker-user- DokuWiki `2025-05-14b "Librarian"` and onwards (uses the modern namespaced `dokuwiki\Extension\AdminPlugin` / `ActionPlugin` base classes).
101047cf127Stracker-user- PHP 8.0+ (`str_ends_with()` used in the admin component).
102e6a02230Stracker-user- No external dependencies.
103e6a02230Stracker-user
104e6a02230Stracker-user## License
105e6a02230Stracker-user
106e6a02230Stracker-userGPL-2.0-or-later, matching DokuWiki itself.
107