Files
tailscale-custom/PRODUCTION_ROADMAP.md
T
huanld 2fb067ecbf
checklocks / checklocks (push) Has been cancelled
CodeQL / Analyze (go) (push) Has been cancelled
natlab-integrationtest / natlab-integrationtest (push) Has been cancelled
CI / gomod-cache (push) Has been cancelled
CI / race-root-integration (1/4) (push) Has been cancelled
CI / race-root-integration (2/4) (push) Has been cancelled
CI / race-root-integration (3/4) (push) Has been cancelled
CI / race-root-integration (4/4) (push) Has been cancelled
CI / test (-race, amd64, 1/3) (push) Has been cancelled
CI / test (-race, amd64, 2/3) (push) Has been cancelled
CI / test (-race, amd64, 3/3) (push) Has been cancelled
CI / test (386) (push) Has been cancelled
CI / test (amd64) (push) Has been cancelled
CI / Windows (benchmarks) (push) Has been cancelled
CI / Windows (1/2) (push) Has been cancelled
CI / Windows (2/2) (push) Has been cancelled
CI / macos (push) Has been cancelled
CI / privileged (push) Has been cancelled
CI / vm (push) Has been cancelled
CI / cross (386, linux) (push) Has been cancelled
CI / cross (amd64, darwin) (push) Has been cancelled
CI / cross (amd64, freebsd) (push) Has been cancelled
CI / cross (amd64, openbsd) (push) Has been cancelled
CI / cross (amd64, windows) (push) Has been cancelled
CI / cross (arm, 5, linux) (push) Has been cancelled
CI / cross (arm, 7, linux) (push) Has been cancelled
CI / cross (arm64, darwin) (push) Has been cancelled
CI / cross (arm64, linux) (push) Has been cancelled
CI / cross (arm64, windows) (push) Has been cancelled
CI / cross (loong64, linux) (push) Has been cancelled
CI / ios (push) Has been cancelled
CI / crossmin (amd64, illumos) (push) Has been cancelled
CI / crossmin (amd64, plan9) (push) Has been cancelled
CI / crossmin (amd64, solaris) (push) Has been cancelled
CI / crossmin (ppc64, aix) (push) Has been cancelled
CI / android (push) Has been cancelled
CI / wasm (push) Has been cancelled
CI / tailscale_go (push) Has been cancelled
CI / fuzz (push) Has been cancelled
CI / depaware (push) Has been cancelled
CI / go_generate (push) Has been cancelled
CI / make_tidy (push) Has been cancelled
CI / licenses (push) Has been cancelled
CI / staticcheck (macOS) (push) Has been cancelled
CI / staticcheck (Linux) (push) Has been cancelled
CI / staticcheck (Windows) (push) Has been cancelled
CI / staticcheck (Portable (1/4)) (push) Has been cancelled
CI / staticcheck (Portable (2/4)) (push) Has been cancelled
CI / staticcheck (Portable (3/4)) (push) Has been cancelled
CI / staticcheck (Portable (4/4)) (push) Has been cancelled
CI / notify_slack (push) Has been cancelled
CI / merge_blocker (push) Has been cancelled
CI / check_mergeability_strict (push) Has been cancelled
CI / check_mergeability (push) Has been cancelled
Dockerfile build / deploy (push) Has been cancelled
test installer.sh / test (curl, alpine:3.21) (push) Has been cancelled
test installer.sh / test (curl, alpine:edge) (push) Has been cancelled
test installer.sh / test (curl, alpine:latest) (push) Has been cancelled
test installer.sh / test (curl, amazonlinux:latest) (push) Has been cancelled
test installer.sh / test (curl, archlinux:latest) (push) Has been cancelled
test installer.sh / test (curl, debian:oldstable-slim) (push) Has been cancelled
test installer.sh / test (curl, debian:sid-slim) (push) Has been cancelled
test installer.sh / test (curl, debian:stable-slim, 1.80.0) (push) Has been cancelled
test installer.sh / test (curl, debian:testing-slim) (push) Has been cancelled
test installer.sh / test (curl, elementary/docker:stable) (push) Has been cancelled
test installer.sh / test (curl, elementary/docker:unstable) (push) Has been cancelled
test installer.sh / test (curl, fedora:latest, 1.80.0) (push) Has been cancelled
test installer.sh / test (curl, kalilinux/kali-dev) (push) Has been cancelled
test installer.sh / test (curl, kalilinux/kali-rolling) (push) Has been cancelled
test installer.sh / test (curl, opensuse/leap:latest) (push) Has been cancelled
test installer.sh / test (curl, opensuse/tumbleweed:latest) (push) Has been cancelled
test installer.sh / test (curl, oraclelinux:8) (push) Has been cancelled
test installer.sh / test (curl, oraclelinux:9) (push) Has been cancelled
test installer.sh / test (curl, parrotsec/core:latest) (push) Has been cancelled
test installer.sh / test (curl, rockylinux:8.7) (push) Has been cancelled
test installer.sh / test (curl, rockylinux:9) (push) Has been cancelled
test installer.sh / test (curl, ubuntu:20.04) (push) Has been cancelled
test installer.sh / test (curl, ubuntu:22.04) (push) Has been cancelled
test installer.sh / test (curl, ubuntu:24.04, 1.80.0) (push) Has been cancelled
test installer.sh / test (wget, debian:oldstable-slim) (push) Has been cancelled
test installer.sh / test (wget, debian:sid-slim) (push) Has been cancelled
update-flake / update-flake (push) Has been cancelled
tailscale.com/cmd/vet / vet (push) Has been cancelled
test installer.sh / notify-slack (push) Has been cancelled
feat: security hardening, production roadmap, admin panel v1
Client security fixes (cmd/tailscale-tray/main.go):
- SSRF protection in Add Server dialog (validateControlURL): reject
  private/loopback/link-local/cloud-metadata IPs via DNS resolution
- RCE gate on AuthURL/BrowseToURL exec paths (validateAuthURL)
- Sanitized URL logging (sanitizeURLForLog drops query auth tokens)
- Error handling on exec.Command with user-facing showError()

Admin panel security (web-admin):
- Bcrypt password hashing (replaces SHA256)
- Rate limiting: 5 failed logins → 15-min lockout
- Session + login attempt cleanup goroutine (hourly)
- url.QueryEscape / encodeURIComponent for all API params
- Fail-hard startup when no TLS and non-loopback bind
- ADMIN_PASSWORD required (no default), password min 12 chars
- Username regex whitelist

Installer hardening (Setup.wxs):
- util:PermissionEx restricts SCM access: only Administrators +
  SYSTEM can start/stop/reconfigure service. Authenticated Users
  limited to QueryStatus/QueryConfig/Interrogate
- Vital="yes" on ServiceInstall

Docs & roadmap:
- PRODUCTION_ROADMAP.md: 5-milestone plan (security + features +
  distribution + ops) with granular tasks, effort, done-when
- CLIENT_SECURITY_AUDIT.md, SECURITY_FIXES.md, DEPLOYMENT.md
- AI assistant rules (.cursorrules, .antigravityrules, etc.)

Build & distribution:
- build-msi.ps1, deploy-and-sign.ps1, sign-release.ps1
- redeploy.ps1, tray-deploy.ps1, test-msi.ps1
- installer/msi/ alternative WXS setup
- Restored .github/workflows/ removed in mirror cleanup

.gitignore hardened: *.pfx, *.p12, *.key, *.pem, .env*
2026-04-22 15:18:11 +07:00

889 lines
31 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 🚀 Production Roadmap — Tailscale Custom
**Mục tiêu:** Đưa Tailscale Custom từ trạng thái hiện tại (functional prototype) lên **v1.0 production-ready**, bao gồm **bảo mật** + **hoàn thiện chức năng** + **distribution**.
**Cam kết sau khi hoàn thành:**
- ✅ Server `vpn.softs.business` chạy 24/7 ổn định, monitor được
- ✅ Client MSI ký số, end-user cài là dùng ngay, không cần hỗ trợ
- ✅ Admin panel có đủ chức năng vận hành thực tế (user, node, route, key, audit)
- ✅ Không còn lỗ hổng bảo mật nghiêm trọng
- ✅ Có runbook cho incident & rollback
**Cách dùng roadmap:**
- Các task đánh `☐` → check `☑` khi xong, commit ngay sau mỗi task
- Mỗi task có `P0`/`P1`/`P2` (priority) và effort estimate
- Làm theo thứ tự milestone (M1 → M2 → M3...). Trong mỗi milestone có thể pick-and-choose theo priority.
**Effort scale:**
- **S** = ≤ 30 phút
- **M** = 30 phút 2 giờ
- **L** = 2 6 giờ
- **XL** = 1 3 ngày
---
## 📊 Mục Lục
- [M0 — Trạng Thái Hiện Tại (Audit)](#m0--trạng-thái-hiện-tại-audit)
- [M1 — Production Blockers (~8h)](#m1--production-blockers-8h)
- [M2 — Core Features cho v1.0 (~20h)](#m2--core-features-cho-v10-20h)
- [M3 — Distribution Ready (~5h)](#m3--distribution-ready-5h)
- [M4 — Operations & Monitoring (~3h + ongoing)](#m4--operations--monitoring-3h--ongoing)
- [M5 — Future Enhancements (optional)](#m5--future-enhancements-optional)
- [Suggested Timeline](#suggested-timeline)
- [Rollback & Troubleshooting](#rollback--troubleshooting)
---
## M0 — Trạng Thái Hiện Tại (Audit)
### Tray App (Windows Client)
| Chức năng | Status | Ghi chú |
|---|---|---|
| Connect / Disconnect | ✅ | Qua IPN bus |
| Login (browser auth) | ✅ | Có SSRF + RCE protection mới |
| Add Server dialog | ✅ | Có validateControlURL |
| Peer list + copy IP | ✅ | |
| Account switching | ✅ | Multiple profiles |
| Shield icon 3-state | ✅ | green/yellow/gray |
| Single-instance lock | ✅ | Mutex |
| Exit node UI | ❌ | **Thiếu — P1** |
| Subnet router UI | ❌ | **Thiếu — P2** |
| Settings dialog | ❌ | **Thiếu — P1** |
| Notifications | ❌ | **Thiếu — P1** |
| Update checker | ❌ | **Thiếu — P1** |
| Taildrop UI | ❌ | P2 |
| SSH status | ❌ | P2 |
### Admin Panel
| Chức năng | Status | Ghi chú |
|---|---|---|
| Login + rate limiting | ✅ | bcrypt, 5 attempts → 15 min lock |
| Password change | ✅ | |
| User CRUD | ✅ | Login account + Headscale user |
| Node management | ✅ | List/delete/expire/rename |
| Preauth keys create + list | ✅ | |
| Routes approve/disable | ✅ | |
| Dashboard metrics cơ bản | ✅ | Counts only |
| Downloads section | ✅ | Serve MSI từ /data/downloads |
| Dark theme | ✅ | |
| TLS + security headers | ⚠️ | Fail-hard no-TLS DONE, headers đang làm M1 |
| Audit log | ❌ | **Thiếu — P0** |
| Pagination | ❌ | **Thiếu — P1** |
| i18n Vietnamese | ❌ | **Thiếu — P1** |
| Mobile responsive | ❌ | P2 |
| Revoke preauth key | ❌ | **Thiếu — P1** |
| MFA / 2FA | ❌ | P2 |
| ACL editor | ❌ | P2 |
| DNS config UI | ❌ | P2 |
### Build & Distribution
| Chức năng | Status | Ghi chú |
|---|---|---|
| WiX v5 MSI | ✅ | Service + tray + registry |
| Code signing | ✅ | .pfx nhưng cần move |
| Service SCM ACL | ✅ | util:PermissionEx mới thêm |
| Signed binaries | ⚠️ | Script hiện có hardcode path |
| Version management | ❌ | **Hardcode 1.0.0.0 — P1** |
| Silent install docs | ❌ | P1 |
| Vietnamese installer UI | ❌ | P2 |
| Auto-update | ❌ | P1 |
| Linux packages | ❌ | P2 |
---
## M1 — Production Blockers (~8h)
**Làm xong phase này = có thể deploy cho nhóm nhỏ (internal users) một cách an toàn.**
### M1.1 Code Security Fixes Còn Lại
#### ☐ `1.1.1` Security Response Headers cho Admin Panel **[P0, S]**
**File:** [web-admin/main.go](web-admin/main.go) (khoảng dòng 245)
Thêm middleware `securityHeaders(mux, tlsCertFile != "")` trước khi listen. Xem snippet:
```go
func securityHeaders(h http.Handler, isTLS bool) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("X-Content-Type-Options", "nosniff")
w.Header().Set("X-Frame-Options", "DENY")
w.Header().Set("Referrer-Policy", "no-referrer")
w.Header().Set("Permissions-Policy", "geolocation=(), microphone=(), camera=()")
w.Header().Set("Content-Security-Policy",
"default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; "+
"img-src 'self' data:; connect-src 'self'; frame-ancestors 'none'; "+
"base-uri 'self'; form-action 'self'")
if isTLS {
w.Header().Set("Strict-Transport-Security", "max-age=31536000; includeSubDomains")
}
h.ServeHTTP(w, r)
})
}
```
Wrap mux: `handler := securityHeaders(mux, tlsCertFile != "")` rồi pass `handler` vào `ListenAndServe*`.
**Done when:** `curl -I https://vpn.softs.business/admin/` trả về 4 headers: CSP, HSTS, X-Frame-Options, X-Content-Type-Options.
#### ☐ `1.1.2` Remove Debug Panic Trigger **[P0, S]**
**File:** [control/controlclient/direct.go](control/controlclient/direct.go:365) và :506
Hai block `if strings.Contains(opts.ServerURL, "vpn.softs.business") && envknob.Bool("TS_PANIC_IF_HIT_MAIN_CONTROL")` được copy từ upstream Tailscale nhưng đảo ngược ý nghĩa. Fix:
```go
// Thay đổi điều kiện để match upstream domain (tránh dev accidentally hit upstream)
if strings.Contains(opts.ServerURL, "controlplane.tailscale.com") && envknob.Bool("TS_PANIC_IF_HIT_UPSTREAM") {
panic("accidentally hit upstream Tailscale in dev")
}
```
**Done when:** `grep -n "TS_PANIC_IF_HIT_MAIN_CONTROL" control/` trả về rỗng.
#### ☐ `1.1.3` Deduplicate Whitelist qua DefaultControlURL **[P1, S]**
**File:** [cmd/tailscale-tray/main.go](cmd/tailscale-tray/main.go) trong `validateAuthURL()`
```go
// Thay hardcode "vpn.softs.business" bằng parse từ ipn.DefaultControlURL
allowedDomains := map[string]bool{}
if u, err := url.Parse(ipn.DefaultControlURL); err == nil && u.Host != "" {
allowedDomains[u.Host] = true
}
```
**Done when:** đổi `DefaultControlURL` trong `ipn/prefs.go` thành domain khác → tray app tự follow, không cần sửa.
### M1.2 Secret Hygiene
#### ☐ `1.2.1` Move `.pfx` ra khỏi working tree **[P0, S]**
```powershell
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.tailscale-custom-secrets"
Move-Item .\tailscale-custom-codesign.pfx "$env:USERPROFILE\.tailscale-custom-secrets\"
[Environment]::SetEnvironmentVariable("TAILSCALE_CODESIGN_PFX",
"$env:USERPROFILE\.tailscale-custom-secrets\tailscale-custom-codesign.pfx", "User")
```
Sửa `build-msi.ps1`, `deploy-and-sign.ps1`, `sign-release.ps1` để đọc từ `$env:TAILSCALE_CODESIGN_PFX`. Throw nếu không set.
**Done when:** `Test-Path .\tailscale-custom-codesign.pfx``False`, script build vẫn chạy.
#### ☐ `1.2.2` PFX Password qua Windows DPAPI **[P1, S]**
Đừng hardcode password trong script:
```powershell
# Setup 1 lần
$cred = Get-Credential -UserName "codesign" -Message "PFX password"
$cred.Password | ConvertFrom-SecureString | Out-File "$env:USERPROFILE\.tailscale-custom-secrets\pfx-pass.txt"
# Trong build script
$securePass = Get-Content "$env:USERPROFILE\.tailscale-custom-secrets\pfx-pass.txt" | ConvertTo-SecureString
$plainPass = [Runtime.InteropServices.Marshal]::PtrToStringAuto(
[Runtime.InteropServices.Marshal]::SecureStringToBSTR($securePass))
```
**Done when:** không còn literal password trong bất cứ `.ps1` nào.
#### ☐ `1.2.3` Rotate Headscale API key **[P0, S]**
```bash
docker compose -f /opt/headscale/docker-compose.yml exec headscale \
headscale apikeys expire <existing-key-prefix>
docker compose exec headscale headscale apikeys create --expiration 90d
```
Update `.env``docker compose restart headscale-admin`.
**Done when:** API key cũ không auth được.
### M1.3 Server Basics
#### ☐ `1.3.1` Let's Encrypt cert cho `vpn.softs.business` **[P0, M]**
```bash
sudo apt install certbot
sudo certbot certonly --standalone -d vpn.softs.business \
--agree-tos -m admin@softs.business --non-interactive
sudo certbot renew --dry-run
```
**Done when:**
```bash
openssl s_client -connect vpn.softs.business:443 -servername vpn.softs.business </dev/null 2>&1 | grep "Verify return code"
# → Verify return code: 0 (ok)
```
#### ☐ `1.3.2` DNS CAA Record **[P0, S]**
Trên DNS provider, thêm:
```
vpn.softs.business. CAA 0 issue "letsencrypt.org"
vpn.softs.business. CAA 0 iodef "mailto:admin@softs.business"
```
**Done when:** `dig CAA vpn.softs.business +short` có output.
#### ☐ `1.3.3` Nginx reverse proxy + HSTS **[P0, M]**
Config nginx listen :443 với:
- TLS từ Let's Encrypt
- Security headers (HSTS, X-Frame-Options, X-Content-Type-Options)
- Proxy Headscale /api/v1/, /key, /register, /machine, /derpmap, /health → :8080
- Proxy /admin/ → :9080
- HTTP :80 redirect 301 → HTTPS
Ref: đoạn config mẫu trong [DEPLOYMENT.md](DEPLOYMENT.md).
**Done when:**
- `curl -I https://vpn.softs.business/admin/` → 200 + HSTS header
- `curl -I http://vpn.softs.business/` → 301
#### ☐ `1.3.4` Admin panel chỉ bind loopback **[P0, S]**
Trong `web-admin/docker-compose.yml`:
```yaml
ports:
- "127.0.0.1:9080:9080"
```
**Done when:** `sudo ss -tlnp | grep 9080` → chỉ `127.0.0.1:9080`.
#### ☐ `1.3.5` UFW firewall **[P0, S]**
```bash
sudo ufw default deny incoming
sudo ufw allow 22/tcp
sudo ufw allow 443/tcp
sudo ufw deny 9080/tcp
sudo ufw enable
```
**Done when:** `nmap vpn.softs.business` từ ngoài chỉ thấy 22, 443.
### M1.4 Verification
#### ☐ `1.4.1` Build & sign MSI lần đầu **[P0, M]**
```powershell
Remove-Item -Recurse -Force .\build, .\dist, .\*.msi, .\*.wixpdb, .\cab*.cab -EA SilentlyContinue
$env:CGO_ENABLED="0"; $env:GOOS="windows"; $env:GOARCH="amd64"
$commit = git rev-parse --short HEAD
$ldflags = "-s -w -X tailscale.com/version.longStamp=1.0.0.0-$commit -X tailscale.com/version.shortStamp=1.0.0.0 -H=windowsgui"
New-Item -ItemType Directory -Force .\dist | Out-Null
go build -trimpath -ldflags $ldflags -o .\dist\tailscaled.exe ./cmd/tailscaled
go build -trimpath -ldflags $ldflags -o .\dist\tailscale.exe ./cmd/tailscale
go build -trimpath -ldflags $ldflags -o .\dist\tailscale-tray.exe ./cmd/tailscale-tray
# Sign binaries TRƯỚC khi đóng MSI
$pfx = $env:TAILSCALE_CODESIGN_PFX
foreach ($bin in @(".\dist\tailscaled.exe", ".\dist\tailscale.exe", ".\dist\tailscale-tray.exe")) {
signtool sign /f $pfx /p $env:PFX_PASSWORD /tr http://timestamp.digicert.com /td sha256 /fd sha256 $bin
}
# Build MSI
wix extension add WixToolset.Util.wixext
wix build Setup.wxs -ext WixToolset.Util.wixext -out .\Tailscale-Custom-Setup.msi
# Sign MSI
signtool sign /f $pfx /p $env:PFX_PASSWORD /tr http://timestamp.digicert.com /td sha256 /fd sha256 .\Tailscale-Custom-Setup.msi
signtool verify /pa /v .\Tailscale-Custom-Setup.msi
```
**Done when:** verify pass, MSI ~30-50MB.
#### ☐ `1.4.2` Smoke test trên Windows VM sạch **[P0, M]**
```powershell
# Trên VM
signtool verify /pa /v .\Tailscale-Custom-Setup.msi
msiexec /i .\Tailscale-Custom-Setup.msi /qb /l*v install.log
Get-Service Tailscale-Custom # Running
Get-Process tailscale-tray # process
sc.exe stop Tailscale-Custom # non-admin: Access denied
```
**Done when:** cài được, service chạy, tray icon xuất hiện, user thường không stop được service.
#### ☐ `1.4.3` SSRF validation test **[P0, S]**
Trong tray "Add Server", test các URL phải **reject**:
- `http://192.168.1.1` → "only HTTPS allowed"
- `https://192.168.1.1` → "private address not allowed"
- `https://127.0.0.1` → "loopback"
- `https://localhost` → "loopback"
- `https://169.254.169.254` → "cloud metadata"
- `https://10.0.0.5` → "private"
**Done when:** 6/6 reject đúng error.
---
## M2 — Core Features cho v1.0 (~20h)
**Làm xong = dùng được như VPN client thực tế cho công ty, không hụt chức năng so với Tailscale official.**
### M2.1 Tray App
#### ☐ `2.1.1` Exit Node UI **[P1, L]**
**File:** [cmd/tailscale-tray/main.go](cmd/tailscale-tray/main.go)
Thêm menu submenu "Exit Node" với:
- "None" (default)
- List các peer có `ExitNodeOption = true`
- Click → set `ExitNodeID` via `EditPrefs`
Pattern dựa theo menu peer list hiện có. Check `st.Peer[...].ExitNodeOption` trong `ipnstate.Status`.
**Done when:**
- Menu hiển thị peers eligible làm exit node
- Select 1 peer → traffic route qua peer đó
- Verify bằng `curl -s ifconfig.me` → thấy IP của exit node
#### ☐ `2.1.2` Settings Dialog **[P1, M]**
Dialog đơn giản (dùng Win32 CreateDialog như `inputDialog` hiện có):
- [ ] Accept DNS (checkbox → `Prefs.CorpDNS`)
- [ ] Accept subnets (checkbox → `Prefs.RouteAll`)
- [ ] Allow LAN access when using exit node (`Prefs.ExitNodeAllowLANAccess`)
- [ ] Run on startup (registry Run key toggle)
- "Apply" → `EditPrefs` với MaskedPrefs
**Done when:** 4 checkbox hoạt động, settings persist sau reboot.
#### ☐ `2.1.3` Windows Toast Notifications **[P1, M]**
Dùng `golang.org/x/sys/windows` + `Shell_NotifyIcon` (NIF_INFO flag) hoặc WinToast.
Trigger notification khi:
- State transition: Stopped → Running ("Connected to <server>")
- State transition: Running → NeedsLogin ("Session expired, please login")
- State transition: Any → Stopped ("Disconnected")
- Add Server failed với error
**Done when:** 4 loại notification xuất hiện đúng lúc, click vào notification mở tray menu.
#### ☐ `2.1.4` Update Checker **[P1, M]**
Thêm function `checkForUpdate()` chạy mỗi 6h:
- GET `https://vpn.softs.business/api/version/latest``{version, url, sha256, releaseNotes}`
- So sánh với version hiện tại (từ `version.Long`)
- Nếu mới hơn → toast notification + menu item "Update available" (click mở URL download)
Backend: admin panel thêm endpoint `/api/version/latest` đọc từ file `/data/latest-version.json`.
**Done when:**
- Đặt file `/data/latest-version.json` với version cao hơn → tray hiển thị update available
- Click → browser mở download page
#### ☐ `2.1.5` Friendly Error Dialogs **[P2, S]**
Hiện tại nhiều `log.Printf` khi có error nhưng user không biết. Thay bằng `showError()` ở các điểm quan trọng:
- `doLogin` start error
- `doLogin` StartLoginInteractive error
- `addServer` Start error
- `EditPrefs` error ở các menu action
- Profile switch error
**Done when:** Bất kỳ operation fail nào từ menu → user thấy dialog giải thích lỗi.
### M2.2 Admin Panel — Chức năng Thiết Yếu
#### ☐ `2.2.1` Audit Log **[P0, M]**
**Schema:** `/data/audit.log` dạng JSONL (mỗi dòng một entry).
```go
type auditEntry struct {
Time time.Time `json:"time"`
Actor string `json:"actor"` // username
Action string `json:"action"` // "login", "create_user", "delete_node", ...
Target string `json:"target"`
IP string `json:"ip"`
Success bool `json:"success"`
Detail string `json:"detail,omitempty"`
}
func logAudit(r *http.Request, action, target string, success bool, detail string) {
entry := auditEntry{
Time: time.Now().UTC(), Actor: r.Header.Get("X-Username"),
Action: action, Target: target,
IP: clientIP(r), Success: success, Detail: detail,
}
b, _ := json.Marshal(entry)
auditFile.Write(append(b, '\n'))
}
```
Gọi `logAudit()` trong:
- `handleLogin` (success + failure)
- `handleLogout`
- `handleAdminAccounts` POST/DELETE
- `handleAdminUserByID` password reset + delete
- `handleAdminNodeByID` delete/expire/rename
- `handleAdminKeys` POST
- `handleAdminRoutes` toggle
Thêm endpoint `GET /api/admin/audit?limit=100&offset=0&actor=<name>&action=<type>` để view log.
UI: tab "Audit" trong admin panel hiển thị bảng audit.
**Done when:**
- Mọi hành động admin đều có entry trong `audit.log`
- UI tab "Audit" hiển thị, filter được theo actor/action/time range
#### ☐ `2.2.2` Revoke Preauth Key **[P1, S]**
**Backend:** Thêm `DELETE /api/admin/preauthkeys?user=X&key=Y` proxy tới Headscale expire endpoint.
**Headscale API:** `POST /api/v1/preauthkey/expire` với body `{"user": "...", "key": "..."}`.
**Frontend:** Thêm nút "Revoke" mỗi dòng preauth key, confirm dialog → call API → refresh.
**Done when:** Click Revoke → key trở thành không dùng được để register node mới.
#### ☐ `2.2.3` Pagination cho Nodes & Users **[P1, M]**
Khi Headscale có > 100 nodes, hiện tại load hết 1 lần → slow.
**Backend:** Thêm query params `?limit=50&offset=0` cho `/api/admin/nodes`, `/api/admin/accounts`, `/api/admin/users`. Vì Headscale API không hỗ trợ pagination, phải filter ở admin panel sau khi nhận full list từ Headscale.
**Frontend:** Pagination controls (Previous / 1 2 3 4 / Next). Client-side filter nếu backend chưa hỗ trợ.
**Done when:** List 500 nodes giả → chỉ render 50 một lần, có next/prev button.
#### ☐ `2.2.4` Search / Filter **[P1, M]**
Thêm search input top of mỗi list:
- Nodes: filter theo hostname/IP/user
- Users: filter theo username
- Accounts: filter theo username/role
Implement client-side (filter trên array đã fetch) cho đơn giản.
**Done when:** Typing vào search box → list filter realtime.
#### ☐ `2.2.5` Vietnamese i18n **[P1, M]**
Tạo `web-admin/static/i18n.js`:
```javascript
const translations = {
en: { 'login.title': 'Login', 'nodes.title': 'Nodes', ... },
vi: { 'login.title': 'Đăng nhập', 'nodes.title': 'Thiết bị', ... }
};
function t(key) {
const lang = localStorage.getItem('lang') || 'vi';
return translations[lang]?.[key] || translations.en[key] || key;
}
```
Thay HTML labels bằng `data-i18n="login.title"` + JS populate on page load. Thêm language switcher ở header.
**Done when:**
- Toggle vi/en → toàn bộ UI đổi ngôn ngữ
- Persist chọn lựa qua localStorage
#### ☐ `2.2.6` Dashboard Metrics Tốt Hơn **[P2, M]**
Thêm cards:
- Tổng nodes online 24h qua
- Top 5 users theo số node
- Nodes đăng ký mới 7 ngày qua
- Preauth keys sắp hết hạn
Simple: compute client-side từ `/api/admin/nodes` + `/api/admin/users` + `/api/admin/preauthkeys`.
**Done when:** Dashboard có 4 cards mới với data đúng.
### M2.3 Build Process
#### ☐ `2.3.1` Version Auto-injection **[P1, S]**
Thay vì hardcode `1.0.0.0` trong `Setup.wxs`:
**Tạo file** `VERSION` ở root: `1.0.1`
**Sửa `build-msi.ps1`:**
```powershell
$version = (Get-Content .\VERSION).Trim()
$msiVersion = "$version.0" # MSI cần 4 segments
# Generate Setup.wxs từ template với version
(Get-Content .\Setup.wxs.tpl) -replace '{{VERSION}}', $msiVersion | Set-Content .\Setup.wxs
# Embed version vào go binaries qua ldflags (đã có ở M1.4.1)
```
Rename `Setup.wxs``Setup.wxs.tpl` với placeholder `{{VERSION}}`, add `Setup.wxs` vào .gitignore.
**Done when:**
- Đổi VERSION thành `1.0.2` → build ra MSI với version tương ứng
- `wmic datafile where name="C:\\...\\tailscaled.exe" get version` hiển thị đúng
#### ☐ `2.3.2` Latest Version Manifest **[P1, S]**
Mỗi lần build xong, tạo `/data/latest-version.json`:
```json
{
"version": "1.0.1",
"url": "https://vpn.softs.business/download/Tailscale-Custom-Setup-1.0.1.msi",
"sha256": "...",
"releaseNotes": "Bug fixes and security improvements",
"releasedAt": "2026-05-01T10:00:00Z",
"minimumVersion": "1.0.0"
}
```
Upload file này lên server kèm MSI. Endpoint `/api/version/latest` (không cần auth — public).
**Done when:** `curl https://vpn.softs.business/api/version/latest` trả JSON đúng.
#### ☐ `2.3.3` Silent Install Documentation **[P2, S]**
Tạo `docs/INSTALL_ENTERPRISE.md`:
```markdown
# Enterprise / Silent Install
## Default install
msiexec /i Tailscale-Custom-Setup.msi /qb
## Silent, no UI
msiexec /i Tailscale-Custom-Setup.msi /qn /norestart
## With logging
msiexec /i Tailscale-Custom-Setup.msi /qn /l*v install.log
## Uninstall
msiexec /x Tailscale-Custom-Setup.msi /qn
## Group Policy deployment
Copy MSI to \\domain\SYSVOL\software, create GPO Computer Config → Software Installation
## Uninstall via Group Policy
Remove from GPO; MSI uninstalls on next logon
```
**Done when:** File tồn tại, các lệnh đã test được.
---
## M3 — Distribution Ready (~5h)
**Làm xong = có thể phân phối MSI cho users thật sự (không phải dev test).**
### ☐ `3.1` End-user documentation **[P1, M]**
Tạo `docs/USER_GUIDE_VI.md`:
- Cài đặt (đưa link MSI + hash SHA256 để verify)
- Đăng nhập lần đầu (open tray → Login → browser mở)
- Thấy các máy khác trong công ty (Peer list)
- Copy IP của máy để SSH/RDP
- Sign out / switch profile
- Troubleshooting thường gặp (không thấy peer, không connect được, log ở đâu)
**Done when:** User mới đọc guide này làm được mọi thứ không cần hỏi.
### ☐ `3.2` Admin documentation **[P1, M]**
Tạo `docs/ADMIN_GUIDE_VI.md`:
- Truy cập admin panel
- Tạo user mới + cấp preauth key
- Approve subnet routes
- Revoke máy bị mất/bị hack
- Reset password cho user
- Xem audit log
- Backup / restore
- Cách nâng cấp server
**Done when:** Admin mới (không phải dev) vận hành được hệ thống.
### ☐ `3.3` Download landing page **[P1, M]**
Thêm page `/download` ở admin panel hoặc nginx (public, không auth):
- Logo + tên công ty
- Nút "Download Tailscale Custom for Windows" → MSI
- Hash SHA256 để verify
- Bản Linux (nếu có)
- Link user guide
- Link support/contact
**Done when:** Truy cập `https://vpn.softs.business/download` không cần đăng nhập, tải được MSI.
### ☐ `3.4` Test MSI full lifecycle **[P0, M]**
Trên Windows 10 + Windows 11 VM sạch:
- [ ] Fresh install → hoạt động
- [ ] Upgrade (đè lên version cũ) → giữ profile cũ, không conflict
- [ ] Uninstall → service removed, registry cleaned, logs cleaned (hoặc keep theo choice)
- [ ] Silent install (`/qn`) → không popup nào
- [ ] Reboot → service auto-start OK
- [ ] User thường (non-admin) installed session → tray app chạy dưới user context, service chạy SYSTEM
**Done when:** 6/6 pass trên cả 2 OS.
### ☐ `3.5` Linux client packages **[P2, L]**
Nếu cần support Linux (đã mention trong CUSTOM_CLIENT.md):
- `.deb` cho Ubuntu/Debian (dùng `nfpm` hoặc `dpkg-deb`)
- `.rpm` cho RHEL/Fedora
- systemd service file với unit `tailscaled-custom.service`
- Post-install script: enable service
**Done when:**
- `sudo dpkg -i tailscale-custom_1.0.1_amd64.deb` → service chạy
- `sudo tailscale up --login-server https://vpn.softs.business` → kết nối OK
---
## M4 — Operations & Monitoring (~3h + ongoing)
### ☐ `4.1` Automated Backup **[P0, M]**
Cron daily trên VPS:
```bash
#!/bin/bash
# /opt/backup-tailscale.sh
set -e
BDIR=/backups/tailscale
DATE=$(date +%Y%m%d-%H%M%S)
mkdir -p $BDIR
docker compose -f /opt/headscale/docker-compose.yml exec -T headscale \
sqlite3 /var/lib/headscale/db.sqlite .dump > $BDIR/headscale-$DATE.sql
docker cp headscale-admin:/data/users.json $BDIR/users-$DATE.json
docker cp headscale-admin:/data/audit.log $BDIR/audit-$DATE.log
find $BDIR -mtime +30 -delete
```
```bash
sudo chmod +x /opt/backup-tailscale.sh
echo "0 3 * * * root /opt/backup-tailscale.sh" | sudo tee /etc/cron.d/tailscale-backup
```
**Done when:** Sau 24h có file backup tự động.
### ☐ `4.2` Backup Restore Test **[P0, S]**
Một lần mỗi quý:
```bash
# Restore headscale DB sang instance test
docker run --rm -v /backups/tailscale:/backup alpine \
sqlite3 /backup/headscale-test.sqlite < /backups/tailscale/headscale-LATEST.sql
# Mount vào headscale test, verify users/nodes
```
**Done when:** Restore thành công, data match backup source.
### ☐ `4.3` Monitoring & Alerts **[P1, M]**
Setup một trong các tool free:
**Option A — Uptime Kuma (self-hosted):**
```bash
docker run -d --name uptime-kuma -p 3001:3001 louislam/uptime-kuma:1
# Add monitors:
# - HTTPS https://vpn.softs.business/api/v1/health (Headscale health)
# - HTTPS https://vpn.softs.business/admin/ (200 expected)
# - TCP 443
# - Cert expiry
# Setup notifications: Telegram / Email
```
**Option B — UptimeRobot** (cloud, free tier): monitor 50 endpoints, email alert.
**Done when:**
- Có email/Telegram alert khi service down >2 phút
- Có alert khi cert expire <14 ngày
### ☐ `4.4` CT Log Monitoring **[P0, S]**
Đăng ký [Cert Spotter](https://sslmate.com/certspotter/) cho `*.softs.business`:
- Free tier: 5 domains
- Email khi có cert mới issue
- Verify cert đó là của bạn (issued qua certbot) → nếu không phải → attack alert
**Done when:** Nhận được email confirmation từ Cert Spotter.
### ☐ `4.5` Incident Runbook **[P1, M]**
Tạo `docs/INCIDENT_RUNBOOK.md` cover các scenario:
1. **Server down hoàn toàn** — check VPS provider, SSH, docker ps, nginx, certbot
2. **Admin panel không login được** — check logs, reset password qua DB
3. **User báo không connect được** — check node status, preauth, certbot, DNS
4. **Cảnh báo CT log: cert lạ** — verify legitimate cert, nếu không → revoke qua CA + rotate
5. **Disk đầy** — clean old backups, rotate audit log, clean docker images
6. **Rogue admin action** — check audit log, revoke session, reset password
Mỗi scenario có:
- Triệu chứng
- Lệnh chẩn đoán
- Các bước fix
- Prevention
**Done when:** File tồn tại, tất cả 6 scenarios có runbook chi tiết.
### ☐ `4.6` Log Rotation **[P1, S]**
Audit log và tray log có thể lớn nhanh:
**Server audit.log:**
```bash
# /etc/logrotate.d/tailscale-admin
/var/lib/docker/volumes/headscale-admin_data/_data/audit.log {
daily
rotate 90
compress
missingok
notifempty
}
```
**Tray log (client-side):** Đã dùng `O_TRUNC` → file không tăng. OK.
**Done when:** Audit log rotate tự động, giữ 90 ngày.
### ☐ `4.7` Quarterly Rotation **[P1, S]**
Set reminder calendar:
- **Hàng quý:** Rotate admin password + Headscale API key
- **Hàng năm:** Check code signing cert expiry, renew nếu <6 tháng
- **Hàng quý:** Test backup restore
- **Hàng tháng:** Review audit log cho anomalies
- **Hàng tuần:** Check fail2ban + failed logins
**Done when:** Calendar có 5 recurring events.
---
## M5 — Future Enhancements (optional)
Làm khi có thời gian / user request. Không block production.
### M5.1 Tray App Advanced
- **Subnet router UI** — advertise routes từ tray
- **Taildrop file share UI** — list inbox, accept/reject files
- **SSH status indicator** — show Tailscale SSH enabled trong menu
- **Split tunnel UI** — per-app routing (khó, yêu cầu WFP)
- **MagicDNS toggle** — dễ với Prefs.CorpDNS
### M5.2 Admin Panel Advanced
- **ACL policy editor** — UI cho Headscale ACL (JSON/HuJSON)
- **DNS settings UI** — split DNS, search domains, nameservers
- **DERP server management** — custom DERP relay config
- **MFA / 2FA** — TOTP cho admin login (library: `github.com/pquerna/otp`)
- **SSO / OAuth2** — Google Workspace / Microsoft 365 integration
- **User groups / tags** — bulk policy cho groups
- **Node metrics** — uptime, last seen granularity, bandwidth (cần Headscale enhancement)
### M5.3 Distribution
- **Auto-update mechanism** — MSI self-update (dùng MSI chain transaction)
- **macOS client** — lớn, cần team riêng
- **iOS / Android** — rất lớn, cần team mobile
### M5.4 Operations
- **Prometheus metrics** — expose `/metrics` từ admin panel
- **Grafana dashboard** — visualize nodes/users/traffic
- **Remote wipe** — revoke + force logout node từ xa
- **Session recording** — cho compliance
---
## Suggested Timeline
Giả sử làm full-time focus:
| Tuần | Milestone | Output |
|---|---|---|
| **1 (Mon-Fri)** | M1 toàn bộ | Security baseline + first MSI signed + server TLS/nginx ready |
| **2 (Mon-Wed)** | M2.1 (tray features) + M2.3.1 (version) | Tray v1.0 với exit node + settings + notifications |
| **2 (Thu-Fri)** | M2.2.1-2.2.2 (audit log + revoke key) | Admin panel audit trail + preauth revoke |
| **3 (Mon-Tue)** | M2.2.3-2.2.5 (pagination + search + i18n) | Admin panel v1.0 UX |
| **3 (Wed-Thu)** | M3 (distribution) | User/admin docs + download page + MSI lifecycle test |
| **3 (Fri)** | M4.1-4.4 (ops basics) | Backup + monitoring + CT + runbook |
| **Week 4+** | M5 optional | |
**Có thể part-time:** nhân đôi timeline (~6 tuần).
---
## Rollback & Troubleshooting
### Rollback Server
```bash
cd /opt/headscale-admin
git log --oneline -10 # find commit trước
git reset --hard <commit-hash>
docker compose down
docker compose up -d --build
```
### Rollback Client MSI
```powershell
# Uninstall current
msiexec /x {510A8C57-BA8F-4B9F-84E3-8E5C4E091054} /qn
# Install previous version
msiexec /i .\Tailscale-Custom-Setup-OLD.msi /qb
```
**Always keep:** bản MSI version trước với timestamp trong filename, DB backup trước mỗi major deploy, git tag `stable-YYYYMMDD` cho version production.
### Common Troubleshooting
| Symptom | Check | Fix |
|---|---|---|
| MSI install fail "Service not installed" | `wix extension list` | `wix extension add WixToolset.Util.wixext` |
| Tray crash on start | `tray.log` in `%ProgramData%\Tailscale-Custom\Logs\` | Check error, file bug |
| "Invalid server URL" but URL looks right | `nslookup <host>` | DNS resolving ra private IP → SSRF protection đúng là block |
| Admin panel 502 | `sudo tail -f /var/log/nginx/error.log` | Backend down: `docker compose up -d` |
| Headscale reject auth | `docker compose logs headscale` | Preauth expired: create new |
| Client không connect sau reboot | `Get-Service Tailscale-Custom` | Service chưa start: `sc start Tailscale-Custom` |
---
## ✅ Definition of Done cho Production v1.0
- [ ] M1 toàn bộ checked
- [ ] M2.1.1 (Exit Node), M2.1.3 (Notifications), M2.1.4 (Update Checker)
- [ ] M2.2.1 (Audit Log), M2.2.2 (Revoke Key), M2.2.3 (Pagination)
- [ ] M2.3.1 (Version Management)
- [ ] M3.1 (User Guide), M3.2 (Admin Guide), M3.4 (MSI Lifecycle Test)
- [ ] M4.1 (Backup), M4.3 (Monitoring), M4.4 (CT Log)
- [ ] Signed MSI verify OK
- [ ] Fresh Windows VM: install MSI → connect to VPN → browse peer → all works
- [ ] Admin panel: full admin workflow không có bug blocking
- [ ] Rollback plan đã test 1 lần
---
## 📞 Ownership
| Task | Ai làm |
|---|---|
| Code (tray, admin panel, server) | Dev (có thể nhờ AI — tôi) |
| Secret management (pfx, api key) | Bạn — không delegate |
| DNS / CAA | Bạn — domain owner |
| VPS admin, nginx, firewall | Bạn hoặc sysadmin |
| MSI signing | Bạn — giữ pfx |
| User support | Bạn hoặc team support |
| Documentation review | Bạn — hiểu đúng context tiếng Việt |
---
**Ngày tạo:** 2026-04-22
**Phiên bản roadmap:** 1.0
**Cập nhật tiếp theo:** Sau mỗi milestone hoàn thành
> 💡 **Tip:** Commit roadmap này. Mỗi lần làm xong 1 task, check `☐` → `☑` và commit. Sau 1 tuần bạn sẽ thấy progress rõ ràng trong `git log`.