1073b0c214
CI / Lint (push) Has been cancelled
CI / Type Check (push) Has been cancelled
CI / Test (push) Has been cancelled
CI / Build (push) Has been cancelled
Bump Nix package on release / bump (release) Has been cancelled
Update Homebrew Cask / update-cask (release) Has been cancelled
2.3 KiB
2.3 KiB
PaddleOCR Local Service
OpenScreen calls OCR through a local HTTP service. The default endpoint is:
http://127.0.0.1:8866/ocr
The app sends either imageBase64 or path, plus optional language and profile, and expects OCR blocks:
{
"blocks": [
{
"text": "Settings",
"confidence": 0.97,
"box": { "x": 120, "y": 80, "width": 90, "height": 24 }
}
]
}
Install
Use a separate virtual environment because PaddleOCR and PaddlePaddle are large dependencies.
python -m venv .venv-ocr
.\.venv-ocr\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r tools\ocr\requirements.txt
If paddle is still missing after installing paddleocr, install the CPU PaddlePaddle wheel that matches your Python and OS from the official PaddlePaddle install guide.
Run
.\.venv-ocr\Scripts\Activate.ps1
$env:PADDLEOCR_DEVICE="cpu"
$env:OPENSCREEN_OCR_PROFILE="vietnamese"
npm run ocr:paddle
Keep this terminal open while using the Guide OCR step in OpenScreen.
Verify
Invoke-WebRequest http://127.0.0.1:8866/health -UseBasicParsing
Expected healthy environment:
{
"ok": true,
"paddleocrInstalled": true,
"paddleInstalled": true,
"engineReady": false,
"defaultLanguage": "vi,en",
"defaultProfile": "vietnamese"
}
engineReady becomes true after the first OCR request. The first request can be slow because PaddleOCR downloads and loads models.
Configuration
PADDLEOCR_DEVICE:cpu,gpu:0, or another PaddleOCR device string.OPENSCREEN_OCR_PROFILE:fast,vietnamese, orhybrid. The defaultvietnameseprofile upscales and sharpens focused UI screenshots before OCR.OPENSCREEN_GUIDE_OCR_LANGUAGE: defaults tovi,en.PADDLEOCR_LANG: optional hard override. Leave unset for the app profile/language settings to work.PADDLEOCR_VERSION: defaults toPP-OCRv5.PADDLEOCR_USE_MOBILE: defaults to1; set to0to use the default/server models.PADDLEOCR_REC_MODEL: optional recognizer model override. The bundled profile useslatin_PP-OCRv5_mobile_rec, which supports Vietnamese Latin-script text.OPENSCREEN_GUIDE_OCR_URL: OpenScreen OCR endpoint override; defaults tohttp://127.0.0.1:8866.