957 Commits

Author SHA1 Message Date
huanld 24a16c693a Add auto guide generation with bundled OCR 2026-05-28 07:07:30 +07:00
Sid 8117d4826f Merge pull request #657 from aaravshirpurkar/main
fix: increase rounding snap tolerance to 4px to prevent background line bleed when padding is 0
2026-05-26 22:35:42 -07:00
Sid b70994b079 Merge pull request #658 from neurot1cal/fix/streaming-recording-duration
Fix: stream long recordings to disk and patch WebM duration on save (#616)
2026-05-26 22:27:09 -07:00
neurot1cal 5c5cab6903 fix: don't stream when the append IPC is unavailable
Codex re-review: if openRecordingStream exists but appendRecordingChunk
does not (renderer/main version skew), the recorder would open the stream
and switch to streaming mode, but every append silently no-ops and the
save ends up empty. Require both IPC methods before streaming; otherwise
fall back to in-memory buffering. Adds a regression test.

Verified: tsc --noEmit clean; biome clean; vitest 183/183.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 16:39:53 -07:00
neurot1cal 36d7d2bdd0 fix: tighten streaming failure handling from re-review
Addresses the CodeRabbit + Codex re-review of the prior commit.

- Normalize a rejected append (channel/handler error, not just a
  { success: false } result) into appendError, so the write queue never
  rejects and isStreaming() stays consistent after a failure (CodeRabbit).
- Handle a rejected open-stream IPC the same as a failed open: fall back
  to in-memory buffering instead of leaving the recorder stuck "pending"
  with an unhandled rejection (CodeRabbit).
- Discard a streamed webcam whose write failed even when the screen save
  succeeds. The cleanup gate is now per-recorder, so a webcam omitted from
  a successful screen-only save no longer leaks its stream and partial
  file (Codex).

Adds tests for the rejected-append and rejected-open paths.

Verified: tsc --noEmit clean; biome clean; vitest 182/182.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 16:28:50 -07:00
neurot1cal f3c5b8a65d fix: harden streaming lifecycle and lift it out of the IPC god-module
Addresses the review feedback on #658 (CodeRabbit + Codex) and the
structural notes from the quality pass.

Correctness:
- Compute the recorder's streaming state at finalize time, not at
  construction. A stream that fails to open is now reported as
  not-streamed, so its buffered chunks are saved as a complete in-memory
  fallback instead of being dropped (was total data loss on open failure).
- Await every in-flight chunk write before onstop resolves, so the main
  process never closes the write stream while a final chunk is still in
  flight (was truncating the tail of a recording under load).
- Open the disk write stream by awaiting its 'open' event, so a bad path
  or permission error rejects up front instead of being acknowledged as
  success and then silently dropping bytes.
- Close the stream and remove the partial file when a streamed recording
  is discarded or fails, so cancelled/failed runs don't leak descriptors
  or orphan partial recordings.
- Surface a mid-stream write failure as a rejected recording rather than
  saving a silently truncated file.

Structure:
- Extract the streaming concern into electron/ipc/recordingStream.ts
  (RecordingStreamRegistry) and src/hooks/recorderHandle.ts, out of the
  2.8k-line handlers.ts and the screen-recorder hook.
- Key write streams by output file name, removing the implicit
  recordingId/+1 contract that spanned the IPC boundary.
- Collapse the duplicated screen/webcam finalize blocks into one helper
  and the repeated duration-validity guard into one check; patch the
  screen and webcam durations in parallel.

Adds unit tests for the registry (real temp-dir fs) and the recorder
handle state machine (open-failure fallback, in-order writes awaited
before stop, mid-stream failure). Extends the vitest include glob to
collect electron-side tests.

Verified: tsc --noEmit clean; biome clean; vitest 180/180.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 16:09:39 -07:00
aaravshirpurkar 8c7ea939ab fix: increase rounding snap tolerance to 4px to prevent background line bleed when padding is 0 2026-05-26 13:43:57 +05:30
neurot1cal 727e395fcf fix: stream long recordings to disk and patch WebM duration on save
Recordings longer than ~10 minutes silently fail to save (#616). The
renderer buffers the whole WebM as a Blob[], then on stop makes several
in-memory copies (fixWebmDuration -> arrayBuffer -> Buffer.from) before
writing. A long 1080p recording duplicates hundreds of MB several times
in the renderer, exceeds Electron's memory limit, and the renderer
crashes silently with no file saved.

Two changes:

1. Stream chunks to disk (originally @Amanuel2x's contribution in #617).
   Open an fs.WriteStream in the main process at recording start and send
   each ~1s ondataavailable chunk straight to disk over two new IPC calls
   (open-recording-stream, append-recording-chunk), so the renderer never
   holds more than a single chunk. A full in-memory fallback is preserved
   for environments where the IPC stream cannot open.

2. Patch the WebM Duration header on disk after the stream closes. Browser
   MediaRecorder writes WebM with no Duration element, so streamed files
   save with duration=N/A and the editor's seek bar, timeline, and any
   scrub/trim break. A new electron/recording/webm-duration.ts module
   rewrites the Duration element, writing to a temp file and renaming in
   place so a crash mid-write cannot corrupt the recording.

Streaming is opt-in: the screen recorder and the browser-only webcam
recorder stream to disk; native-capture webcam sidecars (Windows, macOS)
keep buffering in-memory, since their finalize path reads the recorder
blob directly to attach the webcam track.

Verified: tsc --noEmit clean; biome clean; vitest 166/166.

Closes #616
Supersedes #617

Co-Authored-By: Amanuel <amanuel@localboostnetworking.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 17:53:22 -07:00
Sid 54677960d0 Merge pull request #645 from auberginewly/fix/export-panel-hidden-after-zoom-select
fix(export): clear timeline selection when opening export panel
2026-05-25 10:59:13 -07:00
Sid 7f73f9089f Merge pull request #646 from AjTheSpidey/codex/editor-settings-scroll
Fix stale speed controls in the editor
2026-05-25 10:56:14 -07:00
Sid f6b0bbd3ed Merge pull request #653 from nachobh/roundingIncreased
Rounding increased
2026-05-25 10:52:56 -07:00
Ignacio Benito Herrero d32bb00959 Increased max rounding to 64 2026-05-25 10:42:43 +02:00
Ignacio Benito Herrero 3d129f85cd Added .import files to .gitignore 2026-05-25 10:38:26 +02:00
AjTheSpidey d856a52316 fix stale speed selection in editor 2026-05-23 18:32:52 +08:00
auberginewly 37eaacc07b fix(export): clear timeline selection when opening export panel
When a zoom/trim/speed region is selected, hasTimelineSelection is true
and the export panel is gated behind !hasTimelineSelection. Clicking the
Download button only switched activePanelMode locally in SettingsPanel
without clearing the selection in VideoEditor, so the export panel never
rendered.

Add onExportPanelOpen callback prop to SettingsPanel and call it on
Download button click to clear selectedZoomId, selectedTrimId, and
selectedSpeedId — making hasTimelineSelection false and unblocking the
export panel.

Complements PR #611 which fixed the bulk suggest-zooms path; this
covers the manual selection path.
2026-05-23 16:17:12 +08:00
Sid 34340c2b29 Merge pull request #526 from Sunwood-ai-labs/codex/allow-png-background-upload
[codex] Allow PNG custom background uploads
2026-05-22 20:37:25 -07:00
Siddharth 2dbdb27bb6 Merge remote-tracking branch 'origin/main' into codex/allow-png-background-upload
# Conflicts:
#	electron/ipc/handlers.ts
#	electron/main.ts
2026-05-22 20:33:19 -07:00
Sid fbd06fca48 Merge pull request #613 from auberginewly/feat/zoom-hold-preview
feat(zoom): hold-to-preview button for zoom focus editing (prototype for #612)
2026-05-22 20:16:28 -07:00
Siddharth 85d0dea9fc Merge remote-tracking branch 'origin/main' into feat/zoom-hold-preview
# Conflicts:
#	src/components/video-editor/VideoPlayback.tsx
2026-05-22 20:15:00 -07:00
Siddharth 259bfa9097 Merge remote-tracking branch 'origin/main' into feat/zoom-hold-preview 2026-05-22 20:12:51 -07:00
Sid b6b37e3718 Merge pull request #639 from siddharthvaddem/codex/fix-windows-paused-recording
[codex] Fix native Windows recording pause
2026-05-22 20:09:05 -07:00
Siddharth a50835e30f Merge remote-tracking branch 'origin/main' into codex/fix-windows-paused-recording
# Conflicts:
#	src/hooks/useScreenRecorder.ts
2026-05-22 20:08:26 -07:00
Sid dfd961393f Merge pull request #605 from EtienneLescot/codex/editor-defaults-ssot
Centralize editor defaults
2026-05-22 20:04:54 -07:00
Siddharth 9eaae72af1 fix: drop removed WEBCAM_TARGET width/height refs after main merge
PR #600 (now on main) removed WEBCAM_TARGET_WIDTH/HEIGHT and switched
this call site to width/height: 0 so the native helper picks the
camera's native dimensions. Align this branch with that so CI's
fresh PR-merge stops erroring on the undeclared identifiers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 20:02:25 -07:00
Siddharth d658ec40f5 Merge remote-tracking branch 'origin/main' into codex/editor-defaults-ssot 2026-05-22 20:00:54 -07:00
Siddharth 84b523df83 fix: drop unused imports and reorder in SettingsPanel
Removes MAX_PLAYBACK_SPEED and DEFAULT_WEBCAM_SIZE_PRESET (TS6133) and
runs biome's organize-imports to satisfy the Lint check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 19:54:13 -07:00
Sid 086039bf67 Merge pull request #600 from sagar290/webcam-layout-constraints
Fix vertical webcam recording layout
2026-05-22 19:47:31 -07:00
Sid b8e78ccbf5 Merge branch 'main' into codex/editor-defaults-ssot 2026-05-22 19:44:37 -07:00
Sid 3b7e78a16b Merge pull request #642 from siddharthvaddem/codex/fix-windows-webcam-sidecar
Fix native Windows webcam sidecar capture
2026-05-22 19:43:29 -07:00
Sid 97fa74aa92 Merge pull request #643 from AjTheSpidey/codex/fix-decimal-playback-speed
Fix decimal custom playback speeds
2026-05-22 19:42:34 -07:00
AjTheSpidey 0daf2295a3 fix: accept decimal custom speeds 2026-05-23 03:47:42 +08:00
EtienneLescot 10614c2950 Address webcam sidecar review feedback 2026-05-22 21:20:51 +02:00
Etienne Lescot b36a32d44b refactor: centralize editor defaults 2026-05-22 21:10:44 +02:00
EtienneLescot ca826d9088 Fix native Windows recording pause 2026-05-22 21:02:33 +02:00
EtienneLescot ef5855f1f4 Fix native Windows webcam sidecar capture
Record browser webcam sidecar when native Windows capture is active.

Add native webcam sidecar output and DirectShow NV12/YUY2 fallback.

Sample exported webcam frames by source timestamp.
2026-05-22 20:56:09 +02:00
Sid 9f7f498e22 Merge pull request #621 from LucaFontanot/refactor-cursor
refactor: Migrate the powershell cursor script into native cursor sampler
2026-05-20 21:23:09 -07:00
Sid a9df720554 Merge pull request #614 from creazyfrog/feature/rename-native-to-original
ux: rename 'Native' aspect ratio label to 'Original'
2026-05-20 21:06:36 -07:00
Sid 37ab35f5a8 Merge pull request #603 from AjTheSpidey/codex/multi-source-recording-editor
test: cover MP4 editor export
2026-05-20 21:05:40 -07:00
Sid 4a55dcdb4c Merge pull request #618 from LucaFontanot/i18n-ita
i18n: Add italian
2026-05-20 20:37:48 -07:00
Luca Fontanot cfe6b9e594 fix: Thread detach before teardown is race-prone. 2026-05-20 11:53:50 +02:00
Luca Fontanot 7826cc44e3 fix: Add <algorithm> include for std::max 2026-05-20 11:49:23 +02:00
AjTheSpidey 57eed2c829 test: tighten export e2e guards 2026-05-20 16:38:14 +08:00
Sid 12350520cc Merge pull request #622 from marcgabe15/fix-locale-save-diagnostics
Fix locale save diagnostics
2026-05-19 20:59:52 -07:00
Marc Diaz a9181e6782 fix: update translations 2026-05-19 23:22:50 -04:00
Marc Diaz 3be317b7f9 refactor: update support key 2026-05-19 22:41:38 -04:00
Luca Fontanot 49ee3ac0db refactor: Migrate the powershell cursor script into the native cursor-sampler.cpp 2026-05-20 00:45:33 +02:00
AjTheSpidey fd2bf21e7e test: wait for export bytes in e2e 2026-05-20 04:07:38 +08:00
AjTheSpidey 94e848452e test: cover MP4 editor exports 2026-05-20 00:14:04 +08:00
Luca Fontanot de3c2ef5bd i18n: CR update src/i18n/locales/it/shortcuts.json
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-05-19 15:14:29 +02:00
Luca Fontanot c4acb9d6e7 i18n: Fix application language selection with electronAPI setLocale 2026-05-19 14:34:45 +02:00
Luca Fontanot a1a75badde i18n: Add italian (and missing russian) language to window main locale 2026-05-19 14:33:16 +02:00
Luca Fontanot 7319ec2db6 i18n: Fixed language sorting and added italian to config and test 2026-05-19 14:19:41 +02:00
Luca Fontanot de2cc6546a i18n: Added italian 2026-05-19 14:15:02 +02:00
Rohit Sharma 9348b9c7a0 ux: rename 'Native' aspect ratio label to 'Original'
The aspect ratio dropdown showed 'Native', which is video-industry jargon
that isn't self-explanatory for most users. Renaming it to 'Original'
makes it immediately clear that this option preserves the source video's
own dimensions.

The internal `"native"` value in the AspectRatio union type is unchanged;
only the display string returned by `getAspectRatioLabel()` is updated.

Closes #607

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 01:19:50 -07:00
Sid 00fbd95452 Merge pull request #609 from kwakseongjae/i18n/ko-kr-missing-keys
i18n(ko-KR): fill missing Korean translation keys
2026-05-18 21:00:31 -07:00
kwakseongjae dd413785f3 fix(i18n): remove duplicate keys in ko-KR settings after main merge
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 12:30:07 +09:00
auberginewly b25c2c8363 revert(zoom): drop selected-zoom preview override, preview the current frame
The a686fa0 override replaced findDominantRegion's resolved region with the
raw stored region (forcing strength=1 / transition=null). findDominantRegion
already resolves focus via getResolvedFocus — for focusMode:"auto" it
interpolates the cursor-followed focus from telemetry and applies clamp/blend/
transition. The override bypassed all of that, so previewing an auto-focus
zoom showed a stale static focus and an instant un-eased zoom that did not
match real playback/export.

Hold-to-preview now shows the natural zoom for the current playhead frame
(true before/after compare). The isPreviewingZoom flag is kept — it only
disables the un-zoomed editing guard so findDominantRegion's result is shown.
Previewing while the playhead is outside any zoom shows no zoom by design.
2026-05-19 11:08:54 +08:00
auberginewly a686fa012a fix(zoom): address PR review — preview selected zoom + keyboard a11y
- VideoPlayback: while holding Preview, render the SELECTED zoom at full
  strength regardless of the playhead, instead of whatever findDominantRegion
  returns at currentTime (which is none/another zoom when the playhead is
  outside the selection). Uses getZoomScale/getRotation3D for the region's
  configured scale and 3D preset.
- SettingsPanel: require both onZoomPreviewStart && onZoomPreviewEnd to render
  the button (full lifecycle), and add keyboard support — Space/Enter keydown
  (repeat-guarded) starts preview, keyup/blur ends it.
2026-05-19 11:02:28 +08:00
auberginewly 24ce67b5a7 i18n(zoom): clarify previewHold wording to "preview zoom effect"
Make the label explicit about what holding previews (the zoom effect),
across all 11 locales.
2026-05-19 10:51:22 +08:00
auberginewly 2993a57853 feat(zoom): add hold-to-preview button for zoom focus editing
When a zoom region is selected and paused, the editor shows the full
un-zoomed frame for focus-point placement. This adds a press-and-hold
"Preview" button so editors can momentarily see the zoomed result at the
current focus + depth — like a before/after compare — without entering
playback.

- VideoPlayback: new transient isPreviewingZoom prop; shouldShowUnzoomedView
  now also requires !isPreviewingZoom, so the zoom transform is applied at
  the playhead while previewing
- VideoEditor: isPreviewingZoom state wired to VideoPlayback and to
  onZoomPreviewStart/End handlers
- SettingsPanel: hold button in the zoom controls (pointer down/up/leave/
  cancel)
- i18n: zoom.previewHold added across all 11 locales

Prototype for #612 — placement (panel vs overlay) and hold-vs-toggle still
open for maintainer direction.
2026-05-19 10:48:44 +08:00
Sid 36ceca38f9 Merge pull request #611 from auberginewly/fix/suggest-zooms-export-hidden
fix(zoom): keep export panel visible after Suggest Zooms
2026-05-18 19:26:54 -07:00
Sid 7bae09fc88 Merge branch 'main' into i18n/ko-kr-missing-keys 2026-05-18 19:26:05 -07:00
auberginewly bc8655a4bb fix(zoom): 批量建议缩放后不再抢占选中态,避免导出面板被隐藏
handleZoomSuggested 每加一个建议缩放就调用 setSelectedZoomId,循环结束后
最后一个 auto-zoom 处于选中态。SettingsPanel 以 !hasTimelineSelection 作为
导出面板渲染条件,导致用户点完"自动添加缩放"后导出按钮消失,必须先点剪辑
轨道取消选中才能导出。

批量 suggest 路径移除选中副作用;单个手动添加 (handleZoomAdded) 保持自动
选中不变。

Closes #610
2026-05-19 10:23:37 +08:00
Sid e7ca9ecb5d Merge pull request #573 from EtienneLescot/feat/macos-native-capture-pipeline
feat: add macOS native capture and cursor pipeline
2026-05-18 08:03:21 -07:00
kwakseongjae 0e130b6d49 feat(i18n): fill missing Korean (ko-KR) translation keys
Adds Korean translations for keys that had accumulated in en/* but were
missing from ko-KR/* as other features landed after the initial Korean
localization.

common.json (22 keys, matching macOS Korean menu standards):
- actions.{undo, redo, cut, copy, paste, selectAll, minimize, reload,
  forceReload, toggleDevTools, actualSize, zoomIn, zoomOut,
  toggleFullScreen, recordingStatus, about, services, hide, hideOthers,
  unhide}

settings.json (7 keys):
- zoom.customScale, zoom.position.{title, x, y, hint}
- layout.noWebcam
- effects.on

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 22:37:25 +09:00
Sagar Dash b5c105b5d6 fix: respect native webcam orientation on Windows 2026-05-18 18:28:13 +06:00
Etienne Lescot 788b0a2e9f fix(cursor): default canvas clipping off 2026-05-18 12:38:40 +02:00
auberginewly 8516707880 fix(cursor): address review findings — aria-label, 3D transform sync, i18n
- Add aria-label to cursorClipToBounds Switch so screen readers announce the control
- Mirror composite3D 3D transform onto nativeCursorClipRef so the cursor clip layer
  rotates with the video during 3D zoom regions (cursor stays outside preserve-3d
  so clip-path continues to work; only the transform string is mirrored)
- Fix vi cursor.motionBlur: "Mờ chuyển động" → "Làm mờ chuyển động" to match
  effects.motionBlur phrasing
- Fix zh-TW cursor.motionBlur: "運動模糊" → "動態模糊" to match effects.motionBlur
2026-05-18 12:19:48 +02:00
auberginewly 58722a1d84 feat(cursor): add cursorClipToBounds setting with i18n translations
Add a cursor.clipToBounds toggle to the Settings panel (default on) that controls
whether the native cursor is clipped to the video canvas boundary in both preview
and export. Wire up 11 locale files (ar, en, es, fr, ja-JP, ko-KR, ru, tr, vi,
zh-CN, zh-TW) with the new cursor settings section.
2026-05-18 12:19:48 +02:00
auberginewly 65bb5bc8dd feat(cursor): clip native cursor to camera-aware video bounds in preview and export
- Add nativeCursorClipRef div (outside preserve-3d) with CSS inset() clip-path that
  tracks the camera-transformed video boundary, including border-radius
- Add cameraAwareMaskRect() in FrameRenderer that computes the same boundary for
  Canvas 2D clip in the export path; remove stage-clamping so rounded corners match
  the preview's inset() behavior when zoom/pan pushes the mask off-stage
- Cache maskBorderRadius in LayoutCache so both shadow and direct composite paths
  can apply camera-aware rounded clipping
- Fix double mask.x offset introduced by nativeCursorMaskRef; replace mask div with
  clip-path on the outer wrapper
- Normalize cursor size relative to maskRect.width so preview and export scale match
- Clip cursor to canvas boundary and hide on non-recorded display
- Wire cursorClipToBounds flag through FrameRenderConfig and VideoExporter
2026-05-18 12:19:47 +02:00
Etienne Lescot 31e394fe1c fix: address follow-up review comments 2026-05-18 12:19:47 +02:00
Etienne Lescot e708ae973e fix: address native mac review feedback 2026-05-18 12:19:47 +02:00
Etienne Lescot 179047b834 fix: isolate macOS native capture by platform 2026-05-18 12:19:47 +02:00
Etienne Lescot df6da28ad2 fix: improve macOS HUD interactions and audio preview 2026-05-18 12:19:47 +02:00
Etienne Lescot c1ba82fc71 chore: sync i18n locale keys 2026-05-18 12:19:47 +02:00
Etienne Lescot 73870c65ef feat: support pausing macOS native recordings 2026-05-18 12:19:47 +02:00
Etienne Lescot b2f9afab8c feat: add macOS editable cursor overlay support 2026-05-18 12:19:47 +02:00
Etienne 6a4ddc5dad feat: compose mac native capture with media 2026-05-18 12:19:05 +02:00
Etienne b9e2134749 feat: add macos screencapturekit helper 2026-05-18 12:19:05 +02:00
Etienne 7102110de5 chore: ignore macos native build outputs 2026-05-18 12:19:05 +02:00
EtienneLescot fbdc7d5697 feat: scaffold macOS native capture pipeline 2026-05-18 12:19:05 +02:00
Sid 6018ba0fe1 Merge pull request #594 from EtienneLescot/codex/timeline-empty-space-scrub
[codex] add empty timeline scrubbing
2026-05-16 12:35:30 -07:00
Sid e50f65e3b6 Merge pull request #597 from EtienneLescot/codex/fix-editable-cursor-native
Fix editable cursor mode for native Windows capture
2026-05-16 12:34:57 -07:00
Sid 80e2c3545d Merge pull request #598 from EtienneLescot/codex/fix-high-quality-export
Clarify export resolution presets
2026-05-16 12:34:33 -07:00
Sid 939de7081b Merge pull request #593 from EtienneLescot/codex/fix-native-aspect-ratio-fallback
[codex] fix native aspect ratio fallback
2026-05-16 12:34:05 -07:00
Sid 55dfca05aa Merge pull request #596 from EtienneLescot/codex/fix-language-prompt-clicks
[codex] fix language prompt HUD clicks
2026-05-16 12:33:31 -07:00
EtienneLescot 5e76170307 Clarify MP4 export resolution presets 2026-05-16 20:19:00 +02:00
EtienneLescot 0d3c4df453 fix: relax cursor capture helper validation 2026-05-16 13:54:51 +02:00
EtienneLescot 9d5be8beb4 fix: enforce cursor-free WGC editable mode 2026-05-16 13:44:08 +02:00
EtienneLescot c4eb3003be fix timeline scrub lint formatting 2026-05-16 13:20:20 +02:00
EtienneLescot 7bf07611c3 fix language prompt hud clicks 2026-05-16 12:53:18 +02:00
EtienneLescot c9985a08d4 add empty timeline scrubbing 2026-05-16 12:21:30 +02:00
EtienneLescot 55bc0c9836 fix native aspect ratio fallback 2026-05-16 12:20:54 +02:00
Sid b0293e7d93 Merge pull request #217 from EtienneLescot/feat/cursor-pipeline
feat: add Windows native capture and cursor pipeline
2026-05-10 14:21:01 -07:00
Siddharth 1d36ad239d Merge remote-tracking branch 'origin/main' into feat/cursor-pipeline
# Conflicts:
#	src/components/video-editor/VideoEditor.tsx
2026-05-10 14:17:42 -07:00
Siddharth b41c4f49fc remove macos cursor highlight; wire telemetry session for non-windows 2026-05-10 14:12:54 -07:00
Sid 201729e8ab Merge pull request #536 from yusufm/codex/export-diagnostics
Improve export failure diagnostics
2026-05-10 12:03:24 -07:00
EtienneLescot 0720a6d802 fix: restore native cursor wiring after upstream rebase 2026-05-10 15:19:19 +02:00
EtienneLescot 8137e816fd fix: normalize native Windows audio for AAC 2026-05-10 15:11:38 +02:00
EtienneLescot 4e5b7a4f5a test: log source copy fast path blockers 2026-05-10 15:11:38 +02:00
EtienneLescot afd5e35730 docs: remove README developer notes link 2026-05-10 15:11:37 +02:00
EtienneLescot ac2e34e58c fix: preserve Windows system audio on export 2026-05-10 15:11:37 +02:00
EtienneLescot 4d3bce0f20 feat: add Windows cursor capture mode 2026-05-10 15:11:36 +02:00
EtienneLescot b349c0a27c fix: downmix multichannel export audio 2026-05-10 15:11:35 +02:00
EtienneLescot 238fc97c6d fix: preserve cursor and audio in exports 2026-05-10 15:11:34 +02:00
EtienneLescot 0d9e821171 fix: guard source copy while native cursor data loads 2026-05-10 15:11:34 +02:00
EtienneLescot 34e22d001c fix: restore source copy export fast path 2026-05-10 15:11:33 +02:00
EtienneLescot 722f630117 fix: address maintainer platform regressions 2026-05-10 15:11:32 +02:00
EtienneLescot f4fc7fab9e fix: preserve native cursor click interactions 2026-05-10 15:11:31 +02:00
EtienneLescot f91300a1b7 fix: make native cursor click bounce visible 2026-05-10 15:11:31 +02:00
EtienneLescot 82bffefa54 fix: harden native recorder review paths 2026-05-10 15:11:30 +02:00
EtienneLescot 826790fe52 fix: address native cursor review findings 2026-05-10 15:11:29 +02:00
EtienneLescot 9b85cacec7 test: harden Windows cursor diagnostic 2026-05-10 15:11:28 +02:00
EtienneLescot f76fb423be docs: backlog native cursor click bounce 2026-05-10 15:11:28 +02:00
EtienneLescot e33d2205e6 fix: record native cursor click events 2026-05-10 15:11:28 +02:00
EtienneLescot 3a32a140cc fix: capture quick native cursor clicks 2026-05-10 15:11:27 +02:00
EtienneLescot d0341580d6 feat: apply native cursor visual effects 2026-05-10 15:11:27 +02:00
EtienneLescot ab3d38d90f fix: address native capture review feedback 2026-05-10 15:11:25 +02:00
EtienneLescot c7b43a50ef fix: resolve selected Windows microphone 2026-05-10 15:11:24 +02:00
EtienneLescot 0ebf5c143b test: add Windows native checklist smoke test 2026-05-10 15:11:23 +02:00
EtienneLescot c0deb03414 fix: gate Windows cursor settings 2026-05-10 15:11:22 +02:00
EtienneLescot 38d727eb8e fix: skip black webcam warmup frames 2026-05-10 15:11:21 +02:00
EtienneLescot 84484d6167 fix: support DirectShow virtual webcams 2026-05-10 15:11:21 +02:00
EtienneLescot fdcd882058 fix: honor selected native Windows webcam 2026-05-10 15:11:20 +02:00
EtienneLescot fb85f66875 feat: add native Windows webcam composition 2026-05-10 15:11:19 +02:00
EtienneLescot 048189da72 feat: add native Windows window capture 2026-05-10 15:11:18 +02:00
EtienneLescot 7929aea908 fix: align native mixed audio timeline 2026-05-10 15:11:17 +02:00
EtienneLescot 588a0a7be8 feat: add native Windows microphone capture 2026-05-10 15:11:17 +02:00
EtienneLescot 062cf2a87c feat: add native Windows recorder helper 2026-05-10 15:11:16 +02:00
EtienneLescot d21e5eb34c fix: restore native cursor preview and export 2026-05-10 15:11:15 +02:00
EtienneLescot 87240a919e fix: align native cursor preview and export 2026-05-10 15:11:12 +02:00
EtienneLescot 3d1d4a5ff0 fix: avoid unsupported display media min constraint 2026-05-10 15:11:10 +02:00
EtienneLescot ef36da4a4f feat: complete windows cursor assets 2026-05-10 15:11:09 +02:00
EtienneLescot bb0dec7344 feat: add windows cursor preview diagnostics 2026-05-10 15:11:07 +02:00
EtienneLescot 28ff0fb7bf fix: restore cursor pipeline build after rebase 2026-05-10 15:11:03 +02:00
Etienne Lescot e9650225ba feat: add cursor overlay pipeline for high-fidelity cursor recording and playback
- Implement native bridge for Windows cursor capture via PowerShell/C#
- Add cursor-free capture using getDisplayMedia with setDisplayMediaRequestHandler
- Update video player and exporters to support native cursor telemetry
- Enable system audio capture on Windows via WASAPI loopback
- Add interpolation for smoother cursor movement in playback and export
- Improve cursor scaling and visibility handling in editor and playback
2026-05-10 15:11:00 +02:00
Etienne Lescot 248ebabcf1 feat: add windows native cursor capture and rendering 2026-05-10 15:10:56 +02:00
Etienne Lescot 44f59bfa89 feat: add unified native bridge foundation 2026-05-10 15:10:54 +02:00
Etienne Lescot 6f099b3483 feat: add cursor overlay pipeline 2026-05-10 15:10:53 +02:00
Sunwood-ai-labs d3b51e84f2 Merge remote-tracking branch 'upstream/main' into codex/allow-png-background-upload
# Conflicts:
#	src/components/video-editor/SettingsPanel.tsx
#	src/i18n/locales/ja-JP/settings.json
2026-05-10 14:30:22 +09:00
Sid 162e734b76 Merge pull request #535 from yusufm/codex/lazy-load-editor
Lazy load the editor bundle
2026-05-09 22:22:33 -07:00
Yusuf Mohsinally 2b8ec9e3a5 Merge remote-tracking branch 'origin/main' into codex/export-diagnostics
# Conflicts:
#	src/components/video-editor/VideoEditor.tsx
2026-05-09 20:03:58 -07:00
Siddharth e3d4a330df ui revamp 2026-05-09 19:18:16 -07:00
Siddharth 7bbb855e8e update readme 2026-05-09 17:32:43 -07:00
Siddharth 52cb709a88 readme update 2026-05-09 17:12:27 -07:00
Sid 68a95c642a Merge pull request #567 from siddharthvaddem/chore/bump-nix-1.4.0
chore: bump nix package to v1.4.0
2026-05-09 17:05:05 -07:00
github-actions[bot] 2ae7aca185 chore: bump nix package to v1.4.0 2026-05-10 00:04:08 +00:00
Sid 8afca89520 Merge pull request #566 from siddharthvaddem/chore/nix-auto-bump-workflow
chore: add nix package auto-bump workflow
2026-05-09 16:59:34 -07:00
Siddharth 7feb05cca7 add nix package auto-bump workflow
On every published GitHub Release, opens a PR bumping nix/package.nix:
- version => the new release version
- npmDepsHash => freshly computed via prefetch-npm-deps package-lock.json

Mirrors the brew + winget release-bump pattern, but lands the change in
this repo (not a separate tap), so it opens a PR instead of pushing
directly. Uses GITHUB_TOKEN — note that PRs created by GITHUB_TOKEN do
not auto-trigger CI; the diff is two lines, easy to review and merge.

Refs the long-standing manual-bump pain (e.g. PR #504 fixing a stale
hash). After this lands, Nix users get new releases without anyone
having to remember the manual edit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 16:58:51 -07:00
Sid 9ac4e778f8 Merge pull request #565 from siddharthvaddem/chore/winget-releaser-workflow
chore: add winget-releaser workflow
2026-05-09 16:42:27 -07:00
Siddharth ed825d8b37 add winget-releaser workflow
Auto-publishes new releases to winget via vedantmgoyal9/winget-releaser.
On every "released" event (not pre-release), the action opens a PR against
microsoft/winget-pkgs bumping SiddharthVaddem.OpenScreen.

Requires:
- WINGET_ACC_TOKEN secret: classic PAT with public_repo scope
  (fine-grained PATs are NOT supported by the action).
- A fork of microsoft/winget-pkgs under siddharthvaddem (or pass fork-user
  if forked elsewhere).

Closes #299

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 16:39:31 -07:00
Siddharth b48370e3d0 update readme w brew 2026-05-09 16:34:26 -07:00
Sid 8c0c555640 Merge pull request #564 from siddharthvaddem/chore/homebrew-cask-final-style
chore: final homebrew cask style cleanup
2026-05-09 16:28:49 -07:00
Siddharth 24be97bae7 fix: final homebrew cask style + audit cleanup
- Drop unnecessary verified: stanza (URL host matches homepage host).
- Add blank line between sha256 and url inside on_arm/on_intel
  (rubocop treats them as separate stanza groups).
- Keep no blank line between on_arm and on_intel blocks
  (same outer stanza group).

After re-running the bump workflow, the cask passes both
brew audit --cask and brew style --cask cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 16:28:15 -07:00
Sid c820b416f9 Merge pull request #563 from siddharthvaddem/chore/homebrew-cask-audit-fix
chore: fix homebrew cask audit warnings
2026-05-09 16:25:08 -07:00
Siddharth f42c478725 fix homebrew cask audit warnings
- Use #{version} interpolation in URLs so brew detects them as versioned
  (silences "Use sha256 :no_check when URL is unversioned").
- Drop blank line between on_arm and on_intel (same stanza group).
- Alphabetize zap trash array.
- Add verified: stanza for the GitHub release URL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 16:22:58 -07:00
Sid 64c2a43006 Merge pull request #562 from siddharthvaddem/chore/homebrew-cask-workflow
chore: add homebrew cask bump workflow
2026-05-09 16:07:47 -07:00
Siddharth 4a0878c3d0 add homebrew cask bump workflow
Auto-updates the openscreen Homebrew tap on each published release:
finds the macOS DMGs, computes sha256, and rewrites Casks/openscreen.rb
in siddharthvaddem/homebrew-openscreen.

Requires HOMEBREW_TAP_TOKEN secret with contents:write on the tap repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 16:03:50 -07:00
Sid d8da26a41a Merge pull request #561 from auberginewly/fix/electron-screen-capture-permissions
fix(macOS): fix three screen capture permission issues in Electron layer
2026-05-09 15:10:47 -07:00
Sid f0699cc8b0 Merge pull request #507 from hthienloc/add-vietnamese-i18n-1022783609047552672
Add Vietnamese i18n support (vi locale)
2026-05-09 14:48:31 -07:00
Siddharth 3ad3e22a16 test(i18n): add vi to tutorialHelpTranslations locale map 2026-05-09 14:43:56 -07:00
Siddharth 2381e48a46 Merge main into add-vietnamese-i18n-1022783609047552672
Resolve conflict in electron/i18n.ts by keeping both `ar` (from main) and `vi` (from this branch). Also add `vi` to SUPPORTED_LOCALES in src/i18n/config.ts so Vietnamese is selectable in the language picker.
2026-05-09 14:35:03 -07:00
Siddharth 68c35ff01c zoom precision position 2026-05-09 14:32:50 -07:00
auberginewly 3dd7b85ebb fix(build): remove misplaced entitlement key from mac.extendInfo
com.apple.security.device.audio-input is an entitlement key and should
only appear in macos.entitlements. Placing it in extendInfo writes it
into Info.plist where it has no effect and is misleading.

The correct entry already exists in macos.entitlements; this removes
the redundant, incorrectly-placed duplicate.
2026-05-10 05:30:48 +08:00
auberginewly be4e2d0c94 fix(electron/macOS): proactively check screen recording permission on startup
Microphone permission is checked at startup via getMediaAccessStatus, and
camera has a dedicated request-camera-access IPC handler, but screen
recording relied entirely on desktopCapturer.getSources() to implicitly
trigger the TCC prompt — causing the permission dialog to reappear on
every launch (issue #558).

Note: askForMediaAccess() only accepts "microphone" | "camera"; screen
recording TCC is triggered via desktopCapturer.getSources() instead.

Fix:
- Import desktopCapturer in main.ts
- Call getMediaAccessStatus("screen") in app.whenReady(); trigger the
  TCC prompt via getSources when status is "not-determined"
- Add request-screen-access IPC handler symmetric to request-camera-access
2026-05-10 05:30:42 +08:00
auberginewly c9b6074626 fix(electron): add screen and display-capture to Electron permission allowlists
setPermissionCheckHandler and setPermissionRequestHandler only allowed
["media", "audioCapture", "microphone", "videoCapture", "camera"], causing
any renderer-side getUserMedia/desktopCapturer request using a screen source
to be silently denied by Electron before macOS TCC is ever consulted.

Fix: add "screen" and "display-capture" to both handler allowlists.
2026-05-10 05:24:19 +08:00
Siddharth c1f6cf67b2 loc first and then export processing 2026-05-09 11:59:52 -07:00
Siddharth 5bd17f4346 fix layout 2026-05-09 11:46:09 -07:00
Sid d3e397e249 Merge pull request #399 from muratclk/fix/trim-handle-boundary-clamp
fix: clamp trim handle end position to timeline boundary
2026-05-09 10:19:12 -07:00
Murat Çelik c771bf8bb9 fix: clamp trim handle end position to timeline boundary
The right-side trim handle could be dragged past the end of the
timeline because clampSpanToBounds did not cap the computed end
value at totalMs. This adds Math.min(…, totalMs) so the handle
snaps to the timeline edge.

Fixes #393
2026-05-09 10:07:01 -07:00
Sid 38f2044967 Merge pull request #549 from Ayusman-Singhal/feat/no-webcam-layout-preset
feat: add 'No Webcam' layout preset to hide webcam in final recording
2026-05-09 10:03:45 -07:00
Sid b4f7b4c182 Merge pull request #518 from makaradam/feature/custom-zoom-slider-clean
feat: add custom zoom slider with continuous scale control (#513)
2026-05-09 09:14:37 -07:00
Sid e880f05866 Merge pull request #504 from 0david0mp/fix/package.nix
fix: bumped npmDepsHash on package.nix
2026-05-09 09:02:22 -07:00
Sid b7c85a9b4e Merge pull request #546 from psychosomat/feature/add-russian-localization
Add Russian localization
2026-05-09 08:57:12 -07:00
Sid bc7c51ecdf Merge branch 'main' into feature/add-russian-localization 2026-05-09 08:55:16 -07:00
makaradam 42127e647f fix: add NaN guard in handleZoomCustomScaleChange before state update 2026-05-09 11:23:37 +02:00
makaradam f3dcbf2867 fix: address code review feedback on custom zoom slider
- Clamp and NaN-guard customScale in getZoomScale (defensive sanitization)
- Set customScale on preset button click so slider stays green
- Set customScale on new zoom region creation so slider lights up immediately
2026-05-09 11:23:37 +02:00
makaradam f30090bf88 fix: sanitize customScale in getZoomScale and fix isCustomActive styling 2026-05-09 11:23:36 +02:00
makaradam 37215531c2 feat: add custom zoom slider with continuous scale control (#513)
Adds a Radix UI slider below the zoom preset buttons allowing any scale
between 1.0x and 5.0x. When the slider value matches a preset exactly,
that preset button also shows as active.

- Add `customScale?: number` to `ZoomRegion` and `getZoomScale()` helper
  that returns customScale when set, falling back to ZOOM_DEPTH_SCALES[depth]
- Overlay indicator, playback renderer, and frame exporter all use
  getZoomScale() so preview, playback, and export are consistent
- Fix focus clamping in zoomRegionUtils and frameRenderer to use actual
  scale instead of depth-based preset scale, preventing zoom drift with
  custom values
- Fix drag boundary in VideoPlayback to use clampFocusToScale with the
  actual scale so the full canvas is clickable at high custom zoom levels
- Timeline item label shows custom scale value when set
- Slider styled dark with green thumb/fill when a custom (non-preset) value is active

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 11:23:36 +02:00
Sid 770a872861 Merge pull request #521 from makaradam/feature/save-dialog-redesign
feat: replace native OS close dialog with custom in-app dialog
2026-05-08 20:14:43 -07:00
Sid e170fe1a83 Merge pull request #554 from creazyfrog/fix/missing-NSScreenCaptureUsageDescription
fix(macOS): add NSScreenCaptureUsageDescription and screen-capture entitlement
2026-05-08 20:07:32 -07:00
Sid 9af318561f Merge pull request #512 from AbhinRustagi/feature/remember-last-export-folder
feat: Add exportFolder to user preferences
2026-05-08 19:30:56 -07:00
Sid b525571f71 Merge pull request #552 from marcgabe15/feature/diagnostics
add diagnostics report
2026-05-08 08:08:29 -07:00
Sid fe93050089 Merge pull request #551 from marcgabe15/fix/tests
fix: tests + how to write them
2026-05-08 08:04:54 -07:00
Trivenzaa-Admin f47fa6bdca fix(macos): add NSScreenCaptureUsageDescription and screen-capture entitlement
Without NSScreenCaptureUsageDescription in Info.plist, macOS silently
blocks desktopCapturer.getSources(), breaking window detection on macOS
10.15+. Also adds the com.apple.security.device.screen-capture entitlement
to macos.entitlements alongside the existing camera and audio-input entries.

Fixes #548
2026-05-08 01:48:52 -07:00
Marc Diaz a0c423de67 add diagnostics report 2026-05-08 00:00:30 -04:00
Marc Diaz c9980b0dca fix: tests + how to write them 2026-05-07 23:22:32 -04:00
AbhinRustagi 1aac6eddb0 Merge branch 'main' of github.com:siddharthvaddem/openscreen into feature/remember-last-export-folder 2026-05-08 05:29:14 +05:30
AbhinRustagi 25cfd2777f fix: resolve comments 2026-05-08 05:24:40 +05:30
Ayusman Singhal ada1f434f7 feat: add 'No Webcam' layout preset to hide webcam in final recording
Adds a new 'No Webcam' option to the webcam layout preset dropdown in the editor. When selected, the webcam feed is completely hidden from both the preview and the exported video, allowing users who recorded with a webcam to exclude it from the final output.

- Add 'no-webcam' to WebcamLayoutPreset type union and preset map

- Handle 'no-webcam' in computeCompositeLayout (returns webcamRect: null)

- Add 'no-webcam' case in project persistence normalization

- Add 'No Webcam' option to the layout preset dropdown in SettingsPanel

- Add 'noWebcam' i18n translation key (en)
2026-05-07 12:19:48 +05:30
psychosomat 9336e3d3c6 Fix Russian translation typo and reorder imports 2026-05-06 13:16:21 +03:00
psychosomat 6130c66be6 Add Russian localization 2026-05-06 12:55:01 +03:00
Siddharth 899504f8e2 fix export mouse overlay 2026-05-05 22:02:21 -07:00
Siddharth 6a6caf618b fix build 2026-05-05 20:29:53 -07:00
Sid b6af435e7f Merge pull request #529 from i1Zeus/arabic-support
feat: add Arabic localization support for editor, launch, settings, s…
2026-05-05 19:09:12 -07:00
Siddharth c13ec0df7d fix build to exclude uiohook 2026-05-04 19:48:30 -07:00
Sid 40f18a9bdf Merge pull request #542 from auberginewly/fix/i18n-add-missing-zoom-threeD-keys
fix(i18n): add missing zoom.threeD translation keys for 7 locales
2026-05-04 18:53:39 -07:00
auberginewly 81b1eb3e8a fix(i18n): 补充 7 个语言缺失的 zoom.threeD 翻译键
es/fr/ja-JP/ko-KR/tr/zh-CN/zh-TW 的 settings.json 均缺少
zoom.threeD.title 和 zoom.threeD.preset.{iso,left,right},
导致 npm run i18n:check 报告 MISSING。
2026-05-05 06:37:21 +08:00
Yusuf Mohsinally 559e97ddea Wrap export error diagnostics 2026-05-03 18:08:22 -07:00
Siddharth 190d5d8ecb 3d iso,tilt 2026-05-03 17:54:21 -07:00
Yusuf Mohsinally 156e9c1ec5 Improve export failure diagnostics 2026-05-03 14:24:42 -07:00
Yusuf Mohsinally 42c596da66 Lazy load the editor bundle 2026-05-03 14:20:43 -07:00
Siddharth 6fc19314dd fix dock macos lifecycle 2026-05-03 12:03:23 -07:00
Siddharth 7e00cdb1a9 preview intentional perf optimizations 2026-05-03 11:41:03 -07:00
i1Zeus a0d1cfe8c8 added ar to config and added fallback to the main.ts recordingStatus 2026-05-03 20:55:11 +03:00
Sid f7d1bc6f05 Merge pull request #484 from psychosomat/main
Improve Arch Linux support and fix video export on Hyprland
2026-05-03 10:23:27 -07:00
i1Zeus 59ecedb0ac implement i18n support and dynamic application menu in electron main process 2026-05-03 20:21:42 +03:00
i1Zeus bb30e20df7 implement lightweight i18n support for electron main process 2026-05-03 20:05:06 +03:00
i1Zeus b5d37c4270 feat: implement video editor SettingsPanel and add Arabic and English localization files 2026-05-03 20:03:01 +03:00
i1Zeus 679e306d31 feat: add Arabic localization support for editor, launch, settings, shortcuts, timeline, common, and dialogs modules 2026-05-03 19:49:35 +03:00
psychosomat b7d3563272 Upload pacman package in Linux CI artifacts 2026-05-03 12:10:00 +03:00
Siddharth 78f57970e9 fix ci checks 2026-05-02 23:27:38 -07:00
Sid bba5fd34cf Merge pull request #524 from hiroppelx/improve-ja-jp-localization
Improve Japanese localization
2026-05-02 23:23:44 -07:00
Sid 876378b622 Merge pull request #328 from AmirYunus/fix/305-hud-horizontal-scrollbar
fix(hud): avoid horizontal scrollbar when recording on Windows
2026-05-02 23:23:00 -07:00
Siddharth b7d1864a0b Merge main into fix/305-hud-horizontal-scrollbar
Resolved conflicts in src/App.tsx and src/components/launch/LaunchWindow.tsx:
- App.tsx: kept main's split useEffect for loadAllCustomFonts; placed PR's
  HUD-overlay style block inside the original [windowType] effect.
- LaunchWindow.tsx: kept main's systemLocaleSuggestion modal in place of the
  earlier inline language switcher; preserved PR's root-div className change
  that fixes the Windows horizontal-scrollbar bug.
2026-05-02 23:21:12 -07:00
Sunwood-ai-labs 613ed008fc Allow PNG custom background uploads 2026-05-03 15:18:50 +09:00
Siddharth 8d79a14e3b cursor highlighting and clicks 2026-05-02 23:03:14 -07:00
hiroppelx e4eeff0ea3 日本語訳を改善 2026-05-03 11:03:20 +09:00
Siddharth c8d4e867b2 fix recording inception error 2026-05-02 17:53:43 -07:00
Siddharth 279320d3ef fix save prompt despite being saved 2026-05-02 17:49:40 -07:00
Siddharth 0f28cc0f38 fix missing locales 2026-05-02 17:44:56 -07:00
Siddharth d59db3d839 fix missing spanish locale 2026-05-02 17:34:47 -07:00
Sid 716002f1a9 Merge pull request #370 from BaptisteAuscher/feature/color-wheel
feature/color-wheel
2026-05-02 14:32:44 -07:00
makaradam e2bdfee653 fix: scope IPC close-confirm responses to the originating window
Both ipcMain.once handlers now check event.sender.id against
windowToClose.webContents.id and ignore messages from any other
renderer, preventing cross-window response mix-ups if multiple editor
windows are ever open simultaneously.
2026-05-02 14:36:59 +02:00
makaradam e7e493294b fix: use relative path for logo so it resolves in packaged app
./openscreen.png resolves correctly both in dev (Vite serves public/)
and in production (loadFile sets base to dist/, where public assets land
inside the asar). getAssetPath points to extraResources, which is the
wrong location for bundled dist assets.
2026-05-02 14:33:14 +02:00
makaradam b2cc722613 fix: use getAssetPath for logo so it resolves correctly in packaged app 2026-05-02 13:43:20 +02:00
makaradam 36076aaf2a fix: address code review feedback on custom close dialog 2026-05-02 13:08:52 +02:00
makaradam b3469c469b feat: replace native OS close dialog with custom in-app dialog 2026-05-02 12:28:04 +02:00
AbhinRustagi b801c1ccea fix: resolve comments 2026-05-02 01:19:44 +05:30
AbhinRustagi c40727672f feat: implement handlers to store last export location 2026-05-02 01:05:17 +05:30
AbhinRustagi a38454a7fb feat: update saveExportedVideo fn signature 2026-05-02 01:02:42 +05:30
BaptisteAuscher 8e8b194454 adds support for japanese and chineese (taiwan) 2026-04-30 22:22:46 +02:00
BaptisteAuscher 916d649037 Merge branch 'main' of github.com:siddharthvaddem/openscreen into feature/color-wheel 2026-04-30 22:07:31 +02:00
google-labs-jules[bot] 37c1ea5984 Add Vietnamese i18n support (vi locale)
Co-authored-by: hthienloc <148019203+hthienloc@users.noreply.github.com>
2026-04-30 16:27:57 +00:00
Sid 884021c7d6 Merge pull request #505 from marcgabe15/fix/decodeEarlyBug
Fix/decode early bug
2026-04-29 21:33:18 -07:00
Marc Diaz 93466fdda1 fix: add max duration 2026-04-29 22:52:15 -04:00
Marc Diaz 786165208f misc: remove misc changes 2026-04-29 22:45:41 -04:00
Marc Diaz 0768c449d7 feat: all changes 2026-04-29 22:36:49 -04:00
david dc7259ba09 fix: bumped npmDepsHash on package.nix 2026-04-29 10:31:08 +02:00
Sid a6fe33a0f6 Merge pull request #501 from FabLrc/fix/vp8-vp9-codec-normalization
Fix/vp8 vp9 codec normalization
2026-04-28 19:51:03 -07:00
Sid 608e0abe87 Merge pull request #457 from shaun0927/fix/cursor-telemetry-session-isolation
fix: isolate cursor telemetry samples per recording session
2026-04-28 08:08:08 -07:00
FabLrc f9401f051c fix(exporter): fall back to avc1.640033 for unsupported H.264 codec strings 2026-04-28 14:13:34 +02:00
FabLrc cae71ed49c fix(exporter): add codec normalization for bare avc1/h264 and logging 2026-04-28 14:08:01 +02:00
FabLrc 6577a54418 fix(exporter): normalize bare VP8/VP9 codec strings from web-demuxer 2026-04-28 13:59:10 +02:00
shaun0927 3b9b4192bf fix: key cursor telemetry batches by recordingId for safe discard
discardLatestPending() popped whichever batch happened to be at the
back of the queue. With a Stop → Record → Discard sequence, the
pending queue can have recording B's batch sitting in front of A's by
the time A's finalize callback resolves (because finalizeRecording
awaits fixWebmDuration), so the discard targets the wrong recording.

Tag each completed batch with the recording id supplied at
startSession() time and replace discardLatestPending() with
discardBatch(recordingId). takeNextBatch() now returns the full
{recordingId, samples} shape so prependBatch() can re-queue it on
write-failure without losing the id. The renderer already owns a
stable recordingId (Date.now() in useScreenRecorder) and the IPC
surface threads it through set-recording-state and
discard-cursor-telemetry.

Adds a regression test that mirrors FabLrc's scenario in PR #457:
two recordings finalize, A is discarded after B has already been
queued, and the buffer must drop A while keeping B intact.
2026-04-28 18:27:14 +09:00
Siddharth 1fefde8881 auto zoom marker 2026-04-26 17:25:20 -07:00
Siddharth 5e994d214e fix perf playback choppiness 2026-04-26 17:17:49 -07:00
Sid 49213960e2 Merge pull request #419 from rajtiwariee/fix/video-blur
fix: resolve blurry screen recordings and video editor previews
2026-04-25 16:52:48 -07:00
Siddharth 8458cbb40e fix: pass asset base URL to preload via additionalArguments
Sandboxed preloads (Electron's default with contextIsolation) cannot
require node modules. Commit 702b733 added node:path / node:url imports
to preload.ts which fail at load time:

  Unable to load preload script: dist-electron/preload.mjs
  Error: module not found: node:path

This left window.electronAPI undefined, breaking every IPC call.

Compute the asset base URL in main process (windows.ts) and pass it
to preload via webPreferences.additionalArguments. Preload reads it
from process.argv. Sync API for renderer is preserved.
2026-04-25 16:50:18 -07:00
Siddharth e1c67c4e92 Revert "Merge pull request #373 from Moncef-Mhz/adjust-zoom-speed"
This reverts commit a6ae0e6d98, reversing
changes made to db10f92c49.
2026-04-25 16:50:18 -07:00
Sid 92f0ed8efe Merge pull request #472 from ichi1007/feature/add-i18n-japanese-key
feat(i18n): add Japanese locale and update translations for existing locales
2026-04-25 16:21:24 -07:00
Sid 67e7048636 Merge pull request #480 from saiganesh47/patch-1
Remove unnecessary newline in i18n-check.mjs
2026-04-25 15:56:17 -07:00
Sid 13c982618a Merge pull request #491 from AmitwalaH/feature/video-playback-fix
Fix video playback initialization and zoom sync
2026-04-25 09:24:00 -07:00
AmitwalaH 657d55bd72 fix: rethrow play error so allowPlaybackRef resets on failure 2026-04-25 15:08:01 +05:30
Sid c53dd2df93 Merge pull request #496 from Enriquefft/fix/wallpaper-export-376
Fix wallpaper backgrounds exporting as black (#376)
2026-04-24 21:34:59 -07:00
Enriquefft e06e40dbc2 clean review nits: typed prefix sentinel, instanceof narrowing, drop dead re-export
- Replace anonymous Error in resolveImageWallpaperUrl with typed
  UnsafeImagePrefixError, mirroring UnsafeAssetPathError so cause
  chains stay discriminable.
- Replace `(err as BackgroundLoadError).cause` casts in wallpaper
  tests with instanceof narrowing (no `as` per project rules).
- Remove unused `WALLPAPER_PATHS` re-export from projectPersistence;
  consumers import directly from @/lib/wallpaper (SSOT).
2026-04-24 22:34:00 -05:00
Enriquefft 373319808e cover Windows drive-letter file URLs in legacy wallpaper normalizer test 2026-04-24 21:58:59 -05:00
Enriquefft af159e8a2b tighten legacy normalizer and guard against BackgroundLoadError double-wrap
Reviewer audit found two real risks in the prior amendment:

1. LEGACY_FILE_WALLPAPER_RE was too permissive. Any file:// URL
   containing /wallpapers/wallpaperN.jpg would match — including a user's
   own file at /home/me/wallpapers/wallpaper1.jpg that happened to share
   the name pattern. Silent data-loss potential: user's photo replaced
   with a bundled asset. In-app upload flow uses data: URIs today so it
   can't actually produce such a value, but the regex should be tight
   on intent. Now requires a known install-layout segment:
   resources/[assets/]wallpapers/ (packaged) or public/wallpapers/ (dev).

2. No upper bound on \d+. A corrupted or future-schema project with
   wallpaper99.jpg was silently rewritten to /wallpapers/wallpaper99.jpg
   which 404s. Now validates against WALLPAPER_PATHS; out-of-set
   bundled-looking values fall back to DEFAULT_WALLPAPER.

Also applied R2.2 defensive guard: resolveImageWallpaperUrl's catch
block now checks instanceof BackgroundLoadError and rethrows unchanged
instead of wrapping a second time. Current getAssetPath cannot throw
BackgroundLoadError so this is a future-proof against refactors.

Tests: 56 pass (up from 54). Added coverage for "user file outside
install dir stays untouched" and "bundled-looking but out-of-set falls
back to default".
2026-04-24 18:58:34 -05:00
Enriquefft f2ff7fb21c address review audit: persist canonical wallpaper, dedupe types, tighten edge cases
R1 — Persisted wallpaper is now always the canonical /wallpapers/wallpaperN.jpg
form, never the resolved file:// URL. Swatch clicks pass WALLPAPER_PATHS[i]
(the relative path) to onWallpaperChange; the resolved URL stays in
wallpaperPreviewUrls for rendering only. This prevents machine-specific paths
from being written into project JSON and avoids break-on-upgrade /
break-on-share regressions. Legacy projects carrying resolved file:// URLs are
rewritten by a new normalizer in normalizeProjectEditor:
file://…(/assets)?/wallpapers/wallpaperN.jpg → /wallpapers/wallpaperN.jpg.

R2 — resolveImageWallpaperUrl now catches anything getAssetPath throws
(UnsafeAssetPathError, AssetBaseUnavailableError) and rewraps as
BackgroundLoadError with the original as cause. Callers (videoExporter retry
loop, gifExporter catch, VideoEditor toast) only need one instanceof check and
users always see the translated errors.exportBackgroundLoadFailed toast.

R3 — src/vite-env.d.ts no longer duplicates Window.electronAPI. The interface
had drifted — renderer declaration was missing readBinaryFile, getPlatform,
revealInFolder, getShortcuts, saveShortcuts, hudOverlay*, countdown overlay
methods that electron-env.d.ts already declares. Removed the duplicate and
kept the triple-slash reference so the authoritative declaration is the one
in electron/electron-env.d.ts.

N1 — GRADIENT_RE accepts optional "repeating-" prefix so
repeating-linear/radial/conic-gradient values classify as gradients instead
of falling through to color.

N2 — displayBasename returns "(unknown)" sentinel for URLs without a
meaningful basename (file:///, bare /) instead of leaking the original string.

N3 — electron-builder.json5 extraResources block gets an inline comment
pointing at preload.ts:assetBaseDir so the bidirectional coupling is
discoverable from either file.

Tests: 54 unit tests pass (up from 35). New coverage for repeating
gradients, displayBasename sentinels, BackgroundLoadError cause wrapping,
legacy file:// wallpaper normalization (5 cases).
2026-04-24 18:55:04 -05:00
Enriquefft 702b733074 resolve asset base path synchronously from preload
Every consumer of /wallpapers/*.jpg — SettingsPanel, VideoPlayback,
frameRenderer — was doing async IPC round trips, useEffect dances, and
Promise.all for a value that is a build-time constant per process. Each
consumer showed briefly-empty or briefly-404ing state on first paint
until the handler's reply resolved.

The asset base URL depends only on process.defaultApp and
process.resourcesPath / __dirname — all available in preload at
context-bridge time. Compute once there, expose as a sync string.

- preload.ts resolves baseDir (process.resourcesPath packaged,
  <appRoot>/public unpackaged) and emits assetBaseUrl synchronously.
- get-asset-base-path IPC handler + main-process branching deleted.
- getAssetPath() is now sync. Returns string, not Promise<string>.
  Throws AssetBaseUnavailableError (new) when electronAPI.assetBaseUrl
  is missing — catastrophic preload failure, not silent 404.
- resolveImageWallpaperUrl() sync; same sync throw semantics.
- SettingsPanel: Promise.all + useState + useEffect collapse to one
  useMemo. First paint has real URLs, no 18× ERR_FILE_NOT_FOUND, no
  flicker.
- VideoPlayback: wallpaper-resolve useEffect collapses to useMemo.
- frameRenderer.setupBackground: drops the await.
- electronAPI type decls updated in both .d.ts files.
- 35 unit tests updated to reflect sync signature + new
  AssetBaseUnavailableError contract.

Silent-fallback behavior from getAssetPath (returning /relative when
electronAPI failed) is gone. Renderers now surface preload failures
instead of rendering 404s.
2026-04-24 18:33:03 -05:00
Enriquefft 86c1c483d4 avoid 404s on first swatch render
SettingsPanel fell back to rendering WALLPAPER_PATHS (raw
/wallpapers/*.jpg strings) during the brief window before the
resolveImageWallpaperUrl effect populated wallpaperPaths. In packaged
Electron the browser resolved those against a file:// origin, producing
18 ERR_FILE_NOT_FOUND requests per load / reload. The second render
replaced them with correct URLs, so swatches appeared — but the wasted
requests showed up in devtools and churned the network panel.

Drop the fallback; render nothing until the effect completes. The
resolution is effectively instant and avoids the empty-origin round
trip.
2026-04-24 18:22:27 -05:00
Enriquefft adf3855ac8 harden wallpaper resolver against traversal, PII, and SSOT drift
Adversarial review surfaced four defects and four drive-bys. All applied:

B1 (security, MEDIUM) — Path traversal via encodeRelativeAssetPath.
encodeURIComponent passed "." and ".." through unchanged; percent-encoded
"%2e%2e" got decoded by the URL constructor. Either form escaped the
asset root: new URL("../../etc/passwd", "file:///opt/Openscreen/resources/")
→ file:///opt/etc/passwd. Reject both at src/lib/assetPath.ts via a new
UnsafeAssetPathError thrown when a decoded segment equals "." or "..".

B2 (correctness) — classifyWallpaper returned { kind: "image" } for
conic-gradient(...), rgb(...), hsl(...), oklch(...), empty string,
and named colors like "red". Old frameRenderer's bare fillStyle = value
handled these; new code would throw BackgroundLoadError with misleading
message. Classification now anchors on regexes, accepts all CSS color
functions and all three gradient types, treats unknown strings as
fallthrough color (old behavior), and normalizes "" to "#000000".

B3 (SSOT) — DEFAULT_WALLPAPER, projectPersistence.WALLPAPER_PATHS, and
SettingsPanel.WALLPAPER_RELATIVE independently hardcoded the same
/wallpapers/wallpaperN.jpg pattern. Three drift sites collapse into one:
WALLPAPER_PATHS lives in src/lib/wallpaper.ts, DEFAULT_WALLPAPER derives
from WALLPAPER_PATHS[0], projectPersistence re-exports from the canonical
module, SettingsPanel imports it directly.

B4 (privacy) — BackgroundLoadError.message and the translated toast
surfaced full file paths like file:///home/<user>/…/wallpaper.jpg —
leaks the user's home directory in copy-pasted bug reports. Added a
displayUrl getter that returns just the basename (or "data:…" for data
URIs), wired into the toast. Full URL remains in console.error and
error.url for debugging.

N1 — resolveImageWallpaperUrl now rejects image paths that don't live
under /wallpapers/ (throws BackgroundLoadError). Narrows the blast
radius of the returned <resourcesPath>/ base so the renderer can only
request files within the wallpapers directory, regardless of what the
project JSON claims.

N2 — videoExporter retry loop no longer calls cleanup() twice in the
BackgroundLoadError branch; the finally handles it.

N3 — Browser tests assert BackgroundLoadError.url contains the failing
path. Guards the {{url}} i18n interpolation contract.

N4 — VideoPlayback wallpaper resolve effect now catches resolver
throws (UnsafeAssetPathError, BackgroundLoadError from /wallpapers/
prefix enforcement). Prevents the new strict-rejection logic from
silently leaving the preview without a background.

Tests: 35 unit tests pass (up from 20); new coverage for all color
functions, all gradient types, empty string, named color fallback,
whitespace trimming, /wallpapers/ prefix enforcement, traversal
rejection, percent-encoded traversal rejection, displayUrl basename
and data-URI abbreviation.
2026-04-24 18:16:57 -05:00
Enriquefft d145f80041 fix: wallpaper backgrounds black in exported video (#376)
Three independent defects plus one SSOT violation caused reported symptom
of image wallpapers rendering solid black in exported MP4/GIF while
appearing correctly in the editor preview.

Bug A — Dev-mode IPC handler returned <appPath>/public/assets/, but
wallpapers live at public/wallpapers/. No assets/ subdirectory exists in
source.

Bug B — FrameRenderer.setupBackground bypassed getAssetPath and did
window.location.origin + wallpaper, producing file:///wallpapers/*.jpg
404s in packaged Electron.

Bug C — setupBackground silently caught any background-load error and
filled black. Masked Bug B from the export pipeline; why the bug shipped.

Smell D — Asset layout asymmetric: public/wallpapers/ (dev) vs
resources/assets/wallpapers/ (packaged). assets/ subdirectory had no
other consumers.

Fixes:

- Unify asset layout. electron-builder extraResources now copies to
  resources/wallpapers/ (no assets/). Main handler returns
  <resourcesPath>/ packaged and <appPath>/public/ unpackaged. Same
  convention in both modes: /wallpapers/x.jpg maps to <base>/wallpapers/x.jpg.
  Nix package.nix mirror updated.

- New src/lib/wallpaper.ts module owns the wallpaper contract:
  DEFAULT_WALLPAPER, classifyWallpaper (color/gradient/image), and
  resolveImageWallpaperUrl (pure URL resolver, wraps getAssetPath).
  BackgroundLoadError typed error for short-circuit detection.

- FrameRenderer.setupBackground uses the new helpers. Silent black
  fallback removed; rethrows as BackgroundLoadError. Export pipeline
  (VideoExporter + GifExporter) short-circuits encoder-retry loop on
  BackgroundLoadError. VideoEditor catch site dispatches to translated
  exportBackgroundLoadFailed toast.

- VideoPlayback editor preview consolidated onto the same helpers.
  Three default-wallpaper path literals (useEditorHistory,
  projectPersistence, VideoPlayback) collapsed onto DEFAULT_WALLPAPER.

- i18n: new errors.exportBackgroundLoadFailed key added to all seven
  locales (en, zh-CN, zh-TW, es, fr, tr, ko-KR).

- Tests: 20 unit tests for wallpaper module (classifyWallpaper +
  resolveImageWallpaperUrl branches + BackgroundLoadError).
  videoExporter.browser.test.ts and gifExporter.browser.test.ts extended
  with image-wallpaper happy path and BackgroundLoadError failure path.

Migration note: packaged users upgrading in place may retain an empty
resources/assets/ directory from the prior layout. Unreferenced at
runtime; cosmetic only. DMG/AppImage fresh installs get the new layout
directly.
2026-04-24 17:59:21 -05:00
AmitwalaH 466fad399a fix: seed overlaySize on mount and guard ResizeObserver entries 2026-04-24 15:28:53 +05:30
AmitwalaH 1673daabe4 merge: resolve conflicts and update video playback system 2026-04-24 15:07:05 +05:30
Marc Diaz cffca5f2ff fix: just use one test 2026-04-23 17:37:08 -04:00
Marc Diaz d1087af63c fix: lint 2026-04-23 15:46:35 -04:00
Raj Tiwari 8e1c7e035a fix: correct motion blur state caching logic 2026-04-23 23:02:04 +05:30
Raj Tiwari a26eb3cbab perf: cache motion blur state in ticker 2026-04-23 22:55:40 +05:30
Raj Tiwari f4e10b28cc style: fix linting errors for biome check 2026-04-23 22:50:21 +05:30
AmitwalaH 9a361a9f2e fix(video-playback): resolve initialization timing issues and ensure smooth zoom & layout rendering 2026-04-23 15:10:59 +05:30
Sid 67ec57751f Merge pull request #390 from FabLrc/update-french-translation
fix(i18n): Update French translations for dialogs, editor, and settings
2026-04-22 20:56:05 -07:00
Sid 0264d8cb9e Merge pull request #482 from FabLrc/chore/update-dependencies-security-2026-04
Update dependencies and resolve vite compatibility issues
2026-04-22 20:54:45 -07:00
Sid fafe8ff82d Merge pull request #486 from FabLrc/update-readme
doc: Update README
2026-04-22 07:30:14 -07:00
FabLrc d59ef6a8dd Update README with additional badges for Trendshift (top repository of the day) and Discord badge update 2026-04-22 16:06:25 +02:00
FabLrc d823f3f011 Add Star History section to README 2026-04-22 12:27:15 +02:00
psychosomat d6d872e529 Fix CodeRabbit review comments
- Add buildDialogOptions helper function to safely attach parent window only when valid and not destroyed
- Update all dialog calls (save-exported-video, open-video-file-picker, save-project-file, load-project-file) to use the helper
- Fix supportsWindowOpacity logic by removing || isWayland so Linux always follows no-opacity codepath
- Change incorrect Chromium feature name 'PipeWire' to 'WebRTCPipeWireCapturer' in main.ts
- Remove unused isWayland variable in handlers.ts
2026-04-22 02:23:31 +03:00
psychosomat 31f0483c65 Improve Arch Linux support and fix video export on Hyprland
- Add pacman package build target for Arch Linux in electron-builder.json5
- Update build:linux script in package.json to include pacman target
- Fix dialog window issues on Wayland/Hyprland:
  * Pass mainWindow reference to dialog.showSaveDialog and dialog.showOpenDialog in electron/ipc/handlers.ts
  * Required for proper dialog functionality on Wayland compositors
  * Previously dialogs opened without parent window attachment causing issues on Hyprland

Changes ensure:
- Correct video export on Arch Linux + Hyprland systems
- Ability to install via pacman package manager
- Improved compatibility with Wayland compositors
2026-04-22 02:01:20 +03:00
FabLrc 9613e714e1 chore: align @types/node with engine and fix package-lock.json cross-platform resolution 2026-04-21 15:06:57 +02:00
FabLrc 7573d8822c fix: regenerate pack-lock.json 2026-04-21 15:00:16 +02:00
FabLrc 659affa88c fix: upgrade vite to 7.x to resolve lockfile/platform issues
vitest@4.1.4 requires vite ^6||^7||^8. With vite@6 at project level,
npm@10 installs a separate vite@8 for vitest, which pulls in rolldown
(native .node bindings) that npm ci cannot install cross-platform due
to npm bug #4828.

vite@7 avoids rolldown entirely (uses rollup) and npm@10 deduplicates
correctly with the project-level vite@7. Also adds esbuild@^0.27.0
explicitly (required by vite-plugin-electron-renderer) and aligns with
vite@7's own esbuild@^0.27.0 so no duplicate installs.

- vite: ^6.4.2 → ^7.3.2
- @vitejs/plugin-react: ^4.7.0 → ^5.2.0 (adds vite@7/8 support)
- esbuild: ^0.27.0 added explicitly
- vite.config.ts: manualChunks converted to function form (rollup compat)
2026-04-21 14:34:09 +02:00
Sai Ganesh Maganti f60a11820e Remove unnecessary newline in i18n-check.mjs 2026-04-21 18:01:59 +05:30
FabLrc b472c768ce style: migrate biome config to 2.4.12 and fix formatting (CRLF → LF) 2026-04-21 14:11:31 +02:00
FabLrc 018ba08eb9 fix(security): remove unused electron-icon-builder and electron-rebuild
Both packages were listed as devDependencies but not referenced in any
scripts or source files. Removing them eliminates all 22 npm audit
vulnerabilities (2 critical, 5 high, 13 moderate, 2 low) introduced by
their unmaintained transitive dependency chain (phantomjs-prebuilt,
request, tar, etc.).
2026-04-21 14:07:23 +02:00
FabLrc 41a26f3e66 fix: upgrade vite to 6.x to satisfy vitest 4.x peer dependency
vitest ^4.1.4 requires vite ^6+, which conflicted with the pinned
vite 5.4.21 and caused npm ci to fail with an inconsistent lockfile.
Also bumps vite-plugin-electron to 0.29.1.
2026-04-21 14:06:59 +02:00
FabLrc 9d365ca406 fix: Update French translations for editor, launch, and settings 2026-04-21 12:48:29 +02:00
FabLrc a1762b2691 Update French translations for dialogs, editor, and settings 2026-04-21 12:33:39 +02:00
FabLrc 9e345660e6 chore: update dependencies to latest versions 2026-04-21 12:27:13 +02:00
shaun0927 96765e483d docs: correct cx/cy units and sanitize buffer option limits
Two follow-up fixes for CodeRabbit feedback on the docs commit:

- CursorTelemetryPoint JSDoc previously described cx/cy as 'device-pixel
  positions'. The producer sampleCursorPoint() in electron/ipc/handlers.ts
  clamps them to the [0, 1] range after dividing by the source display's
  width/height, so they are normalised ratios, not pixel values. Correct
  the doc comment accordingly.
- createCursorTelemetryBuffer now sanitizes maxActiveSamples and
  maxPendingBatches: non-finite, zero, or negative values fall back to
  safe positive-integer defaults. Without this, a caller passing Infinity
  or NaN would hang the trim loops.

New test covers the sanitisation path for both options.
2026-04-21 18:12:28 +09:00
shaun0927 adc610544c docs: document cursor telemetry buffer API and surface drop events
Add JSDoc to every public export in cursorTelemetryBuffer so the module
meets the 80% docstring-coverage threshold, and make two silent-drop
paths observable:

- endSession() now returns the number of pending batches evicted by the
  maxPendingBatches cap and emits console.warn when any are dropped.
- prependBatch() defensively trims and warns if an unusual retry pattern
  would push the queue past the cap (normal retry after takeNextBatch()
  stays a no-op).

Tests cover both drop paths.
2026-04-21 17:07:19 +09:00
Marc Diaz 95c7b7fc2b fix: add webm inflated duration and fix 2026-04-20 23:11:58 -04:00
Sid cccb966fda Merge pull request #460 from Galactic99/feat/countdown-before-record-start
feat:add countdown before record start
2026-04-20 08:25:30 -07:00
Aaryash Khalkar c033984ccb Merge branch 'main' into feat/countdown-before-record-start 2026-04-20 20:52:29 +05:30
Sid ae6b6ca860 Merge pull request #357 from imAaryash/main
Update LaunchWindow.tsx
2026-04-20 08:12:14 -07:00
Sid 1f99fcb4ad Merge pull request #325 from dheerajmr01/fix/camera-bugs
fix: camera light flashes and turns off when clicking webcam button (…
2026-04-20 08:10:37 -07:00
Fabien Laurence 0bb14f3a33 Update src/components/launch/LaunchWindow.tsx
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-20 17:07:17 +02:00
Fabien Laurence 97fdefa433 Merge branch 'main' into main 2026-04-20 17:05:46 +02:00
Sid 2b1c93146d Merge pull request #471 from tmchow/fix/449-cjk-text-wrap
fix(annotations): wrap CJK text at character boundaries in export renderer
2026-04-19 11:53:50 -07:00
Trevin Chow dd622f83c1 fix(annotations): use Unicode script properties for CJK detection
Address review feedback on #471 from @coderabbitai. The BMP-only
codepoint ranges missed two classes of characters:

- Non-BMP Han extensions (CJK Unified Ideographs Extension B, C, D, E, F)
  such as 𠀀. A long string of Extension-B characters would still be
  tokenized as a single unbreakable unit and overflow the box.
- Halfwidth Katakana (U+FF65-U+FF9F) such as カ. Same failure mode.

Switch to Unicode script property escapes (\\p{Script=Han},
\\p{Script=Hiragana}, \\p{Script=Katakana}, \\p{Script=Hangul}) which
cover these cases without enumerating ranges. tsconfig target is ES2020;
property escapes require ES2018+ so this is safe.

Verified coverage: 漢 あ ア 가 𠀀 カ all match; A and digits do not.
2026-04-19 10:05:48 -07:00
ichi d6bf31cb3f feat(i18n): add Japanese locale and update translations for existing locales
- Added Japanese (ja-JP) translations for common, editor, dialogs, launch, settings, shortcuts, and timeline.
- Updated translations for existing locales (en, es, fr, ko-KR, tr, zh-CN, zh-TW) to include new keys for "showInFolder", "loadingVideo", "trim", and "speed".
- Refactored VideoEditor and timeline Item components to utilize localized strings for various user interface elements and notifications.
- Enhanced user experience by providing localized messages for project loading, exporting, and timeline actions.
2026-04-19 20:44:07 +09:00
Trevin Chow f04c2b7c14 fix(annotations): wrap CJK text at character boundaries in export renderer
renderText split each line on whitespace, which works for Latin text
but leaves CJK strings as a single unbreakable token because CJK
scripts have no word-separating whitespace. Result: CJK annotation
text overflows the clipped annotation box even though the editor's
HTML preview wraps it correctly via CSS word-break: break-word.

Replace the ad-hoc whitespace split with a tokenizeForWrap helper
that emits each CJK character (Hiragana, Katakana, Hangul Syllables,
CJK Unified Ideographs + Extension A, and CJK Compatibility
Ideographs) as its own token, while keeping Latin words + whitespace
intact. The existing width-measurement wrap loop then handles CJK
per-character, matching the editor's behavior.

Closes #449
2026-04-19 02:49:17 -07:00
Galactic99 4a65ab8171 chore:safewrapper consistency and hide countdown overlay before starting recording setup. 2026-04-19 12:57:17 +05:30
Galactic99 7e02856836 fix:hide handler actually hides window instead of just clearing value 2026-04-19 12:37:19 +05:30
Galactic99 65b9d189e8 fix:improve ui of the countdown by adding a low opacity circle background 2026-04-19 12:37:19 +05:30
Galactic99 3ba9e901c9 fix:Claim the countdown run before the first await. 2026-04-19 12:37:18 +05:30
Galactic99 331e126d3c fix:handle hideCountdownOverlay rejections in cleanup/cancel paths. 2026-04-19 12:37:18 +05:30
Galactic99 d04bab732b prioritize recording stop over countdown cancel 2026-04-19 12:37:18 +05:30
Galactic99 ea68e4cfc3 fix:prevent stale countdown IPC updates from repainting overlay 2026-04-19 12:37:18 +05:30
Galactic99 6b08a0a72a fix:flickering, stale runs, macOS bugs provided by coderabbit and thread countdown token 2026-04-19 12:37:17 +05:30
Galactic99 1670db41a8 feat:add countdown before record start 2026-04-19 12:37:17 +05:30
Sid fd6a0778fb Merge pull request #469 from imAaryash/feat/discord-actions
Improve Discord API error handling and webhook checks
2026-04-18 17:52:22 -07:00
Aaryash Khalkar cfc6579e37 Improve Discord API error handling and webhook checks
Refactor error handling for Discord API responses and improve webhook secret checks.
2026-04-19 06:19:05 +05:30
Siddharth 10463f882f rm 2026-04-18 17:46:46 -07:00
Sid 3e436087b7 Merge pull request #467 from imAaryash/feat/discord-actions
updated discord workflow
2026-04-18 17:43:06 -07:00
Aaryash Khalkar 63c850bc08 Change pull_request to pull_request_target in workflow 2026-04-19 05:47:52 +05:30
Siddharth dc74db13ad test 2026-04-18 11:36:59 -07:00
Siddharth 33eb245aea codeowner 2026-04-18 11:29:12 -07:00
Siddharth d22c4190cf fix 2026-04-18 11:05:33 -07:00
Sid 57c6a590a9 Merge pull request #423 from org-cyber/fix/windows-export-clean
fix(windows): Fixed windows Export Issue and early decode Crash
2026-04-18 10:54:13 -07:00
Sid 88ab1eabdd Merge pull request #401 from hobostay/fix/bug-fixes-security-and-reliability
Fix security and reliability issues
2026-04-18 10:50:18 -07:00
Sid a20a31f27d Merge branch 'main' into fix/bug-fixes-security-and-reliability 2026-04-18 10:50:05 -07:00
Sid 9ef1f756b4 Merge pull request #448 from theopfr/fix/cpu-readback-only-for-linux
fix: improve performance on windows and macos by passing canvas direclty to `VideoFrame()`
2026-04-18 10:49:09 -07:00
Sid b0529c87a6 Merge pull request #450 from michthemaker/feat/hud-overlay-ux-overhaul
Feat/hud overlay ux overhaul
2026-04-18 10:47:01 -07:00
Sid 974fde4f1d Merge pull request #344 from ekkoitac/fix/tutorial-help-missing-translations
Fix/tutorial help missing translations
2026-04-18 10:44:04 -07:00
Sid e7247d880d Merge pull request #434 from Enriquefft/fix/export-audio-duration-validation
fix: validate export duration and fix audio trim in speed-aware path
2026-04-18 10:41:38 -07:00
Sid 56d3d59598 Merge pull request #342 from kuishou68/cocoon/feature-duplicate-annotation
feat(editor): duplicate annotations
2026-04-18 10:39:32 -07:00
Sid 0ec18358d5 Merge branch 'main' into cocoon/feature-duplicate-annotation 2026-04-18 10:37:56 -07:00
Sid e85d07ba78 Merge pull request #461 from imAaryash/discord-actions
Refactor Discord webhook URL handling in workflow
2026-04-18 10:29:02 -07:00
Test User 721e8f4759 Fix lint, type check errors, and apply CodeRabbit review feedback
- Remove trailing comma in SUPPORTED_LOCALES that caused Locale type to
  include undefined, fixing all downstream type errors
- Remove unused webcamSizePreset from useMemo dependency array
- Use parsed.toString() instead of raw url in shell.openExternal per
  Electron security best practice

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 21:37:16 +08:00
Theodor Peifer 9e4ec790f3 chore: fix linting issue 2026-04-18 11:32:42 +02:00
Theodor Peifer 2f24038cb5 fix: use existing getPlatform() so the OS based CPU readback check also works in the browser 2026-04-18 11:31:09 +02:00
Theodor Peifer 934f05cc80 fix: pass platform from video/gifExporter to FrameRenderer, skip readback also for canvas composition for non-linux 2026-04-18 11:31:09 +02:00
Theodor Peifer d12f3980f9 fix: only read back frames from canvas if the OS is linux, work around not necessary for other OS' line win or darwin 2026-04-18 11:31:09 +02:00
ekkoitac 485c95b672 resolve conflicts: adopt main's tutorial i18n restructuring
The main branch has already applied the same tutorial.help key
restructuring with slightly different intermediate values.
Adopting main's version to resolve merge conflicts.
2026-04-18 08:52:26 +08:00
Enriquefft dd8c001f6d refactor: require validatedDurationSec in AudioProcessor, drop fallbacks
AudioProcessor.process and renderPitchPreservedTimelineAudio accepted
validatedDurationSec as optional, so the speed-aware path fell back to
media.duration when it was absent. HTMLMediaElement.duration can be
Infinity for the same MediaRecorder/Chromium Linux containers this PR
targets, which would make effectiveEnd and the playback stop checks
unreliable.

The only caller (VideoExporter.process) already threads
streamingDecoder's validatedDuration through, so make the parameter
required. Drop the media.duration fallback, the Number.isFinite guard
on readEndSec, and the two `!== undefined` checks in the tick loop.

While here:
- Document that +0.5 on readEndSec mirrors streamingDecoder.decodeAll's
  read window so trim-only and speed-aware paths stay in sync.
- Replace the unreachable silent-blob fallback at the end of
  renderPitchPreservedTimelineAudio with a loud invariant throw, so a
  broken recorder contract surfaces instead of yielding empty audio.
2026-04-16 14:49:27 -05:00
Enriquefft 0c01db7afa fix: fall back to unbounded packet scan when duration hints missing
The earlier NaN/Infinity guard collapsed both duration hints to 0 when
the container reported invalid values, which turned scanEndSec into
0.5s. The packet scan then read only the first half-second, scannedDuration
capped there, and validateDuration fell back to that wrong value for the
entire export — exactly the Chromium Linux case this PR is meant to fix.

Use a 24h sentinel as the read endpoint when no hint is usable. An
explicit end is still required (some containers are truncated without
one, per prior comment), but the sentinel is large enough to exceed any
realistic recording so the scan reaches real EOF.
2026-04-16 14:33:27 -05:00
Enriquefft 4d4b08db07 fix: skip chained initial trims before recording starts
Startup trim-skip only consulted the first active region at t=0, so
back-to-back or overlapping trims starting at zero (e.g. [0,500ms]
followed by [500ms,1000ms]) left the second region un-skipped. The
in-flight tick loop would catch it, but MediaRecorder was already
running by then, capturing up to one rAF frame of trimmed audio into
the blob and shifting the downstream timeline.

Loop findActiveTrimRegion from the advancing startPosition until no
region matches or startPosition >= effectiveEnd, bounded by
trimRegions.length for safety. Recompute initialSpeedRegion from the
final startPosition so playbackRate reflects the true start point.
2026-04-16 14:31:51 -05:00
Enriquefft 61e895a75a fix: sanitize packet-scan range against NaN/Infinity duration
mediaInfo.duration from web-demuxer can be NaN or Infinity on Chromium
Linux (same MediaRecorder bug this PR otherwise addresses). That value
flowed straight into Math.max + demuxer.read() as scanEndSec, producing
an invalid range argument and breaking the ground-truth packet scan.

Guard both mediaInfo.duration and videoStream.duration with
Number.isFinite before Math.max; validateDuration() already handled the
downstream use.

Drop redundant WebDemuxer.read() / getDecoderConfig() type casts while
here — the generics infer the chunk/config type from the media string
literal, so the `as ReadableStream<EncodedVideoChunk>` and
`as AudioDecoderConfig` are no-ops.
2026-04-16 14:18:40 -05:00
Enriquefft 83ea025ed8 fix: handle NaN in zero-scan fallback and symmetric divergence check
- validateDuration returns 0 instead of NaN when both container is
  NaN and scanned is zero
- Use Math.abs for divergence check so container under-reporting is
  also corrected (not just over-reporting)
2026-04-16 13:50:09 -05:00
Enriquefft 337838294d fix: pass explicit range to packet scan read
Some containers are truncated when read() has no end bound.
Use container/stream duration + buffer as scan range, matching
the same pattern used in decodeAll().
2026-04-16 13:50:09 -05:00
Enriquefft 5e62ad3215 fix: validate export duration and fix audio trim in speed-aware path
Two bugs in the export pipeline:

1. Container duration from WebM metadata can be unreliable (Chromium bug
   on Linux — reports Infinity, 0, or inflated values). The pipeline
   trusted this value, causing inflated exports, frozen video, and
   "decode ended early" errors.

   Fix: scan actual packet timestamps in loadMetadata() and compare
   against container duration. Use packet-based ground truth when they
   diverge.

2. The speed-aware audio path (renderPitchPreservedTimelineAudio)
   recorded in real-time via MediaRecorder but never paused recording
   during trim-region seeks. Seek dead time was captured as audio,
   inflating the audio track beyond the video duration.

   Fix: pause MediaRecorder during trim seeks, skip past initial trim
   before recording starts, wait for seek completion before resuming.

Fixes #276, #433. Partially addresses #428.
2026-04-16 13:50:09 -05:00
Aaryash Khalkar 7264b9989e Refactor Discord webhook URL handling in workflow
Updated Discord webhook handling to allow for a fallback to DISCORD_PR_FORUM_WEBHOOK if DISCORD_WEBHOOK_URL is not set. Added checks to ensure webhook URL is provided, especially for fork PR events.
2026-04-16 17:16:56 +05:30
Cocoon-Break 501c4f20a1 fix: remove unused COMPARE_LOCALES variable in i18n-check.mjs to pass Biome lint 2026-04-16 17:29:05 +08:00
Cocoon-Break 64e011f798 style: wrap long onDuplicate prop to fix Biome formatter 2026-04-16 17:01:02 +08:00
Cocoon-Break 8b7047365c style: sort lucide-react imports alphabetically to fix Biome lint 2026-04-16 17:00:48 +08:00
Azeru 5caee9bc2d chore(merge): resolve merge conflict in streamingDecoder.ts
Address merge conflict markers added during resolution of Windows export fixes, ensuring clean integration of decode termination logic updates.
2026-04-16 09:51:26 +01:00
Charles Ikechukwu 61b3182f87 Merge branch 'main' into feat/hud-overlay-ux-overhaul 2026-04-16 09:43:08 +01:00
themaker cb44dec81e Merge branch 'feat/hud-overlay-ux-overhaul' of https://github.com/michthemaker/openscreen into feat/hud-overlay-ux-overhaul 2026-04-16 09:41:01 +01:00
themaker 17bed0956d fix(package-lock.json): update package-lock.json to resolve dependencies mismatch 2026-04-16 09:40:31 +01:00
Sid 6d449a46c4 Merge pull request #362 from imAaryash/detect-system-lang
feat(launch): refine recording HUD and language switching UX
2026-04-15 23:10:47 -07:00
Sid e2c4f3f62a Merge pull request #414 from theopfr/fix/correct-frame-count
fix: export frame counter exceeding total frames
2026-04-15 23:06:37 -07:00
Sid ff52e55fa1 Merge branch 'main' into detect-system-lang 2026-04-15 23:02:34 -07:00
Sid 8aa85413f9 Merge pull request #444 from AmitwalaH/fix/read-binary-error
fix: prevent crash in read-binary-file handler and improve error debugging
2026-04-15 23:01:21 -07:00
Sid 4f05cf572e Merge pull request #432 from Enriquefft/feature/nix-support
feat: add Nix flake with dev shell, package, and NixOS/Home Manager modules
2026-04-15 22:57:12 -07:00
Sid cefcf443e4 Merge pull request #436 from Dopiz/feat/i18n-zh-TW
feat(i18n): add zh-TW locale
2026-04-15 22:55:42 -07:00
Sid 20aecc13ae Merge pull request #431 from LorenzoLancia/feat/blur-mosaic-and-black
feat: add mosaic blur and black shading option
2026-04-15 22:51:47 -07:00
Sid 89fce713e5 Merge pull request #430 from SimulAffect/fix/tutorial-help-i18n
fix(i18n): sync tutorial help translations
2026-04-15 22:50:28 -07:00
Sid 847647d310 Merge pull request #421 from pufferfish3e/main
Add documentation section to README
2026-04-15 22:46:21 -07:00
Sid 97d1957b78 Merge pull request #452 from imAaryash/discord-actions
added discord.yaml
2026-04-15 22:13:51 -07:00
JunghwanNA fac0b405d3 fix: handle recording discard and write-failure in cursor telemetry buffer
Address two issues raised during review:

P1 – When a recording is cancelled or restarted, setRecordingState(false)
enqueues its cursor batch but store-recorded-session is never called,
leaving a stale batch that contaminates the next recording's telemetry.
Add discardLatestPending() to the buffer and a discard-cursor-telemetry
IPC handler; the renderer now calls it on the discard path.

P2 – takeNextBatch() dequeued the batch before fs.writeFile, so a write
failure would permanently lose the telemetry. Wrap the write in
try/catch and re-insert the batch via prependBatch() on failure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 11:58:16 +09:00
Cocoon-Break 12f3be02f2 fix: sort lucide-react imports alphabetically
Signed-off-by: Cocoon-Break <54054995+kuishou68@users.noreply.github.com>
2026-04-16 09:31:37 +08:00
shaun0927 84ec5a7e68 fix: isolate cursor telemetry samples per recording session
Previously, the main process kept two module-scope arrays —
activeCursorSamples and pendingCursorSamples — and set-recording-state
on a new recording wiped BOTH. When a user stopped recording and
immediately started a new one before store-recorded-session fired,
the previous recording's pending samples were discarded or later
overwritten with the new session's data, producing empty or mismatched
.cursor.json files.

Replace the two arrays with a small FIFO buffer
(createCursorTelemetryBuffer) that:
- Keeps pending batches per completed recording, never wiping them on
  a new session start.
- Yields batches in arrival order to storeRecordedSessionFiles.
- Caps pending batches (default 8) so a never-stored sequence cannot
  leak unbounded memory.

Unit-tested directly in src/lib/cursorTelemetryBuffer.test.ts, including
the rapid-restart race that motivated the change.
2026-04-16 10:27:20 +09:00
imAaryash ee395b7896 added discord.yaml 2026-04-15 22:01:28 +05:30
Charles Ikechukwu 9998b43acc Update src/components/launch/SourceSelector.module.css
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-04-15 14:57:26 +01:00
themaker 1cdb8ed1cd feat(ui): add squircle corner shape to SourceSelector and polish sources spinner ui
Added corner-shape: squircle; to SourceSelector.module.css for more visually appealing rounded corners.

Customized windows source selector scrollbar to be more subtle but carry the product colour.

Removed box-shadow on SourceSelector because electron doesn't round corners of the shadow, thereby leaving a square border shadow conflicting with the rounded corners of the SourceSelector.
2026-04-15 14:25:30 +01:00
themaker 566830a866 feat(): changed .gitignore 2026-04-15 09:39:03 +01:00
themaker 143cd1e772 additions 2026-04-14 23:17:06 +01:00
Theodor Peifer 14bbe8f183 fix: algin frame cap with epsilon boundary to prevent frame count mismatch 2026-04-14 20:26:21 +02:00
themaker 5bcdf4c558 me and you 2026-04-14 15:58:52 +01:00
themaker df06369b75 me and you 2026-04-14 15:58:39 +01:00
AmitwalaH 6441e96035 fix: prevent crash in read-binary-file handler and improve error debugging 2026-04-14 12:45:02 +05:30
Theodor Peifer 46c611bd3f fix: include epsilon subtration in totalFrame calculation 2026-04-13 17:30:16 +02:00
Enriquefft d20a062150 fix(nix): handle store path sources for path: flake inputs
gitTracked uses builtins.fetchGit which fails when the source is
already a store path (happens with path: flake inputs from consuming
flakes). Detect store paths at eval time and fall back to cleanSource.
2026-04-13 06:17:07 -05:00
Dopiz 515baf1d84 feat: add zh-TW locale 2026-04-13 17:19:45 +08:00
Enriquefft f106cc6835 fix(nix): restrict package source to git-tracked files
Replace denylist approach with gitTracked to exclude node_modules,
dist, .git, and any other untracked artifacts from the derivation.
Keeps the nix/flake/md exclusions as they are nix-only or non-source.
2026-04-12 18:14:44 -05:00
Enriquefft 456816ab2e fix(nix): correct Electron binary path to libexec/electron
Electron 41.x in nixpkgs places the binary at libexec/electron/,
not lib/electron/. Without this fix, npm run dev fails with ENOENT.
2026-04-12 17:55:43 -05:00
Enriquefft 64cdc0dd3c feat: add Nix flake with dev shell, package, and NixOS/Home Manager modules
Reproducible development environment for NixOS/Nix contributors:
- Dev shell with Node 22, system Electron, Playwright, LD_LIBRARY_PATH
  for X11/Wayland/audio libs, activated automatically via direnv
- buildNpmPackage derivation wrapping system Electron with desktop file
  and hicolor icons
- NixOS module (programs.openscreen.enable) with xdg-desktop-portal
- Home Manager module for per-user installation
- Overlay for composing with other flakes

Tested: nix flake show, nix develop, nix build, nixos-rebuild switch
2026-04-12 13:33:13 -05:00
LorenzoLancia 8bcce473d5 feat: add mosaic blur with black shading 2026-04-12 18:04:43 +02:00
SimulAffect 0efd2d64ed fix(i18n): sync tutorial help translations 2026-04-12 17:26:45 +08:00
Sid a6ae0e6d98 Merge pull request #373 from Moncef-Mhz/adjust-zoom-speed
feat: implement zoom speed
2026-04-11 20:23:10 -07:00
imAaryash d1c9555464 feat(i18n): auto-discover valid locales and harden language menu
- derive available locales from locale folders with required namespace validation

- exclude incomplete locales and report missing namespace files

- align system-language suggestion and selectors with discovered locales

- improve launch HUD language menu interaction, scrolling, and viewport clipping

- make i18n-check discover locale folders automatically
2026-04-12 05:13:31 +05:30
imAaryash 1ef30ff1c7 Merge branch 'detect-system-lang' of https://github.com/imAaryash/openscreen into detect-system-lang 2026-04-12 04:24:00 +05:30
imAaryash e96478e813 Revert "Merge pull request #365 from AmitwalaH/fix-tutorial-translations"
This reverts commit 5494acb5ba.
2026-04-12 04:23:41 +05:30
imAaryash 97fbb01801 fix(i18n): resolve prompt persistence and language menu behavior 2026-04-12 04:23:39 +05:30
imAaryash c9c2634db4 fix(launch): polish language menu behavior 2026-04-12 04:23:37 +05:30
imAaryash 0c627da22c feat(launch): refine recording HUD and language switching UX 2026-04-12 04:23:35 +05:30
moncef e8d6fe3d1b Merge branch 'main' into adjust-zoom-speed 2026-04-11 23:27:50 +01:00
Sid db10f92c49 Merge pull request #300 from samirpatil2000/main
feat: configure macOS hardened runtime, entitlements, and build envir…
2026-04-11 11:45:20 -07:00
Sid bbf75a27e7 Merge pull request #418 from Orchardxyz/fix/icon-size
fix: adjust icon size for macOS platform compatibility
2026-04-11 11:41:58 -07:00
Sid 5781be0ba1 Merge pull request #409 from Scottlexium/fix/hud-follows-spaces
fix: HUD overlay and source selector follow across macOS Spaces
2026-04-11 11:18:50 -07:00
Sid 26c243950a Merge pull request #407 from kwakseongjae/feat/i18n-ko-KR
feat(i18n): add Korean (ko-KR) localization
2026-04-11 11:15:23 -07:00
Sid 0e4fc249ce Merge pull request #392 from imAaryash/patch-1
Fix SUPPORTED_LOCALES array syntax
2026-04-11 10:59:55 -07:00
Sid 321c4983ca Merge pull request #316 from lueckpeter76-lgtm/revert-293-fix/restart-recording-windows
Revert "fix: prevent double-finalize race condition in restartRecording on Windos"
2026-04-11 10:45:38 -07:00
Sid d9114877ff Merge pull request #389 from richard950825-sys/fix/zh-CN-missing-newRecording-translation
fix(i18n): add missing zh-CN translation for newRecording dialog
2026-04-11 10:44:35 -07:00
Azeru e4d4ce284b fix(export): compute requiredEndSec for decode termination handling
Add requiredEndSec calculation to properly handle early decode termination by using the last segment's end time. This addresses issues with export processing on Windows platforms.
2026-04-11 18:35:00 +01:00
Siddharth b713b6a9e8 fix: zoom focus now matches indicator position including wallpaper edges 2026-04-11 10:26:26 -07:00
Siddharth 40028cfd55 feat: add dual frame webcam layout preset (#347) 2026-04-11 10:01:19 -07:00
Siddharth 7169e583c7 revert: undo local merge of PR #347 2026-04-11 09:58:15 -07:00
Azeru d40f40d69d fix(export): compute requiredEndSec for decode termination handling
Add requiredEndSec calculation to properly handle early decode termination by using the last segment's end time. This addresses issues with export processing on Windows platforms.
2026-04-11 17:55:05 +01:00
Siddharth de7518549c feat: add dual frame webcam layout preset (#347) 2026-04-11 09:54:30 -07:00
Azeru 05da56fdc8 fix(export): relax early decode termination on Windows
On Windows, tolerate small decode gaps (<=3 seconds) to work around driver quirks, allowing export to complete with available frames.
2026-04-11 17:45:23 +01:00
Shreyas b1a1f45e93 refactor: simplify dual frame preset normalization 2026-04-11 09:30:01 -07:00
Shreyas bce1957505 fix: clear webcam position for non-pip layouts 2026-04-11 09:30:00 -07:00
Azeru 08aff31351 fix(windows): normalize export save path and relax early decode end 2026-04-11 17:27:52 +01:00
Shreyas 24b4b4254a fix: normalize dual frame preset for portrait projects 2026-04-11 09:26:15 -07:00
Shreyas 16cba73cb2 fix: avoid double-scaling dual frame export radius 2026-04-11 09:26:15 -07:00
Shreyas c55f462f1c feat: add dual frame webcam layout preset 2026-04-11 09:20:34 -07:00
Orchard d526ab4cda fix(tray): standardize icon size to 16px on macOS 2026-04-11 22:21:22 +08:00
Kendrick 363683d288 Add documentation section to README
Added a documentation section with a link to OpenScreen Docs.
2026-04-11 21:57:46 +08:00
곽성재 71cdd5f0e0 Merge branch 'main' into feat/i18n-ko-KR 2026-04-11 20:55:36 +09:00
Raj Tiwari 90d04c734e fix(video): prioritize h264 codec and fix pixi render blur 2026-04-11 13:07:07 +05:30
Orchard 33a60fed8c fix(tray): adjust icon size for macOS platform compatibility 2026-04-11 10:39:55 +08:00
Theodor Peifer d21dd1cbf1 fix: export frame counter exceeding total frames 2026-04-10 22:24:37 +02:00
Sid 68295b21ec Merge pull request #394 from LorenzoLancia/feature/blur-selection
feat: add blur selection (rectangle, oval)
2026-04-10 07:10:29 -07:00
Scott Lexium 0bde359421 docs: add JSDoc comments to window factory functions 2026-04-10 12:28:47 +01:00
Scott Lexium e7d82e1478 fix: make HUD overlay and source selector follow across macOS Spaces
Both windows had alwaysOnTop but lacked setVisibleOnAllWorkspaces, so
they stayed pinned to the Space they were first opened on. Users moving
to a different virtual desktop would lose sight of the overlay.

Calls setVisibleOnAllWorkspaces(true, { visibleOnFullScreen: true })
on macOS only — no-op on Windows/Linux so cross-platform behaviour is
unchanged.
2026-04-10 12:13:54 +01:00
kwakseongjae d512f59826 feat(i18n): add Korean (ko-KR) localization
- Add complete Korean locale across all 7 i18n namespaces
- All translation keys match the English baseline 1:1
- Register ko-KR in SUPPORTED_LOCALES and i18n-check validation

Refs siddharthvaddem/openscreen#406
2026-04-10 16:11:23 +09:00
LorenzoLancia 3232918197 Add the Shortcut Blur 2026-04-09 21:51:27 +02:00
Test User cf6dce552e Fix security and reliability issues
1. Validate URL scheme in open-external-url handler
   - Prevent opening file:// or other dangerous schemes via shell.openExternal
   - Only allow http:, https:, and mailto: protocols

2. Fix latest video detection using mtime instead of lexicographic sort
   - Lexicographic sort gives wrong results (e.g. recording-9 > recording-10)
   - Now sorts by file modification time for reliable latest-file detection

3. Add null guard for AudioData.format in cloneWithTimestamp
   - Replace non-null assertion (!) with proper validation
   - Throws descriptive error if format is unexpectedly null

4. Prevent encodeQueue counter underflow in VideoExporter
   - Use Math.max(0, ...) to prevent negative queue count

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 16:58:12 +08:00
moncef 8118a0cf89 Merge branch 'main' into adjust-zoom-speed 2026-04-08 22:10:55 +01:00
BaptisteAuscher 283fa406b2 langages : tr and fr 2026-04-08 23:00:33 +02:00
BaptisteAuscher 33c384a827 Merge branch 'main' of github.com:siddharthvaddem/openscreen into feature/color-wheel 2026-04-08 22:56:02 +02:00
BaptisteAuscher c3faca19fd small fix: color block handles transparent values 2026-04-08 22:45:27 +02:00
LorenzoLancia 38d72217c2 fix little things blur 2026-04-08 22:43:30 +02:00
BaptisteAuscher 765434b935 code rabbit 2026-04-08 22:23:52 +02:00
LorenzoLancia f6b7c463f0 Fix last issues 2026-04-08 22:21:19 +02:00
BaptisteAuscher 545c02b5bb handle transparent values for the color wheel 2026-04-08 22:04:19 +02:00
LorenzoLancia f8232d9c76 Fix some little issues 2026-04-08 21:36:53 +02:00
LorenzoLancia 5a9c85c345 Fix formatting and locale config 2026-04-08 20:26:16 +02:00
Lorenzo Lancia a4f1c6a2ee feat: add blur selection (rectangle, oval) 2026-04-08 16:42:12 +02:00
Aaryash Khalkar fdfeb51c00 Fix SUPPORTED_LOCALES array syntax 2026-04-08 19:19:08 +05:30
Sid 5494acb5ba Merge pull request #365 from AmitwalaH/fix-tutorial-translations
fix(i18n): add missing tutorial dialog translation keys
2026-04-08 18:55:21 +05:30
Richard 85bd215f1f fix(i18n): add missing zh-CN translation for newRecording dialog
The zh-CN locale was missing the 'newRecording' section in editor.json,
which is present in the en locale. This commit adds the translation for:
- title: 返回录屏
- description: 当前会话已保存。
- cancel: 取消
- confirm: 确认
2026-04-08 13:58:53 +08:00
Sid e7d5f51740 Merge pull request #345 from GarryLaly/feature/webcam-resize-slider
feat: Add webcam size with slider
2026-04-07 22:40:15 -07:00
Sid 9e6b05815f Merge pull request #375 from mehmetnadir/feat/turkish-locale
feat(i18n): add Turkish (tr) locale support
2026-04-07 22:30:30 -07:00
Sid 7bd993a97b Merge branch 'main' into feat/turkish-locale 2026-04-07 22:30:16 -07:00
Sid 558379702a Merge pull request #330 from maxbailey/main
fix: resolve green MP4 exports on CachyOS/Arch Linux (Wayland)
2026-04-07 22:28:00 -07:00
Sid 09b99563f5 Merge pull request #380 from FabLrc/french-traduction
feat(i18n): add French translations
2026-04-07 22:21:34 -07:00
Sid b34961f6af Merge pull request #365 from AmitwalaH/fix-tutorial-translations
fix(i18n): add missing tutorial dialog translation keys
2026-04-07 22:21:02 -07:00
Sid 5a36179454 Merge pull request #383 from marcgabe15/exportTesting
feat: Add unit tests for exporting videos
2026-04-07 22:02:17 -07:00
BaptisteAuscher 10a8feb71d changes after review, factor the color picker component and add validation for the input 2026-04-07 22:33:39 +02:00
Marc Diaz 3482be9864 refactor: remove extraneous comments 2026-04-07 13:50:26 -04:00
Marc Diaz b8fe1a1ec8 fix(playwright): use one version 2026-04-07 13:32:49 -04:00
Marc Diaz 33609432e1 fix: use npm for install 2026-04-07 13:05:27 -04:00
Marc Diaz b65c68d139 fix: use headless 2026-04-07 13:02:11 -04:00
Marc Diaz 6bff2a2a2c feat: use export testing 2026-04-07 12:58:33 -04:00
samirpatil2000 dfbaf3f176 ci: update build workflow configuration 2026-04-07 22:13:28 +05:30
samirpatil2000 3709342c6c ci: update build workflow configuration and dependencies 2026-04-07 22:00:42 +05:30
Samir Patil 0489d7b9f5 Merge branch 'siddharthvaddem:main' into main 2026-04-07 21:59:25 +05:30
moncef 0cb298d20b Fix Pr reviews 2026-04-07 11:58:45 +01:00
moncef 7409631207 Fix pr review SelecedSpeedId 2026-04-07 11:43:20 +01:00
moncef 8f35cf090c feat: add zoomRegionUtils to calculate dominant zoom regions and handle smooth transitions between connected regions 2026-04-07 11:40:39 +01:00
FabLrc 1f56bb42c3 fix(i18n): update French translations for cycle annotations shortcuts 2026-04-07 12:17:53 +02:00
FabLrc 7a8fb807e6 feat(i18n): add French translations for common and dialogs namespaces 2026-04-07 12:17:10 +02:00
Garry Priambudi 0e1a69a7b2 Merge branch 'main' into feature/webcam-resize-slider 2026-04-07 17:13:38 +07:00
FabLrc e739653b3f feat(i18n): add French translations for various application components 2026-04-07 12:05:36 +02:00
Sid 9024eaae61 Merge pull request #307 from Ayush765-spec/main
Added the new recording button so that user does not exit the entire application
2026-04-06 23:00:24 -07:00
Sid c5882b06b1 Merge pull request #334 from matthew-hre/matthew-hre/jj-przmrvurqkow
fix: handle av1 VideoDecoder errors
2026-04-06 22:47:50 -07:00
Sid 306b61a902 Merge pull request #291 from 1shanpanta/feat/extended-speed-options
feat: extend speed options with higher presets and custom speed input
2026-04-06 22:14:08 -07:00
Nadir A. c36349d950 feat(i18n): add Turkish (tr) locale support
Add complete Turkish translation across all 7 i18n namespaces:
- common: actions, playback controls, locale metadata
- launch: HUD tooltips, audio/webcam controls, source selector
- editor: error messages, export, project, recording permissions
- dialogs: export progress, trim tutorial, unsaved changes, file dialogs
- settings: all panels (zoom, speed, trim, layout, effects, background,
  crop, export, annotations, custom fonts, language, audio)
- shortcuts: keyboard shortcuts panel and all actions
- timeline: toolbar buttons, hints, labels, errors, success messages

Also adds "tr" to SUPPORTED_LOCALES config and i18n validation script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 03:05:21 +03:00
moncef 112f02fe03 feat: implement video editor timeline components with interactive zoom, trim, and speed region controls. 2026-04-07 00:30:23 +01:00
BaptisteAuscher 2c10073d30 ai review changes 2026-04-06 21:02:50 +02:00
BaptisteAuscher 7e563166a3 add color wheel to background and annotations 2026-04-06 20:37:05 +02:00
AmitwalaH 4e2a53b200 fix: spacing issues in tutorial translations 2026-04-06 15:19:24 +05:30
AmitwalaH 90ba713323 fix(i18n): update tutorial dialog translation keys for all locales 2026-04-06 15:08:49 +05:30
Sid 24928164ca Merge pull request #355 from getSono/main
Adding automatic debian builds
2026-04-05 22:56:56 -07:00
imAaryash 3d20c67c63 fix(i18n): resolve prompt persistence and language menu behavior 2026-04-06 10:15:41 +05:30
imAaryash 4e43b59b42 fix(launch): polish language menu behavior 2026-04-06 10:11:07 +05:30
imAaryash 08b5580ca2 feat(launch): refine recording HUD and language switching UX 2026-04-06 09:41:42 +05:30
imAaryash 36453d740f Update LaunchWindow.tsx 2026-04-06 08:18:40 +05:30
Garry Priambudi 5320f76aae Merge branch 'main' into feature/webcam-resize-slider 2026-04-06 07:56:28 +07:00
Sid e571ecbf8d Merge pull request #352 from siddharthvaddem/sid/fix-read-handler-security
fix(security): prevent path traversal in IPC file read handlers
2026-04-05 17:04:27 -07:00
Siddharth d4c50c9a5e ci: remove flaky e2e test job from CI pipeline 2026-04-05 16:48:53 -07:00
Siddharth 3e6dff9c34 fix: wrap evaluate in try/catch for expected HUD window close
The HUD window now closes faster after switchToEditor, causing the
Playwright page context to terminate before evaluate returns.
2026-04-05 16:34:35 -07:00
Siddharth 1b6f4cce46 fix: restore original e2e test with minimal security fix additions
Revert to exact working version (7e65d52), only adding:
- recordings dir copy for path security check
- --enable-unsafe-swiftshader for CI WebGL
2026-04-05 16:29:54 -07:00
Julian Wolf ec946f0807 Merge branch 'siddharthvaddem:main' into main 2026-04-06 01:26:58 +02:00
Julian Wolf 37e1a82d05 Update ci.yml 2026-04-06 01:26:43 +02:00
Sid 2ce4dc49b3 Merge pull request #354 from notrudyyy/patch-1
Improve grammar and mobile images in README
2026-04-05 16:23:53 -07:00
Siddharth db815e362a ci: trigger checks 2026-04-05 16:22:22 -07:00
Siddharth ed9b8689f7 fix: catch expected page close error in e2e test evaluate call
switchToEditor closes the HUD window, which terminates the Playwright
page context before evaluate can return. Catch at the outer level.
2026-04-05 16:20:29 -07:00
Siddharth dc0856282f fix: add --enable-unsafe-swiftshader to e2e test for CI WebGL support
The headless CI environment fails to create valid WebGL framebuffers,
causing PixiJS pixel reads to fail silently and GIF export to hang.
SwiftShader provides a software WebGL implementation that works reliably.
2026-04-05 16:14:34 -07:00
Anirudh Vempati da79dab756 Update preview image sizes to be dynamic 2026-04-06 04:37:49 +05:30
Siddharth 1dc2c06ee4 fix: revert e2e test to fire-and-forget setCurrentVideoPath with reload
Restore the original test approach that was passing: fire-and-forget
setCurrentVideoPath, catch the switchToEditor context close, and reload
the editor window for WebCodecs initialization.
2026-04-05 16:04:01 -07:00
Anirudh Vempati b6803eb6e3 Update README.md 2026-04-06 04:28:15 +05:30
Siddharth 8013cc97bb fix: remove editor reload in e2e test that was clearing video state
The reload was intended to ensure WebCodecs registered, but it clears
the video path state set before the editor opened, causing the editor
to load blank and the export to never complete.
2026-04-05 15:56:28 -07:00
Siddharth e45611ade4 fix: e2e test — copy fixture into recordings dir for path security check
The test fixture path is outside RECORDINGS_DIR, so set-current-video-path
rejects it after the path traversal fix. Copy the fixture into the app
recordings directory before loading it.
2026-04-05 15:42:25 -07:00
Siddharth b986148d5d fix exporter test 2026-04-05 15:36:29 -07:00
Siddharth fe0c2829a7 fix 2026-04-05 15:33:39 -07:00
Siddharth e4672811de fix(security): prevent path traversal in IPC file read handlers 2026-04-05 14:58:28 -07:00
Julian Wolf 3e28e5860d Fix JSON formatting in package.json 2026-04-05 22:06:06 +02:00
Julian Wolf 925a7e532d Add author information to package.json 2026-04-05 22:04:16 +02:00
Sid f3d761b28d Merge pull request #324 from JasonOA888/fix/306-persist-user-settings
fix: persist user settings across sessions
2026-04-05 12:55:31 -07:00
Julian Wolf c9861dbef8 Refactor CI workflow for E2E tests
Updated CI workflow to include E2E tests conditionally.
2026-04-05 21:54:03 +02:00
Julian Wolf 1591edbeca Add .deb files to build artifacts 2026-04-05 21:35:33 +02:00
Julian Wolf a3c2ed8ed1 Update Linux build command to include AppImage and deb 2026-04-05 21:34:26 +02:00
Siddharth ae971bc480 fix: resolve type error, formatting, and import order from PR #321 2026-04-05 11:03:45 -07:00
Sid 213637967e fix(editor): track unsaved changes for new projects (#321)
fix(editor): track unsaved changes for new projects
2026-04-05 11:02:42 -07:00
JasonOA888 a8427b950e fix: resolve lint errors for CI
- Add updateState to useEffect dependency array
- Remove ineffective biome-ignore suppression comment
- Fix formatting in userPreferences.ts per biome rules
2026-04-06 02:01:01 +08:00
Siddharth c868469be5 fix: auto-finalize duration bug, restore cancelRecording, and add i18n for pause tooltips 2026-04-05 10:17:35 -07:00
Sid e90bba82ef feat: add pause/resume recording (#314)
added a new Feature that allows user to pause/resume while screen rec…
2026-04-05 10:05:38 -07:00
Siddharth 475cbcd76c revert: undo manual merge of PR #314 2026-04-05 10:05:04 -07:00
Siddharth 08f66c7c25 feat: add pause/resume recording with duration fix 2026-04-05 09:57:35 -07:00
Ayush765-spec 013312be1f Refactor: update 'New Recording' dialog and atomize confirm workflow (plus lint fixes) 2026-04-05 22:27:32 +05:30
Ayush Mukherjee 735dd2a191 Merge branch 'siddharthvaddem:main' into main 2026-04-05 22:14:49 +05:30
Siddharth 7072c05edd fix: duration bug in auto-finalize path and add i18n for pause tooltip 2026-04-05 09:39:28 -07:00
Manish 0bc3bbca6b Merge branch 'main' into feature/pause-button 2026-04-05 22:03:35 +05:30
Sid 5340272530 Merge pull request #313 from theaiagent/feature/frame-step-navigation
feat: add arrow key frame-by-frame playhead navigation
2026-04-05 08:49:43 -07:00
Garry Laly 2ee7ccd89c fix: feedback coderabbit 2026-04-05 20:19:31 +07:00
Garry Laly 79201569c5 feat: Add webcam size presets with slider 2026-04-05 20:00:44 +07:00
Garry Laly ca962ff16b feat: Add webcam size presets (small/medium/large) 2026-04-05 19:45:50 +07:00
cocoon 5426b6284c feat(editor): duplicate annotations 2026-04-05 09:16:04 +00:00
Sid da16872809 Merge pull request #295 from abres33/feature/cancel-recording
feat: add Cancel Recording button to HUD
2026-04-04 22:10:17 -07:00
Sid 11788ad703 Merge pull request #332 from marcgabe15/addDiscord
Add discord
2026-04-04 20:35:23 -07:00
Matthew Hrehirchuk 2712d8a41b fix: use view-aware byte extraction for BufferSource inputs 2026-04-04 21:00:16 -06:00
Matthew Hrehirchuk 21361d9bf8 fix: handle av1 VideoDecoder errors 2026-04-04 20:33:39 -06:00
Marc Diaz 66f9172a35 feat: add discord to readme 2026-04-04 20:52:14 -04:00
Marc Diaz bd604cb658 add discord to readme 2026-04-04 20:43:41 -04:00
Max Bailey 3b5ad5064e fix: resolve green MP4 exports on CachyOS/Arch Linux (Wayland)
On Linux/Wayland the implicit GPU-to-2D texture-sharing path used by
drawImage(webglCanvas) fails silently (EGL/Ozone), producing green
frames. Use explicit gl.readPixels to copy from GPU to CPU memory,
bypassing that path.
2026-04-04 19:12:15 -05:00
dheerajmr01 210baee0da added acquireId guard to prevent stale getUserMedia from repopulating webcamStream 2026-04-04 14:25:48 -05:00
Samir Patil 0e3106f7ec Merge branch 'main' into main 2026-04-05 00:42:10 +05:30
dheerajmr01 5ff613922f fix:addresses comments - clear track.onended before intentional stop to prevent disconnect toast 2026-04-04 14:03:26 -05:00
dheerajmr01 b270affb25 trigger re-review 2026-04-04 12:42:23 -05:00
Amir Yunus 1b980d6264 fix(hud): avoid horizontal scrollbar when recording on Windows
Use full-size layout and overflow clipping instead of 100vw/100vh on the HUD shell so the fixed 600×160 overlay does not gain a horizontal scrollbar when recording widens the toolbar.

Fixes #305
2026-04-05 01:33:25 +08:00
dheerajmr01 954b99e962 fix: addresses review - differentiate webcam error types and handle stream acquisition 2026-04-04 12:31:28 -05:00
JasonOA888 4f48ecd4bc fix: address code review feedback for settings persistence
- Replace useRef with useState for prefsHydrated to prevent race condition
- Wrap localStorage.getItem in try/catch in loadUserPreferences
- Validate aspectRatio against known valid values
- Include 'good' in exportQuality validation, 'mp4' in exportFormat validation
2026-04-04 23:58:25 +08:00
JasonOA888 7d746196d2 fix: persist user settings across sessions (closes #306)
Load saved preferences (padding, aspect ratio, export quality, export format)
on mount and auto-save whenever these settings change. Uses the existing
userPreferences.ts utility with a ref guard to prevent overwriting saved prefs
with defaults before the initial load completes.
2026-04-04 23:27:56 +08:00
JasonOA888 d5f59a7b8e fix: persist user settings across sessions
Add userPreferences module to save/load padding, aspect ratio,
export format and quality to localStorage. Applied on mount
in VideoEditor.

Closes #306
2026-04-04 23:16:39 +08:00
cocoon 478fe316dc fix(editor): track unsaved changes for new projects 2026-04-04 13:23:51 +00:00
dheerajmr01 20b0899c05 fix: camera light flashes and turns off when clicking webcam button (#308) 2026-04-04 01:43:54 -05:00
Ayush765-spec b451bdc03d Merge branch 'main' of https://github.com/Ayush765-spec/openscreen 2026-04-04 11:51:49 +05:30
Ayush765-spec 43ec6ee9cd fix(editor): localize new recording dialog and fix session clear behavior 2026-04-04 11:51:05 +05:30
Ayush Mukherjee 98da431da0 Merge branch 'siddharthvaddem:main' into main 2026-04-04 11:38:01 +05:30
Sid 21893f07af Merge pull request #288 from gulivan/feature/webcam-mask-shapes
Add webcam mask shape support
2026-04-03 22:56:01 -07:00
Sid 763c187f87 Merge pull request #281 from GuilhermeFaga/main
fix(#264): read raw pixels from canvas for VideoFrame to avoid silent failures on Linux
2026-04-03 22:50:15 -07:00
Sid 20567db245 Merge pull request #257 from xKeCo/feature/auto-follow-zoom
feat: add auto-follow zoom mode with cursor tracking
2026-04-03 22:42:02 -07:00
Sid 7a1113827c Merge pull request #318 from tmchow/feat/219-appimage-update-info
feat: embed AppImage update information for delta updates
2026-04-03 22:32:56 -07:00
Trevin Chow 7e298d3bbf feat: embed AppImage update information for delta updates
Add a top-level publish config in electron-builder.json5 pointing to
GitHub Releases. This embeds the update information URL in the AppImage
header, enabling tools like AppImageUpdate, AppImageLauncher, and
AppManager to perform delta updates instead of full re-downloads.

Also update the Linux build workflow to upload the generated .zsync file
alongside the .AppImage artifact.

Fixes #219
2026-04-03 20:14:20 -07:00
lueckpeter76-lgtm f972556443 Revert "fix: prevent double-finalize race condition in restartRecording on Windos" 2026-04-03 18:33:54 -06:00
theaiagent 97c9a73578 fix: skip frame-step on ARIA widgets that own arrow keys
Expand the arrow key guard to also skip elements with
role="separator" (PanelResizeHandle), role="slider", and
role="spinbutton" so keyboard panel resizing is not intercepted.
2026-04-03 23:02:12 +03:00
theaiagent 3bfcd8576b fix: read live video.currentTime for rapid frame steps and add JSDoc
- Read currentTime directly from the video element instead of the React
  ref so rapid arrow key presses each advance by exactly one frame
- Add JSDoc docstrings to frameStep.ts exports
2026-04-03 22:44:25 +03:00
maniesh6900 b002f2a485 added a new Feature that allows user to pause/resume while screen recording, 2026-04-04 00:56:14 +05:30
theaiagent cd0f2ab318 fix: expand arrow key guard for form controls and wire i18n for fixed shortcuts
- Add HTMLSelectElement and contentEditable to the arrow key input guard
  to prevent intercepting native keyboard behavior on form controls
- Add i18nKey field to FixedShortcut interface and wire up i18n lookups
  in ShortcutsConfigDialog and KeyboardShortcutsHelp so fixed shortcut
  labels are properly localized
2026-04-03 22:06:45 +03:00
xKeCo 54df597160 feat: enhance adaptive smoothing for auto-follow zoom in video playback 2026-04-03 12:26:07 -05:00
theaiagent e5430eed39 feat: add arrow key frame-by-frame playhead navigation (#302) 2026-04-03 17:50:53 +03:00
theaiagent baa30a9d6a test: add unit tests for frame step time computation 2026-04-03 17:44:26 +03:00
theaiagent b709d0d240 feat: add frame step entries to FIXED_SHORTCUTS display list 2026-04-03 17:37:13 +03:00
theaiagent 11bad60eb2 feat: add i18n labels for frame step shortcuts (en, es, zh-CN) 2026-04-03 17:36:43 +03:00
Ayush Mukherjee 5259ae5d87 Merge branch 'siddharthvaddem:main' into main 2026-04-03 18:58:00 +05:30
Ayush765-spec 14cd045e65 [Feature]: Ability to start a new recording from the editor 2026-04-03 18:57:05 +05:30
samirpatil2000 78901a8076 feat: configure macOS hardened runtime, entitlements, and build environment variables for notarization 2026-04-03 15:16:45 +05:30
Adam 27853cc2c3 fix: await setCurrentVideoPath and narrow catch in gif-export E2E test 2026-04-03 02:32:47 -05:00
Adam d6933813bd fix: move try/catch outside evaluate() in gif-export E2E test 2026-04-03 02:25:29 -05:00
Adam 2b471783c0 feat: add Cancel Recording button to HUD 2026-04-03 02:00:36 -05:00
Sid b101820ab8 Merge pull request #293 from abres33/fix/restart-recording-windows
fix: prevent double-finalize race condition in restartRecording on Windos
2026-04-02 23:35:46 -07:00
Sid 3061c141c6 Merge pull request #249 from EtienneLescot/feat/webcam-selector-optimization
feat: added webcam source selector and optimized horizontal UI
2026-04-02 23:30:30 -07:00
Adam 846cf71e09 fix: prevent double-finalize race condition in restartRecording on Windows 2026-04-03 01:12:26 -05:00
Ishan Panta 3895ca985f [add] extend speed options with higher presets and custom speed input
add 3x, 4x, 5x speed presets and a custom playback speed input field
that accepts any integer value up to 16x. change PlaybackSpeed type
from a fixed union to number with min/max constants and clamp utility.
update project persistence to validate any speed in range instead of
exact value matching. add i18n keys for en, es, zh-CN.

closes #252
2026-04-03 08:37:16 +05:45
Ivan 9d0ccf3bde Add webcam mask shape support 2026-04-03 00:09:51 +03:00
Faga 914a3c7f7b fix: read raw pixels from canvas for VideoFrame to avoid silent failures on Linux 2026-04-02 11:55:21 -03:00
Siddharth 2f36160174 version bump 2026-04-01 22:08:43 -07:00
xKeCo 05a87a8ab1 Revert "demo: add example project file for auto-follow zoom"
This reverts commit 5c6621293a.
2026-04-01 02:53:03 -05:00
xKeCo 5c6621293a demo: add example project file for auto-follow zoom
Contains the zoom region configuration used in the PR demo video:
two auto-follow zoom regions and one manual zoom region.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 02:51:25 -05:00
xKeCo 163b12d6fc ♻️ refactor: refactor zoom focus handling in video editor settings and playback 2026-04-01 02:38:42 -05:00
xKeCo 3be195cc15 feat: smooth auto-follow zoom with export parity 2026-04-01 01:41:20 -05:00
Etienne Lescot baec9a7585 fix: focusable element when webcam expanded with no devices, add error test
- LaunchWindow: render sr-only <select> when webcamExpanded but
  cameraDevices.length === 0 (loading/error/empty), so keyboard users
  always have a focusable element even in no-camera states
- useCameraDevices.test: add error-branch test asserting error message,
  empty devices array and isLoading=false when enumerateDevices rejects
2026-03-27 16:28:53 +01:00
Etienne Lescot 9817c85acf fix: address coderabbit review (concurrent stream, collapsed label, unified select, test quality)
- useCameraDevices: remove getUserMedia label probe to avoid conflict with
  useScreenRecorder acquiring the real stream; use enumerateDevices only and
  fall back to 'Camera <id>' for unlabeled devices; gate effect on enabled flag
- LaunchWindow: fix selectedCameraLabel to reflect loading/error/empty states
  in the collapsed view (was always showing 'Default Camera')
- LaunchWindow: unify webcam <select> to a single always-mounted element
  (sr-only when unavailable); mirrors the mic selector pattern
- useCameraDevices.test.ts: re-seed mockGetUserMedia in beforeEach after
  vi.resetAllMocks(); update permission test to assert fallback label behavior
2026-03-27 15:15:43 +01:00
Etienne Lescot 9762448929 fix: address coderabbit comments (loading state + keyboard access)
- LaunchWindow: expose isLoading/error from useCameraDevices; show
  'Searching...' only while enumeration is in flight, 'Camera unavailable'
  on error, 'No camera found' when list is empty (fixes perpetual loading state)
- LaunchWindow: keep <select> always mounted (sr-only when collapsed) and
  expand panel on focus as well as hover; fixes keyboard inaccessibility for
  both mic and webcam selectors
- i18n: add webcam.noneFound and webcam.unavailable to en/es/zh-CN locales
2026-03-27 14:53:41 +01:00
Etienne Lescot eade28079d fix: address PR review comments
- useCameraDevices: remove selectedDeviceId from useEffect deps (use ref instead)
- useCameraDevices: fall back to first available device when selected device is unplugged
- i18n: add missing keys (audio.defaultMicrophone, webcam.defaultCamera, webcam.searching) to en/es/zh-CN
- LaunchWindow: replace hardcoded strings with t() i18n calls
- tests: add afterEach(vi.resetAllMocks()), improve permission test assertions, add stale device fallback test
2026-03-27 14:39:19 +01:00
Etienne Lescot fed5a44b5a fix: enforce identical badge height with h-[36px] on both selectors 2026-03-27 14:25:28 +01:00
Etienne Lescot a9222c9484 fix: equalize badge heights and reduce window to 160px 2026-03-27 14:15:46 +01:00
Etienne Lescot e72851d4ef fix: use fixed positioning for HUD and device selectors to avoid h-full clipping 2026-03-27 14:09:54 +01:00
Etienne Lescot 317089d57f fix: restore flex layout to ensure HUD renders in transparent Electron window 2026-03-27 14:04:04 +01:00
Etienne Lescot 0a5e57ce76 feat: add webcam source selector with stable HUD layout
- Add useCameraDevices hook to enumerate video input devices
- Update useScreenRecorder to support webcamDeviceId selection
- Add device selector UI above HUD bar (mic + webcam, hover-to-expand)
- All selectors and HUD bar are absolute-positioned to prevent layout shifts
- Increase HUD window to 600x200px to accommodate device panels
- Add unit tests for useCameraDevices hook
2026-03-27 13:45:52 +01:00
Sid e35780bd85 Merge pull request #244 from ateendra24/fix-issue-226
feat: add fullscreen video player
2026-03-22 09:39:54 -07:00
AP Solanki eae3f119a4 feat: Implement PlaybackControls component and add i18n files for common terms in English, Spanish, and Chinese. 2026-03-22 15:33:43 +05:30
AP Solanki 5d561ff06f feat: add fullscreen video player 2026-03-22 14:17:44 +05:30
Siddharth a8bb0e88d5 improved vertical split gated behind 9:16 2026-03-21 23:15:46 -07:00
Siddharth cbbe2d7fbf movable camera pip 2026-03-21 22:04:10 -07:00
Siddharth 7aca8b8bc1 move project settings to top 2026-03-21 20:07:09 -07:00
Siddharth 4a299063c3 lang support 2026-03-21 18:18:43 -07:00
Siddharth 3d680e8521 Merge feat(export): allow re-saving exported video on dialog cancel (PR #181) 2026-03-21 17:06:36 -07:00
Siddharth c322825969 feat(export): allow re-saving exported video on dialog cancel 2026-03-21 17:06:25 -07:00
Sid ece93683b8 Merge pull request #243 from ryujh030820/feature/improve-timeline-navigation
Improve timeline navigation while scrubbing and scrolling
2026-03-21 16:43:25 -07:00
JH 203282be43 fix: pan timeline on row scroll 2026-03-20 16:52:16 +09:00
JH d8871d9228 fix: pan timeline when dragging playhead to edges 2026-03-20 16:25:19 +09:00
Sid dd0b7d6586 Merge pull request #210 from linyqh/codex/exporter-timeout-fallback
Stabilize video export on Windows
2026-03-19 20:53:32 -07:00
linyq 459b71f792 fix: satisfy biome formatting in video exporter 2026-03-20 10:19:49 +08:00
Sid adc0cf795c Merge pull request #237 from ryujh030820/fix/gradient-export-rendering
fix: fix gradient background export rendering
2026-03-19 18:27:43 -07:00
JH 796506819d Merge branch 'main' into fix/gradient-export-rendering 2026-03-20 09:32:55 +09:00
linyqh 2a2d7e7aba Stabilize video export on Windows 2026-03-20 00:04:34 +08:00
Sid 3eeecc46cf Merge pull request #241 from marcusschiesser/codex/add-multiple-layout-presets-for-video
Add selectable webcam layout presets (Picture in Picture, Vertical Stack)
2026-03-19 08:28:20 -07:00
Marcus Schiesser 6236d2a13d fix: handle export and camera access edge cases 2026-03-19 20:03:55 +08:00
Marcus Schiesser c84c244761 Pin Node and npm versions 2026-03-19 19:25:07 +08:00
Marcus Schiesser 83a60926d8 fix: center stacked screen and webcam layout 2026-03-19 17:51:51 +08:00
Marcus Schiesser 579887e2f8 fix: improve camera permission handling 2026-03-19 16:49:46 +08:00
Marcus Schiesser a0682e6716 feat: add selectable webcam layout presets 2026-03-19 13:05:42 +08:00
JH 038d6c40ab fix: fix gradient background export rendering 2026-03-18 14:30:21 +09:00
Sid 45636410fe Merge pull request #234 from siddharthvaddem/codex/issue-231
fix 231
2026-03-17 20:35:50 -07:00
Siddharth 69f1b4d20f fix 231 2026-03-17 20:25:34 -07:00
Sid d968689975 Merge pull request #233 from siddharthvaddem/codex/issue-230
fix: avoid false early decode failures
2026-03-17 20:10:40 -07:00
Siddharth 7e65d52847 fix 2026-03-17 20:07:15 -07:00
Siddharth 1680ef9b77 fix: guard exported file paths in export flow 2026-03-17 19:46:56 -07:00
Siddharth b7070f3ac8 Merge remote-tracking branch 'origin/main' into codex/issue-230 2026-03-17 19:45:27 -07:00
Siddharth de18a2f46f fix: avoid false early decode failures 2026-03-17 19:30:47 -07:00
Sid 7a6efc5df9 Merge pull request #232 from siddharthvaddem/codex/saved-to-location
bring back show folder
2026-03-17 19:06:53 -07:00
Siddharth 4b8c95f04f bring back show folder 2026-03-17 19:05:59 -07:00
Siddharth 0f123283b3 Merge remote-tracking branch 'origin/main' into main 2026-03-17 18:55:46 -07:00
Siddharth b33ec5e2d7 fix: restore webcam sessions and stop export deadlocks 2026-03-17 18:50:05 -07:00
Siddharth 0a0dd088c3 Merge branch 'codex/pr-229' into main 2026-03-17 18:47:19 -07:00
Sid 2669b380a3 Merge pull request #216 from prayaslashkari/feature/restart-recording
feat: Add Restart Recording Functionality
2026-03-17 16:22:30 -07:00
Sid 0935dac70a Merge pull request #228 from prayaslashkari/feature/resizeable-video-editor
refactor: Resizable Video Editor Layout, Migrated inline styles to TailwindCSS
2026-03-17 15:37:56 -07:00
Prayas Lashkari e2147bec63 feat: enhance restart recording functionality to prevent concurrent restarts 2026-03-17 13:48:31 -04:00
Marcus Schiesser 3d2d0a4dbc fix: always release exporter video frames 2026-03-17 20:35:21 +08:00
Marcus Schiesser 1591f7dfcb fix: restore passing checks for webcam overlay changes 2026-03-17 20:29:13 +08:00
Marcus Schiesser c3e4c86b33 fix: reset webcam state on access denial 2026-03-17 20:07:10 +08:00
Marcus Schiesser 942a7e599a fix: allow webcam toggle while recording 2026-03-17 20:05:37 +08:00
Marcus Schiesser 776ed954f2 fix: always tear down webcam export queues 2026-03-17 20:03:14 +08:00
Marcus Schiesser f1a453b9b2 fix: finalize externally stopped recordings 2026-03-17 19:57:45 +08:00
Marcus Schiesser e4263d4597 fix: sync webcam preview playback speed 2026-03-17 19:37:12 +08:00
Marcus Schiesser 2fb5b3b574 Add webcam recording overlay support 2026-03-17 19:09:34 +08:00
Prayas Lashkari 9a5d94a1c8 refactor: update VideoEditor layout and add config.json for setup and teardown 2026-03-17 02:12:44 -04:00
Prayas Lashkari 119c3acb18 feat: implement async restart recording functionality to ensure proper session handling 2026-03-17 01:57:55 -04:00
Sid 881acdb26f Merge pull request #225 from elevchyt/notification-area-hud-open
notification area hud open fix with small window open refactor
2026-03-16 21:03:37 -07:00
Sid 4a308fde12 Merge pull request #223 from marcgabe15/marcdiaz/e2e
E2E Testing with Playwright
2026-03-16 20:59:04 -07:00
Sid fc8a4db8f1 Merge pull request #222 from EtienneLescot/fix/export-local-file-loading
fix: read local export sources through electron IPC
2026-03-16 20:56:15 -07:00
Marc Diaz ac4f82484b revert change 2026-03-16 13:46:57 -04:00
Marc Diaz e9f0fda397 fix: possible race condition on test 2026-03-16 13:27:44 -04:00
elevchyt 4655e71ca5 notification area hud open fix with small window open refactor 2026-03-16 19:25:12 +02:00
Marc Diaz e82332647a fix: remove ffmpeg 2026-03-16 11:31:05 -04:00
Marc Diaz 9fb91dd17b Merge pull request #1 from marcgabe15/marcdiaz/test
feat(test): add an e2e test
2026-03-16 11:28:13 -04:00
Sid c8cf052fc9 Merge pull request #221 from EtienneLescot/feat/motion-blur-slider
feat: replace motion blur toggle with intensity slider
2026-03-16 08:27:38 -07:00
Marc Diaz 61d89831bb fix: add xvfb run 2026-03-16 11:24:30 -04:00
Marc Diaz 9f6ef0f582 feat(test): add an e2e test 2026-03-16 11:17:26 -04:00
Etienne Lescot ea68300634 fix: read local export sources via electron ipc 2026-03-16 13:01:32 +01:00
Etienne Lescot 446e3a35fc fix: avoid history checkpoint spam on motion blur drag 2026-03-16 12:51:54 +01:00
Etienne Lescot c35a33203b fix: increase motion blur intensity range 2026-03-16 12:40:08 +01:00
Etienne Lescot dd84edaf41 feat: replace motion blur toggle with intensity slider
Motion blur was a boolean switch (on/off). This changes it to a slider
from 0 (off) to 1 (full intensity), with 0.35 as the recommended sweet
spot per feedback on PR #207.

- EditorState/ProjectEditorState: motionBlurEnabled:bool → motionBlurAmount:number
- SettingsPanel: Switch → Slider (0–1, step 0.01); shows 'off' or value
- VideoPlayback/zoomTransform: scale blur by amount instead of boolean gate
- FrameRenderer/VideoExporter/GifExporter: propagate numeric amount
- projectPersistence: backward-compat loader (old true → 0.35, false → 0)
2026-03-16 12:22:16 +01:00
Sid 9d71f509b8 Merge pull request #207 from EtienneLescot/feat/recordly-cursor-pipeline
feat: rework zoom transitions and motion blur
2026-03-15 19:30:45 -07:00
Siddharth 9687157aba Merge main into PR #186 and resolve SourceSelector conflict 2026-03-15 18:27:29 -07:00
Siddharth e2075f15e9 Merge main into PR #185 and resolve native aspect conflicts 2026-03-15 17:13:53 -07:00
Siddharth d182854270 Merge PR #184: resolve crop control conflicts 2026-03-15 16:52:46 -07:00
Etienne Lescot 7a8d0f449a feat: narrow PR to zoom transitions and motion blur 2026-03-15 10:29:23 +01:00
Prayas Lashkari 0727b61de7 feat: add restart recording functionality in LaunchWindow and useScreenRecorder 2026-03-15 02:07:39 -04:00
Siddharth 56988e86e2 custom install loc 2026-03-14 15:57:22 -07:00
Sid 965d3e5f4c Merge pull request #211 from prayaslashkari/bug/crop-window
fix: Fix crop window behavior
2026-03-14 12:48:10 -07:00
Siddharth 5f6576768c normalize paths on all OS 2026-03-14 12:43:12 -07:00
Prayas Lashkari 6c086be1b6 fix: rename crop dropdown state to crop modal for clarity 2026-03-14 15:32:59 -04:00
Prayas Lashkari b52d27bf56 fix: add peer dependencies to package-lock.json 2026-03-14 15:26:27 -04:00
Prayas Lashkari e3c922d032 feat: add crop functionality with snapshot handling in SettingsPanel 2026-03-14 15:25:51 -04:00
Siddharth 16dea49fa8 fix audio desync and speed issue 2026-03-14 11:58:43 -07:00
Sid 575a339550 Merge pull request #206 from EtienneLescot/fix/windows-export-stall
Fix Windows export finalization stalls
2026-03-14 11:15:26 -07:00
Sid e5fa783b59 Merge pull request #205 from EtienneLescot/fix/issue-197-windows-paths
Fix Windows cursor telemetry path resolution
2026-03-14 09:46:51 -07:00
Etienne Lescot b5cc7777d7 Fix export finalization stalls on Windows 2026-03-14 11:57:59 +01:00
Etienne Lescot e72fb8252c Fix Windows cursor telemetry path resolution 2026-03-14 11:22:45 +01:00
Siddharth 5e8bb99e96 fix playback callback to not be in pixi setup dependency 2026-03-13 23:00:11 -07:00
Siddharth 1b08618831 project save/ close fix 2026-03-13 19:37:00 -07:00
Sid 144e34318e Merge pull request #204 from marcgabe15/feature/increase-worker-count
feat(gif-worker): increase amount of web workers based on hardwarecon…
2026-03-13 19:18:58 -07:00
Sid 6035719252 Merge pull request #202 from prayaslashkari/refactor/launch-window-ux
feat: UX Improvements in Launch Window
2026-03-13 18:50:03 -07:00
Sid 2af33894e9 Merge pull request #174 from FabLrc/feature/undo-redo
feat: implement undo/redo functionality in video editor
2026-03-13 18:49:27 -07:00
Marc Diaz 63fd87612e feat(gif-worker): increase amount of web workers based on hardwareconcurrecy 2026-03-13 17:32:20 -04:00
FabLrc 4b79909116 fix: stabilize lint/typecheck and shortcut typing 2026-03-13 11:24:54 +01:00
FabLrc 0a6895e89f Merge origin/main into feature/undo-redo 2026-03-13 10:55:40 +01:00
Prayas Lashkari 36a0a304d5 refactor: clean up imports and streamline JSX formatting in LaunchWindow component 2026-03-13 00:17:08 -04:00
Siddharth 4f68df1db8 fix exporter 2026-03-12 21:16:20 -07:00
Prayas Lashkari 7422e16b1e refactor: update package-lock.json to version 1.2.0 and add @radix-ui/react-tooltip dependency 2026-03-12 22:47:28 -04:00
Prayas Lashkari 151a3b2902 refactor: integrate Tooltip component and enhance LaunchWindow with tooltips 2026-03-12 22:14:44 -04:00
Prayas Lashkari 066832a3bd refactor: enhance LaunchWindow styles and structure for improved UX 2026-03-12 18:43:36 -04:00
Prayas Lashkari 118158b8ee refactor: add new animations and boxShadow styles for mic panel and recording effects 2026-03-12 18:43:24 -04:00
Prayas Lashkari 948e2b1e4a refactor: added timeUtils 2026-03-12 18:43:10 -04:00
Prayas Lashkari c48243360b refactor: improve icon handling and formatting in LaunchWindow component 2026-03-12 17:15:18 -04:00
Siddharth 7833dee014 fix microphone permission in build 2026-03-08 14:07:42 -07:00
Siddharth 991727d1c5 replace img 2026-03-07 22:13:33 -08:00
Siddharth 8e1e0e33e3 version update 2026-03-07 20:39:14 -08:00
Siddharth e02ef0d2c0 unsaved changes warning and loading project in hud 2026-03-07 19:44:00 -08:00
Siddharth fc7c1d28e5 update readme for new release 2026-03-07 19:32:53 -08:00
Siddharth 2553442a7d feat: add .editorconfig 2026-03-07 18:37:45 -08:00
Siddharth b3247a6a97 update ci check 2026-03-07 18:32:46 -08:00
Siddharth 124f2da992 fix unused 2026-03-07 18:17:29 -08:00
Siddharth 9343453365 add ci workflow 2026-03-07 18:12:44 -08:00
Siddharth 1802725581 pre commit hook and biome lint check 2026-03-07 18:03:32 -08:00
Siddharth 885d66c4a4 biome linting refactor 2026-03-07 17:59:41 -08:00
Siddharth 555b199e03 revamped HUD 2026-03-07 17:06:22 -08:00
Siddharth 371f79a35f system audio 2026-03-07 16:44:10 -08:00
Siddharth 64bc261c20 audio recording and settings 2026-03-07 15:56:11 -08:00
Siddharth 21e9f38be6 untrack 2026-03-07 13:51:08 -08:00
Sid a4fa260727 Merge pull request #182 from FabLrc/feature/fixing-timeline-on-long-video
Fixing timeline on long video
2026-03-07 13:35:20 -08:00
Siddharth 546bc7352c fix errors 2026-03-07 13:14:13 -08:00
Sid 9540a8c0a9 Merge pull request #163 from varaprasadreddy9676/feature/reveal-export-folder
feat: add reveal in folder option after export
2026-03-04 22:48:17 -08:00
Hemkesh dcf35a6ede Default to Windows tab when no screens available and show source counts
On Linux (e.g. Ubuntu), screen sources are often empty. This defaults
the source selector to the Windows tab when there are no screens, and
shows the count of each source type in the tab labels.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 21:51:09 -06:00
Hemkesh c8ebef026b Add "Native" aspect ratio option to export at cropped video dimensions
Adds a "Native" option to the aspect ratio dropdown that uses the cropped
video's actual aspect ratio, so the video fills the entire frame with no
background visible. Selecting Native also sets padding to 0 automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 21:48:47 -06:00
Hemkesh 7226632fc4 Add precise crop controls with numeric inputs, aspect ratio presets, and drag-to-move
- Add X, Y, W, H pixel input fields in the crop modal for exact positioning
- Add aspect ratio preset dropdown (16:9, 9:16, 4:3, 3:4, 1:1, 21:9, Free)
- Add lock/unlock button to maintain aspect ratio when resizing
- Display source video resolution for reference
- Add drag-to-move: click inside the crop area to pan it around
- Fix dropdown styling for dark mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 21:37:17 -06:00
SaiVaraprasad Medapati 60d3dfaef5 Merge branch 'main' into feature/reveal-export-folder 2026-03-04 19:33:39 +05:30
FabLrc 2ccead5fb9 Merge remote-tracking branch 'origin/main' into feature/undo-redo
# Conflicts:
#	src/components/video-editor/VideoEditor.tsx
2026-03-04 12:41:49 +01:00
FabLrc cbfc242308 fix: adjust minimum item width and duration for better interaction on timeline 2026-03-03 12:35:20 +01:00
FabLrc f0779c96a3 fix: ensure minimum dimensions for timeline items and adjust duration constraints 2026-03-03 12:35:20 +01:00
Sid 9eb362012b Merge pull request #153 from yusufm/projectsave
Add project save/load files with File menu integration
2026-03-02 18:36:10 -08:00
FabLrc 6d44dafd96 fix: Fixing speed undoable and add undo/redo to the list of shortucts configuration 2026-03-02 16:26:42 +01:00
FabLrc e6e3abb88c Merge branch 'main' into feature/undo-redo
# Conflicts:
#	src/components/video-editor/KeyboardShortcutsHelp.tsx
#	src/components/video-editor/VideoEditor.tsx
#	src/components/video-editor/timeline/TimelineEditor.tsx
2026-03-02 15:45:03 +01:00
Yusuf Mohsinally 843c130834 Merge main and address PR #153 review feedback 2026-03-01 21:13:19 -08:00
Sid f384338765 Merge pull request #179 from FabLrc/feature/speed-shortcut-configurable
fix: Add configurable shortcut for speed adjustment in TimelineEditor
2026-03-01 16:10:58 -08:00
FabLrc 0e082fff9c fix: Add configurable shortcut for speed adjustment in TimelineEditor 2026-03-01 23:08:05 +01:00
Sid 6ca24c3411 Merge pull request #176 from Brodypen/feature/speed-option
feat: Add speed option
2026-03-01 09:45:36 -08:00
Sid 31bc733415 Merge branch 'main' into feature/speed-option 2026-03-01 09:45:19 -08:00
Sid 451bb203b7 Merge pull request #172 from FabLrc/feature/shortcuts-configuration
Configurable keyboard shortcuts system
2026-03-01 09:36:25 -08:00
FabLrc 0e85679b14 feat: implement undo/redo functionality in video editor 2026-03-01 12:47:52 +01:00
Fabien Laurence 57fdad0646 Merge branch 'main' into feature/shortcuts-configuration 2026-03-01 12:31:56 +01:00
Sid 71bb09c82e Merge pull request #177 from Brodypen/worktree-refactor/magic-number
refactor: replace magic numbers with named constants in useScreenRecorder
2026-02-28 13:03:08 -08:00
Siddharth 4ab8f3d1f1 export zoom focus clamping 2026-02-28 12:36:50 -08:00
Sid 4bac15cb44 Merge pull request #154 from yusufm/feat/cursor-telemetry-zoom-suggestions
feat: cursor telemetry-driven zoom suggestions
2026-02-28 12:13:44 -08:00
FabLrc d76f38fb35 feat: enhance shortcuts configuration with conflict detection and fixed shortcuts 2026-02-28 11:11:12 +01:00
Yusuf Mohsinally 236ca4da29 address PR #153 review feedback 2026-02-28 00:28:01 -08:00
Brodypen cf8d211eb2 feat: add the speed to exporter lol 2026-02-28 02:16:03 -06:00
Yusuf Mohsinally 4ecd18086c refactor: move zoom suggestion logic into timeline util 2026-02-28 00:06:29 -08:00
Yusuf Mohsinally a2b9eea90a feat: add cursor telemetry-driven zoom suggestions 2026-02-28 00:06:29 -08:00
Siddharth 4b3afcf535 annotation bounding and canvas wrapping 2026-02-27 23:44:02 -08:00
Brodypen 185969a9d1 build: package-lock stuff 2026-02-28 01:27:01 -06:00
Sid 5f20820735 Merge pull request #173 from FabLrc/feature/enhancing-export
fix: improve encoder queue management and adjust latency mode for beter troughput
2026-02-27 23:26:05 -08:00
Brodypen 397a943426 feat: speed thing 2026-02-28 01:20:04 -06:00
Brodypen 83d3e7b6b8 refactor: replace magic numbers with named constants in useScreenRecorder 2026-02-28 01:08:19 -06:00
Siddharth 5573c9f427 rm testing files 2026-02-27 21:04:31 -08:00
FabLrc 92d2a41296 fix: improve encoder queue management and adjust latency mode for better throughput 2026-02-27 00:24:27 +01:00
FabLrc 9bc2c78b4d feat: implement keyboard shortcuts management and configuration 2026-02-26 15:41:32 +01:00
Sid 87735c2716 Merge pull request #118 from IdrisGit/feat-add-biome-formatter-linter
feat: remove eslint and add biome for formatter and linter (RFC)
2026-02-23 14:42:51 -08:00
Idris Gadi 91c9de2561 feat: update package 2026-02-22 09:49:04 +05:30
Idris Gadi 9df9264d25 Merge branch 'main' into feat-add-biome-formatter-linter 2026-02-22 09:34:51 +05:30
saivaraprasadreddy medapati c6d33aa82a fix: await openPath fallback and other review fixes
- Fix IPC handler to properly await shell.openPath() promise
- Export dialog now shows file name below the button for better UX
- Toast message now generic (works for both video and GIF exports)
- Fixed formatting in electron type definitions
2026-02-21 02:06:20 +05:30
saivaraprasadreddy medapati 85f2388041 feat: add reveal in folder option after export
- Added electron IPC handler 'reveal-in-folder' to show exported file in finder
- Created toast notification with clickable action to reveal exported video
- Added Show in Folder button in export success dialog
- Implemented proper state management for exported file path
- Fixed timing issue where exportedFilePath was reset too early
2026-02-21 01:53:27 +05:30
Sid 44cf97c7a1 Merge pull request #146 from KoopaCode/main
Refined Launch Styling
2026-02-19 19:06:13 -08:00
Yusuf Mohsinally bd50b193a1 Add Save Project As menu action and force prompt behavior 2026-02-18 11:08:01 -08:00
Yusuf Mohsinally 491db0ab2e Add project file save/load workflow, menu actions, and persistence tests 2026-02-18 11:01:14 -08:00
Sid 518fe4ca15 Merge pull request #147 from NureddinSoltan/fix/linux-sandbox-docs
docs: add troubleshooting for Linux sandbox error
2026-02-17 07:44:52 -08:00
NureddinSoltan 647705fb58 docs: add troubleshooting for Linux sandbox error 2026-02-17 01:37:55 +03:00
Andrew P. Harper 59d786bfda Refined launch style
Added .hudBar style & tweaked background gradient, reduce blur/saturation, Added scrollbar style.
2026-02-16 00:11:04 -05:00
Siddharth 19476da5cc upgrade electron-builder version 2026-02-13 22:07:27 -08:00
Siddharth cdff1a9b5d fix path 2026-02-13 21:06:49 -08:00
Siddharth fac4af40c7 demuxer and CFR conversion 2026-02-13 20:46:12 -08:00
Siddharth d9177b4a44 more timeline ux qol improvements: 2026-02-12 22:53:19 -08:00
Siddharth 4d7e2a2d85 UX improved for timeline 2026-02-12 22:33:41 -08:00
Siddharth 8e94dcbc2c keyframe snap and move 2026-02-06 22:14:39 -08:00
Siddharth 05f4e74de6 google fonts 2026-02-06 21:58:07 -08:00
Siddharth a89198ccdc anti aliasing on 2026-02-06 21:28:42 -08:00
Sid 170dd2efd2 Merge pull request #120 from IdrisGit/feat-add-support-for-16-10-aspect-ratio
feat: add support for 16:10 aspect ratio
2026-02-06 20:24:21 -08:00
Sid d4704d554d Merge pull request #134 from JustinBenito/patch-1
Update README with Full Disk Access instructions
2026-02-06 20:23:00 -08:00
Justin Benito B 485d2fd098 Update README with Full Disk Access instructions
Added note about granting Full Disk Access for terminal.
2026-02-04 17:48:45 +05:30
Sid d5dbcb3a7a Merge pull request #119 from IdrisGit/docs-move-issue-templates-from-markdown-to-yaml-forms
docs: move issue templates to YAML forms
2026-01-31 20:55:21 -08:00
Sid e1760020ac Merge pull request #121 from IdrisGit/docs-readme-cleanup
docs: add the beta warning at top and move assests to public
2026-01-31 20:53:32 -08:00
Siddharth c8a4becaf0 version fix 2026-01-31 20:15:52 -08:00
Siddharth e6db74f183 stale closure bug 2026-01-31 20:05:19 -08:00
Idris Gadi 271a3ee443 chore: remove unused svgs 2026-01-27 20:53:36 +05:30
Idris Gadi cef987cd33 docs: add the beta warning at top and move assests to public
moving the assests to public prevents root from getting populated with
alot of files
2026-01-27 20:53:36 +05:30
Idris Gadi 95b4df0ae4 fix: types 2026-01-27 16:23:45 +05:30
Idris Gadi 0d27f4fc36 feat: add support for 16:10 aspect ratio 2026-01-27 15:59:01 +05:30
Idris Gadi 23aced6007 docs: move issue templates to YAML forms
Github has official support for forms
(https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/syntax-for-issue-forms)

forms are much easire and intuitive for people to fill out compared to
markdown, this makes creating new issues easy and more structured.

should also help with preventing random issues.

I am intentionally not adding a blank template for now, if required it
can be added later.
2026-01-27 14:43:29 +05:30
Idris Gadi db9cf960f8 feat: remove eslint and add biome for formatter and linter 2026-01-27 12:53:33 +05:30
Sid afbd0740a3 Merge pull request #117 from IdrisGit/fix-turn-off-motion-blur-by-default
fix: set motion blur to disabled by default
2026-01-26 22:17:55 -08:00
Idris Gadi f30e2d654e fix: remove extra semi colon 2026-01-27 11:46:46 +05:30
Idris Gadi f2c6d8ff0f fix: set motion blur to disabled by default 2026-01-27 11:10:12 +05:30
Siddharth 9821e926d9 fix version 2026-01-24 22:45:07 -08:00
Siddharth a2ca0799d4 accordion & settings cleanup 2026-01-20 21:10:22 -08:00
Siddharth 6d2e1edb5b fix build errors 2026-01-20 20:05:14 -08:00
Sid 08f58b3539 Merge pull request #101 from Al-Farhan/bug/source-selector-tabs-content-height
fix(ui): increase height for source selector tabs content
2026-01-16 19:20:45 -08:00
Farhan Shaikh 1586dbe65e fix(ui): increase height for source selector tabs content 2026-01-13 16:42:19 +05:30
Sid 3a63617c55 Merge pull request #84 from solnikhil/export
feat(export): add GIF exporting, Frame Rate & Output Size selection, and loop animation
2026-01-11 10:27:18 -08:00
Nikhil Solanki e8d2c19b7d Merge branch 'main' into export 2026-01-11 21:52:29 +05:30
Nikhil Solanki 23ede0fcfa Update GIF export options to remove 10 FPS and small size
Removed 10 FPS from valid GIF frame rates and the 'small' size preset from GIF export options. Updated UI grid layouts and tests to reflect these changes for consistency.
2026-01-11 21:50:42 +05:30
Sid 78cfb64f98 Merge pull request #97 from twinkalp10/bug/fix-bg-image-selection-height
fix(ui): set minimum height for image selection tab in settings panel
2026-01-10 12:05:56 -08:00
Twinkal P 3124342309 fix(ui): set minimum height for image selection tab in settings panel 2026-01-10 17:33:11 +00:00
Sid a9bd2ac820 Merge pull request #90 from gerrywastaken/claude/openscreen-linux-boot-performance-8Xq9v
fix(linux): Greatly AppImage boot time from ~20s to ~2s
2026-01-01 12:05:13 -08:00
Claude e190915c48 fix(linux): reduce AppImage boot time from ~50s to near-instant
Change compression from "maximum" to "normal" for electron-builder.

The "maximum" compression setting causes gzip/xz compression in the
squashfs filesystem, which has extremely poor random access performance
(~35 MB/s). This results in 50+ second boot times on Linux AppImage
releases due to FUSE overhead during Electron's many small file reads
at startup.

With "normal" compression, the AppImage uses faster decompression
algorithms, dramatically improving startup time while only marginally
increasing package size.

Refs: electron-userland/electron-builder#6317
Refs: electron-userland/electron-builder#7483
2026-01-01 01:52:44 +00:00
Sid 171a02aef4 Merge pull request #72 from ateendra24/fix-issue-34
feat: add tutorial help component for video trimming guidance
2025-12-28 19:03:35 -08:00
Nikhil Solanki 085ebad38f Add custom hidden scrollbar styles and clean up code
Introduced CSS classes to hide scrollbars while maintaining scrollability across browsers. Also removed unnecessary blank lines in frameRenderer.ts for code cleanliness.
2025-12-25 16:12:34 +05:30
Nikhil Solanki f3e12629c2 Adjust layout and sizing of SettingsPanel tabs
Updated the Tabs and TabsList components in SettingsPanel to use fixed min and max heights and improved flex properties for better layout consistency and scrolling behavior.
2025-12-25 15:09:26 +05:30
Nikhil Solanki f00d381f94 fixed swloppy gitignore 2025-12-25 14:48:24 +05:30
Nikhil Solanki fb92b0b6d9 / 2025-12-25 02:05:38 +05:30
Nikhil Solanki 8ca2b8362a Update .gitignore 2025-12-25 02:02:08 +05:30
Nikhil Solanki 134f392553 Update .gitignore 2025-12-25 02:01:20 +05:30
Nikhil Solanki c7e81c6b7f Update .gitignore 2025-12-25 02:00:27 +05:30
Nikhil Solanki f58b8b2897 Mega gitignore 2025-12-25 02:00:06 +05:30
Nikhil Solanki 6e6ecba172 Add GIF export feature to video editor
Implements GIF export alongside MP4, including new export types, a GIF exporter module, UI components for format selection and GIF options, and integration into the export dialog and video editor. Adds property-based and unit tests for GIF export correctness, updates dependencies to include gif.js and related types, and refines Electron save dialog to support GIF files.
2025-12-25 01:50:02 +05:30
AP Solanki 175bb36eda feat: add tutorial help component for video trimming guidance 2025-12-18 10:38:07 +05:30
Sid 2ca99136ba Update README.md 2025-12-17 00:41:24 -07:00
Siddharth b7485865f3 fix:Border radius appears smaller in export compared to preview 2025-12-16 13:52:53 -07:00
Sid 7db2fa4e01 Merge pull request #65 from LauZzL/feature/system-tray-icon
feat(electron): implement dynamic tray icon and menu updates
2025-12-16 12:28:25 -08:00
LauZzL 81b59cad7c feat(electron): implement dynamic tray icon and menu updates
- Show "Stop Recording" menu & recording icon when recording
- Show "Open/Quit" menu & default icon when not recording
2025-12-16 21:28:18 +08:00
Sid 7e0ce53df0 Update package.json 2025-12-14 10:47:47 -07:00
Sid d57140b031 Merge pull request #59 from kamikazebr/feat/linux-support
feat: add Linux support
2025-12-14 09:33:11 -08:00
Felipe Novaes F Rocha 5dd85abaee chore: remove pnpm lock files from tracking 2025-12-14 14:29:23 -03:00
Felipe Novaes F Rocha 78fbd30b15 ci: add Linux build to workflow 2025-12-13 20:25:15 -03:00
Felipe Novaes F Rocha ebb1d29375 feat: add Linux support 2025-12-13 20:07:41 -03:00
Siddharth 250fc5d221 disable user-select 2025-12-10 21:30:25 -07:00
Sid 58b5ea0f9b Merge pull request #50 from suenyiyang/feat/optimize-clamped-content
feat: add content-clamp component to show full text when truncated
2025-12-09 13:00:38 -08:00
Yiyang Suen 4bc8a1e970 feat: remove content clamp popover content box shadow 2025-12-09 09:31:35 +08:00
Yiyang Suen 16752a7ae8 feat: add content-clamp component to show clamped text 2025-12-09 09:22:30 +08:00
Sid 5f4d20b26d Merge pull request #46 from LauZzL/fix/windows-close-button-not-work
fix(electron): remove platform check for hud overlay close event
2025-12-07 10:31:44 -08:00
LauZzL 8cbdcf2d7a fix(electron): remove platform check for hud overlay close event
This check causes the close button to stop working on Windows.
2025-12-07 17:51:19 +08:00
Siddharth f1f507e6e9 replace 2025-12-06 11:44:57 -07:00
Sid a7fb7670a7 - 2025-12-06 11:29:42 -07:00
Sid eccccf583b - 2025-12-06 11:23:34 -07:00
Siddharth d6d1a3eca6 build fix 2025-12-05 23:31:36 -07:00
Siddharth c5aa622898 hex based inputs for brand consitency 2025-12-05 23:22:30 -07:00
Siddharth 1345c8109c rename export res 2025-12-05 23:04:58 -07:00
Siddharth d91ed78fc2 delete trim ux improvement 2025-12-05 22:32:26 -07:00
Siddharth 5d7b817586 fix default wallpaper missing from export in build 2025-12-05 22:22:17 -07:00
Sid fecb9a9b22 Merge pull request #28 from ilGianfri/main
Add platform-aware keyboard shortcut formatting
2025-12-04 16:46:28 -08:00
Sid cbdef41667 Merge branch 'main' into main 2025-12-04 16:46:16 -08:00
Alessandro Spisso f34bd19183 feat: implement platform-aware keyboard shortcuts and add IPC handler for platform detection 2025-12-04 23:53:25 +01:00
Siddharth 7a7db0b277 revert exporter 2025-12-04 10:22:20 -07:00
Siddharth 3a4ec9c470 update assets 2025-12-02 23:25:40 -07:00
Siddharth d2a62b137d cleanup settings and readme 2025-12-02 21:49:38 -07:00
Siddharth c9e9d1d1bd missing shortcut 2025-12-02 18:50:53 -07:00
Siddharth cce88b3dab build errors and version update 2025-12-02 18:31:31 -07:00
Siddharth 4018741648 settings update 2025-12-02 18:11:00 -07:00
Siddharth ed3cdab64e export quality options 2025-12-02 17:41:30 -07:00
Siddharth 4ffa9c6ecb reduce seek bottleneck 2025-12-02 16:32:35 -07:00
Siddharth 899e55d257 update usescreenrecorder 2025-12-01 22:16:38 -07:00
Siddharth 977be1e3b1 draggable playhead and pause/play shortcut 2025-12-01 15:36:03 -07:00
Siddharth 262745a97f final annotation preview and export 2025-12-01 11:20:05 -07:00
Siddharth 6ac712eaac final annotation settings 2025-11-30 21:14:55 -07:00
Siddharth 79e40cef68 improved annotation experience 2025-11-30 19:19:08 -07:00
Siddharth c847953a52 allow multiple annotation conflicts, and cycle using Tab 2025-11-30 18:39:56 -07:00
Alessandro Spisso 391938049b Add platform-aware keyboard shortcut formatting
Introduces a new utility (platformUtils.ts) to format keyboard shortcuts based on the user's platform (macOS or others). Updates KeyboardShortcutsHelp and TimelineEditor to use the new formatShortcut function for displaying shortcuts, ensuring correct symbols are shown for modifier keys.
2025-12-01 00:19:34 +01:00
Siddharth 71ba4e4cea rm dead code 2025-11-30 15:35:03 -07:00
Siddharth ec3b9b46a1 annotations in preview 2025-11-30 14:47:22 -07:00
Sid 2ad5899417 Merge pull request #23 from suenyiyang/feat/disable-source-selection-when-recording
feat: disable source selection and project selection when recording
2025-11-29 19:34:41 -08:00
Yiyang Suen 1f08d3ca26 feat: disable source selection and project selection when recording 2025-11-30 10:19:12 +08:00
Sid bae17c0d1b Merge pull request #20 from siddharthvaddem/aspect-ratio
Aspect ratio
2025-11-29 10:42:48 -08:00
Siddharth 0c89e3e01a export aspect ratio 2025-11-29 11:38:09 -07:00
Siddharth d2ee511466 preview aspect ratio 2025-11-28 23:54:58 -07:00
Siddharth 4c725dfceb settings cleanup 2025-11-28 21:55:42 -07:00
Siddharth 71e2b51f5b padding video control 2025-11-28 21:46:05 -07:00
Siddharth c9321240d8 enable custom border radius 2025-11-28 19:15:56 -07:00
Siddharth 159f770da8 missing item depth level 2025-11-28 18:48:11 -07:00
Siddharth 443e4b0581 motion blur export stale closure 2025-11-28 18:04:38 -07:00
Siddharth 65bc21f153 fix the bug i just introduced lol 2025-11-28 17:41:30 -07:00
Siddharth 59807662d8 aspect ratio bug 2025-11-28 17:14:55 -07:00
Siddharth 8f31bde518 motion-blue switch 2025-11-28 16:08:11 -07:00
Sid 6fb9a24834 Merge pull request #14 from siddharthvaddem/feature/trim
Feature/trim
2025-11-27 22:31:33 -08:00
Siddharth 6a8b99c7bd spacing 2025-11-27 22:33:34 -07:00
Siddharth 1241de6e1a trim integration export 2025-11-27 22:24:17 -07:00
Siddharth 3998af5398 skip trimmed area seeking 2025-11-27 21:44:07 -07:00
Siddharth 2b5b15f3e8 basic trim setup 2025-11-27 16:35:21 -07:00
Siddharth e549850b75 hud overlay UX improvements 2025-11-27 15:12:38 -07:00
Siddharth b6d6fe6a70 draggable UI resizer 2025-11-27 14:17:10 -07:00
Siddharth e6cb86fafc prevent branch target runs 2025-11-26 11:46:52 -07:00
Sid 1dbbf8721c Merge pull request #11 from FlyingThaCat/main
add untracked uuid packages
2025-11-26 10:43:30 -08:00
john c63f5eddeb make the hud in right bottom corner 2025-11-26 23:12:04 +07:00
john 6475814541 add untracked uuid packages 2025-11-26 22:46:47 +07:00
Siddharth d85e6e1254 wallpaper count 2025-11-25 22:03:37 -07:00
Siddharth 6baeebec96 more zoom options, info popup 2025-11-25 21:43:30 -07:00
Siddharth ddf30ed60e record/ select your own video 2025-11-25 21:18:57 -07:00
Siddharth 98d6acaa6a keyframes 2025-11-25 18:45:37 -07:00
Siddharth 060a7bab92 timeline updates 2025-11-25 17:24:35 -07:00
Siddharth 48253cc31d file dialog choose location 2025-11-25 15:37:03 -07:00
Siddharth f887d09865 shadow intensity 2025-11-25 15:00:06 -07:00
Siddharth 6634d0eb53 editor ui improvements 2025-11-25 14:20:13 -07:00
Siddharth 6617fd39f6 configsupportcheck and throttling 2025-11-24 21:39:03 -07:00
Siddharth 8ec426d200 level 5.1 w software encoding 2025-11-24 18:49:08 -07:00
Siddharth 188ba94aad test win codec fix 2025-11-24 17:11:37 -07:00
Siddharth 7129d55f86 add support in app 2025-11-24 02:39:12 -07:00
Siddharth 483ace8e46 update fndg 2025-11-24 02:00:34 -07:00
Siddharth bf63154801 arm64 support-fix 2025-11-24 01:06:51 -07:00
Siddharth 437eded23a arm64 support 2025-11-24 01:03:23 -07:00
Siddharth 472d4053f4 arm64 support 2025-11-24 00:56:36 -07:00
Siddharth 864902b660 testing win editor issue 2025-11-24 00:44:46 -07:00
Siddharth dac826a5bc icon for win in builder 2025-11-24 00:02:15 -07:00
Siddharth dd17551c18 test actions 2025-11-23 23:56:06 -07:00
Siddharth dae7dc5212 rm uiohook-napi 2025-11-23 23:32:52 -07:00
Siddharth 210977faf4 pr template 2025-11-23 22:06:30 -07:00
Sid 9c6ac891e4 Merge pull request #2 from siddharthvaddem/v0.1.1
V0.1.1
2025-11-23 17:04:49 -07:00
Siddharth 1306c6e4ea logo update 2025-11-23 17:03:56 -07:00
Siddharth b181546ad3 reduce installer size 50% and app bundle size by 30% 2025-11-23 16:49:53 -07:00
Siddharth 0d5c4529d1 migrate to mediabunny 2025-11-23 12:24:56 -07:00
Sid 4a78bb999b Update README with macOS installation instructions 2025-11-23 00:56:35 -07:00
Sid f034c9fe01 Merge pull request #1 from siddharthvaddem/v0.1
V0.1
2025-11-23 00:53:37 -07:00
Siddharth 325c239a3a ... 2025-11-23 00:48:16 -07:00
Siddharth 9fedc5c167 3x faster exports 2025-11-23 00:33:10 -07:00
Siddharth 0e0f5003ff fix screen recording, optimize exporting pipeline 2025-11-22 23:40:39 -07:00
Siddharth 55a373c7ef recording optimizations 2025-11-22 22:28:58 -07:00
Siddharth Vaddem e14ecbff56 Modify funding sources in FUNDING.yml 2025-11-22 18:45:33 -07:00
Siddharth Vaddem 996d0d8c5b Update issue templates 2025-11-20 22:20:12 -07:00
Siddharth ba72eaf83c update readme 2025-11-20 22:09:17 -07:00
Siddharth 7f5b010c00 wallpaper fix 2025-11-20 21:22:21 -07:00
Siddharth dda08172d9 sunset windows support 2025-11-20 18:45:06 -07:00
Siddharth 27377c2194 export on windows 2025-11-20 16:58:55 -07:00
Siddharth 632816057b fix layout issues and export on windows 2025-11-20 16:10:58 -07:00
Siddharth bd8e6f94ef upload custom wallpaper 2025-11-20 14:13:25 -07:00
Siddharth 7a0b756cea timeline ux improvements 2025-11-20 13:47:46 -07:00
Siddharth c6dbf1fa67 ui improvements & more wallpapers 2025-11-20 13:27:39 -07:00
Siddharth 6081747b7d window consistency across mac and win 2025-11-20 12:25:46 -07:00
Siddharth 2e2ce5e151 workflow testing for win attempt 6 2025-11-19 23:24:22 -07:00
Siddharth cef1da37c9 workflow testing for win attempt 5 2025-11-19 23:18:34 -07:00
Siddharth 6e68c54ed7 workflow testing for win attempt 4 2025-11-19 23:08:42 -07:00
Siddharth c9d24c42b3 workflow testing for win attempt 3 2025-11-19 22:51:57 -07:00
Siddharth 87e439fb00 workflow testing for win 2025-11-19 22:42:08 -07:00
Siddharth 897799a926 workflow testing for win 2025-11-19 22:40:45 -07:00
Siddharth a43a492830 update readme 2025-11-19 13:04:29 -07:00
Siddharth eda876f618 contributions rm 2025-11-19 13:03:10 -07:00
Siddharth 500a85542a contributions 2025-11-19 13:01:35 -07:00
Siddharth 49ff17327a readme update 2025-11-19 12:55:24 -07:00
Siddharth Vaddem 3f3f7d5235 Add LICENSE file 2025-11-19 00:46:37 -07:00
Siddharth 78ed8621b2 readme 2025-11-18 01:17:57 -07:00
Siddharth d9a9f48ab9 cleanup+ readme updates 2025-11-18 00:58:09 -07:00
Siddharth fd8417b221 readme 2025-11-18 00:02:00 -07:00
Siddharth 965f779fe6 new icons 2025-11-16 22:30:27 -07:00
Siddharth dbc78cb867 fix wallpaper access in build 2025-11-16 22:12:22 -07:00
Siddharth 99f2af587c errors, icons 2025-11-16 21:26:37 -07:00
Siddharth 382f6d348c fix issue with export diff size 2025-11-16 21:04:26 -07:00
Siddharth 87c4eae9b0 longer wait 2025-11-16 18:14:07 -07:00
Siddharth faa0037bf0 faster exports 2025-11-16 17:21:10 -07:00
Siddharth 34e9efdb73 export working 2025-11-16 16:02:21 -07:00
Siddharth 75388e1218 cleanup 2025-11-16 01:44:41 -07:00
Siddharth 921ecebb1a ui changes 2025-11-16 01:43:24 -07:00
Siddharth c080168fb5 ui updates 2025-11-16 01:27:03 -07:00
Siddharth 6287fa90c8 timeline ui 2025-11-16 00:10:49 -07:00
Siddharth 41572298d6 overlay and source ui improvements 2025-11-15 23:31:36 -07:00
Siddharth 096396fdce isc 2025-11-09 18:08:44 -07:00
Siddharth ee8b64e590 external url direct handler 2025-11-09 17:36:32 -07:00
Siddharth ddd0adcea2 ui-updates for crop and playhead 2025-11-09 16:54:18 -07:00
Siddharth 5ebef96935 better ux for previewing zoom only during playback and not showing default while applying 2025-11-09 16:44:11 -07:00
Siddharth fa780786c0 fix crop and layout pos+scale on change 2025-11-09 16:34:01 -07:00
Siddharth d404ead557 fix blur and filter 2025-11-09 14:47:06 -07:00
Siddharth 98d3c3b6bd fix crop with zoom and camera frame 2025-11-09 13:44:32 -07:00
Siddharth e3b38c00f1 fix playback on load 2025-11-09 12:07:36 -07:00
Siddharth 0e946fb260 crop effect 2025-11-09 00:43:45 -07:00
Siddharth 307ac02ec3 shadow effect 2025-11-08 22:44:36 -07:00
Siddharth 43116a1bc3 zoom levels and selection 2025-11-08 22:35:12 -07:00
Siddharth 4cc1ae7a56 refactoring 2025-11-08 22:19:56 -07:00
Siddharth 61137c3233 delete zoom option 2025-11-08 21:33:03 -07:00
Siddharth 44c2bdb020 fix autoplay issue 2025-11-08 20:20:10 -07:00
Siddharth 0d6845dd00 pan and zoom effects 2025-11-08 20:00:00 -07:00
Siddharth 31364066e7 change to pixi container 2025-11-08 14:22:47 -07:00
Siddharth a597ea619d basic timeline synced to video playback 2025-10-31 22:37:12 -07:00
Siddharth 5440a39146 dnd-kit-timeline 2025-10-18 15:32:48 -07:00
Siddharth 3380bbab46 pixi install 2025-10-18 12:38:30 -07:00
Siddharth 588bafcf38 resizing layouts 2025-10-18 12:21:14 -07:00
Siddharth adf22a1408 gradients, colorpicker tabs 2025-10-18 12:02:20 -07:00
Siddharth 5eaa43c247 empty shadow switch 2025-10-17 22:28:57 -07:00
Siddharth 0d072b5038 rounding video preview 2025-10-17 22:00:52 -07:00
Siddharth e47d56d5b1 thumbnail on editor load 2025-10-17 21:24:37 -07:00
Siddharth 568d9ca21b background selection 2025-10-17 21:14:48 -07:00
Siddharth c3eb97116a stop via tray 2025-10-17 20:05:17 -07:00
Siddharth ec37cd7f11 code cleanup 2025-10-17 17:06:03 -07:00
Siddharth d43becbf81 video editor improvements 2025-10-15 20:10:13 -07:00
Siddharth 9c095e98de fix webm metadata duration 2025-10-15 19:11:48 -07:00
Siddharth 52563e6142 editor layout 2025-10-15 18:13:16 -07:00
Siddharth 8b01b55b36 canvas draw post recording 2025-10-15 17:10:50 -07:00
Siddharth 310bd40593 sourcewindow fix spacing 2025-10-15 12:50:31 -07:00
Siddharth a578e659e6 tmp files & video editor preview 2025-10-14 23:16:03 -07:00
Siddharth 5459eb3bc2 uiohook refactoring 2025-10-13 16:00:30 -07:00
Siddharth 240794b2b1 uiohook mouse integration 2025-10-13 15:44:56 -07:00
Siddharth 7428afaa6d qol impr 2025-10-12 17:42:06 -07:00
Siddharth ac849a3337 source selection 2025-10-12 17:13:31 -07:00
Siddharth de6d1aed98 apply transparent bg dynamically 2025-10-12 14:45:22 -07:00
Siddharth 632baa2552 update startup layout 2025-10-12 13:53:21 -07:00
Siddharth 1d3ca85332 window ui redesign 2025-10-12 12:01:14 -07:00
Siddharth 80221a5624 rm memory leak, more functional improvements 2025-10-10 00:30:04 -07:00
Siddharth 273a01895c basic screen recording function 2025-10-09 22:37:32 -07:00
437 changed files with 76735 additions and 16 deletions
+14
View File
@@ -0,0 +1,14 @@
root = true
[*]
indent_style = tab
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
[*.{json,yml,yaml}]
indent_size = 2
[*.md]
trim_trailing_whitespace = false
+10
View File
@@ -0,0 +1,10 @@
APP_NAME=Openscreen
BUNDLE_ID=com.siddharthvaddem.openscreen
APPLE_ID=
TEAM_ID=
SIGN_IDENTITY="Developer ID Application: Samir Patil ()"
CSC_NAME="Samir Patil ()"
NOTARY_PROFILE=OpenScreen-notary
APPLE_APP_SPECIFIC_PASSWORD=
+1
View File
@@ -0,0 +1 @@
use flake
+1
View File
@@ -0,0 +1 @@
* @siddharthvaddem
+149
View File
@@ -0,0 +1,149 @@
name: Bug Report
description: Create a report to help us improve
title: "[Bug]: "
labels: ["bug", "triage"]
body:
- type: checkboxes
attributes:
label: Search existing issues
description: Please search to see if an issue already exists for the bug you encountered.
options:
- label: I have searched the existing issues
required: true
- type: textarea
id: bug-description
attributes:
label: Describe the bug
description: A clear and concise description of what the bug is.
placeholder: e.g., When I click submit, nothing happens...
validations:
required: true
- type: textarea
id: expected-behavior
attributes:
label: Expected behavior
description: A clear and concise description of what you expected to happen.
placeholder: e.g., The form should submit and show a success message
validations:
required: true
- type: textarea
id: steps-to-reproduce
attributes:
label: To Reproduce
description: Steps to reproduce the behavior.
placeholder: |
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
validations:
required: false
- type: textarea
id: screenshots
attributes:
label: Screenshots
description: If applicable, add screenshots to help explain your problem.
placeholder: Drag and drop images here or paste them
validations:
required: false
- type: dropdown
id: os-type
attributes:
label: OS
description: Operating system
options:
- Windows
- macOS
- Linux
- iOS
- Android
- Other
validations:
required: false
- type: input
id: os-version
attributes:
label: OS Version
description: Please specify your OS version
placeholder: e.g., Windows 11, macOS Sonoma, Ubuntu 22.04
validations:
required: false
- type: input
id: os-other
attributes:
label: Other OS
description: If you selected "Other" for OS, please specify your operating system
placeholder: e.g., FreeBSD, Solaris
validations:
required: false
- type: dropdown
id: browser
attributes:
label: Browser
description: What browser are you using?
options:
- Chrome
- Firefox
- Safari
- Edge
- Other
validations:
required: false
- type: input
id: browser-version
attributes:
label: Browser Version
description: Please specify your browser version
placeholder: e.g., 120.0, 121.0.1
validations:
required: false
- type: input
id: browser-other
attributes:
label: Other Browser
description: If you selected "Other" for Browser, please specify your browser
placeholder: e.g., Brave, Vivaldi, Opera
validations:
required: false
- type: dropdown
id: device-type
attributes:
label: Device Type
description: Device category
options:
- Desktop
- Laptop
- Tablet
- Mobile
- Other
validations:
required: false
- type: input
id: device-other
attributes:
label: Other Device
description: If you selected "Other" for Device Type, please specify your device
placeholder: e.g., Smart TV, IoT device
validations:
required: false
- type: textarea
id: additional-context
attributes:
label: Additional context
description: Add any other context about the problem here.
placeholder: Links, references, or any additional information
validations:
required: false
@@ -0,0 +1,48 @@
name: Feature Request
description: Suggest an idea for this project
title: "[Feature]: "
labels: ["enhancement", "feature-request"]
body:
- type: checkboxes
attributes:
label: Search existing issues
description: Please search to see if an issue already exists for this feature request.
options:
- label: I have searched the existing issues
required: true
- type: textarea
id: problem-description
attributes:
label: Is your feature request related to a problem?
description: A clear and concise description of what the problem is.
placeholder: e.g., I'm always frustrated when I have to...
validations:
required: true
- type: textarea
id: solution-description
attributes:
label: Describe the solution you'd like
description: A clear and concise description of what you want to happen.
placeholder: Describe the feature or change you're proposing
validations:
required: false
- type: textarea
id: alternatives
attributes:
label: Describe alternatives you've considered
description: A clear and concise description of any alternative solutions or features you've considered.
placeholder: Have you considered any workarounds or alternative approaches?
validations:
required: false
- type: textarea
id: additional-context
attributes:
label: Additional context
description: Add any other context or screenshots about the feature request here.
placeholder: Links, mockups, or any additional information
validations:
required: false
+43
View File
@@ -0,0 +1,43 @@
# Pull Request Template
## Description
<!-- Briefly describe the purpose of this PR. -->
## Motivation
<!-- Explain why this change is needed. What problem does it solve? -->
## Type of Change
- [ ] New Feature
- [ ] Bug Fix
- [ ] Refactor / Code Cleanup
- [ ] Documentation Update
- [ ] Other (please specify)
## Related Issue(s)
<!-- Link to any related issue(s) (e.g., #123) -->
## Screenshots / Video
<!-- Include screenshots or a short video demonstrating the change. If the change adds a new UI feature, attach an image. If it adds functionality best shown via video, embed a video. -->
**Screenshot** (if applicable):
```markdown
![Screenshot Description](path/to/screenshot.png)
```
**Video** (if applicable):
```html
<video src="path/to/video.mp4" controls width="600"></video>
```
## Testing
<!-- Describe how reviewers can test the changes. Include steps, commands, or environment setup. -->
## Checklist
- [ ] I have performed a self-review of my code.
- [ ] I have added any necessary screenshots or videos.
- [ ] I have linked related issue(s) and updated the changelog if applicable.
---
*Thank you for contributing!*
+253
View File
@@ -0,0 +1,253 @@
name: Build Electron App
on:
workflow_dispatch:
inputs:
arch:
description: 'Architecture to build'
required: true
default: 'both'
type: choice
options:
- arm64
- x64
- both
jobs:
build-windows:
runs-on: windows-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '22'
- name: Install dependencies
run: npm ci
- name: Build Windows app
run: npm run build:win
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Upload Windows build
uses: actions/upload-artifact@v4
with:
name: windows-installer
path: release/**/*.exe
retention-days: 30
build-macos:
runs-on: macos-latest
strategy:
matrix:
arch: ${{ github.event.inputs.arch == 'both' && fromJSON('["arm64", "x64"]') || fromJSON(format('["{0}"]', github.event.inputs.arch)) }}
steps:
# ─── Checkout ─────────────────────────────────────────────
- name: Checkout code
uses: actions/checkout@v4
# ─── Setup Node.js ────────────────────────────────────────
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
# ─── Setup Python (needed by some native deps) ────────────
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
# ─── Install Dependencies ─────────────────────────────────
- name: Install dependencies
run: npm ci
# ─── Import Code Signing Certificate ──────────────────────
# This is the KEY step that makes CI signing work.
# We create a temporary keychain, import the .p12 cert into it,
# and set it as the default so codesign can find it.
- name: Import code signing certificate
env:
MAC_CERTIFICATE_P12: ${{ secrets.MAC_CERTIFICATE_P12 }}
MAC_CERTIFICATE_PASSWORD: ${{ secrets.MAC_CERTIFICATE_PASSWORD }}
run: |
# Create a temporary keychain
KEYCHAIN_PATH=$RUNNER_TEMP/build.keychain-db
KEYCHAIN_PASSWORD=$(openssl rand -base64 32)
# Create and configure keychain
security create-keychain -p "$KEYCHAIN_PASSWORD" "$KEYCHAIN_PATH"
security set-keychain-settings -lut 21600 "$KEYCHAIN_PATH"
security unlock-keychain -p "$KEYCHAIN_PASSWORD" "$KEYCHAIN_PATH"
# Decode and import certificate
echo "$MAC_CERTIFICATE_P12" | base64 --decode > $RUNNER_TEMP/certificate.p12
security import $RUNNER_TEMP/certificate.p12 \
-k "$KEYCHAIN_PATH" \
-P "$MAC_CERTIFICATE_PASSWORD" \
-T /usr/bin/codesign \
-T /usr/bin/security
# Allow codesign to access the keychain without UI prompt
security set-key-partition-list -S apple-tool:,apple: -k "$KEYCHAIN_PASSWORD" "$KEYCHAIN_PATH"
# Add to keychain search path (makes it the default)
security list-keychains -d user -s "$KEYCHAIN_PATH" $(security list-keychains -d user | tr -d '"')
# Verify the identity is available
security find-identity -v -p codesigning "$KEYCHAIN_PATH"
# Clean up the .p12 file
rm -f $RUNNER_TEMP/certificate.p12
# ─── Build Vite + Electron ────────────────────────────────
- name: Build Vite + Electron
run: npx tsc && npx vite build
# ─── Package with electron-builder ────────────────────────
# electron-builder handles deep codesigning the .app bundle
# "notarize: false" in electron-builder.json5 prevents it from
# trying its own notarization flow
- name: Package .app bundle
run: npx electron-builder --mac --${{ matrix.arch }} --dir
env:
CSC_NAME: "Samir Patil (N26FZ4GW28)"
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
# ─── Read version from package.json ───────────────────────
- name: Get version
id: version
run: echo "version=$(node -p 'require(\"./package.json\").version')" >> $GITHUB_OUTPUT
# ─── Locate the .app bundle ───────────────────────────────
- name: Find .app bundle
id: find_app
run: |
VERSION="${{ steps.version.outputs.version }}"
echo "=== Release directory contents ==="
ls -laR "release/${VERSION}/" || echo "release/${VERSION}/ not found"
echo "=== Searching for .app bundle ==="
APP_BUNDLE=$(find "release/${VERSION}" -maxdepth 4 -name "*.app" -type d | head -n1)
if [ -z "$APP_BUNDLE" ]; then
echo "::error::No .app bundle found in release/${VERSION}/"
exit 1
fi
echo "app_bundle=$APP_BUNDLE" >> $GITHUB_OUTPUT
echo "Found: $APP_BUNDLE"
# ─── Verify .app signature ────────────────────────────────
- name: Verify .app code signature
run: codesign --verify --deep --strict "${{ steps.find_app.outputs.app_bundle }}"
# ─── Create DMG ───────────────────────────────────────────
- name: Create DMG
id: dmg
run: |
VERSION="${{ steps.version.outputs.version }}"
ARCH="${{ matrix.arch }}"
DMG_NAME="Openscreen-Mac-${ARCH}-${VERSION}.dmg"
RELEASE_DIR="release/${VERSION}"
DMG_OUTPUT="${RELEASE_DIR}/${DMG_NAME}"
STAGING="${RELEASE_DIR}/dmg-staging"
mkdir -p "$STAGING"
cp -R "${{ steps.find_app.outputs.app_bundle }}" "$STAGING/"
ln -s /Applications "$STAGING/Applications"
hdiutil create \
-srcfolder "$STAGING" \
-volname "Openscreen" \
-fs HFS+ \
-fsargs "-c c=64,a=16,e=16" \
-format UDBZ \
"$DMG_OUTPUT"
rm -rf "$STAGING"
echo "dmg_path=$DMG_OUTPUT" >> $GITHUB_OUTPUT
echo "dmg_name=$DMG_NAME" >> $GITHUB_OUTPUT
# ─── Sign DMG ─────────────────────────────────────────────
- name: Sign DMG
run: |
codesign --force \
--sign "Developer ID Application: Samir Patil (N26FZ4GW28)" \
--timestamp \
"${{ steps.dmg.outputs.dmg_path }}"
# ─── Notarize DMG ────────────────────────────────────────
# On CI we can't use keychain profiles for notarytool, so we
# pass credentials directly via env vars / flags
- name: Notarize DMG
run: |
xcrun notarytool submit "${{ steps.dmg.outputs.dmg_path }}" \
--apple-id "${{ secrets.APPLE_ID }}" \
--team-id "${{ secrets.APPLE_TEAM_ID }}" \
--password "${{ secrets.APPLE_APP_SPECIFIC_PASSWORD }}" \
--wait
timeout-minutes: 15
# ─── Staple ───────────────────────────────────────────────
- name: Staple notarization ticket
run: xcrun stapler staple "${{ steps.dmg.outputs.dmg_path }}"
# ─── Validate ─────────────────────────────────────────────
- name: Validate stapled DMG
run: |
xcrun stapler validate "${{ steps.dmg.outputs.dmg_path }}"
spctl -a -vv -t install "${{ steps.dmg.outputs.dmg_path }}"
# ─── Upload Artifact ──────────────────────────────────────
- name: Upload notarized DMG
uses: actions/upload-artifact@v4
with:
name: openscreen-mac-${{ matrix.arch }}
path: ${{ steps.dmg.outputs.dmg_path }}
retention-days: 30
# ─── Cleanup Keychain ─────────────────────────────────────
- name: Cleanup keychain
if: always()
run: security delete-keychain $RUNNER_TEMP/build.keychain-db || true
build-linux:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '22'
- name: Install dependencies
run: npm ci
# bsdtar (from libarchive-tools) is required by fpm to build pacman
# packages. AppImage and deb don't need it; ubuntu-latest doesn't ship it.
- name: Install pacman build dependencies
run: sudo apt-get update && sudo apt-get install -y libarchive-tools
- name: Build Linux app
run: npm run build:linux
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Upload Linux build
uses: actions/upload-artifact@v4
with:
name: linux-installer
path: |
release/**/*.AppImage
release/**/*.zsync
release/**/*.deb
release/**/*.pacman
retention-days: 30
+118
View File
@@ -0,0 +1,118 @@
name: Bump Nix package on release
on:
release:
types: [published]
workflow_dispatch:
inputs:
tag:
description: "Release tag to bump (e.g. v1.5.0)"
required: true
type: string
permissions:
contents: write
pull-requests: write
jobs:
bump:
runs-on: ubuntu-latest
if: github.event_name == 'workflow_dispatch' || !github.event.release.prerelease
steps:
- name: Resolve tag and version
id: meta
env:
GH_EVENT_TAG: ${{ github.event.release.tag_name }}
INPUT_TAG: ${{ inputs.tag }}
run: |
set -euo pipefail
TAG="${GH_EVENT_TAG:-$INPUT_TAG}"
if [[ -z "$TAG" ]]; then
echo "::error::No tag resolved from release event or workflow input"
exit 1
fi
VERSION="${TAG#v}"
BRANCH="chore/bump-nix-${VERSION}"
echo "tag=$TAG" >> "$GITHUB_OUTPUT"
echo "version=$VERSION" >> "$GITHUB_OUTPUT"
echo "branch=$BRANCH" >> "$GITHUB_OUTPUT"
- name: Checkout main
uses: actions/checkout@v4
with:
ref: main
fetch-depth: 0
- name: Install Nix
uses: cachix/install-nix-action@v27
with:
nix_path: nixpkgs=channel:nixos-unstable
extra_nix_config: |
experimental-features = nix-command flakes
- name: Compute npmDepsHash
id: hash
run: |
set -euo pipefail
HASH=$(nix run nixpkgs#prefetch-npm-deps -- package-lock.json)
if [[ -z "$HASH" ]]; then
echo "::error::prefetch-npm-deps returned an empty hash"
exit 1
fi
echo "hash=$HASH" >> "$GITHUB_OUTPUT"
echo "Computed npmDepsHash: $HASH"
- name: Update nix/package.nix
env:
VERSION: ${{ steps.meta.outputs.version }}
HASH: ${{ steps.hash.outputs.hash }}
run: |
set -euo pipefail
# Update version line: ` version = "<anything>";`
sed -i -E "s|^([[:space:]]*version[[:space:]]*=[[:space:]]*)\"[^\"]*\";|\1\"${VERSION}\";|" nix/package.nix
# Update npmDepsHash line: ` npmDepsHash = "<anything>";`
sed -i -E "s|^([[:space:]]*npmDepsHash[[:space:]]*=[[:space:]]*)\"[^\"]*\";|\1\"${HASH}\";|" nix/package.nix
echo "=== diff ==="
git --no-pager diff nix/package.nix || true
- name: Create PR
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
VERSION: ${{ steps.meta.outputs.version }}
HASH: ${{ steps.hash.outputs.hash }}
BRANCH: ${{ steps.meta.outputs.branch }}
TAG: ${{ steps.meta.outputs.tag }}
run: |
set -euo pipefail
if git diff --quiet -- nix/package.nix; then
echo "nix/package.nix already at v${VERSION} with this hash — nothing to do."
exit 0
fi
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
# Replace any prior bump branch to keep the workflow idempotent.
git push origin --delete "$BRANCH" 2>/dev/null || true
git checkout -b "$BRANCH"
git add nix/package.nix
git commit -m "chore: bump nix package to v${VERSION}"
git push -u origin "$BRANCH"
gh pr create \
--title "chore: bump nix package to v${VERSION}" \
--base main \
--head "$BRANCH" \
--body "$(cat <<EOF
Automated bump triggered by release \`${TAG}\`.
- \`version\` → \`${VERSION}\`
- \`npmDepsHash\` → \`${HASH}\` (computed via \`prefetch-npm-deps package-lock.json\`)
Merge this so Nix users (NixOS, Home Manager, \`nix run github:siddharthvaddem/openscreen\`) pick up the new release.
> Note: PRs opened by \`GITHUB_TOKEN\` don't auto-trigger CI. The diff is two lines — review the change here, then merge. If you want CI to run, push an empty commit to this branch or close-and-reopen the PR.
EOF
)"
+58
View File
@@ -0,0 +1,58 @@
name: CI
on:
pull_request:
branches: [main]
push:
branches: [main]
jobs:
lint:
name: Lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
- run: npm ci
- run: npm run lint
typecheck:
name: Type Check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
- run: npm ci
- run: npx tsc --noEmit
test:
name: Test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
- run: npm ci
- run: npm run test
- run: npm run test:browser:install
- run: npm run test:browser
build:
name: Build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
- run: npm ci
- run: npx vite build
+519
View File
@@ -0,0 +1,519 @@
name: PR to Discord Forum
on:
pull_request_target:
types: [opened, reopened, ready_for_review, converted_to_draft, synchronize, edited, labeled, unlabeled, closed]
pull_request_review:
types: [submitted]
issue_comment:
types: [created]
schedule:
- cron: "0 12 * * 1"
workflow_dispatch:
permissions:
contents: read
pull-requests: write
issues: read
jobs:
notify:
if: github.event_name != 'schedule' && github.actor != 'github-actions[bot]'
concurrency:
group: discord-pr-sync-${{ github.repository }}-${{ github.event.pull_request.number || github.event.issue.number || github.run_id }}
cancel-in-progress: false
runs-on: ubuntu-latest
steps:
- name: Sync PR activity to Discord forum thread
id: sync
uses: actions/github-script@v7
env:
DISCORD_WEBHOOK_URL: ${{ secrets.DISCORD_WEBHOOK_URL }}
DISCORD_PR_FORUM_WEBHOOK: ${{ secrets.DISCORD_PR_FORUM_WEBHOOK }}
DISCORD_WEBHOOK_USERNAME: ${{ secrets.DISCORD_WEBHOOK_USERNAME }}
DISCORD_WEBHOOK_AVATAR_URL: ${{ secrets.DISCORD_WEBHOOK_AVATAR_URL }}
DISCORD_BOT_TOKEN: ${{ secrets.DISCORD_BOT_TOKEN }}
DISCORD_REVIEWER_ROLE_ID: ${{ secrets.DISCORD_REVIEWER_ROLE_ID }}
DISCORD_ALERT_WEBHOOK_URL: ${{ secrets.DISCORD_ALERT_WEBHOOK_URL }}
with:
script: |
const WEBHOOK_USERNAME = (process.env.DISCORD_WEBHOOK_USERNAME || "OpenScreen").trim();
const WEBHOOK_AVATAR = (process.env.DISCORD_WEBHOOK_AVATAR_URL || "").trim();
const THREAD_MARKER_REGEX = /<!--\s*discord-thread-id:(\d+)\s*-->/i;
const webhookUrl = (process.env.DISCORD_WEBHOOK_URL || process.env.DISCORD_PR_FORUM_WEBHOOK || "").trim();
const botToken = (process.env.DISCORD_BOT_TOKEN || "").trim();
const reviewerRoleId = (process.env.DISCORD_REVIEWER_ROLE_ID || "").trim();
const alertWebhookUrl = (process.env.DISCORD_ALERT_WEBHOOK_URL || "").trim();
const TAGS = {
open: "1493976692967080096",
draft: "1493976782028935279",
ready: "1493976833626996756",
changes: "1493976909875515564",
approved: "1493976951038152764",
merged: "1493977049709281320",
closed: "1493977108102516786",
};
const labelTagMap = {
bug: "1493977562773458975",
enhancement: "1493977619216207993",
documentation: "1493978565153394830",
};
function cleanDescription(text, maxLen = 3500) {
if (!text) return "No description provided.";
const normalized = text
.replace(/\r\n/g, "\n")
.replace(/\n{3,}/g, "\n\n")
.trim();
if (normalized.length <= maxLen) return normalized;
return `${normalized.slice(0, maxLen - 1)}…`;
}
function trimThreadName(name) {
return name.length > 95 ? name.slice(0, 95) : name;
}
function extractThreadId(body) {
if (!body) return null;
const match = body.match(THREAD_MARKER_REGEX);
return match ? match[1] : null;
}
function upsertThreadMarker(body, threadId) {
const cleaned = (body || "").replace(THREAD_MARKER_REGEX, "").trim();
return `${cleaned}\n\n<!-- discord-thread-id:${threadId} -->`.trim();
}
async function discordPost(payload, options = {}) {
const endpoint = new URL(webhookUrl);
endpoint.searchParams.set("wait", "true");
if (options.threadId) endpoint.searchParams.set("thread_id", String(options.threadId));
const response = await fetch(endpoint.toString(), {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
username: WEBHOOK_USERNAME,
avatar_url: WEBHOOK_AVATAR,
allowed_mentions: { parse: [] },
...payload,
})
});
const contentType = (response.headers.get("content-type") || "").toLowerCase();
const text = await response.text();
if (!response.ok) {
throw new Error(`Discord API error ${response.status}: ${text}`);
}
if (!text) return {};
if (contentType.includes("application/json")) return JSON.parse(text);
// Some proxy/CDN edge responses may return HTML with 2xx; avoid crashing on JSON parse.
core.warning(`Discord webhook returned non-JSON response (content-type: ${contentType || "unknown"}).`);
return {};
}
async function patchDiscordThread(threadId, patchBody) {
if (!botToken || !threadId) return;
const response = await fetch(`https://discord.com/api/v10/channels/${threadId}`, {
method: "PATCH",
headers: {
"Authorization": `Bot ${botToken}`,
"Content-Type": "application/json",
},
body: JSON.stringify(patchBody),
});
if (!response.ok) {
const text = await response.text();
core.warning(`Discord thread patch failed (${response.status}): ${text}`);
}
}
function desiredStatusTag(prState) {
if (prState.merged && TAGS.merged) return TAGS.merged;
if (prState.closed && !prState.merged && TAGS.closed) return TAGS.closed;
if (prState.reviewState === "CHANGES_REQUESTED" && TAGS.changes) return TAGS.changes;
if (prState.reviewState === "APPROVED" && TAGS.approved) return TAGS.approved;
if (prState.draft && TAGS.draft) return TAGS.draft;
if (!prState.draft && TAGS.ready) return TAGS.ready;
return TAGS.open || null;
}
function tagIdsFromLabels(labels) {
const out = [];
for (const label of labels) {
const mapped = labelTagMap[label.toLowerCase()] || labelTagMap[label];
if (mapped) out.push(String(mapped));
}
return out;
}
async function getPullRequest() {
if (context.eventName === "pull_request_target" || context.eventName === "pull_request_review") {
return context.payload.pull_request || null;
}
if (context.eventName === "issue_comment") {
const issue = context.payload.issue;
if (!issue?.pull_request) return null;
const { data } = await github.rest.pulls.get({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: issue.number,
});
return data;
}
return null;
}
async function getReviewState(owner, repo, pullNumber) {
const { data } = await github.rest.pulls.listReviews({ owner, repo, pull_number: pullNumber, per_page: 100 });
let hasChanges = false;
let hasApproved = false;
for (const r of data) {
const s = (r.state || "").toUpperCase();
if (s === "CHANGES_REQUESTED") hasChanges = true;
if (s === "APPROVED") hasApproved = true;
}
if (hasChanges) return "CHANGES_REQUESTED";
if (hasApproved) return "APPROVED";
return "NONE";
}
async function sendFailureAlert(message) {
if (!alertWebhookUrl) return;
try {
await fetch(alertWebhookUrl, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
username: "OpenScreen",
avatar_url: WEBHOOK_AVATAR,
content: `⚠️ PR Discord sync failed\n${message}\nRun: ${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`,
allowed_mentions: { parse: [] }
})
});
} catch {
core.warning("Failed to send failure alert webhook.");
}
}
try {
const pr = await getPullRequest();
if (!pr) {
core.info("No PR context found. Skipping.");
return;
}
if (!webhookUrl) {
const strictEvents = new Set(["pull_request_target", "workflow_dispatch"]);
const msg =
`Discord sync skipped: webhook secret unavailable for event '${context.eventName}'. ` +
"Set either DISCORD_WEBHOOK_URL or DISCORD_PR_FORUM_WEBHOOK in repository secrets.";
if (strictEvents.has(context.eventName)) {
core.setFailed(msg);
} else {
core.warning(msg);
}
return;
}
const action = context.payload.action || "";
const owner = context.repo.owner;
const repo = context.repo.repo;
const number = pr.number;
const title = pr.title;
const author = pr.user?.login || "unknown";
const url = pr.html_url;
const authorUrl = pr.user?.html_url || "";
const authorAvatar = pr.user?.avatar_url || "";
const base = pr.base?.ref || "";
const head = pr.head?.ref || "";
const repoFullName = pr.base?.repo?.full_name || `${owner}/${repo}`;
const labels = (pr.labels || []).map((l) => l.name);
const body = (pr.body || "").trim();
const reviewState = await getReviewState(owner, repo, number);
let threadId = extractThreadId(body);
const shouldCreateThread =
context.eventName === "pull_request_target" &&
["opened", "reopened", "ready_for_review"].includes(action) &&
!threadId;
if (shouldCreateThread) {
const fields = [
{ name: "PR", value: `[#${number}](${url})`, inline: true },
{ name: "Author", value: `[${author}](${authorUrl || url})`, inline: true },
{ name: "Status", value: pr.draft ? "Draft" : "Open", inline: true },
{ name: "Branches", value: `\`${head}\` -> \`${base}\``, inline: true },
{ name: "Changes", value: `+${pr.additions} / -${pr.deletions}`, inline: true },
{ name: "Files Changed", value: String(pr.changed_files), inline: true }
];
if (labels.length) {
fields.push({
name: "Labels",
value: labels.map((l) => `\`${l}\``).join(" "),
inline: false,
});
}
const statusTag = desiredStatusTag({ draft: pr.draft, reviewState, merged: false, closed: false });
const mappedLabelTags = tagIdsFromLabels(labels);
const appliedTags = [...new Set([statusTag, ...mappedLabelTags].filter(Boolean))];
const createPayload = {
content: action === "ready_for_review" ? "🔔 PR is now ready for review" : "🔔 New pull request opened",
thread_name: trimThreadName(`PR #${number} - ${title}`),
applied_tags: appliedTags,
embeds: [
{
title: `PR #${number}: ${title}`,
url,
description: cleanDescription(body),
color: pr.draft ? 15105570 : 1998671,
author: {
name: author,
url: authorUrl || undefined,
icon_url: authorAvatar || undefined,
},
fields,
footer: { text: repoFullName },
timestamp: new Date().toISOString(),
},
],
};
const result = await discordPost(createPayload);
const createdThreadId = result.channel_id || null;
if (createdThreadId) {
const updatedBody = upsertThreadMarker(body, createdThreadId);
await github.rest.pulls.update({ owner, repo, pull_number: number, body: updatedBody });
core.info(`Created Discord thread ${createdThreadId} and stored mapping.`);
} else {
core.warning("Discord thread created but channel_id missing in response.");
}
return;
}
if (!threadId) {
core.info("No mapped Discord thread ID found; skipping update event.");
return;
}
if (context.eventName === "pull_request_target" && ["edited", "labeled", "unlabeled", "ready_for_review", "converted_to_draft"].includes(action)) {
const statusTag = desiredStatusTag({
draft: action === "converted_to_draft" ? true : pr.draft,
reviewState,
merged: false,
closed: false,
});
const mappedLabelTags = tagIdsFromLabels(labels);
const appliedTags = [...new Set([statusTag, ...mappedLabelTags].filter(Boolean))];
await patchDiscordThread(threadId, {
name: trimThreadName(`PR #${number} - ${title}`),
...(appliedTags.length ? { applied_tags: appliedTags } : {}),
});
}
let updateMessage = null;
let updateEmbed = null;
if (context.eventName === "pull_request_target") {
if (action === "synchronize") {
const { data: commits } = await github.rest.pulls.listCommits({ owner, repo, pull_number: number, per_page: 5 });
const list = commits.map((c) => `- \`${c.sha.slice(0, 7)}\` ${c.commit.message.split("\n")[0]}`).join("\n") || "- No commit details";
updateMessage = `🧩 New commits pushed to PR #${number}`;
updateEmbed = {
title: `Commit Update • PR #${number}`,
url: `${url}/files`,
description: `${list}`,
color: 1998671,
footer: { text: repoFullName },
timestamp: new Date().toISOString(),
};
} else if (action === "edited") {
updateMessage = `✏️ PR #${number} details were edited`;
updateEmbed = {
title: `PR Updated • #${number}`,
url,
description: cleanDescription(body, 1200),
color: 1998671,
timestamp: new Date().toISOString(),
};
} else if (action === "closed") {
const isMerged = !!pr.merged;
const statusTag = desiredStatusTag({ draft: false, reviewState, merged: isMerged, closed: true });
const mappedLabelTags = tagIdsFromLabels(labels);
const appliedTags = [...new Set([statusTag, ...mappedLabelTags].filter(Boolean))];
await patchDiscordThread(threadId, {
...(appliedTags.length ? { applied_tags: appliedTags } : {}),
...(isMerged ? { archived: true, locked: true } : {}),
});
updateMessage = isMerged
? `✅ PR #${number} was merged`
: `🛑 PR #${number} was closed without merge`;
updateEmbed = {
title: isMerged ? `Merged • PR #${number}` : `Closed • PR #${number}`,
url,
description: isMerged ? "This PR has been merged into the base branch." : "This PR was closed before merge.",
color: isMerged ? 5763719 : 15158332,
timestamp: new Date().toISOString(),
};
} else if (action === "ready_for_review") {
updateMessage = `🚀 PR #${number} moved from draft to ready for review`;
if (reviewerRoleId) updateMessage += ` <@&${reviewerRoleId}>`;
} else if (action === "converted_to_draft") {
updateMessage = `📝 PR #${number} converted to draft`;
}
} else if (context.eventName === "pull_request_review") {
const review = context.payload.review;
if (review) {
const state = (review.state || "commented").toUpperCase();
const reviewer = review.user?.login || "reviewer";
updateMessage = `🧪 Review ${state} by **${reviewer}** on PR #${number}`;
if (state === "CHANGES_REQUESTED" && reviewerRoleId) updateMessage += ` <@&${reviewerRoleId}>`;
updateEmbed = {
title: `Review ${state} • PR #${number}`,
url: review.html_url || url,
description: cleanDescription(review.body || "No review note.", 1000),
color: state === "APPROVED" ? 5763719 : state === "CHANGES_REQUESTED" ? 15158332 : 1998671,
timestamp: new Date().toISOString(),
};
if (state === "CHANGES_REQUESTED" || state === "APPROVED") {
const statusTag = desiredStatusTag({ draft: pr.draft, reviewState: state, merged: false, closed: false });
const mappedLabelTags = tagIdsFromLabels(labels);
const appliedTags = [...new Set([statusTag, ...mappedLabelTags].filter(Boolean))];
await patchDiscordThread(threadId, {
...(appliedTags.length ? { applied_tags: appliedTags } : {}),
});
}
}
} else if (context.eventName === "issue_comment") {
const comment = context.payload.comment;
if (comment) {
const commenter = comment.user?.login || "user";
updateMessage = `💬 New comment by **${commenter}** on PR #${number}`;
updateEmbed = {
title: `New PR Comment • #${number}`,
url: comment.html_url || url,
description: cleanDescription(comment.body || "No comment body.", 1000),
color: 1998671,
timestamp: new Date().toISOString(),
};
}
}
if (!updateMessage && !updateEmbed) {
core.info("No Discord update message for this event/action. Skipping.");
return;
}
const payload = { content: updateMessage || "" };
if (updateEmbed) payload.embeds = [updateEmbed];
await discordPost(payload, { threadId });
core.info(`Posted update to Discord thread ${threadId}.`);
} catch (err) {
const msg = err && err.message ? err.message : String(err);
core.setFailed(msg);
const alertWebhook = process.env.DISCORD_ALERT_WEBHOOK_URL;
if (alertWebhook) {
try {
await fetch(alertWebhook, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
username: "OpenScreen",
avatar_url: WEBHOOK_AVATAR,
content: `⚠️ PR->Discord sync failed\n${msg}\nRun: ${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`,
allowed_mentions: { parse: [] }
})
});
} catch {
core.warning("Failed to send alert webhook.");
}
}
}
weekly-contributor-leaderboard:
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
steps:
- name: Post weekly contributor leaderboard
uses: actions/github-script@v7
env:
DISCORD_SPOTLIGHT_WEBHOOK_URL: ${{ secrets.DISCORD_SPOTLIGHT_WEBHOOK_URL }}
DISCORD_WEBHOOK_USERNAME: ${{ secrets.DISCORD_WEBHOOK_USERNAME }}
DISCORD_WEBHOOK_AVATAR_URL: ${{ secrets.DISCORD_WEBHOOK_AVATAR_URL }}
with:
script: |
const spotlightWebhook = (process.env.DISCORD_SPOTLIGHT_WEBHOOK_URL || "").trim();
const webhookUsername = (process.env.DISCORD_WEBHOOK_USERNAME || "OpenScreen").trim();
const webhookAvatar = (process.env.DISCORD_WEBHOOK_AVATAR_URL || "").trim();
if (!spotlightWebhook) {
core.info("DISCORD_SPOTLIGHT_WEBHOOK_URL missing. Skipping leaderboard post.");
return;
}
const since = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000).toISOString();
const owner = context.repo.owner;
const repo = context.repo.repo;
const q = `repo:${owner}/${repo} is:pr is:merged merged:>=${since.substring(0, 10)}`;
const search = await github.rest.search.issuesAndPullRequests({
q,
per_page: 100,
});
const counter = new Map();
for (const item of search.data.items) {
const login = item.user?.login;
if (!login) continue;
counter.set(login, (counter.get(login) || 0) + 1);
}
const ranked = [...counter.entries()]
.sort((a, b) => b[1] - a[1])
.slice(0, 10);
const totalMerged = search.data.items.length;
const lines = ranked.length
? ranked.map(([user, count], idx) => `${idx + 1}. **${user}** - ${count} merged PR(s)`).join("\n")
: "No merged PRs this week.";
const payload = {
username: webhookUsername,
...(webhookAvatar ? { avatar_url: webhookAvatar } : {}),
embeds: [
{
title: "🌟 Weekly Contributor Leaderboard",
description: lines,
color: 1998671,
fields: [
{ name: "Merged PRs (7d)", value: String(totalMerged), inline: true },
{ name: "Repository", value: `${owner}/${repo}`, inline: true },
{ name: "Period", value: "Last 7 days", inline: true }
],
timestamp: new Date().toISOString()
}
],
allowed_mentions: { parse: [] }
};
const res = await fetch(`${spotlightWebhook}?wait=true`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(payload)
});
if (!res.ok) {
const txt = await res.text();
core.setFailed(`Leaderboard post failed ${res.status}: ${txt}`);
}
+26
View File
@@ -0,0 +1,26 @@
name: Publish release to WinGet
on:
release:
types: [released]
workflow_dispatch:
inputs:
tag:
description: "Release tag to publish to winget (e.g. v1.4.0)"
required: true
type: string
jobs:
publish:
runs-on: windows-latest
if: github.event_name == 'workflow_dispatch' || !github.event.release.prerelease
steps:
- uses: vedantmgoyal9/winget-releaser@v2
with:
identifier: SiddharthVaddem.OpenScreen
# Match the Windows installer asset attached to each release.
# Today: "Openscreen.Setup.latest.exe". Adjust this regex if you
# ever rename the installer to include a version (e.g. "Setup\.\d+\.\d+\.\d+\.exe").
installers-regex: 'Setup\..*\.exe$'
release-tag: ${{ inputs.tag || github.event.release.tag_name }}
token: ${{ secrets.WINGET_ACC_TOKEN }}
+168
View File
@@ -0,0 +1,168 @@
name: Update Homebrew Cask
on:
release:
types: [published]
workflow_dispatch:
inputs:
tag:
description: "Release tag to publish to the tap (e.g. v1.4.0)"
required: true
type: string
permissions:
contents: read
jobs:
update-cask:
runs-on: ubuntu-latest
if: github.event_name == 'workflow_dispatch' || !github.event.release.prerelease
env:
TAP_OWNER: siddharthvaddem
TAP_REPO: homebrew-openscreen
CASK_NAME: openscreen
steps:
- name: Resolve tag and version
id: meta
env:
GH_EVENT_TAG: ${{ github.event.release.tag_name }}
INPUT_TAG: ${{ inputs.tag }}
run: |
set -euo pipefail
TAG="${GH_EVENT_TAG:-$INPUT_TAG}"
if [[ -z "$TAG" ]]; then
echo "::error::No tag resolved from release event or workflow input"
exit 1
fi
VERSION="${TAG#v}"
echo "tag=$TAG" >> "$GITHUB_OUTPUT"
echo "version=$VERSION" >> "$GITHUB_OUTPUT"
- name: Find macOS DMG assets
id: assets
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
TAG: ${{ steps.meta.outputs.tag }}
REPO: ${{ github.repository }}
run: |
set -euo pipefail
NAMES=$(gh release view "$TAG" --repo "$REPO" --json assets --jq '.assets[].name')
# arm64 DMG: explicit "arm64" / "apple silicon" / fallback to any .dmg
# whose name does NOT contain "x64" or non-mac platform markers.
ARM_NAME=$(echo "$NAMES" | grep -iE '\.dmg$' \
| grep -iE '(arm64|apple[-_. ]?silicon)' | head -n1 || true)
if [[ -z "$ARM_NAME" ]]; then
ARM_NAME=$(echo "$NAMES" | grep -iE '\.dmg$' \
| grep -iv 'x64' | grep -iv 'linux' | grep -iv 'win' | head -n1 || true)
fi
# x64 DMG
X64_NAME=$(echo "$NAMES" | grep -iE '\.dmg$' \
| grep -iE '(x64|x86[-_]?64|intel)' | head -n1 || true)
if [[ -z "$ARM_NAME" || -z "$X64_NAME" ]]; then
echo "::error::Could not locate both arm64 and x64 DMGs in release assets"
echo "Available assets:"
echo "$NAMES"
exit 1
fi
echo "arm_name=$ARM_NAME" >> "$GITHUB_OUTPUT"
echo "x64_name=$X64_NAME" >> "$GITHUB_OUTPUT"
echo "Found arm64 asset: $ARM_NAME"
echo "Found x64 asset: $X64_NAME"
- name: Download DMGs and compute sha256
id: shas
env:
REPO: ${{ github.repository }}
TAG: ${{ steps.meta.outputs.tag }}
ARM_NAME: ${{ steps.assets.outputs.arm_name }}
X64_NAME: ${{ steps.assets.outputs.x64_name }}
run: |
set -euo pipefail
BASE="https://github.com/${REPO}/releases/download/${TAG}"
curl -fsSL --retry 3 -o /tmp/arm.dmg "${BASE}/${ARM_NAME}"
curl -fsSL --retry 3 -o /tmp/x64.dmg "${BASE}/${X64_NAME}"
ARM_SHA=$(sha256sum /tmp/arm.dmg | awk '{print $1}')
X64_SHA=$(sha256sum /tmp/x64.dmg | awk '{print $1}')
echo "arm_sha=$ARM_SHA" >> "$GITHUB_OUTPUT"
echo "x64_sha=$X64_SHA" >> "$GITHUB_OUTPUT"
- name: Checkout tap
uses: actions/checkout@v4
with:
repository: ${{ env.TAP_OWNER }}/${{ env.TAP_REPO }}
token: ${{ secrets.HOMEBREW_TAP_TOKEN }}
path: tap
- name: Write cask file
env:
REPO: ${{ github.repository }}
TAG: ${{ steps.meta.outputs.tag }}
VERSION: ${{ steps.meta.outputs.version }}
ARM_NAME: ${{ steps.assets.outputs.arm_name }}
X64_NAME: ${{ steps.assets.outputs.x64_name }}
ARM_SHA: ${{ steps.shas.outputs.arm_sha }}
X64_SHA: ${{ steps.shas.outputs.x64_sha }}
run: |
set -euo pipefail
mkdir -p tap/Casks
BASE="https://github.com/${REPO}/releases/download/${TAG}"
# #{version} is Ruby interpolation written literally to the cask
# file (bash heredoc leaves "#{...}" alone). \${VERSION}, \${ARM_SHA},
# etc. are bash variables expanded by the heredoc. The literal
# #{version} fixes Homebrew's "URL is unversioned" audit warning by
# making the version string statically detectable.
cat > "tap/Casks/${CASK_NAME}.rb" <<EOF
cask "${CASK_NAME}" do
version "${VERSION}"
on_arm do
sha256 "${ARM_SHA}"
url "https://github.com/${REPO}/releases/download/v#{version}/${ARM_NAME}"
end
on_intel do
sha256 "${X64_SHA}"
url "https://github.com/${REPO}/releases/download/v#{version}/${X64_NAME}"
end
name "Openscreen"
desc "Screen recorder and video editor"
homepage "https://github.com/${REPO}"
auto_updates false
depends_on macos: ">= :big_sur"
app "Openscreen.app"
zap trash: [
"~/Library/Application Support/Openscreen",
"~/Library/Caches/com.siddharthvaddem.openscreen",
"~/Library/Logs/Openscreen",
"~/Library/Preferences/com.siddharthvaddem.openscreen.plist",
"~/Library/Saved Application State/com.siddharthvaddem.openscreen.savedState",
]
end
EOF
- name: Commit and push to tap
working-directory: tap
env:
VERSION: ${{ steps.meta.outputs.version }}
run: |
set -euo pipefail
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add "Casks/${CASK_NAME}.rb"
if git diff --cached --quiet; then
echo "Cask already up to date for ${VERSION} — nothing to commit."
exit 0
fi
git commit -m "Bump ${CASK_NAME} to ${VERSION}"
git push
+76
View File
@@ -0,0 +1,76 @@
# Logs
logs
*.log
npm-debug.log*
yarn-debug.log*
yarn-error.log*
pnpm-debug.log*
lerna-debug.log*
node_modules
dist
dist-electron
dist-ssr
*.local
.env
# Native helper build outputs
/electron/native/wgc-capture/build/
/electron/native/screencapturekit/build/
/electron/native/screencapturekit/.build/
/electron/native/screencapturekit/.swiftpm/
/electron/native/bin/
/tools/ocr/build/
/tools/ocr/dist/
/tools/ocr/models/**/.gitattributes
/tools/ocr/models/**/README.md
# Native macOS generated files
DerivedData/
*.xcuserstate
xcuserdata/
# Editor directories and files
.vscode/*
.zed/
!.vscode/extensions.json
.idea
.DS_Store
*.suo
*.ntvs*
*.njsproj
*.sln
*.sw?
release/**
*.kiro/
.claude/
__pycache__/
*.py[cod]
# npx electron-builder --mac --win
# Playwright
test-results
playwright-report/
# Vitest browser mode screenshots
__screenshots__/
# shell files
/shell.sh
# Nix
result
result-*
.direnv/
#kilocode
.kilo/
#others
**/*.import
# Local agent/tooling state
/.agent/
/.serena/
/.venv-ocr-build/
+1
View File
@@ -0,0 +1 @@
npx lint-staged
+1
View File
@@ -0,0 +1 @@
22.22.1
+57
View File
@@ -0,0 +1,57 @@
# Contribution Guidelines
Thank you for considering contributing to this project! By contributing, you help make this project better for everyone. Please take a moment to review these guidelines to ensure a smooth contribution process.
## How to Contribute
1. **Fork the Repository**
- Click the "Fork" button at the top right of this repository to create your own copy.
2. **Clone Your Fork**
- Clone your forked repository to your local machine:
```bash
git clone https://github.com/your-username/openscreen.git
```
3. **Create a New Branch**
- Create a branch for your feature or bug fix:
```bash
git checkout -b feature/your-feature-name
```
4. **Make Changes**
- Make your changes.
5. **Test Your Changes**
- Test your changes thoroughly to ensure they work as expected and do not break existing functionality.
6. **Commit Your Changes**
- Commit your changes with a clear and concise commit message:
```bash
git add .
git commit -m "Add a brief description of your changes"
```
7. **Push Your Changes**
- Push your branch to your forked repository:
```bash
git push origin feature/your-feature-name
```
8. **Open a Pull Request**
- Go to the original repository and open a pull request from your branch. Provide a clear description of your changes and the problem they solve.
## Reporting Issues
If you encounter a bug or have a feature request, please open an issue in the [Issues](https://github.com/siddharthvaddem/openscreen/issues) section of this repository. Provide as much detail as possible to help us address the issue effectively.
## Style Guide
- Write clear, concise, and descriptive commit messages.
- Include comments where necessary to explain complex code.
## License
By contributing to this project, you agree that your contributions will be licensed under the [MIT License](./LICENSE).
Thank you for your contributions!
+21
View File
@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 Siddharth Vaddem
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Binary file not shown.
-1
View File
@@ -1 +0,0 @@
9ADA2779B4AE6CA85A24956ADDA1614DA3B9F70201F0EF7954EAF8EAD7594998 Openscreen-1.4.10.msi
Binary file not shown.
Binary file not shown.
-1
View File
@@ -1 +0,0 @@
C081544750442C2198BC7D1B35BBED154E07DDA3ADBE1B82F2DF80442499A05D Openscreen-Setup-1.4.10.exe
+187 -6
View File
@@ -1,9 +1,190 @@
# OpenScreen 1.4.10 release assets
> [!WARNING]
> This started as a side project that took off — it's not production grade and you'll hit bugs, but hopefully it covers what you need.
Large installer artifacts are stored on this branch because release uploads are limited by the reverse proxy.
<p align="center">
<img src="public/openscreen.png" alt="OpenScreen Logo" width="64" />
<br />
<br />
<a href="https://trendshift.io/repositories/17427" target="_blank"><img src="https://trendshift.io/api/badge/repositories/17427" alt="siddharthvaddem%2Fopenscreen | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
<br />
<br />
<a href="https://deepwiki.com/siddharthvaddem/openscreen">
<img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki" />
</a>
&nbsp;
<a href="https://discord.gg/yAQQhRaEeg">
<img src="https://dcbadge.limes.pink/api/server/https://discord.gg/yAQQhRaEeg?style=flat" alt="Join Discord" />
</a>
</p>
- Openscreen-Setup-1.4.10.exe
- Openscreen-1.4.10.msi
# <p align="center">OpenScreen</p>
MSI SHA256: 9ADA2779B4AE6CA85A24956ADDA1614DA3B9F70201F0EF7954EAF8EAD7594998
MSI signing: Azure Trusted Signing Private Trust
<p align="center"><strong>OpenScreen is your free, open-source alternative to Screen Studio (sort of).</strong></p>
If you don't want to pay $29/month for Screen Studio but want a much simpler version that does what most people seem to need - quick, polished product demos and walkthroughs you'd post on X, Reddit. OpenScreen does not offer all Screen Studio features, but covers the basics well!
Screen Studio is an awesome product and this is definitely not a 1:1 clone. OpenScreen is a much simpler take, just the basics for folks who want control and don't want to pay. If you need all the fancy features, your best bet is to support Screen Studio (they really do a great job, haha). But if you just want something free (no gotchas) and open, this project does the job!
**100% free** for both **personal** and **commercial** use. Use it, modify it, distribute it — just be cool 😁 and shout out the project if you feel like it.
<p align="center">
<img src="public/preview3.png" alt="OpenScreen App Preview 3" style="height: 0.2467; margin-right: 12px;" />
<img src="public/preview4.png" alt="OpenScreen App Preview 4" style="height: 0.1678; margin-right: 12px;" />
</p>
## Core Features
- Record a specific window, region, or your whole screen.
- Record microphone and system audio.
- Webcam overlay with picture-in-picture, drag-to-position, and shape options.
- Auto or manual zooms with adjustable depth, duration, easing, and pixel-precise position.
- Wallpapers, solid colors, gradients, or a custom background.
- Motion blur for smoother pan and zoom transitions.
- Crop, trim, and per-segment speed control on the timeline.
- Blur effects to hide sensitive parts of the screen.
- Cursor and click highlighting.
- Text, arrow, and image annotations.
- Save and reopen projects without re-recording.
- Export to MP4 or GIF in multiple aspect ratios and resolutions.
- Translated into Arabic, English, Spanish, French, Japanese, Korean, Russian, Turkish, Vietnamese, Simplified Chinese, and Traditional Chinese.
## Installation
Download the latest installer for your platform from the [GitHub Releases](https://github.com/siddharthvaddem/openscreen/releases) page.
### macOS
The easiest way to install on macOS is via [Homebrew](https://brew.sh):
```bash
brew install --cask siddharthvaddem/openscreen/openscreen
```
Brew automatically picks the right build for Apple Silicon or Intel, and verifies the download against a notarized signature so Gatekeeper won't block it.
To update later: `brew upgrade --cask openscreen`
To uninstall: `brew uninstall --cask openscreen` (add `--zap` to also remove app data)
#### Manual install (if you prefer)
If you'd rather grab the `.dmg` directly from the [Releases page](https://github.com/siddharthvaddem/openscreen/releases) and encounter Gatekeeper blocking the app, you can bypass it by running the following command in your terminal after installation:
```bash
xattr -rd com.apple.quarantine /Applications/Openscreen.app
```
Note: Give your terminal Full Disk Access in **System Settings > Privacy & Security** to grant you access and then run the above command.
After running this command, proceed to **System Preferences > Security & Privacy** to grant the necessary permissions for "screen recording" and "accessibility". Once permissions are granted, you can launch the app.
### Windows
Install via [winget](https://learn.microsoft.com/en-us/windows/package-manager/winget/):
```bash
winget install SiddharthVaddem.OpenScreen
```
To update later: `winget upgrade SiddharthVaddem.OpenScreen`
To uninstall: `winget uninstall SiddharthVaddem.OpenScreen`
If you'd rather grab the `.exe` installer directly, download it from the [Releases page](https://github.com/siddharthvaddem/openscreen/releases).
### Linux
Three packages are published to the [Releases page](https://github.com/siddharthvaddem/openscreen/releases) for each version. Pick the one that matches your distro:
**Debian / Ubuntu / Pop!_OS (`.deb`)**
```bash
sudo apt install ./Openscreen-Linux-latest.deb
```
**Arch / Manjaro (`.pacman`)**
```bash
sudo pacman -U Openscreen-Linux-latest.pacman
```
**Any distro (`.AppImage`)**
```bash
chmod +x Openscreen-Linux-*.AppImage
./Openscreen-Linux-*.AppImage
```
**NixOS / Nix (flake)**
Try without installing:
```bash
nix run github:siddharthvaddem/openscreen
```
Install into your user profile:
```bash
nix profile install github:siddharthvaddem/openscreen
```
For a NixOS system config (flake):
```nix
{
inputs.openscreen.url = "github:siddharthvaddem/openscreen";
outputs = { nixpkgs, openscreen, ... }: {
nixosConfigurations.<host> = nixpkgs.lib.nixosSystem {
modules = [
openscreen.nixosModules.default
{ programs.openscreen.enable = true; }
];
};
};
}
```
For Home Manager, use `openscreen.homeManagerModules.default` with the same `programs.openscreen.enable = true;`.
You may need to grant screen recording permissions depending on your desktop environment.
**Sandbox error:** If the AppImage fails to launch with a "sandbox" error, run it with `--no-sandbox`:
```bash
./Openscreen-Linux-*.AppImage --no-sandbox
```
### Limitations
System audio capture relies on Electron's [desktopCapturer](https://www.electronjs.org/docs/latest/api/desktop-capturer) and has some platform-specific quirks:
- **macOS**: Requires macOS 13+. On macOS 14.2+ you'll be prompted to grant audio capture permission. macOS 12 and below does not support system audio (mic still works).
- **Windows**: Works out of the box.
- **Linux**: Needs PipeWire (default on Ubuntu 22.04+, Fedora 34+). Older PulseAudio-only setups may not support system audio (mic should still work).
## Built with
- Electron
- React
- TypeScript
- Vite
- PixiJS
- dnd-timeline
---
## Documentation
See the documentation here:
[OpenScreen Docs](https://deepwiki.com/siddharthvaddem/openscreen)
Refresh if outdated.
## Contributing
Contributions are welcome - please **include screenshots or a short video** for any UI change or new user-facing feature. If it touches what users see or do, show it. Skip only when it genuinely doesn't apply. PRs that don't follow this will be closed.
## Star History
<a href="https://www.star-history.com/?repos=siddharthvaddem%2Fopenscreen&type=date&legend=top-left">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/chart?repos=siddharthvaddem/openscreen&type=date&theme=dark&legend=top-left" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/chart?repos=siddharthvaddem/openscreen&type=date&legend=top-left" />
<img alt="Star History Chart" src="https://api.star-history.com/chart?repos=siddharthvaddem/openscreen&type=date&legend=top-left" />
</picture>
</a>
## License
This project is licensed under the [MIT License](./LICENSE). By using this software, you agree that the authors are not liable for any issues, damages, or claims arising from its use.
+134
View File
@@ -0,0 +1,134 @@
{
"$schema": "https://biomejs.dev/schemas/2.4.12/schema.json",
"vcs": { "enabled": true, "clientKind": "git", "useIgnoreFile": true },
"files": { "ignoreUnknown": false, "includes": ["**", "!**/*.css"] },
"formatter": {
"enabled": true,
"indentStyle": "tab",
"formatWithErrors": true,
"lineEnding": "lf",
"lineWidth": 100,
"attributePosition": "auto"
},
"linter": {
"enabled": true,
"rules": {
"recommended": false,
"complexity": {
"noAdjacentSpacesInRegex": "error",
"noBannedTypes": "error",
"noExtraBooleanCast": "error",
"noUselessCatch": "error",
"noUselessEscapeInRegex": "error",
"noUselessThisAlias": "error",
"noUselessTypeConstraint": "error"
},
"correctness": {
"noConstAssign": "error",
"noConstantCondition": "error",
"noEmptyCharacterClassInRegex": "error",
"noEmptyPattern": "error",
"noGlobalObjectCalls": "error",
"noInnerDeclarations": "error",
"noInvalidConstructorSuper": "error",
"noNonoctalDecimalEscape": "error",
"noPrecisionLoss": "error",
"noSelfAssign": "error",
"noSetterReturn": "error",
"noSwitchDeclarations": "error",
"noUndeclaredVariables": "error",
"noUnreachable": "error",
"noUnreachableSuper": "error",
"noUnsafeFinally": "error",
"noUnsafeOptionalChaining": "error",
"noUnusedLabels": "error",
"noUnusedVariables": "error",
"useExhaustiveDependencies": "warn",
"useHookAtTopLevel": "error",
"useIsNan": "error",
"useValidForDirection": "error",
"useValidTypeof": "error",
"useYield": "error"
},
"style": {
"noNamespace": "off",
"useArrayLiterals": "error",
"useAsConstAssertion": "error",
"useComponentExportOnlyModules": "off"
},
"suspicious": {
"noAssignInExpressions": "error",
"noAsyncPromiseExecutor": "error",
"noCatchAssign": "error",
"noClassAssign": "error",
"noCompareNegZero": "error",
"noControlCharactersInRegex": "error",
"noDebugger": "error",
"noDuplicateCase": "error",
"noDuplicateClassMembers": "error",
"noDuplicateElseIf": "error",
"noDuplicateObjectKeys": "error",
"noDuplicateParameters": "error",
"noEmptyBlockStatements": "warn",
"noExplicitAny": "warn",
"noExtraNonNullAssertion": "error",
"noFallthroughSwitchClause": "error",
"noFunctionAssign": "error",
"noGlobalAssign": "error",
"noImportAssign": "error",
"noIrregularWhitespace": "error",
"noMisleadingCharacterClass": "error",
"noMisleadingInstantiator": "error",
"noNonNullAssertedOptionalChain": "error",
"noPrototypeBuiltins": "error",
"noRedeclare": "error",
"noShadowRestrictedNames": "error",
"noSparseArray": "error",
"noTsIgnore": "error",
"noUnsafeDeclarationMerging": "error",
"noUnsafeNegation": "error",
"noUselessRegexBackrefs": "error",
"noWith": "error",
"useGetterReturn": "error"
}
},
"includes": ["**", "**/dist", "**/.eslintrc.cjs", "!**/*.css"]
},
"javascript": { "formatter": { "quoteStyle": "double" } },
"overrides": [
{
"includes": ["*.ts", "*.tsx", "*.mts", "*.cts"],
"linter": {
"rules": {
"complexity": { "noArguments": "error" },
"correctness": {
"noConstAssign": "off",
"noGlobalObjectCalls": "off",
"noInvalidBuiltinInstantiation": "off",
"noInvalidConstructorSuper": "off",
"noSetterReturn": "off",
"noUndeclaredVariables": "off",
"noUnreachable": "off",
"noUnreachableSuper": "off"
},
"style": { "useConst": "error" },
"suspicious": {
"noDuplicateClassMembers": "off",
"noDuplicateObjectKeys": "off",
"noDuplicateParameters": "off",
"noFunctionAssign": "off",
"noImportAssign": "off",
"noRedeclare": "off",
"noUnsafeNegation": "off",
"noVar": "error",
"useGetterReturn": "off"
}
}
}
}
],
"assist": {
"enabled": true,
"actions": { "source": { "organizeImports": "on" } }
}
}
+22
View File
@@ -0,0 +1,22 @@
{
"$schema": "https://ui.shadcn.com/schema.json",
"style": "new-york",
"rsc": false,
"tsx": true,
"tailwind": {
"config": "tailwind.config.cjs",
"css": "src/index.css",
"baseColor": "stone",
"cssVariables": true,
"prefix": ""
},
"iconLibrary": "lucide",
"aliases": {
"components": "@/components",
"utils": "@/lib/utils",
"ui": "@/components/ui",
"lib": "@/lib",
"hooks": "@/hooks"
},
"registries": {}
}
+39
View File
@@ -0,0 +1,39 @@
# Native Bridge Architecture
## Goal
Provide a single, resilient source of truth for platform-native capabilities while keeping Electron transport thin and renderer APIs unified.
## Layers
1. Native adapters
Platform-specific providers implement stable domain interfaces such as cursor telemetry or system asset discovery.
2. Main-process services
Services orchestrate adapters, own runtime state, and expose domain-level operations.
3. Unified IPC transport
Renderer code talks to a single `native-bridge:invoke` channel using versioned contracts.
4. Renderer client
React code should consume `src/native/client.ts` rather than binding directly to ad hoc Electron APIs.
## Principles
- Single source of truth: runtime-native state lives in the Electron main process.
- Capability-first: renderer can query support before attempting native behavior.
- Versioned contracts: requests and responses are explicit and evolve predictably.
- Resilience: every response uses a consistent result envelope with stable error codes.
## Current rollout
This repository now contains the initial scaffold:
- shared contracts in `src/native/contracts.ts`
- renderer SDK in `src/native/client.ts`
- main-process state store in `electron/native-bridge/store.ts`
- cursor telemetry adapter in `electron/native-bridge/cursor/telemetryCursorAdapter.ts`
- domain services in `electron/native-bridge/services/*`
- unified handler registration in `electron/ipc/nativeBridge.ts`
The legacy `window.electronAPI` surface still exists for backward compatibility. New native-facing features should prefer the unified bridge client.
+935
View File
@@ -0,0 +1,935 @@
# Quy trình triển khai Auto User Guide Generation
Mục tiêu của tính năng này là biến OpenScreen từ công cụ quay màn hình thành công cụ tự tạo tài liệu hướng dẫn sử dụng phần mềm. Người dùng bật Guide Mode, quay thao tác như bình thường, hệ thống ghi lại thời điểm click hoặc hotkey, trích ảnh từ video sau khi quay xong, chạy OCR local để đọc chữ trên giao diện, sau đó dùng AI tạo bản nháp hướng dẫn từng bước.
Tài liệu này được viết để có thể bắt đầu coding ngay: có kiến trúc, schema, file cần thêm/sửa, thứ tự task, tiêu chí test và định nghĩa MVP.
## Trạng Thái MVP Hiện Tại
- Đã có Guide Mode trong HUD, ghi click/marker vào `.guide.json`.
- Đã có GuidePanel trong editor để chạy: prepare events, capture snapshots, OCR, generate draft, export Markdown/HTML.
- Đã có local deterministic draft để test không cần DeepSeek key.
- DeepSeek được gọi khi chọn provider `DeepSeek` và có `DEEPSEEK_API_KEY`.
- OCR local mặc định gọi `OPENSCREEN_GUIDE_OCR_URL` hoặc `http://127.0.0.1:8866/ocr`.
- Verification hiện tại: targeted guide tests pass, `npm test` pass, `npm run build-vite` pass, `npm run i18n:check` pass.
## Mục Tiêu Sản Phẩm
Flow người dùng:
1. Bật Guide Mode.
2. Quay màn hình phần mềm cần hướng dẫn.
3. Trong lúc quay, hệ thống tự ghi timestamp các click chuột.
4. Người dùng có thể bấm một hotkey/nút marker nếu muốn đánh dấu bước thủ công.
5. Sau khi dừng quay, hệ thống trích ảnh màn hình từ video tại các timestamp đó.
6. OCR local đọc text trên ảnh giao diện.
7. Hệ thống map vị trí click tới text/control gần nhất.
8. AI Agent tạo tài liệu dạng từng bước.
9. Người dùng review, sửa nội dung, export Markdown/HTML.
Ví dụ output:
```md
# Hướng dẫn xuất báo cáo
## Bước 1: Mở phần cài đặt
Nhấn nút **Settings** ở thanh điều hướng bên trái.
## Bước 2: Chọn Export
Trong màn hình Settings, chọn **Export report**.
```
## Phạm Vi MVP
MVP cần làm:
- Bật/tắt Guide Mode trước khi quay.
- Tận dụng recorder hiện tại, không viết recorder mới.
- Tận dụng `.cursor.json` hiện tại để lấy click timestamp.
- Thêm marker bằng hotkey hoặc nút trên HUD.
- Tạo sidecar `.guide.json` riêng cho guide.
- Trích screenshot sau khi quay xong, từ video đã lưu.
- OCR local bằng PaddleOCR service.
- Tạo step candidate từ click position + OCR blocks.
- Gọi DeepSeek bằng text metadata, không gửi ảnh mặc định.
- Có panel review trong editor.
- Export Markdown và HTML.
Không làm trong MVP:
- Không chụp screenshot realtime trong lúc quay nếu chưa có benchmark cần thiết.
- Không gửi raw screenshot lên cloud AI mặc định.
- Không sửa schema `.cursor.json` nếu không bắt buộc.
- Không build full UI automation engine.
- Không làm PDF/DOCX ngay.
- Không bundle OCR runtime vào app packaged ngay.
## Code Hiện Có Cần Tận Dụng
Các điểm đã có trong codebase:
- Recording orchestration: `src/hooks/useScreenRecorder.ts`
- Launch/HUD UI: `src/components/launch/LaunchWindow.tsx`
- Source selection: `src/components/launch/SourceSelector.tsx`
- Editor chính: `src/components/video-editor/VideoEditor.tsx`
- Project/session persistence: `src/components/video-editor/projectPersistence.ts`
- Cursor contracts: `src/native/contracts.ts`
- Hook đọc cursor data: `src/native/hooks/useCursorRecordingData.ts`
- IPC main handlers: `electron/ipc/handlers.ts`
- Native bridge: `electron/ipc/nativeBridge.ts`
- Cursor service: `electron/native-bridge/services/cursorService.ts`
- Windows cursor recording: `electron/native-bridge/cursor/recording/windowsNativeRecordingSession.ts`
- macOS cursor recording: `electron/native-bridge/cursor/recording/macNativeCursorRecordingSession.ts`
- Frame/export primitives: `src/lib/exporter/frameRenderer.ts`
Nhận định kỹ thuật:
- Windows/macOS native cursor recording đã có dữ liệu click.
- Cursor sample hiện có thể có `interactionType: "click" | "mouseup" | "move"`.
- Editor hiện đã dùng click timestamp để render hiệu ứng click.
- Vì schema cursor đang được nhiều nơi dùng, MVP nên tạo `.guide.json` riêng thay vì mở rộng `.cursor.json`.
## Kiến Trúc Tổng Thể
```mermaid
flowchart TD
A["User bật Guide Mode"] --> B["Quay video bằng recorder hiện tại"]
B --> C["Cursor recorder ghi click timestamp"]
B --> D["Hotkey/HUD marker ghi manual event"]
C --> E["Dừng quay"]
D --> E
E --> F["Guide assembler tạo .guide.json"]
F --> G["Snapshot extractor seek video và xuất PNG"]
G --> H["PaddleOCR local đọc text + bounding boxes"]
H --> I["Target mapper map click tới OCR text/control"]
I --> J["DeepSeek/local LLM viết draft guide"]
J --> K["GuidePanel cho user review/sửa"]
K --> L["Export Markdown/HTML"]
```
Quyết định chính:
- Realtime recording chỉ ghi event/timestamp, không xử lý OCR/AI.
- Screenshot được trích từ video sau khi quay, tránh ảnh hưởng performance recorder.
- OCR chạy local-first.
- DeepSeek chỉ nhận text metadata trừ khi user opt-in gửi ảnh.
- Guide data nằm cạnh recording artifact.
## File Cần Thêm
```text
src/guide/
contracts.ts
eventBuilder.ts
targetMapper.ts
promptBuilder.ts
generatedGuideSchema.ts
snapshot/
extractGuideSnapshots.ts
export/
markdownExporter.ts
htmlExporter.ts
__tests__/
eventBuilder.test.ts
targetMapper.test.ts
promptBuilder.test.ts
markdownExporter.test.ts
src/components/video-editor/guide/
GuidePanel.tsx
GuideStepList.tsx
GuideStepEditor.tsx
GuideSnapshotPreview.tsx
electron/guide/
guideStore.ts
guidePaths.ts
guideIpc.ts
ocr/
paddleOcrClient.ts
ai/
deepseekGuideClient.ts
```
File hiện có khả năng phải sửa:
- `src/hooks/useScreenRecorder.ts`
- `src/components/launch/LaunchWindow.tsx`
- `src/components/video-editor/VideoEditor.tsx`
- `electron/ipc/handlers.ts`
- `electron/preload.ts`
- file khai báo type cho `window.electronAPI`
- `package.json` nếu thêm script test hoặc dependency nhỏ
## Artifact Đầu Ra
Với video `recording-123.mp4`, hệ thống tạo:
```text
recording-123.mp4
recording-123.cursor.json
recording-123.guide.json
recording-123-guide/
step-001.png
step-002.png
ocr.json
guide.md
guide.html
```
Quy tắc:
- `.cursor.json` vẫn là dữ liệu cursor gốc.
- `.guide.json` là source of truth cho guide workflow.
- Folder `recording-123-guide/` chứa file phát sinh từ guide.
- `guide.md``guide.html` có thể được tạo lại từ `.guide.json`.
## Contract Chính
Tạo `src/guide/contracts.ts`.
```ts
export type GuideEventKind = "click" | "hotkey" | "manual";
export type GuideEventSource =
| "cursor-recording"
| "guide-hotkey"
| "review-ui";
export interface GuideEvent {
id: string;
recordingId: string;
kind: GuideEventKind;
source: GuideEventSource;
timeMs: number;
x?: number;
y?: number;
normalizedX?: number;
normalizedY?: number;
button?: "left" | "right" | "middle" | "unknown";
label?: string;
screenshotOffsetMs?: number;
createdAt: string;
}
export interface GuideSnapshot {
id: string;
eventId: string;
timeMs: number;
offsetMs: number;
path: string;
width: number;
height: number;
}
export interface OcrBlock {
id: string;
snapshotId: string;
text: string;
confidence: number;
box: {
x: number;
y: number;
width: number;
height: number;
};
}
export interface GuideStepCandidate {
id: string;
eventId: string;
snapshotId?: string;
timeMs: number;
action: "click" | "choose" | "type" | "wait" | "manual";
targetText?: string;
targetRole?: "button" | "menu" | "tab" | "field" | "link" | "unknown";
nearbyText: string[];
confidence: number;
}
export interface GeneratedGuideStep {
id: string;
order: number;
title: string;
instruction: string;
screenshotPath?: string;
sourceCandidateId?: string;
}
export interface GeneratedGuide {
title: string;
summary?: string;
steps: GeneratedGuideStep[];
}
export interface GuideSession {
schemaVersion: 1;
recordingId: string;
videoPath: string;
cursorPath?: string;
guidePath: string;
outputDir: string;
status:
| "recording"
| "events-ready"
| "snapshots-ready"
| "ocr-ready"
| "draft-ready"
| "reviewed";
events: GuideEvent[];
snapshots: GuideSnapshot[];
ocrBlocks: OcrBlock[];
candidates: GuideStepCandidate[];
generatedGuide?: GeneratedGuide;
createdAt: string;
updatedAt: string;
}
```
Quy tắc dữ liệu:
- `timeMs` luôn tính theo timeline video cuối cùng.
- `x/y` là tọa độ pixel nếu có.
- `normalizedX/Y` dùng để chống lệch khi video scale.
- `screenshotOffsetMs` mặc định `500`, nghĩa là lấy ảnh sau click 0.5 giây để bắt trạng thái UI sau thao tác.
- AI output chỉ là draft, user edit mới là nội dung cuối.
## IPC Cần Thêm
MVP dùng app-level Electron IPC, không cần đưa vào native bridge vì đây là workflow cấp ứng dụng.
Preload API đề xuất:
```ts
window.electronAPI.guide = {
startSession(recordingId: string): Promise<GuideSession>;
addMarker(input: AddGuideMarkerInput): Promise<GuideEvent>;
finalizeEvents(input: FinalizeGuideEventsInput): Promise<GuideSession>;
writeSnapshot(input: WriteGuideSnapshotInput): Promise<GuideSnapshot>;
runOcr(input: RunGuideOcrInput): Promise<GuideSession>;
generateDraft(input: GenerateGuideDraftInput): Promise<GuideSession>;
saveGuide(input: SaveGuideInput): Promise<GuideSession>;
exportMarkdown(input: ExportGuideInput): Promise<{ path: string }>;
exportHtml(input: ExportGuideInput): Promise<{ path: string }>;
};
```
Input types:
```ts
export interface AddGuideMarkerInput {
recordingId: string;
timeMs: number;
kind: "hotkey" | "manual";
label?: string;
}
export interface FinalizeGuideEventsInput {
recordingId: string;
videoPath: string;
cursorPath?: string;
}
export interface WriteGuideSnapshotInput {
recordingId: string;
eventId: string;
timeMs: number;
offsetMs: number;
pngBytes: ArrayBuffer;
width: number;
height: number;
}
export interface RunGuideOcrInput {
recordingId: string;
snapshotIds?: string[];
}
export interface GenerateGuideDraftInput {
recordingId: string;
language: "vi" | "en";
provider: "deepseek" | "local";
}
export interface SaveGuideInput {
recordingId: string;
generatedGuide: GeneratedGuide;
}
export interface ExportGuideInput {
recordingId: string;
}
```
## Phase 1: Contracts, Store, IPC
Mục tiêu: tạo khung lưu trữ `.guide.json` mà chưa đụng recorder.
Task coding:
1. Tạo `src/guide/contracts.ts`.
2. Tạo `electron/guide/guidePaths.ts`.
3. Tạo `electron/guide/guideStore.ts`.
4. Tạo `electron/guide/guideIpc.ts`.
5. Register guide IPC trong `electron/ipc/handlers.ts`.
6. Expose API trong `electron/preload.ts`.
7. Bổ sung type cho `window.electronAPI.guide`.
Yêu cầu kỹ thuật:
- Ghi file atomically: write temp file rồi rename.
- Validate `schemaVersion`.
- Không throw raw error ra renderer, trả error code ổn định.
- Không yêu cầu AI/OCR trong phase này.
Acceptance:
- Tạo được guide session fake bằng IPC.
- Đọc/ghi `.guide.json` round-trip không mất dữ liệu.
- Input thiếu `recordingId` hoặc `videoPath` bị reject rõ ràng.
Test:
- `guideStore` tạo path đúng.
- `guideStore` đọc file lỗi schema và trả error.
- IPC handler reject input thiếu field.
## Phase 2: Build Event Từ Cursor Click
Mục tiêu: lấy click event từ `.cursor.json` hiện tại.
Task coding:
1. Tạo `src/guide/eventBuilder.ts`.
2. Thêm hàm `buildGuideEventsFromCursor`.
3. Lọc sample có `interactionType === "click"`.
4. Convert sang `GuideEvent`.
5. De-duplicate click trong cửa sổ `250ms`.
6. Sort theo `timeMs`.
7. Merge với marker thủ công nếu có.
Pseudo-code:
```ts
export function buildGuideEventsFromCursor(input: {
recordingId: string;
samples: CursorRecordingSample[];
videoWidth?: number;
videoHeight?: number;
}): GuideEvent[] {
const events = input.samples
.filter((sample) => sample.interactionType === "click")
.map((sample) => ({
id: createGuideEventId(input.recordingId, sample.timeMs),
recordingId: input.recordingId,
kind: "click" as const,
source: "cursor-recording" as const,
timeMs: sample.timeMs,
x: sample.cx,
y: sample.cy,
normalizedX: normalize(sample.cx, input.videoWidth),
normalizedY: normalize(sample.cy, input.videoHeight),
button: "left" as const,
screenshotOffsetMs: 500,
createdAt: new Date().toISOString(),
}));
return sortGuideEvents(dedupeGuideEvents(events));
}
```
Acceptance:
- 5 click samples tạo 5 guide events.
- `move``mouseup` không tạo step.
- Double click hoặc click bounce không tạo quá nhiều step nếu nằm trong dedupe window.
- Không có cursor click thì vẫn dùng được manual marker.
Test:
- convert click sample.
- bỏ qua move/mouseup.
- dedupe theo thời gian.
- sort đúng thứ tự.
- xử lý sample thiếu tọa độ.
## Phase 3: Guide Mode UI Và Manual Marker
Mục tiêu: user bật được Guide Mode và đánh dấu bước thủ công.
Task coding:
1. Thêm Guide Mode toggle trong `LaunchWindow.tsx`.
2. Truyền trạng thái guide vào flow recording trong `useScreenRecorder.ts`.
3. Khi start recording và Guide Mode on, gọi `guide.startSession(recordingId)`.
4. Thêm nút marker trong HUD.
5. Thêm global hotkey ở Electron main, ví dụ `CommandOrControl+Shift+G`.
6. Khi bấm marker/hotkey, gọi `guide.addMarker`.
7. Khi stop recording, gọi `guide.finalizeEvents`.
Lưu ý:
- Global hotkey phải nằm ở Electron main vì app đang được quay có thể đang focus.
- Nếu register hotkey fail, UI vẫn dùng nút marker.
- Không làm thay đổi behavior khi Guide Mode off.
Acceptance:
- Guide Mode off: quay/sửa/export vẫn như cũ.
- Guide Mode on: stop recording tạo `.guide.json`.
- Hotkey tạo event đúng timestamp.
- Cancel recording không để lại guide artifact rác.
## Phase 4: Snapshot Extraction
Mục tiêu: trích ảnh PNG cho từng event sau khi quay xong.
Quyết định MVP:
- Không chụp realtime trong lúc quay.
- Dùng video đã lưu, seek tới timestamp cần lấy.
- Thực hiện trong renderer/editor bằng hidden `<video>` + `<canvas>`.
- Persist PNG qua IPC.
Task coding:
1. Tạo `src/guide/snapshot/extractGuideSnapshots.ts`.
2. Nhận `GuideSession` + `videoPath`.
3. Với mỗi event, lấy timestamp `event.timeMs + screenshotOffsetMs`.
4. Clamp timestamp vào duration video.
5. Seek hidden video tới timestamp.
6. Draw frame vào canvas.
7. Convert canvas thành PNG bytes.
8. Gọi `guide.writeSnapshot`.
9. Update `.guide.json` với danh sách snapshots.
Acceptance:
- Mỗi event có một ảnh `step-xxx.png`.
- Nếu một event fail snapshot, các event khác vẫn chạy.
- Ảnh được lưu trong `recording-123-guide/`.
- UI báo lỗi recoverable, không crash editor.
Test:
- clamp timestamp.
- tên file đúng thứ tự.
- handle seek timeout.
- không abort toàn bộ batch khi một frame lỗi.
## Phase 5: OCR Local
Mục tiêu: đọc text trên giao diện phần mềm từ screenshot.
Khuyến nghị:
- Dùng PaddleOCR làm OCR chính.
- Tesseract chỉ nên là fallback đơn giản.
- VLM local như Gemma 3 4B/MiniCPM/Qwen-VL chỉ dùng cho trường hợp icon/no-text khó, không dùng làm OCR chính.
Kiến trúc MVP:
- Chạy PaddleOCR như local HTTP service tại `127.0.0.1:8866`.
- Electron main gọi OCR service.
- Renderer không gọi OCR trực tiếp.
API local OCR đề xuất:
```http
GET /health
POST /ocr
Content-Type: application/json
{
"imagePath": "D:\\Code\\OpenScreen\\recording-123-guide\\step-001.png",
"language": "vi,en"
}
```
Response:
```json
{
"blocks": [
{
"text": "Settings",
"confidence": 0.97,
"box": { "x": 120, "y": 80, "width": 90, "height": 24 }
}
]
}
```
Task coding:
1. Tạo `electron/guide/ocr/paddleOcrClient.ts`.
2. Thêm health check.
3. Gọi OCR cho từng snapshot.
4. Convert output sang `OcrBlock`.
5. Ghi `ocrBlocks` vào `.guide.json`.
6. Ghi bản tổng hợp vào `recording-123-guide/ocr.json`.
Config đề xuất:
```ts
export interface GuideOcrConfig {
provider: "paddleocr";
baseUrl: string; // default http://127.0.0.1:8866
language: string; // default vi,en
timeoutMs: number; // default 30000
}
```
Acceptance:
- OCR service offline thì UI báo lỗi rõ ràng.
- OCR fail không xóa snapshots.
- OCR result có text, confidence, bounding box.
- Guide vẫn export thủ công được nếu OCR không chạy.
## Phase 6: Target Mapper
Mục tiêu: xác định user đã click vào nút/menu/field nào dựa trên tọa độ click và OCR.
Task coding:
1. Tạo `src/guide/targetMapper.ts`.
2. Với mỗi `GuideEvent`, lấy snapshot tương ứng.
3. Lấy OCR blocks của snapshot đó.
4. Score từng OCR block.
5. Chọn target tốt nhất.
6. Sinh `GuideStepCandidate`.
Scoring đề xuất:
- `+100` nếu click nằm trong OCR box.
- Điểm cao hơn nếu box center gần click hơn.
- Cộng điểm nếu text ngắn, giống label button/menu.
- Trừ điểm nếu confidence thấp.
- Nếu không có block đủ tốt, để `targetRole: "unknown"`.
Role heuristic:
- `button`: click vào/near text dạng action label.
- `menu`: text nằm trong danh sách dọc.
- `tab`: text nằm trong hàng ngang gần đầu giao diện.
- `field`: click vào vùng giống input.
- `unknown`: không đủ tự tin.
Acceptance:
- Click trực tiếp vào nút text map đúng target text.
- Click gần label map được OCR block gần nhất.
- Click vùng icon/no-text tạo candidate confidence thấp để user review.
Test:
- click inside box.
- nearest box.
- low-confidence penalty.
- no OCR fallback.
## Phase 7: AI Draft Generation
Mục tiêu: tạo bản nháp hướng dẫn từ candidate metadata.
Provider MVP:
- DeepSeek API cho cloud text generation.
- Local LLM có thể thêm sau qua cùng prompt contract.
- Không gửi ảnh lên DeepSeek mặc định.
Task coding:
1. Tạo `src/guide/promptBuilder.ts`.
2. Tạo `electron/guide/ai/deepseekGuideClient.ts`.
3. Đọc API key ở Electron main qua env/config.
4. Build prompt từ candidates + OCR nearby text.
5. Yêu cầu output JSON.
6. Validate output.
7. Ghi `generatedGuide` vào `.guide.json`.
Env:
```powershell
$env:DEEPSEEK_API_KEY="..."
$env:DEEPSEEK_BASE_URL="https://api.deepseek.com"
$env:DEEPSEEK_MODEL="deepseek-v4-flash"
```
Prompt input:
```json
{
"language": "vi",
"softwareContext": {
"recordingName": "recording-123",
"userGoal": "Tạo báo cáo"
},
"steps": [
{
"order": 1,
"eventKind": "click",
"targetText": "Settings",
"targetRole": "button",
"nearbyText": ["Home", "Settings", "Account"],
"confidence": 0.91
}
]
}
```
Expected AI output:
```json
{
"title": "Hướng dẫn thao tác",
"summary": "Tài liệu này mô tả các bước thực hiện thao tác đã ghi hình.",
"steps": [
{
"order": 1,
"title": "Mở phần cài đặt",
"instruction": "Nhấn nút Settings ở thanh điều hướng bên trái."
}
]
}
```
Acceptance:
- Thiếu API key thì UI báo lỗi rõ ràng.
- AI trả invalid JSON thì reject và cho retry.
- Output được validate trước khi lưu.
- Có thể generate tiếng Việt.
Test:
- promptBuilder không đưa raw image vào prompt.
- parser reject JSON sai schema.
- DeepSeek client handle timeout/401/rate limit.
## Phase 8: GuidePanel Review UI
Mục tiêu: người dùng sửa được guide trước khi export.
Task coding:
1. Tạo `src/components/video-editor/guide/GuidePanel.tsx`.
2. Tạo `GuideStepList.tsx`.
3. Tạo `GuideStepEditor.tsx`.
4. Tạo `GuideSnapshotPreview.tsx`.
5. Mount panel trong `VideoEditor.tsx`.
6. Load `.guide.json` khi mở video có guide sidecar.
7. Thêm action:
- Generate snapshots.
- Run OCR.
- Generate AI draft.
- Save edits.
- Export Markdown.
- Export HTML.
UX:
- AI output là draft, không khóa nội dung.
- User sửa title/instruction từng step.
- User xóa step nhiễu.
- User merge step sau nếu cần.
- Confidence hiển thị nhỏ, không làm UI rối.
- Guide fail không ảnh hưởng video editing.
Acceptance:
- User sửa step và save được.
- User xóa step được.
- Regenerate cần confirm nếu đang có manual edits.
- Export dùng nội dung đã sửa, không dùng lại AI raw output.
## Phase 9: Export Markdown/HTML
Mục tiêu: tạo tài liệu dùng được ngay.
Task coding:
1. Tạo `src/guide/export/markdownExporter.ts`.
2. Tạo `src/guide/export/htmlExporter.ts`.
3. Gọi exporter từ Electron IPC.
4. Ghi file vào `recording-123-guide/guide.md`.
5. Ghi file vào `recording-123-guide/guide.html`.
6. Dùng relative screenshot link.
Markdown format:
```md
# Hướng dẫn thao tác
Tài liệu này mô tả các bước thực hiện thao tác đã ghi hình.
## Bước 1: Mở phần cài đặt
Nhấn nút **Settings** ở thanh điều hướng bên trái.
![Bước 1](step-001.png)
```
Acceptance:
- Markdown mở được và thấy ảnh local.
- HTML mở được bằng browser.
- Export vẫn chạy nếu guide được viết thủ công, không cần AI.
## Thứ Tự Coding Ngay
Nên làm theo thứ tự này để giảm rủi ro:
1. Tạo `src/guide/contracts.ts`.
2. Tạo `electron/guide/guidePaths.ts`.
3. Tạo `electron/guide/guideStore.ts`.
4. Tạo `electron/guide/guideIpc.ts`.
5. Expose `window.electronAPI.guide`.
6. Viết unit test cho guide store.
7. Tạo `src/guide/eventBuilder.ts`.
8. Viết unit test convert cursor samples sang guide events.
9. Thêm Guide Mode toggle vào launch UI.
10. Gọi `startSession` khi bắt đầu quay.
11. Gọi `finalizeEvents` khi dừng quay.
12. Tạo snapshot extractor trong renderer.
13. Tạo `paddleOcrClient`.
14. Tạo `targetMapper`.
15. Tạo `promptBuilder`.
16. Tạo `deepseekGuideClient`.
17. Tạo `GuidePanel`.
18. Tạo Markdown/HTML exporters.
19. Chạy lint/test/build.
20. Test thủ công flow đầy đủ.
Chia PR đề xuất:
- PR 1: contracts, store, IPC, unit tests.
- PR 2: cursor-click event builder, Guide Mode toggle, manual marker.
- PR 3: snapshot extraction và GuidePanel shell.
- PR 4: PaddleOCR integration và target mapping.
- PR 5: DeepSeek generation, review UI, Markdown/HTML export.
## Error Codes
Dùng error code ổn định để UI xử lý:
```ts
export type GuideErrorCode =
| "guide-session-not-found"
| "guide-invalid-schema"
| "guide-video-load-failed"
| "guide-snapshot-failed"
| "guide-ocr-unavailable"
| "guide-ocr-failed"
| "guide-ai-key-missing"
| "guide-ai-request-failed"
| "guide-ai-invalid-output"
| "guide-export-failed";
```
Quy tắc:
- IPC không throw raw provider error ra renderer.
- OCR fail là recoverable.
- AI fail là recoverable.
- Export fail phải giữ nguyên `.guide.json`.
## Local Development
Baseline:
```powershell
npm install
npm run lint
npm test
npm run build-vite
```
OCR service dev:
```powershell
python -m venv .venv-ocr
.venv-ocr\Scripts\Activate.ps1
pip install paddleocr fastapi uvicorn
uvicorn local_ocr_service:app --host 127.0.0.1 --port 8866
```
DeepSeek env:
```powershell
$env:DEEPSEEK_API_KEY="..."
$env:DEEPSEEK_BASE_URL="https://api.deepseek.com"
$env:DEEPSEEK_MODEL="deepseek-v4-flash"
```
Không commit API key.
## Testing Matrix
Unit tests:
- `eventBuilder`: cursor sample -> guide events.
- `targetMapper`: OCR blocks -> step candidates.
- `promptBuilder`: candidates -> AI prompt.
- `markdownExporter`: generated guide -> Markdown.
- `htmlExporter`: generated guide -> HTML.
Renderer/browser tests:
- snapshot extractor seek video fixture.
- GuidePanel edit/delete/save step.
Manual integration:
1. Quay với Guide Mode off, xác nhận behavior cũ không đổi.
2. Quay với Guide Mode on và 3 click.
3. Kiểm tra `.guide.json` có 3 click events.
4. Generate snapshots.
5. Run OCR local.
6. Generate Vietnamese draft.
7. Sửa một step.
8. Export Markdown.
9. Mở Markdown/HTML xem ảnh local.
10. Tắt OCR service và test lỗi recoverable.
11. Xóa DeepSeek key và test lỗi recoverable.
Lệnh trước khi merge:
```powershell
npm run lint
npm test
npm run build-vite
```
Nếu phase không sửa native recorder thì chưa cần chạy native helper tests.
## Definition Of Done Cho MVP
MVP được xem là xong khi:
- Guide Mode bật/tắt được.
- Guide Mode off không ảnh hưởng recording hiện tại.
- Click events lấy được từ cursor telemetry hiện có.
- Hotkey/HUD marker tạo event thủ công.
- `.guide.json` được tạo cạnh recording.
- Snapshot PNG được trích từ final video.
- PaddleOCR local đọc được text và bounding boxes.
- Target mapper tạo step candidates.
- DeepSeek tạo được draft tiếng Việt từ text metadata.
- User review/sửa/xóa step được.
- Export Markdown/HTML dùng nội dung đã review.
- Lint/test/build pass.
## Nâng Cấp Sau MVP
- Export PDF/DOCX.
- Bundle local OCR runtime vào packaged app.
- Thêm local VLM fallback cho icon-only control.
- Cho phép user opt-in gửi crop ảnh lên remote vision model.
- Merge/dedupe step thông minh hơn cho double click/menu navigation.
- Dùng transcript giọng nói làm ngữ cảnh thêm.
- Template theo từng loại phần mềm.
- Computer vision detect UI element ngoài OCR.
@@ -0,0 +1,210 @@
# macOS Native Recorder Roadmap
OpenScreen's macOS recorder should follow the same architecture boundaries as the Windows native recorder: Electron owns session orchestration and persistence, while a platform-native helper owns capture, timing, encoding, and platform-specific permissions.
This work is intentionally scoped as a macOS-only port. Windows native capture remains owned by the WGC helper, and Linux remains on the existing Electron path.
## Goals
- Capture displays and windows through ScreenCaptureKit.
- Exclude the real system cursor during capture when using the editable OpenScreen cursor overlay.
- Preserve the current high-quality cursor overlay path in preview and export.
- Capture macOS system audio through ScreenCaptureKit on supported macOS versions.
- Capture microphone audio through the same native timing domain where the OS supports it, or through an explicit companion path until it can be moved into the helper.
- Mix system audio and microphone audio into the primary MP4 without renderer-side track assembly.
- Capture webcam video natively and compose it into the helper-owned MP4 during the native-recording migration.
- Keep screen video, audio, webcam, and cursor aligned to one native timing origin.
- Package per-architecture helper binaries with macOS builds.
## Non-Goals
- Replacing the editor/export pipeline.
- Changing Windows native capture behavior.
- Adding Linux native capture.
- Shipping a silent fallback from native macOS capture to Electron capture when the user explicitly requested a native-only feature.
## Architecture
The renderer keeps the existing recording controls. On macOS, `useScreenRecorder` should eventually send a complete recording request to Electron instead of assembling display, audio, microphone, webcam, and cursor streams in the browser.
Electron owns the native recording session:
- resolves the selected display/window source;
- resolves output paths;
- starts cursor telemetry capture when editable cursor mode is selected;
- starts the ScreenCaptureKit helper process;
- sends pause/resume/stop/cancel commands;
- writes `RecordingSession` manifests;
- reports explicit errors when a macOS-native capability is unavailable.
The helper owns macOS media capture:
- ScreenCaptureKit display/window frames;
- ScreenCaptureKit system audio where supported;
- microphone capture or helper-owned companion audio capture;
- webcam capture and initial picture-in-picture composition;
- AVFoundation/VideoToolbox encoding and muxing;
- stream timestamp normalization.
## Helper Contract V1
The helper receives a single JSON argument:
```json
{
"schemaVersion": 1,
"recordingId": 1234567890,
"source": {
"type": "display",
"sourceId": "screen:0:0",
"displayId": 1,
"windowId": null,
"bounds": { "x": 0, "y": 0, "width": 1920, "height": 1080 }
},
"video": {
"fps": 60,
"width": 1920,
"height": 1080,
"bitrate": 18000000,
"hideSystemCursor": true
},
"audio": {
"system": { "enabled": true },
"microphone": {
"enabled": true,
"deviceId": "default",
"deviceName": "MacBook Pro Microphone",
"gain": 1.4
}
},
"webcam": {
"enabled": true,
"deviceId": "default",
"deviceName": "FaceTime HD Camera",
"width": 1280,
"height": 720,
"fps": 30
},
"cursor": {
"mode": "editable-overlay"
},
"outputs": {
"screenPath": "/Users/me/Library/Application Support/openscreen/recordings/recording-123.mp4",
"manifestPath": "/Users/me/Library/Application Support/openscreen/recordings/recording-123.session.json"
}
}
```
The helper emits newline-delimited JSON events to stdout:
```json
{ "event": "ready", "schemaVersion": 1 }
{ "event": "recording-started", "timestampMs": 1234567890 }
{ "event": "warning", "code": "microphone-unavailable", "message": "..." }
{ "event": "recording-stopped", "screenPath": "..." }
{ "event": "error", "code": "screen-permission-denied", "message": "..." }
```
## Implementation Phases
Current PR status: macOS screen/window capture routes through the ScreenCaptureKit helper when it is available so editable-cursor recordings can hide the system cursor. The helper now writes ScreenCaptureKit system audio into the primary MP4 and attempts runtime-gated native microphone capture on macOS versions that expose ScreenCaptureKit microphone output. Webcam capture is currently an Electron-recorded sidecar attached to the same recording session; native AVFoundation webcam composition remains the target end state.
### 1. Native Session Boundary
- Add a structured macOS native recording request type.
- Add a macOS helper resolver and build script placeholders.
- Keep the helper contract process-based, matching the Windows helper boundary.
- Do not route production macOS recording through this helper until the helper is available and validated.
Acceptance:
- TypeScript build passes.
- The macOS helper path and request contract are documented and testable without affecting Windows/Linux behavior.
### 2. ScreenCaptureKit Display Capture
- Implement a Swift helper using ScreenCaptureKit.
- Select display captures by `displayId`.
- Encode H.264 MP4 through AVFoundation/VideoToolbox.
- Set `showsCursor = false` when editable cursor overlay mode is selected.
Acceptance:
- Display-only recording produces a valid MP4.
- The real cursor is not baked into editable-cursor recordings.
### 3. ScreenCaptureKit Window Capture
- Resolve Electron `window:*` selections to ScreenCaptureKit window ids.
- Capture `SCContentFilter(desktopIndependentWindow:)`.
- Handle closed/minimized/protected windows with explicit errors.
- Keep window selection and capture source resolution in Electron/main, not the renderer.
Acceptance:
- Capturing a normal app window works with cursor/audio/webcam disabled.
- Unsupported windows return clear native errors.
### 4. System Audio
- Enable ScreenCaptureKit system audio on supported macOS versions.
- Keep audio format and timing owned by the helper.
- Encode or mux AAC audio into the primary MP4.
Acceptance:
- System-audio-only recordings produce a valid AAC track.
- Unsupported macOS versions return an explicit capability error.
### 5. Microphone
- Resolve the selected microphone device from the renderer-provided browser `deviceId` and user-visible label.
- Capture microphone audio in the helper timing domain.
- Apply OpenScreen microphone gain policy.
- Mix system and microphone audio before final AAC output.
Acceptance:
- Mic-only and mic-plus-system recordings produce a valid, balanced AAC track.
- Device selection honors the selected microphone, not only the default device.
### 6. Webcam Composition
- Capture the selected camera natively through AVFoundation.
- Match browser device id first where possible, then user-visible label.
- Compose an initial picture-in-picture overlay into the primary MP4.
- Hide webcam output until the first usable frame to avoid black startup flashes.
Acceptance:
- Native display/window recordings can include webcam without returning to Electron capture.
- Selected camera is honored.
### 7. Runtime Controls
- Add pause/resume commands to the helper.
- Add cancel command that removes partial outputs.
- Keep restart as stop-discard-start until the helper exposes a native restart operation.
Acceptance:
- Pause/resume keeps output duration coherent.
- Cancel leaves no stale media/session files.
### 8. Test Pipeline
- `npm run build:native:mac`: builds Swift helper binaries on macOS.
- `npm run test:sck-helper:mac`: display-only helper smoke test.
- `npm run test:sck-window:mac`: window capture smoke test.
- `npm run test:sck-audio:mac`: system audio smoke test when supported.
- `npm run test:sck-mic:mac`: microphone smoke test.
- `npm run test:sck-webcam:mac`: webcam smoke test when a webcam is available.
- Packaging check: confirms helpers are available under `electron/native/bin/darwin-${arch}` in packaged builds.
## SSOT Rules
- `src/lib/nativeMacRecording.ts` is the renderer/main TypeScript request contract.
- This document is the feature-level contract and phase checklist.
- The Swift helper owns ScreenCaptureKit/AVFoundation media timing.
- Electron owns output paths, session manifests, and selected source/device resolution.
- Renderer code must use existing hooks/client APIs and should not bind directly to helper process details.
@@ -0,0 +1,73 @@
# PaddleOCR Local Service
OpenScreen calls OCR through a local HTTP service. The default endpoint is:
```text
http://127.0.0.1:8866/ocr
```
The app sends either `imageBase64` or `path` and expects OCR blocks:
```json
{
"blocks": [
{
"text": "Settings",
"confidence": 0.97,
"box": { "x": 120, "y": 80, "width": 90, "height": 24 }
}
]
}
```
## Install
Use a separate virtual environment because PaddleOCR and PaddlePaddle are large dependencies.
```powershell
python -m venv .venv-ocr
.\.venv-ocr\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r tools\ocr\requirements.txt
```
If `paddle` is still missing after installing `paddleocr`, install the CPU PaddlePaddle wheel that matches your Python and OS from the official PaddlePaddle install guide.
## Run
```powershell
.\.venv-ocr\Scripts\Activate.ps1
$env:PADDLEOCR_DEVICE="cpu"
$env:PADDLEOCR_LANG="latin"
npm run ocr:paddle
```
Keep this terminal open while using the Guide OCR step in OpenScreen.
## Verify
```powershell
Invoke-WebRequest http://127.0.0.1:8866/health -UseBasicParsing
```
Expected healthy environment:
```json
{
"ok": true,
"paddleocrInstalled": true,
"paddleInstalled": true,
"engineReady": false,
"defaultLanguage": "latin"
}
```
`engineReady` becomes `true` after the first OCR request. The first request can be slow because PaddleOCR downloads and loads models.
## Configuration
- `PADDLEOCR_DEVICE`: `cpu`, `gpu:0`, or another PaddleOCR device string.
- `PADDLEOCR_LANG`: defaults to `latin`; this is preferred for Vietnamese UI text because it uses a Latin-script recognition model.
- `PADDLEOCR_VERSION`: defaults to `PP-OCRv5`.
- `PADDLEOCR_USE_MOBILE`: defaults to `1`; set to `0` to use the default/server models.
- `OPENSCREEN_GUIDE_OCR_URL`: OpenScreen OCR endpoint override; defaults to `http://127.0.0.1:8866`.
@@ -0,0 +1,248 @@
# Windows Native Recorder Roadmap
OpenScreen's Windows recorder should be owned by one native backend. Electron capture can remain available for non-Windows platforms and temporary developer diagnostics, but Windows production recording should not silently fall back to `getDisplayMedia` / `MediaRecorder`.
## Goals
- Capture displays and windows through Windows Graphics Capture (WGC).
- Render the native Windows cursor as OpenScreen's high-quality scalable cursor overlay.
- Capture system audio through WASAPI loopback.
- Capture microphone audio through WASAPI.
- Mix system audio and microphone audio into the primary screen recording.
- Capture webcam video natively and compose it into the Windows helper MP4 during the native-recording migration.
- Keep preview/export aligned because screen video, audio, webcam, and cursor share one native timing origin.
- Keep exported MP4s Windows-friendly: H.264 video plus AAC audio. Opus-in-MP4 is not an acceptable Windows export target.
- Package the native helper with the Windows app.
## Non-Goals
- Replacing the editor/export pipeline.
- Replacing the editor/export pipeline. A later pass can reintroduce a separate editable native `webcamVideoPath`; the current Windows-native milestone prioritizes a helper-owned multi-flux MP4 with deterministic screen/audio/mic/webcam sync.
- Adding a native fallback for macOS or Linux in this branch.
## Target Architecture
The renderer keeps the existing recording controls. On Windows, `useScreenRecorder` sends a complete recording request to Electron and does not assemble Windows `MediaStream` tracks with `MediaRecorder`.
Electron owns the native recording session:
- resolves the selected source;
- resolves output paths;
- starts cursor sampling;
- starts the helper process;
- sends pause/resume/stop/cancel commands;
- writes `RecordingSession` manifests;
- reports explicit errors when a Windows-native capability is unavailable.
The helper owns Windows media capture:
- WGC screen/window frames;
- WASAPI system loopback;
- WASAPI microphone input;
- Media Foundation webcam capture;
- DirectShow webcam fallback for virtual cameras not visible to Media Foundation;
- Media Foundation encoding/muxing;
- stream timestamp normalization.
## Helper Contract V2
The helper receives a single JSON argument:
```json
{
"schemaVersion": 2,
"recordingId": 1234567890,
"source": {
"type": "display",
"sourceId": "screen:0:0",
"displayId": 123,
"windowHandle": null,
"bounds": { "x": 0, "y": 0, "width": 1920, "height": 1080 }
},
"video": {
"fps": 60,
"width": 1920,
"height": 1080,
"bitrate": 18000000
},
"audio": {
"system": { "enabled": true },
"microphone": { "enabled": true, "deviceId": "default", "gain": 1.4 }
},
"webcam": {
"enabled": true,
"deviceId": "default",
"deviceName": "Camera (NVIDIA Broadcast)",
"width": 1280,
"height": 720,
"fps": 30,
"bitrate": 18000000
},
"outputs": {
"screenPath": "C:\\Users\\me\\recording-123.mp4",
"manifestPath": "C:\\Users\\me\\recording-123.session.json"
}
}
```
The helper emits newline-delimited JSON events to stdout:
```json
{ "event": "ready", "schemaVersion": 2 }
{ "event": "recording-started", "timestampMs": 1234567890 }
{ "event": "warning", "code": "audio-device-unavailable", "message": "..." }
{ "event": "recording-stopped", "screenPath": "..." }
{ "event": "error", "code": "unsupported-window-source", "message": "..." }
```
During migration, Electron also accepts the current textual helper messages so existing display-only smoke tests keep working.
## Implementation Phases
### 1. Native Session Boundary
- Add a structured Windows native recording request type.
- Pass source kind, audio flags, microphone device, webcam flags, and output paths into the helper.
- On Windows, do not silently fall back to Electron capture. If the helper is unavailable or a native feature is missing, show a clear error.
- Keep Electron fallback only for non-Windows and optional developer diagnostics.
Acceptance:
- Display-only recording still works.
- Enabling an unsupported native feature returns an explicit native error instead of recording through Electron.
### 2. WASAPI System Audio
Status: initial implementation landed. The helper captures the default render endpoint with WASAPI loopback, passes the runtime mix format into `MFEncoder`, and muxes AAC audio into the primary MP4. Long-run drift correction and explicit silence insertion remain follow-up hardening work.
- Add `WasapiLoopbackCapture`.
- Capture the default render endpoint in shared loopback mode.
- Keep `WasapiLoopbackCapture` responsible only for device activation, packet capture, and packet timestamps.
- Keep `MFEncoder` responsible for all Media Foundation stream definitions and muxing.
- Feed the endpoint mix format into `MFEncoder` as the single source of truth for audio stream shape: sample rate, channel count, bits per sample, block alignment, average bytes/sec, and subtype (`PCM` or `Float`).
- Encode the primary screen MP4 with H.264 video and AAC audio through one `IMFSinkWriter`.
- Timestamp audio from the captured frame count in 100ns units. The first implementation uses the WASAPI packet timeline; later drift correction will add explicit silence or resampling if long recordings show measurable clock skew.
- Treat microphone mixing as a later phase. System loopback must land first without introducing renderer-side audio code.
Acceptance:
- Screen MP4 has an AAC audio track when system audio is enabled.
- A 5-minute recording has audio/video duration drift below one frame.
SSOT rules for this phase:
- `src/lib/nativeWindowsRecording.ts` is the renderer/main TypeScript request contract.
- `docs/engineering/windows-native-recorder-roadmap.md` is the feature-level contract and phase checklist.
- `WgcSession::captureWidth()/captureHeight()` is the encoded screen frame size until a dedicated native scaling stage exists.
- `WasapiLoopbackCapture::inputFormat()` is the runtime audio format source used by `MFEncoder`.
- The renderer passes both the browser webcam `deviceId` and selected display label as `deviceName`; `electron/native/wgc-capture/src/webcam_capture.*` is the only place that maps those values to Media Foundation devices.
- Electron resolves the selected label to a DirectShow filter CLSID once and passes it as `webcamDirectShowClsid`; the helper must not independently guess among DirectShow filters.
- No duplicated hard-coded audio format assumptions in `main.cpp`.
### 3. WASAPI Microphone
Status: initial implementation in progress. The helper can open the default WASAPI capture endpoint, apply the OpenScreen microphone gain, encode mic-only audio, and mix system loopback plus microphone through a single queued `AudioMixer` timeline when both endpoints expose the same runtime format. Audio endpoints are warmed before WGC starts, the mixer drops pre-roll and begins its paced timeline on the first encoded video frame, then cuts queued tail audio on stop so the MP4 does not drift past the video. Browser `deviceId` to MMDevice id mapping, resampling between mismatched endpoint formats, and drift correction remain follow-up hardening work.
- Add microphone device enumeration and stable device-id mapping.
- Capture selected/default microphone through WASAPI.
- Apply OpenScreen's current mic gain policy.
- Mix microphone and system audio before AAC encoding.
Acceptance:
- Mic-only, system-only, and mixed audio recordings produce a valid AAC track.
- Device unplug/permission failure produces an explicit error or warning.
### 4. Webcam Capture
- Add Media Foundation webcam source reader.
- Select requested dimensions/fps or the nearest format accepted by Media Foundation.
- Convert webcam samples to BGRA and compose them into the primary helper MP4 as an initial bottom-right picture-in-picture overlay.
- Ignore black webcam warmup frames and keep the overlay hidden until the first visible frame is available, so virtual cameras do not flash a black picture-in-picture rectangle at recording start.
- Keep the helper process as the SSOT for screen/window, WASAPI system audio, microphone, webcam, and mux timing.
- Match the requested webcam through Media Foundation friendly names first, then browser device ids/symbolic links, so UI selection remains stable across Chromium and Windows native device namespaces.
- Use the Electron-resolved DirectShow CLSID when the selected virtual camera, for example NVIDIA Broadcast, is registered for DirectShow but absent from Media Foundation enumeration.
- Later: promote the same webcam capture source to a separate editable native `webcamVideoPath` if product requirements need post-recording layout edits.
Acceptance:
- Native display/window recordings can include webcam without returning to Electron capture.
- `npm run test:wgc-webcam:win` validates the helper path when a webcam is available and skips explicitly when no webcam device exists.
- Combined webcam + system audio + microphone produces one MP4 with H.264 video and AAC audio.
### 5. Native Window Capture
Status: initial implementation in progress. Electron parses the `window:<HWND>:...` desktop source id through the shared native Windows recording contract and passes `windowHandle` to the helper. The helper resolves the `HWND`, validates it with `IsWindow`, and creates the WGC item with `CreateForWindow(HWND)`. Resize/minimize/move hardening and protected-window diagnostics remain follow-up work.
- Resolve Electron `window:*` selections to an `HWND`.
- Use WGC `CreateForWindow(HWND)`.
- Handle window close, minimize, resize, DPI scaling, and monitor moves.
- Return clear errors for unsupported protected windows.
Acceptance:
- Capturing a normal app window works with cursor/audio/mic/webcam.
- Window resize and movement do not corrupt the recording.
### 6. Runtime Controls
- Add pause/resume commands to the helper.
- Add cancel command that removes partial screen/webcam outputs.
- Keep restart as stop-discard-start from Electron until the helper supports a native restart event.
Acceptance:
- Pause/resume keeps preview duration coherent.
- Cancel leaves no stale media/session/cursor files.
### 7. Test Pipeline
- `npm run test:wgc-helper:win`: display-only helper smoke test.
- `npm run test:wgc-audio:win`: validates AAC track presence and duration.
- `npm run test:wgc-window:win`: captures a fixture window by HWND.
- `npm run test:wgc-webcam:win`: validates webcam output when a webcam is available, otherwise skips explicitly.
- Packaging check: confirms the helper is in `app.asar.unpacked`.
- Export check: exported MP4s generated from native recordings keep an AAC audio track when the source has audio.
- `npm run test:wgc-mic:win`: validates default-microphone capture writes an AAC track when an input endpoint is available.
- `npm run test:wgc-mixed-audio:win`: validates system loopback plus microphone writes one mixed AAC track when endpoint formats are compatible.
## Backlog
### Native Cursor Click Bounce Is Not Visibly Applied
Status: open. Do not treat Windows native cursor `Click Bounce` as shipped.
Problem:
- The cursor settings UI exposes `Size`, `Smoothing`, `Motion Blur`, and `Click Bounce`.
- On Windows native cursor recordings, `Size`, `Smoothing`, and `Motion Blur` are visibly applied in preview/export.
- `Click Bounce` still has no visible effect in manual packaged-app testing, even after adding click-related sample metadata.
What has already been tried:
- Added `interactionType: "click" | "mouseup" | "move"` to native cursor samples.
- Added polling-based left-button state through `GetAsyncKeyState`.
- Added the `GetAsyncKeyState` low-bit path to catch quick clicks between samples.
- Added a PowerShell/C# `WH_MOUSE_LL` mouse hook experiment and launched the sampler through a temporary `.ps1` file to avoid Windows command-line length limits.
- Updated `npm run test:cursor-native:win` so the diagnostic can observe a synthetic short click and emit `clickSampleCount`.
Current diagnosis:
- The diagnostic can observe synthetic click events, but this has not translated into a visible `Click Bounce` effect in the real packaged app.
- The test currently proves that some click metadata can be recorded, not that the full OpenScreen record -> preview -> export path displays a bounce at the expected time.
- The current native implementation may be animating from metadata that is not present in the real recording session, may be using the wrong timestamp origin, or may be applying a scale change too subtle to notice on the DOM/native cursor path.
Next investigation when resumed:
- Inspect the actual `.cursor.json`/session sidecar generated by a packaged-app manual recording and confirm whether real clicks produce `interactionType: "click"` at the right `timeMs`.
- Add a targeted end-to-end fixture that records a known click, loads the generated project, and asserts the preview/export cursor scale changes across adjacent frames.
- Compare the native DOM cursor path against the older `PixiCursorOverlay` click visual state and decide whether native cursor bounce should be a scale-only animation, an additional click ring, or a short explicit keyframe animation independent of sample cadence.
- If event capture remains unreliable in the PowerShell sampler, move click events into a small native cursor helper instead of PowerShell/C# script injection.
## Ship Criteria
- Windows display capture works with cursor, system audio, microphone, and webcam.
- Windows window capture works with cursor, system audio, microphone, and webcam.
- Preview and export show no cursor position drift.
- Preview and export show no measurable audio/video/webcam drift.
- Windows production builds do not depend on Electron capture fallback.
+130
View File
@@ -0,0 +1,130 @@
# Windows native cursor test pipeline
This branch includes two Windows-focused diagnostics for fast iteration on native cursor capture and rendering. They are intentionally local developer tools: they create short videos and JSON reports so cursor changes can be inspected without doing a full manual record/edit/export cycle.
## Native sampler diagnostic
```powershell
npm run test:cursor-native:win
```
This script does not launch OpenScreen. It:
- starts a Windows `GetCursorInfo` sampler
- moves the real OS pointer with `SetCursorPos`
- captures native cursor handles, hotspots, assets, and standard `IDC_*` cursor types
- writes normalized `CursorRecordingData`
- generates an abstract preview video
- generates a real-screen preview video using screenshots of the current desktop
The output directory is printed in the command result, for example:
```text
C:\Users\<user>\AppData\Local\Temp\openscreen-cursor-native-...
```
Useful files:
- `report.json`: sample counts, asset counts, cursor handles, and generated artifact paths
- `cursor-recording-data.json`: sidecar-compatible cursor data
- `preview.webm`: abstract path/asset/hotspot preview
- `real-capture-preview.webm`: real desktop screenshot background with reconstructed cursor overlay
- `assets/*.png`: raw cursor bitmaps captured from Windows
Environment overrides:
```powershell
$env:CURSOR_TEST_DURATION_MS = "3000"
$env:CURSOR_TEST_SAMPLE_INTERVAL_MS = "16"
$env:CURSOR_TEST_SCREEN_FRAME_INTERVAL_MS = "80"
$env:CURSOR_TEST_OUTPUT_DIR = "C:\temp\openscreen-cursor-test"
npm run test:cursor-native:win
```
## OpenScreen preview capture
```powershell
npm run capture:openscreen-preview
```
This script launches the real Electron app, injects a fixture video plus cursor sidecar data, opens the editor, captures frames from the actual OpenScreen preview UI, and encodes them into a WebM.
By default it uses the latest `cursor-recording-data.json` generated by `npm run test:cursor-native:win`. To force a specific sidecar:
```powershell
$env:CURSOR_RECORDING_DATA_PATH = "C:\path\to\cursor-recording-data.json"
npm run capture:openscreen-preview
```
Useful environment overrides:
```powershell
$env:OPENSCREEN_PREVIEW_SKIP_BUILD = "true"
$env:OPENSCREEN_PREVIEW_FRAME_COUNT = "120"
$env:OPENSCREEN_PREVIEW_FPS = "30"
$env:OPENSCREEN_PREVIEW_OUTPUT_DIR = "C:\temp\openscreen-preview"
npm run capture:openscreen-preview
```
Useful files:
- `openscreen-preview.webm`: video of the real OpenScreen editor preview
- `frames/*.png`: captured preview frames
- `report.json`: fixture paths, source sidecar, frame count, and output path
## What these tests validate
Together, the scripts make it quick to inspect:
- whether Windows cursor samples are visible and continuous
- whether native hotspots stay anchored when scaling to `3x`
- whether standard Windows cursors are recognized via `IDC_*`
- whether high-quality SVG cursor replacements follow the native hotspot
- whether the real OpenScreen preview renders the same cursor behavior as the diagnostic pipeline
They are not a full substitute for an end-to-end manual recording pass. Before shipping cursor changes, also test a real capture session and export from the packaged app.
## Known Gap
Windows native cursor `Click Bounce` is currently backlogged. `Size`, `Smoothing`, and `Motion Blur` can be validated through preview/export, but `Click Bounce` has not shown a visible effect in packaged-app manual testing. The current diagnostic can observe synthetic click metadata, but that is not enough to validate the real OpenScreen record -> preview -> export path.
Track the open item in `docs/engineering/windows-native-recorder-roadmap.md` under `Native Cursor Click Bounce Is Not Visibly Applied`.
## Native Windows capture backend
The app now routes Windows recordings through an external WGC helper instead of Electron `getDisplayMedia`. This is meant to remove the coordinate and clock split that made the reconstructed cursor drift in the preview/export path.
Current native availability rules:
- Windows 10 build 19041 or newer
- a helper executable is available
The helper currently implements display/window video capture, system audio loopback, default microphone capture, Media Foundation webcam capture, and DirectShow fallback for selected virtual cameras such as NVIDIA Broadcast. Webcam frames are composed into the primary MP4 as a bottom-right picture-in-picture overlay, and black webcam warmup frames are ignored until the first visible frame is available.
Build OpenScreen's helper locally:
```powershell
npm run build:native:win
```
Smoke-test the helper directly:
```powershell
npm run test:wgc-helper:win
npm run test:wgc-helper:win -- --capture-cursor
npm run test:wgc-window:win
npm run test:wgc-audio:win
npm run test:wgc-mic:win
npm run test:wgc-mixed-audio:win
npm run test:wgc-webcam:win
```
For local diagnostics with another compatible helper, point OpenScreen at that executable:
```powershell
$env:OPENSCREEN_WGC_CAPTURE_EXE = "C:\path\to\wgc-capture.exe"
npm run build-vite
npm run dev
```
The helper receives one JSON config argument, emits JSON lifecycle events, prints the legacy `Recording started` marker, accepts `stop` on stdin, and prints `Recording stopped. Output path: <path>`. See `electron/native/README.md` for the exact contract and build output paths.
+149
View File
@@ -0,0 +1,149 @@
# Writing Tests
This project uses [Vitest](https://vitest.dev/) for both unit/integration tests and browser tests. There are two separate configs — each targets a different set of files.
## Unit tests
**Config:** `vitest.config.ts`
**Runs in:** jsdom (simulated DOM, no real browser)
**File pattern:** `src/**/*.test.ts` — anything that does **not** end in `.browser.test.ts`
**CI command:** `npm run test`
Use unit tests for pure logic, utility functions, data transformations, and anything that doesn't need real browser APIs (Canvas, WebCodecs, MediaRecorder, etc.).
### File placement
Co-locate the test file next to the source file, or put it in a `__tests__/` folder in the same directory.
```
src/lib/compositeLayout.ts
src/lib/compositeLayout.test.ts # co-located
src/i18n/__tests__/tutorialHelpTranslations.test.ts # grouped
```
### Example
```ts
import { describe, expect, it } from "vitest";
import { computeCompositeLayout } from "./compositeLayout";
describe("computeCompositeLayout", () => {
it("anchors the overlay in the lower-right corner", () => {
const layout = computeCompositeLayout({
canvasSize: { width: 1920, height: 1080 },
screenSize: { width: 1920, height: 1080 },
webcamSize: { width: 1280, height: 720 },
});
expect(layout).not.toBeNull();
expect(layout!.webcamRect!.x).toBeGreaterThan(1920 / 2);
expect(layout!.webcamRect!.y).toBeGreaterThan(1080 / 2);
});
});
```
### Path aliases
The `@/` alias resolves to `src/`. Use it for imports that would otherwise need long relative paths.
```ts
import { SUPPORTED_LOCALES } from "@/i18n/config";
```
### Running locally
```bash
npm run test # run once
npm run test:watch # watch mode
```
---
## Browser tests
**Config:** `vitest.browser.config.ts`
**Runs in:** real Chromium via Playwright (headless)
**File pattern:** `src/**/*.browser.test.ts`
**CI commands:** `npm run test:browser:install` then `npm run test:browser`
Use browser tests when the code under test depends on real browser APIs that jsdom doesn't implement: `VideoDecoder`, `VideoEncoder`, `MediaRecorder`, `OffscreenCanvas`, `WebGL`, etc.
### File placement
Name the file `<subject>.browser.test.ts` and place it next to the source file.
```
src/lib/exporter/videoExporter.ts
src/lib/exporter/videoExporter.browser.test.ts
```
### Loading fixture assets
Static assets (video files, images) live in `tests/fixtures/`. Import them with Vite's `?url` suffix so Vite serves them through the dev server.
```ts
import sampleVideoUrl from "../../../tests/fixtures/sample.webm?url";
```
### Example
```ts
import { describe, expect, it } from "vitest";
import sampleVideoUrl from "../../../tests/fixtures/sample.webm?url";
import { VideoExporter } from "./videoExporter";
describe("VideoExporter (real browser)", () => {
it("exports a valid MP4 blob from a real video", async () => {
const exporter = new VideoExporter({
videoUrl: sampleVideoUrl,
width: 320,
height: 180,
frameRate: 15,
bitrate: 1_000_000,
wallpaper: "#1a1a2e",
zoomRegions: [],
showShadow: false,
shadowIntensity: 0,
showBlur: false,
cropRegion: { x: 0, y: 0, width: 1, height: 1 },
});
const result = await exporter.export();
expect(result.success, result.error).toBe(true);
expect(result.blob).toBeInstanceOf(Blob);
});
});
```
### Timeouts
Browser tests have a default timeout of 120 seconds per test and 30 seconds per hook (set in `vitest.browser.config.ts`). Export operations are slow — prefer small fixture dimensions (320×180) and low bitrates to keep tests fast.
### Running locally
First install the browser (one-time):
```bash
npm run test:browser:install
```
Then run the tests:
```bash
npm run test:browser
```
---
## Choosing the right type
| Situation | Use |
|---|---|
| Pure function / data transformation | Unit test |
| i18n key coverage | Unit test |
| React hook logic (no real browser APIs) | Unit test |
| `VideoDecoder` / `VideoEncoder` / `MediaRecorder` | Browser test |
| `OffscreenCanvas` / WebGL / Pixi.js rendering | Browser test |
| File export producing a real `Blob` | Browser test |
+106
View File
@@ -0,0 +1,106 @@
// @see - https://www.electron.build/configuration/configuration
{
"$schema": "https://raw.githubusercontent.com/electron-userland/electron-builder/master/packages/app-builder-lib/scheme.json",
"appId": "com.siddharthvaddem.openscreen",
"asar": true,
// .node binaries cannot be loaded from inside an asar; keep them unpacked.
"asarUnpack": [
"**/*.node"
],
"productName": "Openscreen",
"toolsets": {
"winCodeSign": "1.1.0"
},
"npmRebuild": true,
"buildDependenciesFromSource": true,
"compression": "normal",
"directories": {
"output": "release/${version}"
},
"files": [
"dist",
"dist-electron",
"!*.png",
"!preview*.png",
"!*.md",
"!README.md",
"!CONTRIBUTING.md",
"!LICENSE"
],
// Asset layout contract: "wallpapers/" under resourcesPath must align with
// assetBaseDir in electron/preload.ts (packaged branch).
"extraResources": [
{
"from": "public/wallpapers",
"to": "wallpapers"
}
],
"mac": {
"notarize": false,
"hardenedRuntime": true,
"entitlements": "macos.entitlements",
"entitlementsInherit": "macos.entitlements",
"target": [
{
"target": "dmg",
"arch": ["x64", "arm64"]
}
],
"icon": "icons/icons/mac/icon.icns",
"artifactName": "${productName}-Mac-${arch}-${version}-Installer.${ext}",
"extraResources": [
{
"from": "electron/native/bin",
"to": "electron/native/bin",
"filter": ["darwin-*/*"]
}
],
"extendInfo": {
"NSAudioCaptureUsageDescription": "OpenScreen needs audio capture permission to record system audio.",
"NSMicrophoneUsageDescription": "OpenScreen needs microphone access to record voice audio.",
"NSCameraUsageDescription": "OpenScreen needs camera access to record webcam video.",
"NSScreenCaptureUsageDescription": "OpenScreen needs screen recording permission to detect and capture windows.",
"NSCameraUseContinuityCameraDeviceType": true
}
},
"linux": {
"target": [
"AppImage",
"deb",
"pacman"
],
"icon": "icons/icons/png",
"artifactName": "${productName}-Linux-${version}.${ext}",
"category": "AudioVideo"
},
"win": {
"target": [
"nsis"
],
"icon": "icons/icons/win/icon.ico",
"signAndEditExecutable": false,
"signExts": ["!.exe"],
"extraResources": [
{
"from": "electron/native/bin",
"to": "electron/native/bin",
"filter": ["win32-*/*"]
},
{
"from": "tools/ocr/dist/openscreen-ocr-service",
"to": "ocr-service",
"filter": ["**/*"]
},
{
"from": "tools/ocr/models/paddlex",
"to": "ocr-models/paddlex",
"filter": ["**/*"]
}
]
},
"nsis": {
"oneClick": false,
"allowToChangeInstallationDirectory": true
}
}
+380
View File
@@ -0,0 +1,380 @@
/// <reference types="vite-plugin-electron/electron-env" />
declare namespace NodeJS {
interface ProcessEnv {
/**
* The built directory structure
*
* ```tree
* ├─┬─┬ dist
* │ │ └── index.html
* │ │
* │ ├─┬ dist-electron
* │ │ ├── main.js
* │ │ └── preload.js
* │
* ```
*/
APP_ROOT: string;
/** /dist/ or /public/ */
VITE_PUBLIC: string;
}
}
// Used in Renderer process, expose in `preload.ts`
interface Window {
electronAPI: {
invokeNativeBridge: <TData = unknown>(
request: import("../src/native/contracts").NativeBridgeRequest,
) => Promise<import("../src/native/contracts").NativeBridgeResponse<TData>>;
guide: {
startSession: (
recordingId: import("../src/guide/contracts").GuideRecordingIdInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideSession
>
>;
readSession: (
recordingId: import("../src/guide/contracts").GuideRecordingIdInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideSession
>
>;
addMarker: (input: import("../src/guide/contracts").AddGuideMarkerInput) => Promise<
import("../src/guide/contracts").GuideIpcResult<{
session: import("../src/guide/contracts").GuideSession;
event: import("../src/guide/contracts").GuideEvent;
}>
>;
finalizeEvents: (
input: import("../src/guide/contracts").FinalizeGuideEventsInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideSession
>
>;
writeSnapshot: (
input: import("../src/guide/contracts").WriteGuideSnapshotInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideSession
>
>;
runOcr: (
input: import("../src/guide/contracts").RunGuideOcrInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideSession
>
>;
generateDraft: (
input: import("../src/guide/contracts").GenerateGuideDraftInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideSession
>
>;
getAiSettings: () => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideAiSettings
>
>;
saveAiSettings: (
input: import("../src/guide/contracts").SaveGuideAiSettingsInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideAiSettings
>
>;
saveGuide: (
input: import("../src/guide/contracts").SaveGuideInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").GuideSession
>
>;
exportMarkdown: (
input: import("../src/guide/contracts").ExportGuideInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").ExportGuideResult
>
>;
exportHtml: (
input: import("../src/guide/contracts").ExportGuideInput,
) => Promise<
import("../src/guide/contracts").GuideIpcResult<
import("../src/guide/contracts").ExportGuideResult
>
>;
discardSession: (input: import("../src/guide/contracts").DiscardGuideSessionInput) => Promise<
import("../src/guide/contracts").GuideIpcResult<{
discarded: true;
}>
>;
};
getSources: (opts: Electron.SourcesOptions) => Promise<ProcessedDesktopSource[]>;
switchToEditor: () => Promise<void>;
switchToHud: () => Promise<void>;
startNewRecording: () => Promise<{ success: boolean; error?: string }>;
openSourceSelector: () => Promise<{
opened: boolean;
reason?: string;
access?: {
success: boolean;
granted: boolean;
status: string;
error?: string;
};
}>;
selectSource: (source: ProcessedDesktopSource) => Promise<ProcessedDesktopSource | null>;
getSelectedSource: () => Promise<ProcessedDesktopSource | null>;
requestCameraAccess: () => Promise<{
success: boolean;
granted: boolean;
status: string;
error?: string;
}>;
requestScreenAccess: () => Promise<{
success: boolean;
granted: boolean;
status: string;
error?: string;
}>;
requestNativeMacCursorAccess: () => Promise<{
success: boolean;
granted: boolean;
status: string;
error?: string;
}>;
assetBaseUrl: string;
storeRecordedVideo: (
videoData: ArrayBuffer,
fileName: string,
) => Promise<{
success: boolean;
path?: string;
session?: import("../src/lib/recordingSession").RecordingSession;
message?: string;
error?: string;
}>;
storeRecordedSession: (
payload: import("../src/lib/recordingSession").StoreRecordedSessionInput,
) => Promise<{
success: boolean;
path?: string;
session?: import("../src/lib/recordingSession").RecordingSession;
message?: string;
error?: string;
}>;
openRecordingStream: (fileName: string) => Promise<{ success: boolean; error?: string }>;
appendRecordingChunk: (
fileName: string,
chunk: ArrayBuffer,
) => Promise<{ success: boolean; error?: string }>;
closeRecordingStream: (fileName: string) => Promise<{ success: boolean; error?: string }>;
getRecordedVideoPath: () => Promise<{
success: boolean;
path?: string;
message?: string;
error?: string;
}>;
setRecordingState: (
recording: boolean,
recordingId?: number,
cursorCaptureMode?: import("../src/lib/recordingSession").CursorCaptureMode,
) => Promise<void>;
isNativeWindowsCaptureAvailable: () => Promise<{
success: boolean;
available: boolean;
helperPath?: string;
reason?: string;
error?: string;
}>;
isNativeMacCaptureAvailable: () => Promise<{
success: boolean;
available: boolean;
helperPath?: string;
reason?: "unsupported-platform" | "missing-helper" | string;
error?: string;
}>;
startNativeWindowsRecording: (
request: import("../src/lib/nativeWindowsRecording").NativeWindowsRecordingRequest,
) => Promise<import("../src/lib/nativeWindowsRecording").NativeWindowsRecordingStartResult>;
stopNativeWindowsRecording: (discard?: boolean) => Promise<{
success: boolean;
path?: string;
session?: import("../src/lib/recordingSession").RecordingSession;
message?: string;
discarded?: boolean;
error?: string;
}>;
pauseNativeWindowsRecording: () => Promise<{
success: boolean;
error?: string;
}>;
resumeNativeWindowsRecording: () => Promise<{
success: boolean;
error?: string;
}>;
startNativeMacRecording: (
request: import("../src/lib/nativeMacRecording").NativeMacRecordingRequest,
) => Promise<import("../src/lib/nativeMacRecording").NativeMacRecordingStartResult>;
pauseNativeMacRecording: () => Promise<{
success: boolean;
error?: string;
}>;
resumeNativeMacRecording: () => Promise<{
success: boolean;
error?: string;
}>;
stopNativeMacRecording: (discard?: boolean) => Promise<{
success: boolean;
path?: string;
session?: import("../src/lib/recordingSession").RecordingSession;
message?: string;
discarded?: boolean;
error?: string;
}>;
attachNativeMacWebcamRecording: (payload: {
screenVideoPath: string;
recordingId: number;
webcam: import("../src/lib/recordingSession").RecordedVideoAssetInput;
cursorCaptureMode?: import("../src/lib/recordingSession").CursorCaptureMode;
}) => Promise<{
success: boolean;
path?: string;
session?: import("../src/lib/recordingSession").RecordingSession;
message?: string;
error?: string;
}>;
discardCursorTelemetry: (recordingId: number) => Promise<void>;
getCursorTelemetry: (videoPath?: string) => Promise<{
success: boolean;
samples: CursorTelemetryPoint[];
clicks: number[];
message?: string;
error?: string;
}>;
onStopRecordingFromTray: (callback: () => void) => () => void;
openExternalUrl: (url: string) => Promise<{ success: boolean; error?: string }>;
pickExportSavePath: (
fileName: string,
exportFolder?: string,
) => Promise<{
success: boolean;
path?: string;
message?: string;
canceled?: boolean;
error?: string;
}>;
writeExportToPath: (
videoData: ArrayBuffer,
filePath: string,
) => Promise<{
success: boolean;
path?: string;
message?: string;
error?: string;
}>;
openVideoFilePicker: () => Promise<{ success: boolean; path?: string; canceled?: boolean }>;
setCurrentVideoPath: (path: string) => Promise<{ success: boolean }>;
setCurrentRecordingSession: (
session: import("../src/lib/recordingSession").RecordingSession | null,
) => Promise<{
success: boolean;
session?: import("../src/lib/recordingSession").RecordingSession;
}>;
getCurrentVideoPath: () => Promise<{ success: boolean; path?: string }>;
getCurrentRecordingSession: () => Promise<{
success: boolean;
session?: import("../src/lib/recordingSession").RecordingSession;
}>;
readBinaryFile: (filePath: string) => Promise<{
success: boolean;
data?: ArrayBuffer;
path?: string;
message?: string;
error?: string;
}>;
preparePreviewAudioTrack: (filePath: string) => Promise<{
success: boolean;
path?: string | null;
message?: string;
error?: string;
}>;
clearCurrentVideoPath: () => Promise<{ success: boolean }>;
saveProjectFile: (
projectData: unknown,
suggestedName?: string,
existingProjectPath?: string,
) => Promise<{
success: boolean;
path?: string;
message?: string;
canceled?: boolean;
error?: string;
}>;
loadProjectFile: () => Promise<{
success: boolean;
path?: string;
project?: unknown;
message?: string;
canceled?: boolean;
error?: string;
}>;
loadCurrentProjectFile: () => Promise<{
success: boolean;
path?: string;
project?: unknown;
message?: string;
canceled?: boolean;
error?: string;
}>;
onMenuLoadProject: (callback: () => void) => () => void;
onMenuSaveProject: (callback: () => void) => () => void;
onMenuSaveProjectAs: (callback: () => void) => () => void;
getPlatform: () => Promise<string>;
revealInFolder: (
filePath: string,
) => Promise<{ success: boolean; error?: string; message?: string }>;
getShortcuts: () => Promise<Record<string, unknown> | null>;
saveShortcuts: (shortcuts: unknown) => Promise<{ success: boolean; error?: string }>;
hudOverlayHide: () => void;
hudOverlayClose: () => void;
setHudOverlayIgnoreMouseEvents: (ignore: boolean) => void;
moveHudOverlayBy: (deltaX: number, deltaY: number) => void;
showCountdownOverlay: (value: number, runId: number) => Promise<void>;
setCountdownOverlayValue: (value: number, runId: number) => Promise<void>;
hideCountdownOverlay: (runId: number) => Promise<void>;
onCountdownOverlayValue: (callback: (value: number | null) => void) => () => void;
setMicrophoneExpanded: (expanded: boolean) => void;
setHasUnsavedChanges: (hasChanges: boolean) => void;
onRequestSaveBeforeClose: (callback: () => Promise<boolean> | boolean) => () => void;
onRequestCloseConfirm: (callback: () => void) => () => void;
sendCloseConfirmResponse: (choice: "save" | "discard" | "cancel") => void;
setLocale: (locale: string) => Promise<void>;
saveDiagnostic: (payload: {
error: string;
stack?: string;
projectState: unknown;
logs: string[];
}) => Promise<{ success: boolean; path?: string; canceled?: boolean; error?: string }>;
};
}
interface ProcessedDesktopSource {
id: string;
name: string;
display_id: string;
thumbnail: string | null;
appIcon: string | null;
}
interface CursorTelemetryPoint {
timeMs: number;
cx: number;
cy: number;
}
+181
View File
@@ -0,0 +1,181 @@
import type {
GeneratedGuide,
GuideLanguage,
GuideSession,
GuideStepCandidate,
} from "../../../src/guide/contracts";
import { buildGuideDraftPrompt } from "../../../src/guide/promptBuilder";
import type { DeepSeekGuideConfigProvider } from "./deepseekSettingsStore";
export interface GuideDraftClient {
generate(input: {
session: GuideSession;
candidates: GuideStepCandidate[];
language: GuideLanguage;
}): Promise<GeneratedGuide>;
}
export class DeepSeekGuideClientError extends Error {
constructor(
readonly code: "guide-ai-key-missing" | "guide-ai-request-failed" | "guide-ai-invalid-output",
message: string,
readonly retryable = false,
) {
super(message);
this.name = "DeepSeekGuideClientError";
}
}
interface DeepSeekChatResponse {
choices?: Array<{
message?: {
content?: string;
};
}>;
}
export class DeepSeekGuideClient implements GuideDraftClient {
constructor(
private readonly configProvider?: DeepSeekGuideConfigProvider,
private readonly fallbackApiKey = process.env.DEEPSEEK_API_KEY,
private readonly fallbackBaseUrl = process.env.DEEPSEEK_BASE_URL ?? "https://api.deepseek.com",
private readonly fallbackModel = process.env.DEEPSEEK_MODEL ?? "deepseek-chat",
) {}
async generate(input: {
session: GuideSession;
candidates: GuideStepCandidate[];
language: GuideLanguage;
}): Promise<GeneratedGuide> {
const config = await this.resolveConfig();
if (!config.apiKey) {
throw new DeepSeekGuideClientError(
"guide-ai-key-missing",
"DeepSeek API key is not configured.",
);
}
let response: Response;
try {
response = await fetch(`${config.baseUrl.replace(/\/$/, "")}/chat/completions`, {
method: "POST",
headers: {
"content-type": "application/json",
authorization: `Bearer ${config.apiKey}`,
},
body: JSON.stringify({
model: config.model,
temperature: 0.2,
response_format: { type: "json_object" },
messages: [
{
role: "system",
content:
"You convert UI interaction telemetry into concise software user-guide steps.",
},
{
role: "user",
content: buildGuideDraftPrompt(input),
},
],
}),
});
} catch (error) {
throw new DeepSeekGuideClientError(
"guide-ai-request-failed",
`DeepSeek request failed: ${error instanceof Error ? error.message : String(error)}`,
true,
);
}
if (!response.ok) {
throw new DeepSeekGuideClientError(
"guide-ai-request-failed",
`DeepSeek returned HTTP ${response.status}.`,
true,
);
}
const payload = (await response.json()) as DeepSeekChatResponse;
const content = payload.choices?.[0]?.message?.content;
if (!content) {
throw new DeepSeekGuideClientError(
"guide-ai-invalid-output",
"DeepSeek returned an empty response.",
);
}
return parseGeneratedGuide(content);
}
private async resolveConfig(): Promise<{ apiKey?: string; baseUrl: string; model: string }> {
if (this.configProvider) {
return await this.configProvider.getDeepSeekConfig();
}
return {
apiKey: this.fallbackApiKey,
baseUrl: this.fallbackBaseUrl,
model: this.fallbackModel,
};
}
}
function parseGeneratedGuide(content: string): GeneratedGuide {
try {
const parsed = JSON.parse(stripCodeFence(content)) as unknown;
const normalized = normalizeGeneratedGuide(parsed);
if (!normalized) {
throw new Error("Unexpected guide JSON shape.");
}
return normalized;
} catch (error) {
throw new DeepSeekGuideClientError(
"guide-ai-invalid-output",
`DeepSeek response is not valid guide JSON: ${error instanceof Error ? error.message : String(error)}`,
);
}
}
function stripCodeFence(content: string): string {
return content
.replace(/^```(?:json)?\s*/i, "")
.replace(/\s*```$/i, "")
.trim();
}
function normalizeGeneratedGuide(value: unknown): GeneratedGuide | null {
if (!value || typeof value !== "object") {
return null;
}
const guide = value as Partial<GeneratedGuide>;
if (typeof guide.title !== "string" || !Array.isArray(guide.steps)) {
return null;
}
const steps = guide.steps
.map((step, index) => {
if (!step || typeof step !== "object") {
return null;
}
const raw = step as Partial<GeneratedGuide["steps"][number]>;
if (typeof raw.title !== "string" || typeof raw.instruction !== "string") {
return null;
}
const order =
typeof raw.order === "number" && Number.isFinite(raw.order) ? raw.order : index + 1;
return {
id: typeof raw.id === "string" && raw.id.trim() ? raw.id : `guide-step-${order}`,
order,
title: raw.title,
instruction: raw.instruction,
...(typeof raw.screenshotPath === "string" ? { screenshotPath: raw.screenshotPath } : {}),
...(typeof raw.sourceCandidateId === "string"
? { sourceCandidateId: raw.sourceCandidateId }
: {}),
};
})
.filter((step): step is GeneratedGuide["steps"][number] => step !== null);
return {
title: guide.title,
summary: typeof guide.summary === "string" ? guide.summary : undefined,
steps,
};
}
+157
View File
@@ -0,0 +1,157 @@
import fs from "node:fs/promises";
import path from "node:path";
import type { GuideAiSettings, SaveGuideAiSettingsInput } from "../../../src/guide/contracts";
export interface DeepSeekGuideConfig {
apiKey?: string;
baseUrl: string;
model: string;
}
export interface DeepSeekGuideConfigProvider {
getDeepSeekConfig(): Promise<DeepSeekGuideConfig>;
}
interface PersistedGuideAiSettings {
schemaVersion: 1;
deepseek?: {
apiKeyEnvName?: string;
baseUrl?: string;
model?: string;
updatedAt?: string;
};
}
const DEFAULT_DEEPSEEK_API_KEY_ENV_NAME = "DEEPSEEK_API_KEY";
const DEFAULT_DEEPSEEK_BASE_URL = "https://api.deepseek.com";
const DEFAULT_DEEPSEEK_MODEL = "deepseek-chat";
export class DeepSeekSettingsStore implements DeepSeekGuideConfigProvider {
constructor(private readonly filePath: string) {}
async getStatus(): Promise<GuideAiSettings> {
const raw = await this.readSettings();
const apiKeyEnvName = normalizeEnvName(raw?.deepseek?.apiKeyEnvName);
const activeApiKey = process.env[apiKeyEnvName];
return {
deepseek: {
hasApiKey: Boolean(activeApiKey),
apiKeyEnvName,
baseUrl: normalizeBaseUrl(raw?.deepseek?.baseUrl ?? process.env.DEEPSEEK_BASE_URL),
model: normalizeModel(raw?.deepseek?.model ?? process.env.DEEPSEEK_MODEL),
storage: activeApiKey ? "environment" : "none",
encryptionAvailable: false,
updatedAt: raw?.deepseek?.updatedAt,
},
};
}
async save(input: SaveGuideAiSettingsInput): Promise<GuideAiSettings> {
const current = (await this.readSettings()) ?? { schemaVersion: 1 };
const currentDeepSeek = current.deepseek ?? {};
const nextDeepSeek = {
...currentDeepSeek,
baseUrl: normalizeBaseUrl(input.baseUrl ?? currentDeepSeek.baseUrl),
model: normalizeModel(input.model ?? currentDeepSeek.model),
updatedAt: new Date().toISOString(),
};
if (input.clearDeepseekApiKeyEnvName) {
delete nextDeepSeek.apiKeyEnvName;
} else if (input.deepseekApiKeyEnvName !== undefined) {
nextDeepSeek.apiKeyEnvName = normalizeEnvName(input.deepseekApiKeyEnvName);
}
await this.writeSettings({
schemaVersion: 1,
deepseek: nextDeepSeek,
});
return await this.getStatus();
}
async getDeepSeekConfig(): Promise<DeepSeekGuideConfig> {
const raw = await this.readSettings();
const apiKeyEnvName = normalizeEnvName(raw?.deepseek?.apiKeyEnvName);
return {
apiKey: process.env[apiKeyEnvName],
baseUrl: normalizeBaseUrl(raw?.deepseek?.baseUrl ?? process.env.DEEPSEEK_BASE_URL),
model: normalizeModel(raw?.deepseek?.model ?? process.env.DEEPSEEK_MODEL),
};
}
private async readSettings(): Promise<PersistedGuideAiSettings | null> {
try {
const content = await fs.readFile(this.filePath, "utf-8");
const parsed = JSON.parse(content) as unknown;
const normalized = normalizePersistedSettings(parsed);
if (normalized && hasLegacyStoredSecret(parsed)) {
await this.writeSettings(normalized);
}
return normalized;
} catch {
return null;
}
}
private async writeSettings(settings: PersistedGuideAiSettings): Promise<void> {
await fs.mkdir(path.dirname(this.filePath), { recursive: true });
const tempPath = `${this.filePath}.${process.pid}.${Date.now()}.tmp`;
await fs.writeFile(tempPath, JSON.stringify(settings, null, 2), "utf-8");
await fs.rename(tempPath, this.filePath);
}
}
function hasLegacyStoredSecret(input: unknown): boolean {
return (
typeof input === "object" &&
input !== null &&
typeof (input as { deepseek?: { apiKey?: unknown } }).deepseek?.apiKey === "object"
);
}
function normalizePersistedSettings(input: unknown): PersistedGuideAiSettings | null {
if (!input || typeof input !== "object") {
return null;
}
const raw = input as Partial<PersistedGuideAiSettings>;
if (raw.schemaVersion !== 1) {
return null;
}
return {
schemaVersion: 1,
deepseek: {
apiKeyEnvName: normalizeEnvName(raw.deepseek?.apiKeyEnvName),
baseUrl: raw.deepseek?.baseUrl,
model: raw.deepseek?.model,
updatedAt: raw.deepseek?.updatedAt,
},
};
}
function normalizeEnvName(value: string | undefined): string {
const normalized = value?.trim();
if (!normalized) {
return DEFAULT_DEEPSEEK_API_KEY_ENV_NAME;
}
return /^[A-Za-z_][A-Za-z0-9_]*$/.test(normalized)
? normalized
: DEFAULT_DEEPSEEK_API_KEY_ENV_NAME;
}
function normalizeBaseUrl(value: string | undefined): string {
const candidate = value?.trim() || DEFAULT_DEEPSEEK_BASE_URL;
try {
const url = new URL(candidate);
if (url.protocol !== "https:" && url.protocol !== "http:") {
return DEFAULT_DEEPSEEK_BASE_URL;
}
return url.toString().replace(/\/$/, "");
} catch {
return DEFAULT_DEEPSEEK_BASE_URL;
}
}
function normalizeModel(value: string | undefined): string {
return value?.trim() || DEFAULT_DEEPSEEK_MODEL;
}
+152
View File
@@ -0,0 +1,152 @@
import type { IpcMain } from "electron";
import type {
AddGuideMarkerInput,
DiscardGuideSessionInput,
ExportGuideInput,
ExportGuideResult,
FinalizeGuideEventsInput,
GenerateGuideDraftInput,
GuideAiSettings,
GuideEvent,
GuideIpcResult,
GuideSession,
RunGuideOcrInput,
SaveGuideAiSettingsInput,
SaveGuideInput,
WriteGuideSnapshotInput,
} from "../../src/guide/contracts";
import type { DeepSeekSettingsStore } from "./ai/deepseekSettingsStore";
import { GuideStore, GuideStoreError } from "./guideStore";
export function registerGuideIpcHandlers(
ipcMain: IpcMain,
store: GuideStore,
aiSettingsStore?: DeepSeekSettingsStore,
): void {
ipcMain.handle(
"guide:start-session",
async (_, recordingId): Promise<GuideIpcResult<GuideSession>> => {
return await toGuideResult(() => store.startSession(recordingId));
},
);
ipcMain.handle(
"guide:read-session",
async (_, recordingId): Promise<GuideIpcResult<GuideSession>> => {
return await toGuideResult(() => store.readSession(recordingId));
},
);
ipcMain.handle(
"guide:add-marker",
async (
_,
input: AddGuideMarkerInput,
): Promise<GuideIpcResult<{ session: GuideSession; event: GuideEvent }>> => {
return await toGuideResult(() => store.addMarker(input));
},
);
ipcMain.handle(
"guide:finalize-events",
async (_, input: FinalizeGuideEventsInput): Promise<GuideIpcResult<GuideSession>> => {
return await toGuideResult(() => store.finalizeEvents(input));
},
);
ipcMain.handle(
"guide:write-snapshot",
async (_, input: WriteGuideSnapshotInput): Promise<GuideIpcResult<GuideSession>> => {
return await toGuideResult(() => store.writeSnapshot(input));
},
);
ipcMain.handle(
"guide:run-ocr",
async (_, input: RunGuideOcrInput): Promise<GuideIpcResult<GuideSession>> => {
return await toGuideResult(() => store.runOcr(input));
},
);
ipcMain.handle(
"guide:generate-draft",
async (_, input: GenerateGuideDraftInput): Promise<GuideIpcResult<GuideSession>> => {
return await toGuideResult(() => store.generateDraft(input));
},
);
ipcMain.handle("guide:get-ai-settings", async (): Promise<GuideIpcResult<GuideAiSettings>> => {
return await toGuideResult(() => requireAiSettingsStore(aiSettingsStore).getStatus());
});
ipcMain.handle(
"guide:save-ai-settings",
async (_, input: SaveGuideAiSettingsInput): Promise<GuideIpcResult<GuideAiSettings>> => {
return await toGuideResult(() => requireAiSettingsStore(aiSettingsStore).save(input));
},
);
ipcMain.handle(
"guide:save-guide",
async (_, input: SaveGuideInput): Promise<GuideIpcResult<GuideSession>> => {
return await toGuideResult(() => store.saveGuide(input));
},
);
ipcMain.handle(
"guide:export-markdown",
async (_, input: ExportGuideInput): Promise<GuideIpcResult<ExportGuideResult>> => {
return await toGuideResult(() => store.exportMarkdown(input));
},
);
ipcMain.handle(
"guide:export-html",
async (_, input: ExportGuideInput): Promise<GuideIpcResult<ExportGuideResult>> => {
return await toGuideResult(() => store.exportHtml(input));
},
);
ipcMain.handle(
"guide:discard-session",
async (_, input: DiscardGuideSessionInput): Promise<GuideIpcResult<{ discarded: true }>> => {
return await toGuideResult(async () => {
await store.discardSession(input);
return { discarded: true };
});
},
);
}
function requireAiSettingsStore(store: DeepSeekSettingsStore | undefined): DeepSeekSettingsStore {
if (!store) {
throw new GuideStoreError("guide-internal-error", "Guide AI settings store is unavailable.");
}
return store;
}
async function toGuideResult<TData>(action: () => Promise<TData>): Promise<GuideIpcResult<TData>> {
try {
return {
success: true,
data: await action(),
};
} catch (error) {
if (error instanceof GuideStoreError) {
return {
success: false,
code: error.code,
error: error.message,
retryable: error.retryable,
};
}
console.error("Guide IPC failed:", error);
return {
success: false,
code: "guide-internal-error",
error: error instanceof Error ? error.message : String(error),
retryable: false,
};
}
}
+57
View File
@@ -0,0 +1,57 @@
import path from "node:path";
import type { GuideRecordingIdInput } from "../../src/guide/contracts";
export const GUIDE_SESSION_SUFFIX = ".guide.json";
export const GUIDE_OUTPUT_DIR_SUFFIX = "-guide";
export interface GuidePaths {
recordingId: string;
baseName: string;
baseDir: string;
guidePath: string;
outputDir: string;
}
export function normalizeGuideRecordingId(recordingId: GuideRecordingIdInput): string | null {
if (typeof recordingId === "number") {
return Number.isFinite(recordingId) ? String(Math.trunc(recordingId)) : null;
}
if (typeof recordingId !== "string") {
return null;
}
const trimmed = recordingId.trim();
return trimmed.length > 0 ? trimmed : null;
}
export function resolveGuidePaths(input: {
recordingsDir: string;
recordingId: GuideRecordingIdInput;
videoPath?: string | null;
}): GuidePaths | null {
const recordingId = normalizeGuideRecordingId(input.recordingId);
if (!recordingId) {
return null;
}
const normalizedVideoPath =
typeof input.videoPath === "string" && input.videoPath.trim()
? path.resolve(input.videoPath.trim())
: null;
const parsedVideoPath = normalizedVideoPath ? path.parse(normalizedVideoPath) : null;
const baseName = parsedVideoPath?.name ?? defaultGuideBaseName(recordingId);
const baseDir = parsedVideoPath?.dir ?? path.resolve(input.recordingsDir);
return {
recordingId,
baseName,
baseDir,
guidePath: path.join(baseDir, `${baseName}${GUIDE_SESSION_SUFFIX}`),
outputDir: path.join(baseDir, `${baseName}${GUIDE_OUTPUT_DIR_SUFFIX}`),
};
}
function defaultGuideBaseName(recordingId: string): string {
return recordingId.startsWith("recording-") ? recordingId : `recording-${recordingId}`;
}
+233
View File
@@ -0,0 +1,233 @@
import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, beforeEach, describe, expect, it } from "vitest";
import { GuideStore, GuideStoreError } from "./guideStore";
let recordingsDir = "";
beforeEach(async () => {
recordingsDir = await fs.mkdtemp(path.join(os.tmpdir(), "openscreen-guide-"));
});
afterEach(async () => {
if (recordingsDir) {
await fs.rm(recordingsDir, { recursive: true, force: true });
}
});
describe("GuideStore", () => {
it("creates and reads an empty guide session", async () => {
const store = new GuideStore(recordingsDir);
const session = await store.startSession(123);
const readSession = await store.readSession(123);
expect(session.recordingId).toBe("123");
expect(session.status).toBe("recording");
expect(session.guidePath).toBe(path.join(recordingsDir, "recording-123.guide.json"));
expect(readSession).toEqual(session);
await expect(fs.stat(session.outputDir)).resolves.toMatchObject({
isDirectory: expect.any(Function),
});
});
it("adds marker events in timeline order", async () => {
const store = new GuideStore(recordingsDir);
await store.startSession(456);
await store.addMarker({ recordingId: 456, kind: "manual", timeMs: 2000, label: "Later" });
const result = await store.addMarker({
recordingId: 456,
kind: "hotkey",
timeMs: 500,
label: "First",
});
expect(result.event.kind).toBe("hotkey");
expect(result.session.events.map((event) => event.timeMs)).toEqual([500, 2000]);
expect(result.session.events[0]?.source).toBe("guide-hotkey");
expect(result.session.events[1]?.source).toBe("review-ui");
});
it("finalizes a session against the saved video path", async () => {
const store = new GuideStore(recordingsDir);
await store.startSession(789);
const videoPath = path.join(recordingsDir, "recording-789.mp4");
await fs.writeFile(videoPath, "");
const session = await store.finalizeEvents({ recordingId: 789, videoPath });
expect(session.status).toBe("events-ready");
expect(session.videoPath).toBe(videoPath);
expect(session.guidePath).toBe(path.join(recordingsDir, "recording-789.guide.json"));
});
it("adds cursor click events when finalizing a session", async () => {
const store = new GuideStore(recordingsDir);
await store.startSession(790);
await store.addMarker({ recordingId: 790, kind: "manual", timeMs: 250, label: "Manual" });
const videoPath = path.join(recordingsDir, "recording-790.mp4");
await fs.writeFile(videoPath, "");
await fs.writeFile(
`${videoPath}.cursor.json`,
JSON.stringify({
version: 2,
provider: "native",
assets: [],
samples: [
{ timeMs: 100, cx: 0.2, cy: 0.3, interactionType: "move" },
{ timeMs: 200, cx: 0.4, cy: 0.5, interactionType: "click" },
{ timeMs: 225, cx: 0.401, cy: 0.501, interactionType: "click" },
],
}),
"utf-8",
);
const session = await store.finalizeEvents({ recordingId: 790, videoPath });
expect(session.cursorPath).toBe(`${videoPath}.cursor.json`);
expect(session.events.map((event) => event.kind)).toEqual(["click", "manual"]);
expect(session.events[0]).toMatchObject({
timeMs: 200,
normalizedX: 0.4,
normalizedY: 0.5,
});
});
it("rejects guide artifacts outside the recordings directory", async () => {
const store = new GuideStore(recordingsDir);
await store.startSession(321);
const outsideVideoPath = path.join(path.dirname(recordingsDir), "outside.mp4");
await expect(
store.finalizeEvents({ recordingId: 321, videoPath: outsideVideoPath }),
).rejects.toMatchObject({
code: "guide-invalid-input",
});
});
it("rejects invalid guide session schema", async () => {
const store = new GuideStore(recordingsDir);
await fs.writeFile(
path.join(recordingsDir, "recording-bad.guide.json"),
JSON.stringify({ schemaVersion: 999 }),
"utf-8",
);
await expect(store.readSession("bad")).rejects.toBeInstanceOf(GuideStoreError);
await expect(store.readSession("bad")).rejects.toMatchObject({
code: "guide-invalid-schema",
});
});
it("saves a reviewed generated guide", async () => {
const store = new GuideStore(recordingsDir);
await store.startSession(654);
const session = await store.saveGuide({
recordingId: 654,
generatedGuide: {
title: "Huong dan thao tac",
steps: [
{
id: "step-1",
order: 1,
title: "Mo cai dat",
instruction: "Nhan nut Settings.",
},
],
},
});
expect(session.status).toBe("reviewed");
expect(session.generatedGuide?.steps).toHaveLength(1);
});
it("writes snapshots and builds candidates without OCR", async () => {
const store = new GuideStore(recordingsDir);
await store.startSession(112);
await store.addMarker({ recordingId: 112, kind: "manual", timeMs: 500, label: "Save" });
const videoPath = path.join(recordingsDir, "recording-112.mp4");
await fs.writeFile(videoPath, "");
const eventsSession = await store.finalizeEvents({ recordingId: 112, videoPath });
const session = await store.writeSnapshot({
recordingId: 112,
eventId: eventsSession.events[0]?.id ?? "",
timeMs: 1000,
offsetMs: 500,
width: 800,
height: 600,
pngBytes: new Uint8Array([137, 80, 78, 71]).buffer,
});
expect(session.status).toBe("snapshots-ready");
expect(session.snapshots).toHaveLength(1);
expect(session.candidates[0]).toMatchObject({ targetText: "Save" });
await expect(fs.readFile(session.snapshots[0]?.path ?? "")).resolves.toEqual(
Buffer.from([137, 80, 78, 71]),
);
});
it("runs OCR, generates a local draft, and exports files", async () => {
const store = new GuideStore(recordingsDir, {
ocrClient: {
recognize: async (snapshot) => [
{
id: `ocr-${snapshot.id}-1`,
snapshotId: snapshot.id,
text: "Save",
confidence: 0.95,
box: { x: 0.45, y: 0.45, width: 0.15, height: 0.08 },
},
],
},
});
await store.startSession(113);
const videoPath = path.join(recordingsDir, "recording-113.mp4");
await fs.writeFile(videoPath, "");
await fs.writeFile(
`${videoPath}.cursor.json`,
JSON.stringify({
samples: [{ timeMs: 200, cx: 0.5, cy: 0.5, interactionType: "click" }],
}),
"utf-8",
);
const eventsSession = await store.finalizeEvents({ recordingId: 113, videoPath });
await store.writeSnapshot({
recordingId: 113,
eventId: eventsSession.events[0]?.id ?? "",
timeMs: 700,
offsetMs: 500,
width: 800,
height: 600,
pngBytes: new Uint8Array([1, 2, 3]).buffer,
});
const ocrSession = await store.runOcr({ recordingId: 113 });
const draftSession = await store.generateDraft({
recordingId: 113,
language: "en",
provider: "local",
});
const markdown = await store.exportMarkdown({ recordingId: 113 });
const html = await store.exportHtml({ recordingId: 113 });
expect(ocrSession.candidates[0]).toMatchObject({ targetText: "Save" });
expect(draftSession.generatedGuide?.steps[0]?.instruction).toBe('Click "Save".');
await expect(fs.readFile(markdown.path, "utf-8")).resolves.toContain("# User guide");
await expect(fs.readFile(html.path, "utf-8")).resolves.toContain("<!doctype html>");
});
it("discards a guide session and output directory", async () => {
const store = new GuideStore(recordingsDir);
const session = await store.startSession(111);
await fs.writeFile(path.join(session.outputDir, "step-001.png"), "");
await store.discardSession({ recordingId: 111 });
await expect(fs.stat(session.guidePath)).rejects.toMatchObject({ code: "ENOENT" });
await expect(fs.stat(session.outputDir)).rejects.toMatchObject({ code: "ENOENT" });
});
});
+824
View File
@@ -0,0 +1,824 @@
import { randomUUID } from "node:crypto";
import fs from "node:fs/promises";
import path from "node:path";
import {
type AddGuideMarkerInput,
type DiscardGuideSessionInput,
type ExportGuideInput,
type ExportGuideResult,
type FinalizeGuideEventsInput,
type GeneratedGuide,
type GeneratedGuideStep,
type GenerateGuideDraftInput,
GUIDE_SCHEMA_VERSION,
type GuideErrorCode,
type GuideEvent,
type GuideEventKind,
type GuideEventSource,
type GuideSession,
type GuideSessionStatus,
type GuideSnapshot,
type GuideStepCandidate,
type OcrBlock,
type RunGuideOcrInput,
type SaveGuideInput,
type WriteGuideSnapshotInput,
} from "../../src/guide/contracts";
import { buildGuideEventsFromCursor, mergeGuideEvents } from "../../src/guide/eventBuilder";
import { exportGuideToHtml, exportGuideToMarkdown } from "../../src/guide/exporters";
import { buildLocalGuideDraft } from "../../src/guide/promptBuilder";
import { buildGuideStepCandidates } from "../../src/guide/targetMapper";
import type { CursorRecordingSample } from "../../src/native/contracts";
import {
DeepSeekGuideClient,
DeepSeekGuideClientError,
type GuideDraftClient,
} from "./ai/deepseekGuideClient";
import type { DeepSeekGuideConfigProvider } from "./ai/deepseekSettingsStore";
import { type GuidePaths, normalizeGuideRecordingId, resolveGuidePaths } from "./guidePaths";
import { createFocusedOcrSnapshot, remapFocusedOcrBlocks } from "./ocr/focusedOcrSnapshot";
import { DefaultGuideOcrClient, type GuideOcrClient } from "./ocr/paddleOcrClient";
const VALID_SESSION_STATUSES = new Set<GuideSessionStatus>([
"recording",
"events-ready",
"snapshots-ready",
"ocr-ready",
"draft-ready",
"reviewed",
]);
const VALID_EVENT_KINDS = new Set<GuideEventKind>(["click", "hotkey", "manual"]);
const VALID_EVENT_SOURCES = new Set<GuideEventSource>([
"cursor-recording",
"guide-hotkey",
"review-ui",
]);
export class GuideStoreError extends Error {
constructor(
readonly code: GuideErrorCode,
message: string,
readonly retryable = false,
) {
super(message);
this.name = "GuideStoreError";
}
}
export interface GuideStoreDependencies {
ocrClient?: GuideOcrClient;
draftClient?: GuideDraftClient;
deepSeekConfigProvider?: DeepSeekGuideConfigProvider;
focusOcrSnapshots?: boolean;
}
export class GuideStore {
constructor(
private readonly recordingsDir: string,
private readonly dependencies: GuideStoreDependencies = {},
) {}
async startSession(recordingIdInput: AddGuideMarkerInput["recordingId"]): Promise<GuideSession> {
const paths = this.requireGuidePaths(recordingIdInput);
const now = new Date().toISOString();
const session: GuideSession = {
schemaVersion: GUIDE_SCHEMA_VERSION,
recordingId: paths.recordingId,
videoPath: "",
guidePath: paths.guidePath,
outputDir: paths.outputDir,
status: "recording",
events: [],
snapshots: [],
ocrBlocks: [],
candidates: [],
createdAt: now,
updatedAt: now,
};
await this.writeSession(session);
return session;
}
async readSession(recordingIdInput: AddGuideMarkerInput["recordingId"]): Promise<GuideSession> {
const paths = this.requireGuidePaths(recordingIdInput);
return await this.readSessionAtPath(paths.guidePath);
}
async addMarker(
input: AddGuideMarkerInput,
): Promise<{ session: GuideSession; event: GuideEvent }> {
const recordingId = normalizeGuideRecordingId(input.recordingId);
if (!recordingId) {
throw new GuideStoreError("guide-invalid-input", "Guide marker is missing recordingId.");
}
if (input.kind !== "hotkey" && input.kind !== "manual") {
throw new GuideStoreError("guide-invalid-input", "Guide marker kind is invalid.");
}
if (!Number.isFinite(input.timeMs) || input.timeMs < 0) {
throw new GuideStoreError("guide-invalid-input", "Guide marker timeMs must be non-negative.");
}
const session = await this.readSession(recordingId);
const event: GuideEvent = {
id: `guide-event-${randomUUID()}`,
recordingId,
kind: input.kind,
source: input.kind === "hotkey" ? "guide-hotkey" : "review-ui",
timeMs: Math.max(0, input.timeMs),
label: normalizeOptionalString(input.label),
screenshotOffsetMs: 500,
createdAt: new Date().toISOString(),
};
const updatedSession = touchSession({
...session,
events: sortGuideEvents([...session.events, event]),
});
await this.writeSession(updatedSession);
return { session: updatedSession, event };
}
async finalizeEvents(input: FinalizeGuideEventsInput): Promise<GuideSession> {
const recordingId = normalizeGuideRecordingId(input.recordingId);
if (!recordingId) {
throw new GuideStoreError(
"guide-invalid-input",
"Guide finalization is missing recordingId.",
);
}
if (typeof input.videoPath !== "string" || input.videoPath.trim().length === 0) {
throw new GuideStoreError("guide-invalid-input", "Guide finalization is missing videoPath.");
}
const videoPath = path.resolve(input.videoPath);
const currentSession = await this.readSession(recordingId);
const nextPaths = this.requireGuidePaths(recordingId, videoPath);
const cursorPath = await this.resolveCursorPath(videoPath, input.cursorPath);
const cursorEvents = cursorPath
? await this.readCursorGuideEvents(recordingId, cursorPath)
: [];
const manualEvents = currentSession.events.filter(
(event) => event.source !== "cursor-recording",
);
const updatedSession = touchSession({
...currentSession,
videoPath,
cursorPath,
guidePath: nextPaths.guidePath,
outputDir: nextPaths.outputDir,
status: "events-ready",
events: mergeGuideEvents([...cursorEvents, ...manualEvents]),
});
await this.writeSession(updatedSession);
if (path.resolve(currentSession.guidePath) !== path.resolve(updatedSession.guidePath)) {
await fs.unlink(currentSession.guidePath).catch(() => undefined);
}
return updatedSession;
}
async writeSnapshot(input: WriteGuideSnapshotInput): Promise<GuideSession> {
const recordingId = normalizeGuideRecordingId(input.recordingId);
if (!recordingId) {
throw new GuideStoreError("guide-invalid-input", "Snapshot write is missing recordingId.");
}
if (!input.eventId || !Number.isFinite(input.timeMs) || input.timeMs < 0) {
throw new GuideStoreError("guide-invalid-input", "Snapshot metadata is invalid.");
}
if (!input.pngBytes || input.pngBytes.byteLength === 0) {
throw new GuideStoreError("guide-invalid-input", "Snapshot PNG data is empty.");
}
if (
!Number.isFinite(input.width) ||
input.width <= 0 ||
!Number.isFinite(input.height) ||
input.height <= 0
) {
throw new GuideStoreError("guide-invalid-input", "Snapshot dimensions are invalid.");
}
const session = await this.readSession(recordingId);
const eventIndex = session.events.findIndex((event) => event.id === input.eventId);
if (eventIndex === -1) {
throw new GuideStoreError("guide-invalid-input", "Snapshot event does not exist.");
}
this.assertGuidePathIsAllowed(session.outputDir);
await fs.mkdir(session.outputDir, { recursive: true });
const fileName = `step-${String(eventIndex + 1).padStart(3, "0")}.png`;
const snapshotPath = path.join(session.outputDir, fileName);
this.assertGuidePathIsAllowed(snapshotPath);
await fs.writeFile(snapshotPath, Buffer.from(new Uint8Array(input.pngBytes)));
const snapshot: GuideSnapshot = {
id: `snapshot-${input.eventId}`,
eventId: input.eventId,
timeMs: Math.max(0, input.timeMs),
offsetMs: input.offsetMs,
path: snapshotPath,
width: Math.round(input.width),
height: Math.round(input.height),
};
const updatedSnapshots = [
...session.snapshots.filter((existing) => existing.eventId !== input.eventId),
snapshot,
].sort((left, right) => left.timeMs - right.timeMs);
const updatedSession = touchSession({
...session,
status: "snapshots-ready",
snapshots: updatedSnapshots,
ocrBlocks: session.ocrBlocks.filter((block) => block.snapshotId !== snapshot.id),
candidates: buildGuideStepCandidates({
...session,
snapshots: updatedSnapshots,
ocrBlocks: session.ocrBlocks.filter((block) => block.snapshotId !== snapshot.id),
}),
generatedGuide: undefined,
});
await this.writeSession(updatedSession);
return updatedSession;
}
async runOcr(input: RunGuideOcrInput): Promise<GuideSession> {
const session = await this.readSession(input.recordingId);
const requestedIds = new Set(input.snapshotIds ?? []);
const snapshots =
requestedIds.size > 0
? session.snapshots.filter((snapshot) => requestedIds.has(snapshot.id))
: session.snapshots;
if (snapshots.length === 0) {
throw new GuideStoreError("guide-invalid-input", "No guide snapshots are available for OCR.");
}
const ocrClient = this.dependencies.ocrClient ?? new DefaultGuideOcrClient();
const shouldFocusOcrSnapshots =
this.dependencies.focusOcrSnapshots ?? this.dependencies.ocrClient === undefined;
const eventsById = new Map(session.events.map((event) => [event.id, event]));
const blocks: OcrBlock[] = [];
try {
for (const snapshot of snapshots) {
const focusedSnapshot = shouldFocusOcrSnapshots
? await createFocusedOcrSnapshot({
snapshot,
event: eventsById.get(snapshot.eventId),
outputDir: session.outputDir,
})
: { snapshot };
const recognizedBlocks = await ocrClient.recognize(focusedSnapshot.snapshot);
blocks.push(...remapFocusedOcrBlocks(recognizedBlocks, focusedSnapshot.transform));
}
} catch (error) {
throw new GuideStoreError(
"guide-ocr-unavailable",
error instanceof Error ? error.message : "OCR failed.",
true,
);
}
const snapshotIds = new Set(snapshots.map((snapshot) => snapshot.id));
const updatedOcrBlocks = [
...session.ocrBlocks.filter((block) => !snapshotIds.has(block.snapshotId)),
...blocks,
];
const draftSession = {
...session,
ocrBlocks: updatedOcrBlocks,
};
const updatedSession = touchSession({
...draftSession,
status: "ocr-ready",
candidates: buildGuideStepCandidates(draftSession),
generatedGuide: undefined,
});
await this.writeSession(updatedSession);
return updatedSession;
}
async generateDraft(input: GenerateGuideDraftInput): Promise<GuideSession> {
const session = await this.readSession(input.recordingId);
const candidates =
session.candidates.length > 0 ? session.candidates : buildGuideStepCandidates(session);
if (candidates.length === 0) {
throw new GuideStoreError(
"guide-invalid-input",
"No guide events are available for drafting.",
);
}
let generatedGuide: GeneratedGuide;
if (input.provider === "local") {
generatedGuide = buildLocalGuideDraft(session, candidates, input.language);
} else {
const draftClient =
this.dependencies.draftClient ??
new DeepSeekGuideClient(this.dependencies.deepSeekConfigProvider);
try {
generatedGuide = await draftClient.generate({
session,
candidates,
language: input.language,
});
} catch (error) {
if (error instanceof DeepSeekGuideClientError) {
throw new GuideStoreError(error.code, error.message, error.retryable);
}
throw new GuideStoreError(
"guide-ai-request-failed",
error instanceof Error ? error.message : "Guide draft generation failed.",
true,
);
}
}
const updatedSession = touchSession({
...session,
candidates,
generatedGuide: normalizeGeneratedGuide(generatedGuide) ?? generatedGuide,
status: "draft-ready",
});
await this.writeSession(updatedSession);
return updatedSession;
}
async saveGuide(input: SaveGuideInput): Promise<GuideSession> {
const session = await this.readSession(input.recordingId);
const generatedGuide = normalizeGeneratedGuide(input.generatedGuide);
if (!generatedGuide) {
throw new GuideStoreError("guide-invalid-input", "Generated guide shape is invalid.");
}
const updatedSession = touchSession({
...session,
generatedGuide,
status: "reviewed",
});
await this.writeSession(updatedSession);
return updatedSession;
}
async exportMarkdown(input: ExportGuideInput): Promise<ExportGuideResult> {
const session = await this.readSession(input.recordingId);
return await this.writeGuideExport(session, "guide.md", () => exportGuideToMarkdown(session));
}
async exportHtml(input: ExportGuideInput): Promise<ExportGuideResult> {
const session = await this.readSession(input.recordingId);
return await this.writeGuideExport(session, "guide.html", () => exportGuideToHtml(session));
}
async discardSession(input: DiscardGuideSessionInput): Promise<void> {
const paths = this.requireGuidePaths(input.recordingId);
const session = await this.readSession(input.recordingId).catch(() => null);
const guidePath = session?.guidePath ?? paths.guidePath;
const outputDir = session?.outputDir ?? paths.outputDir;
this.assertGuidePathIsAllowed(guidePath);
this.assertGuidePathIsAllowed(outputDir);
await fs.unlink(guidePath).catch(() => undefined);
await fs.rm(outputDir, { recursive: true, force: true });
}
private async writeGuideExport(
session: GuideSession,
fileName: string,
renderContent: () => string,
): Promise<ExportGuideResult> {
if (!session.generatedGuide) {
throw new GuideStoreError("guide-invalid-input", "Generate a guide draft before exporting.");
}
const exportPath = path.join(session.outputDir, fileName);
this.assertGuidePathIsAllowed(exportPath);
try {
await fs.mkdir(session.outputDir, { recursive: true });
await fs.writeFile(exportPath, renderContent(), "utf-8");
} catch (error) {
throw new GuideStoreError(
"guide-export-failed",
error instanceof Error ? error.message : "Guide export failed.",
true,
);
}
return { path: exportPath, session };
}
async writeSession(session: GuideSession): Promise<void> {
const normalized = normalizeGuideSession(session);
if (!normalized) {
throw new GuideStoreError("guide-invalid-schema", "Guide session schema is invalid.");
}
this.assertGuidePathIsAllowed(normalized.guidePath);
this.assertGuidePathIsAllowed(normalized.outputDir);
await fs.mkdir(path.dirname(normalized.guidePath), { recursive: true });
await fs.mkdir(normalized.outputDir, { recursive: true });
await atomicWriteJson(normalized.guidePath, normalized);
}
private async readSessionAtPath(guidePath: string): Promise<GuideSession> {
this.assertGuidePathIsAllowed(guidePath);
try {
const content = await fs.readFile(guidePath, "utf-8");
const session = normalizeGuideSession(JSON.parse(content));
if (!session) {
throw new GuideStoreError("guide-invalid-schema", "Guide session schema is invalid.");
}
return session;
} catch (error) {
if (error instanceof GuideStoreError) {
throw error;
}
const nodeError = error as NodeJS.ErrnoException;
if (nodeError.code === "ENOENT") {
throw new GuideStoreError("guide-session-not-found", "Guide session was not found.");
}
throw error;
}
}
private requireGuidePaths(
recordingIdInput: AddGuideMarkerInput["recordingId"],
videoPath?: string | null,
): GuidePaths {
const paths = resolveGuidePaths({
recordingsDir: this.recordingsDir,
recordingId: recordingIdInput,
videoPath,
});
if (!paths) {
throw new GuideStoreError("guide-invalid-input", "Guide recordingId is invalid.");
}
this.assertGuidePathIsAllowed(paths.guidePath);
this.assertGuidePathIsAllowed(paths.outputDir);
return paths;
}
private assertGuidePathIsAllowed(targetPath: string): void {
if (this.isPathAllowed(targetPath)) {
return;
}
throw new GuideStoreError(
"guide-invalid-input",
"Guide artifacts must be stored inside the recordings directory.",
);
}
private async resolveCursorPath(
videoPath: string,
explicitCursorPath?: string,
): Promise<string | undefined> {
const candidates = [
normalizeOptionalString(explicitCursorPath),
`${videoPath}.cursor.json`,
].filter((candidate): candidate is string => Boolean(candidate));
for (const candidate of candidates) {
const resolvedCandidate = path.resolve(candidate);
if (!this.isPathAllowed(resolvedCandidate)) {
continue;
}
try {
await fs.access(resolvedCandidate);
return resolvedCandidate;
} catch {
// Cursor telemetry is optional for guide sessions.
}
}
return undefined;
}
private async readCursorGuideEvents(
recordingId: string,
cursorPath: string,
): Promise<GuideEvent[]> {
try {
const content = await fs.readFile(cursorPath, "utf-8");
const parsed = JSON.parse(content) as unknown;
const rawSamples =
isRecord(parsed) && Array.isArray(parsed.samples) ? parsed.samples : parsed;
const samples = Array.isArray(rawSamples)
? rawSamples
.map(normalizeCursorSampleForGuide)
.filter((sample): sample is CursorRecordingSample => sample !== null)
: [];
return buildGuideEventsFromCursor({ recordingId, samples });
} catch (error) {
console.warn("Failed to read cursor telemetry for guide events:", error);
return [];
}
}
private isPathAllowed(targetPath: string): boolean {
const resolvedTarget = path.resolve(targetPath);
const resolvedRecordingsDir = path.resolve(this.recordingsDir);
const relative = path.relative(resolvedRecordingsDir, resolvedTarget);
return relative === "" || (!relative.startsWith("..") && !path.isAbsolute(relative));
}
}
function touchSession(session: GuideSession): GuideSession {
return {
...session,
updatedAt: new Date().toISOString(),
};
}
function sortGuideEvents(events: GuideEvent[]): GuideEvent[] {
return [...events].sort((left, right) => left.timeMs - right.timeMs);
}
function normalizeCursorSampleForGuide(input: unknown): CursorRecordingSample | null {
if (!isRecord(input)) {
return null;
}
const interactionType =
input.interactionType === "click" ||
input.interactionType === "mouseup" ||
input.interactionType === "move"
? input.interactionType
: "move";
const timeMs = normalizeNonNegativeNumber(input.timeMs);
const cx = normalizeOptionalNumber(input.cx);
const cy = normalizeOptionalNumber(input.cy);
if (timeMs === null || cx === undefined || cy === undefined) {
return null;
}
return {
timeMs,
cx,
cy,
interactionType,
};
}
async function atomicWriteJson(filePath: string, value: unknown): Promise<void> {
const tempPath = `${filePath}.${process.pid}.${Date.now()}.tmp`;
await fs.writeFile(tempPath, JSON.stringify(value, null, 2), "utf-8");
await fs.rename(tempPath, filePath);
}
function normalizeGuideSession(input: unknown): GuideSession | null {
if (!isRecord(input) || input.schemaVersion !== GUIDE_SCHEMA_VERSION) {
return null;
}
const recordingId = normalizeString(input.recordingId);
const videoPath = normalizeString(input.videoPath);
const guidePath = normalizeString(input.guidePath);
const outputDir = normalizeString(input.outputDir);
const status = normalizeSessionStatus(input.status);
const createdAt = normalizeString(input.createdAt);
const updatedAt = normalizeString(input.updatedAt);
if (
!recordingId ||
videoPath === null ||
!guidePath ||
!outputDir ||
!status ||
!createdAt ||
!updatedAt
) {
return null;
}
const generatedGuide =
input.generatedGuide === undefined ? undefined : normalizeGeneratedGuide(input.generatedGuide);
if (generatedGuide === null) {
return null;
}
return {
schemaVersion: GUIDE_SCHEMA_VERSION,
recordingId,
videoPath,
cursorPath: normalizeOptionalString(input.cursorPath),
guidePath,
outputDir,
status,
events: normalizeArray(input.events, normalizeGuideEvent),
snapshots: normalizeArray(input.snapshots, normalizeGuideSnapshot),
ocrBlocks: normalizeArray(input.ocrBlocks, normalizeOcrBlock),
candidates: normalizeArray(input.candidates, normalizeGuideStepCandidate),
generatedGuide,
createdAt,
updatedAt,
};
}
function normalizeGuideEvent(input: unknown): GuideEvent | null {
if (!isRecord(input)) {
return null;
}
const id = normalizeString(input.id);
const recordingId = normalizeString(input.recordingId);
const kind = VALID_EVENT_KINDS.has(input.kind as GuideEventKind)
? (input.kind as GuideEventKind)
: null;
const source = VALID_EVENT_SOURCES.has(input.source as GuideEventSource)
? (input.source as GuideEventSource)
: null;
const timeMs = normalizeNonNegativeNumber(input.timeMs);
const createdAt = normalizeString(input.createdAt);
if (!id || !recordingId || !kind || !source || timeMs === null || !createdAt) {
return null;
}
return {
id,
recordingId,
kind,
source,
timeMs,
x: normalizeOptionalNumber(input.x),
y: normalizeOptionalNumber(input.y),
normalizedX: normalizeOptionalNumber(input.normalizedX),
normalizedY: normalizeOptionalNumber(input.normalizedY),
button:
input.button === "left" ||
input.button === "right" ||
input.button === "middle" ||
input.button === "unknown"
? input.button
: undefined,
label: normalizeOptionalString(input.label),
screenshotOffsetMs: normalizeOptionalNumber(input.screenshotOffsetMs),
createdAt,
};
}
function normalizeGuideSnapshot(input: unknown): GuideSnapshot | null {
if (!isRecord(input)) {
return null;
}
const id = normalizeString(input.id);
const eventId = normalizeString(input.eventId);
const pathValue = normalizeString(input.path);
const timeMs = normalizeNonNegativeNumber(input.timeMs);
const offsetMs = normalizeOptionalNumber(input.offsetMs);
const width = normalizePositiveInteger(input.width);
const height = normalizePositiveInteger(input.height);
if (
!id ||
!eventId ||
!pathValue ||
timeMs === null ||
offsetMs === undefined ||
width === null ||
height === null
) {
return null;
}
return { id, eventId, timeMs, offsetMs, path: pathValue, width, height };
}
function normalizeOcrBlock(input: unknown): OcrBlock | null {
if (!isRecord(input) || !isRecord(input.box)) {
return null;
}
const id = normalizeString(input.id);
const snapshotId = normalizeString(input.snapshotId);
const text = normalizeString(input.text);
const confidence = normalizeOptionalNumber(input.confidence);
const x = normalizeOptionalNumber(input.box.x);
const y = normalizeOptionalNumber(input.box.y);
const width = normalizeOptionalNumber(input.box.width);
const height = normalizeOptionalNumber(input.box.height);
if (
!id ||
!snapshotId ||
text === null ||
confidence === undefined ||
x === undefined ||
y === undefined ||
width === undefined ||
height === undefined
) {
return null;
}
return { id, snapshotId, text, confidence, box: { x, y, width, height } };
}
function normalizeGuideStepCandidate(input: unknown): GuideStepCandidate | null {
if (!isRecord(input)) {
return null;
}
const id = normalizeString(input.id);
const eventId = normalizeString(input.eventId);
const timeMs = normalizeNonNegativeNumber(input.timeMs);
const confidence = normalizeOptionalNumber(input.confidence);
const nearbyText = Array.isArray(input.nearbyText)
? input.nearbyText.map(normalizeString).filter((text): text is string => text !== null)
: [];
if (!id || !eventId || timeMs === null || confidence === undefined) {
return null;
}
return {
id,
eventId,
snapshotId: normalizeOptionalString(input.snapshotId),
timeMs,
action:
input.action === "click" ||
input.action === "choose" ||
input.action === "type" ||
input.action === "wait" ||
input.action === "manual"
? input.action
: "manual",
targetText: normalizeOptionalString(input.targetText),
targetRole:
input.targetRole === "button" ||
input.targetRole === "menu" ||
input.targetRole === "tab" ||
input.targetRole === "field" ||
input.targetRole === "link" ||
input.targetRole === "unknown"
? input.targetRole
: undefined,
nearbyText,
confidence,
};
}
function normalizeGeneratedGuide(input: unknown): GeneratedGuide | null {
if (!isRecord(input)) {
return null;
}
const title = normalizeString(input.title);
if (!title || !Array.isArray(input.steps)) {
return null;
}
const steps = input.steps
.map((step): GeneratedGuideStep | null => {
if (!isRecord(step)) {
return null;
}
const id = normalizeString(step.id);
const order = normalizePositiveInteger(step.order);
const stepTitle = normalizeString(step.title);
const instruction = normalizeString(step.instruction);
if (!id || order === null || !stepTitle || !instruction) {
return null;
}
return {
id,
order,
title: stepTitle,
instruction,
screenshotPath: normalizeOptionalString(step.screenshotPath),
sourceCandidateId: normalizeOptionalString(step.sourceCandidateId),
};
})
.filter((step): step is GeneratedGuide["steps"][number] => step !== null);
return {
title,
summary: normalizeOptionalString(input.summary),
steps,
};
}
function normalizeArray<T>(input: unknown, normalize: (value: unknown) => T | null): T[] {
return Array.isArray(input)
? input.map((value) => normalize(value)).filter((value): value is T => value !== null)
: [];
}
function normalizeSessionStatus(value: unknown): GuideSessionStatus | null {
return VALID_SESSION_STATUSES.has(value as GuideSessionStatus)
? (value as GuideSessionStatus)
: null;
}
function normalizeString(value: unknown): string | null {
return typeof value === "string" ? value : null;
}
function normalizeOptionalString(value: unknown): string | undefined {
const text = normalizeString(value);
return text === null || text.length === 0 ? undefined : text;
}
function normalizeNonNegativeNumber(value: unknown): number | null {
return typeof value === "number" && Number.isFinite(value) && value >= 0 ? value : null;
}
function normalizeOptionalNumber(value: unknown): number | undefined {
return typeof value === "number" && Number.isFinite(value) ? value : undefined;
}
function normalizePositiveInteger(value: unknown): number | null {
return typeof value === "number" && Number.isFinite(value) && value > 0
? Math.round(value)
: null;
}
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null;
}
+232
View File
@@ -0,0 +1,232 @@
import { type ChildProcessWithoutNullStreams, spawn } from "node:child_process";
import fs from "node:fs/promises";
import path from "node:path";
import { app } from "electron";
const DEFAULT_OCR_BASE_URL = "http://127.0.0.1:8866";
const DEFAULT_OCR_PORT = "8866";
const SERVICE_EXE_NAME = "openscreen-ocr-service.exe";
const HEALTH_TIMEOUT_MS = 1000;
const STARTUP_TIMEOUT_MS = 90000;
const PADDLEX_MODEL_NAMES = ["PP-OCRv5_mobile_det", "latin_PP-OCRv5_mobile_rec"];
let ocrProcess: ChildProcessWithoutNullStreams | null = null;
let startupPromise: Promise<void> | null = null;
let quitHookRegistered = false;
export async function ensureBundledOcrServiceRunning(
baseUrl = DEFAULT_OCR_BASE_URL,
): Promise<void> {
if (!shouldManageOcrService(baseUrl)) {
return;
}
if (await isOcrServiceHealthy(baseUrl, HEALTH_TIMEOUT_MS)) {
return;
}
const executablePath = await findBundledOcrServiceExecutable();
if (!executablePath) {
return;
}
if (!startupPromise) {
startupPromise = startAndWaitForOcrService(executablePath, baseUrl).finally(() => {
startupPromise = null;
});
}
await startupPromise;
}
function shouldManageOcrService(baseUrl: string): boolean {
try {
const url = new URL(baseUrl);
const hostname = url.hostname.toLowerCase();
return (
(url.protocol === "http:" || url.protocol === "https:") &&
(hostname === "127.0.0.1" || hostname === "localhost") &&
(url.port === "" || url.port === DEFAULT_OCR_PORT)
);
} catch {
return false;
}
}
async function findBundledOcrServiceExecutable(): Promise<string | null> {
const candidates = [
process.env.OPENSCREEN_GUIDE_OCR_EXE,
path.join(process.resourcesPath, "ocr-service", SERVICE_EXE_NAME),
path.join(process.resourcesPath, "ocr-service", "openscreen-ocr-service", SERVICE_EXE_NAME),
path.resolve(process.cwd(), "tools", "ocr", "dist", "openscreen-ocr-service", SERVICE_EXE_NAME),
].filter(
(candidate): candidate is string => typeof candidate === "string" && candidate.length > 0,
);
for (const candidate of candidates) {
try {
const stats = await fs.stat(candidate);
if (stats.isFile()) {
return candidate;
}
} catch {
// Try the next candidate.
}
}
return null;
}
async function startAndWaitForOcrService(executablePath: string, baseUrl: string): Promise<void> {
const runtimePaths = await prepareOcrRuntimePaths();
if (!ocrProcess || ocrProcess.exitCode !== null || ocrProcess.killed) {
startOcrServiceProcess(executablePath, runtimePaths);
}
await waitForOcrServiceHealth(baseUrl, STARTUP_TIMEOUT_MS);
}
async function prepareOcrRuntimePaths(): Promise<{
modelCachePath: string;
paddlexCachePath: string;
}> {
const modelCachePath = path.join(app.getPath("userData"), "ocr-models");
const paddlexCachePath = path.join(modelCachePath, "paddlex");
await seedBundledPaddlexModels(paddlexCachePath);
return { modelCachePath, paddlexCachePath };
}
async function seedBundledPaddlexModels(destinationCachePath: string): Promise<void> {
const sourceCachePath = await findBundledPaddlexModelCache();
if (!sourceCachePath) {
return;
}
const sourceOfficialModels = path.join(sourceCachePath, "official_models");
const destinationOfficialModels = path.join(destinationCachePath, "official_models");
await fs.mkdir(destinationOfficialModels, { recursive: true });
for (const modelName of PADDLEX_MODEL_NAMES) {
const sourceModelPath = path.join(sourceOfficialModels, modelName);
const destinationModelPath = path.join(destinationOfficialModels, modelName);
if (!(await pathExists(sourceModelPath)) || (await pathExists(destinationModelPath))) {
continue;
}
await fs.cp(sourceModelPath, destinationModelPath, {
recursive: true,
errorOnExist: false,
force: false,
});
}
}
async function findBundledPaddlexModelCache(): Promise<string | null> {
const candidates = [
path.join(process.resourcesPath, "ocr-models", "paddlex"),
path.resolve(process.cwd(), "tools", "ocr", "models", "paddlex"),
];
for (const candidate of candidates) {
try {
const stats = await fs.stat(candidate);
if (stats.isDirectory()) {
return candidate;
}
} catch {
// Try the next candidate.
}
}
return null;
}
async function pathExists(value: string): Promise<boolean> {
try {
await fs.access(value);
return true;
} catch {
return false;
}
}
function startOcrServiceProcess(
executablePath: string,
runtimePaths: { modelCachePath: string; paddlexCachePath: string },
): void {
registerQuitHook();
ocrProcess = spawn(executablePath, [], {
cwd: path.dirname(executablePath),
env: {
...process.env,
OPENSCREEN_OCR_HOST: "127.0.0.1",
OPENSCREEN_OCR_PORT: DEFAULT_OCR_PORT,
PADDLEOCR_DEVICE: process.env.PADDLEOCR_DEVICE ?? "cpu",
PADDLEOCR_ENABLE_MKLDNN: process.env.PADDLEOCR_ENABLE_MKLDNN ?? "0",
PADDLEOCR_LANG: process.env.PADDLEOCR_LANG ?? "latin",
PADDLEOCR_USE_MOBILE: process.env.PADDLEOCR_USE_MOBILE ?? "1",
PADDLE_PDX_ENABLE_MKLDNN_BYDEFAULT: process.env.PADDLE_PDX_ENABLE_MKLDNN_BYDEFAULT ?? "False",
PADDLE_PDX_CACHE_HOME: process.env.PADDLE_PDX_CACHE_HOME ?? runtimePaths.paddlexCachePath,
PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK:
process.env.PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK ?? "True",
PADDLE_HOME: process.env.PADDLE_HOME ?? path.join(runtimePaths.modelCachePath, "paddle"),
PADDLEOCR_HOME:
process.env.PADDLEOCR_HOME ?? path.join(runtimePaths.modelCachePath, "paddleocr"),
PYTHONUTF8: "1",
},
windowsHide: true,
});
ocrProcess.stdout.on("data", (chunk) => {
console.info(`[guide-ocr-service] ${chunk.toString().trim()}`);
});
ocrProcess.stderr.on("data", (chunk) => {
console.warn(`[guide-ocr-service] ${chunk.toString().trim()}`);
});
ocrProcess.on("exit", (code, signal) => {
console.info("[guide-ocr-service] exited", { code, signal });
ocrProcess = null;
});
}
function registerQuitHook(): void {
if (quitHookRegistered) {
return;
}
quitHookRegistered = true;
app.once("before-quit", () => {
const processToStop = ocrProcess;
ocrProcess = null;
processToStop?.kill();
});
}
async function waitForOcrServiceHealth(baseUrl: string, timeoutMs: number): Promise<void> {
const startedAt = Date.now();
let lastError: unknown;
while (Date.now() - startedAt < timeoutMs) {
if (await isOcrServiceHealthy(baseUrl, HEALTH_TIMEOUT_MS)) {
return;
}
if (ocrProcess?.exitCode !== null && ocrProcess?.exitCode !== undefined) {
throw new Error(`Bundled OCR service exited with code ${ocrProcess.exitCode}.`);
}
await sleep(750);
}
if (lastError instanceof Error) {
throw lastError;
}
throw new Error("Timed out waiting for bundled OCR service to start.");
}
async function isOcrServiceHealthy(baseUrl: string, timeoutMs: number): Promise<boolean> {
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), timeoutMs);
try {
const response = await fetch(`${baseUrl.replace(/\/$/, "")}/health`, {
signal: controller.signal,
});
return response.ok;
} catch {
return false;
} finally {
clearTimeout(timeoutId);
}
}
function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
@@ -0,0 +1,33 @@
import { describe, expect, it } from "vitest";
import type { OcrBlock } from "../../../src/guide/contracts";
import { remapFocusedOcrBlocks } from "./focusedOcrSnapshot";
describe("remapFocusedOcrBlocks", () => {
it("maps boxes from a focused crop back to the original snapshot coordinates", () => {
const blocks: OcrBlock[] = [
{
id: "ocr-1",
snapshotId: "snapshot-1",
text: "Settings",
confidence: 0.9,
box: { x: 0.25, y: 0.5, width: 0.2, height: 0.1 },
},
];
const remapped = remapFocusedOcrBlocks(blocks, {
cropX: 320,
cropY: 180,
cropWidth: 640,
cropHeight: 360,
originalWidth: 1280,
originalHeight: 720,
});
expect(remapped[0]?.box).toEqual({
x: 0.375,
y: 0.5,
width: 0.1,
height: 0.05,
});
});
});
+225
View File
@@ -0,0 +1,225 @@
import { execFile } from "node:child_process";
import fs from "node:fs/promises";
import path from "node:path";
import { promisify } from "node:util";
import type { GuideEvent, GuideSnapshot, OcrBlock } from "../../../src/guide/contracts";
const execFileAsync = promisify(execFile);
interface FocusTransform {
cropX: number;
cropY: number;
cropWidth: number;
cropHeight: number;
originalWidth: number;
originalHeight: number;
}
export interface FocusedOcrSnapshot {
snapshot: GuideSnapshot;
transform?: FocusTransform;
}
export async function createFocusedOcrSnapshot(input: {
snapshot: GuideSnapshot;
event?: GuideEvent;
outputDir: string;
}): Promise<FocusedOcrSnapshot> {
if (process.platform !== "win32") {
return { snapshot: input.snapshot };
}
const click = getEventPoint(input.event, input.snapshot);
if (!click) {
return { snapshot: input.snapshot };
}
const crop = calculateFocusCrop(input.snapshot, click);
if (
!crop ||
(crop.cropWidth === input.snapshot.width && crop.cropHeight === input.snapshot.height)
) {
return { snapshot: input.snapshot };
}
const focusDir = path.join(input.outputDir, "ocr-focus");
await fs.mkdir(focusDir, { recursive: true });
const focusPath = path.join(focusDir, `${path.parse(input.snapshot.path).name}-focus.png`);
const zoom = 2;
const focusedSnapshot: GuideSnapshot = {
...input.snapshot,
path: focusPath,
width: crop.cropWidth * zoom,
height: crop.cropHeight * zoom,
};
try {
await writeFocusedPng({
sourcePath: input.snapshot.path,
outputPath: focusPath,
cropX: crop.cropX,
cropY: crop.cropY,
cropWidth: crop.cropWidth,
cropHeight: crop.cropHeight,
outputWidth: focusedSnapshot.width,
outputHeight: focusedSnapshot.height,
});
return { snapshot: focusedSnapshot, transform: crop };
} catch {
return { snapshot: input.snapshot };
}
}
export function remapFocusedOcrBlocks(
blocks: OcrBlock[],
transform: FocusedOcrSnapshot["transform"],
): OcrBlock[] {
if (!transform) {
return blocks;
}
return blocks.map((block) => ({
...block,
box: {
x: clamp01((transform.cropX + block.box.x * transform.cropWidth) / transform.originalWidth),
y: clamp01((transform.cropY + block.box.y * transform.cropHeight) / transform.originalHeight),
width: clamp01((block.box.width * transform.cropWidth) / transform.originalWidth),
height: clamp01((block.box.height * transform.cropHeight) / transform.originalHeight),
},
}));
}
function getEventPoint(
event: GuideEvent | undefined,
snapshot: GuideSnapshot,
): { x: number; y: number } | null {
if (!event) {
return null;
}
if (isNormalizedNumber(event.normalizedX) && isNormalizedNumber(event.normalizedY)) {
return { x: event.normalizedX, y: event.normalizedY };
}
if (isNormalizedNumber(event.x) && isNormalizedNumber(event.y)) {
return { x: event.x, y: event.y };
}
if (
typeof event.x === "number" &&
typeof event.y === "number" &&
event.x >= 0 &&
event.y >= 0 &&
event.x <= snapshot.width &&
event.y <= snapshot.height
) {
return { x: clamp01(event.x / snapshot.width), y: clamp01(event.y / snapshot.height) };
}
return null;
}
function calculateFocusCrop(
snapshot: GuideSnapshot,
click: { x: number; y: number },
): FocusTransform | null {
if (snapshot.width <= 0 || snapshot.height <= 0) {
return null;
}
const cropWidth = clampInteger(
Math.round(snapshot.width * 0.42),
Math.min(360, snapshot.width),
Math.min(720, snapshot.width),
);
const cropHeight = clampInteger(
Math.round(snapshot.height * 0.42),
Math.min(240, snapshot.height),
Math.min(520, snapshot.height),
);
const clickX = Math.round(clamp01(click.x) * snapshot.width);
const clickY = Math.round(clamp01(click.y) * snapshot.height);
return {
cropX: clampInteger(Math.round(clickX - cropWidth / 2), 0, snapshot.width - cropWidth),
cropY: clampInteger(Math.round(clickY - cropHeight / 2), 0, snapshot.height - cropHeight),
cropWidth,
cropHeight,
originalWidth: snapshot.width,
originalHeight: snapshot.height,
};
}
async function writeFocusedPng(input: {
sourcePath: string;
outputPath: string;
cropX: number;
cropY: number;
cropWidth: number;
cropHeight: number;
outputWidth: number;
outputHeight: number;
}): Promise<void> {
const script = buildCropScript(input);
const encodedCommand = Buffer.from(script, "utf16le").toString("base64");
await execFileAsync(
"powershell.exe",
["-NoProfile", "-ExecutionPolicy", "Bypass", "-EncodedCommand", encodedCommand],
{
timeout: 30000,
maxBuffer: 1024 * 1024,
windowsHide: true,
},
);
}
function buildCropScript(input: {
sourcePath: string;
outputPath: string;
cropX: number;
cropY: number;
cropWidth: number;
cropHeight: number;
outputWidth: number;
outputHeight: number;
}): string {
const sourcePathBase64 = Buffer.from(input.sourcePath, "utf8").toString("base64");
const outputPathBase64 = Buffer.from(input.outputPath, "utf8").toString("base64");
return `
$ErrorActionPreference = "Stop"
$sourcePath = [System.Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("${sourcePathBase64}"))
$outputPath = [System.Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("${outputPathBase64}"))
Add-Type -AssemblyName System.Drawing
$source = [System.Drawing.Image]::FromFile($sourcePath)
$target = [System.Drawing.Bitmap]::new(${input.outputWidth}, ${input.outputHeight})
$graphics = [System.Drawing.Graphics]::FromImage($target)
try {
$graphics.Clear([System.Drawing.Color]::White)
$graphics.InterpolationMode = [System.Drawing.Drawing2D.InterpolationMode]::HighQualityBicubic
$graphics.SmoothingMode = [System.Drawing.Drawing2D.SmoothingMode]::HighQuality
$graphics.PixelOffsetMode = [System.Drawing.Drawing2D.PixelOffsetMode]::HighQuality
$sourceRect = [System.Drawing.Rectangle]::new(${input.cropX}, ${input.cropY}, ${input.cropWidth}, ${input.cropHeight})
$targetRect = [System.Drawing.Rectangle]::new(0, 0, ${input.outputWidth}, ${input.outputHeight})
$graphics.DrawImage($source, $targetRect, $sourceRect, [System.Drawing.GraphicsUnit]::Pixel)
$target.Save($outputPath, [System.Drawing.Imaging.ImageFormat]::Png)
} finally {
$graphics.Dispose()
$target.Dispose()
$source.Dispose()
}
`;
}
function isNormalizedNumber(value: unknown): value is number {
return typeof value === "number" && Number.isFinite(value) && value >= 0 && value <= 1;
}
function clampInteger(value: number, min: number, max: number): number {
if (max < min) {
return min;
}
return Math.round(Math.min(max, Math.max(min, value)));
}
function clamp01(value: number): number {
if (!Number.isFinite(value)) {
return 0;
}
return Math.min(1, Math.max(0, value));
}
+110
View File
@@ -0,0 +1,110 @@
import { describe, expect, it } from "vitest";
import type { GuideSnapshot, OcrBlock } from "../../../src/guide/contracts";
import {
DefaultGuideOcrClient,
normalizeOcrResponse,
parseWindowsOcrPayload,
} from "./paddleOcrClient";
const snapshot: GuideSnapshot = {
id: "snapshot-1",
eventId: "event-1",
timeMs: 1000,
offsetMs: 500,
path: "/tmp/step-001.png",
width: 1000,
height: 800,
};
describe("normalizeOcrResponse", () => {
it("normalizes pixel boxes into guide OCR blocks", () => {
const blocks = normalizeOcrResponse(
{
blocks: [
{
text: "Save",
confidence: 92,
box: { x: 400, y: 320, width: 120, height: 40 },
},
],
},
snapshot,
);
expect(blocks).toEqual([
{
id: "ocr-snapshot-1-1",
snapshotId: "snapshot-1",
text: "Save",
confidence: 0.92,
box: { x: 0.4, y: 0.4, width: 0.12, height: 0.05 },
},
]);
});
it("normalizes polygon responses", () => {
const blocks = normalizeOcrResponse(
[
{
text: "Next",
score: 0.8,
bbox: [
[100, 200],
[300, 200],
[300, 260],
[100, 260],
],
},
],
snapshot,
);
expect(blocks[0]).toMatchObject({
text: "Next",
confidence: 0.8,
box: { x: 0.1, y: 0.25, width: 0.2, height: 0.075 },
});
});
});
describe("DefaultGuideOcrClient", () => {
it("falls back when the HTTP OCR service is unavailable", async () => {
const fallbackBlock: OcrBlock = {
id: "ocr-snapshot-1-1",
snapshotId: "snapshot-1",
text: "Save",
confidence: 0.75,
box: { x: 0.1, y: 0.2, width: 0.3, height: 0.4 },
};
const client = new DefaultGuideOcrClient(
{
recognize: async () => {
throw new Error("HTTP down");
},
},
{
recognize: async () => [fallbackBlock],
},
);
await expect(client.recognize(snapshot)).resolves.toEqual([fallbackBlock]);
});
});
describe("parseWindowsOcrPayload", () => {
it("recovers from raw control characters in OCR text", () => {
const payload = parseWindowsOcrPayload(
'{"blocks":[{"text":"Save\u0001now","confidence":0.75,"box":{"x":1,"y":2,"width":3,"height":4}}]}',
);
expect(payload).toEqual({
blocks: [
{
text: "Save now",
confidence: 0.75,
box: { x: 1, y: 2, width: 3, height: 4 },
},
],
});
});
});
+372
View File
@@ -0,0 +1,372 @@
import { execFile } from "node:child_process";
import fs from "node:fs/promises";
import { promisify } from "node:util";
import type { GuideSnapshot, OcrBlock } from "../../../src/guide/contracts";
import { ensureBundledOcrServiceRunning } from "./bundledOcrService";
const execFileAsync = promisify(execFile);
export interface GuideOcrClient {
recognize(snapshot: GuideSnapshot): Promise<OcrBlock[]>;
}
interface PaddleOcrResponseBlock {
text?: unknown;
confidence?: unknown;
score?: unknown;
box?: unknown;
bbox?: unknown;
}
export class PaddleOcrHttpClient implements GuideOcrClient {
constructor(
private readonly baseUrl = process.env.OPENSCREEN_GUIDE_OCR_URL ?? "http://127.0.0.1:8866",
private readonly language = process.env.OPENSCREEN_GUIDE_OCR_LANGUAGE ?? "vi,en",
) {}
async recognize(snapshot: GuideSnapshot): Promise<OcrBlock[]> {
await ensureBundledOcrServiceRunning(this.baseUrl);
const imageBase64 = await fs.readFile(snapshot.path, "base64");
let response: Response;
try {
response = await fetch(`${this.baseUrl.replace(/\/$/, "")}/ocr`, {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({
imageBase64,
path: snapshot.path,
language: this.language,
}),
});
} catch (error) {
throw new Error(
`OCR service is unavailable: ${error instanceof Error ? error.message : String(error)}`,
);
}
if (!response.ok) {
throw new Error(`OCR service returned HTTP ${response.status}.`);
}
const payload = (await response.json()) as unknown;
return normalizeOcrResponse(payload, snapshot);
}
}
export class WindowsOcrClient implements GuideOcrClient {
constructor(private readonly language = process.env.OPENSCREEN_GUIDE_OCR_LANGUAGE ?? "vi,en") {}
async recognize(snapshot: GuideSnapshot): Promise<OcrBlock[]> {
if (process.platform !== "win32") {
throw new Error("Windows OCR fallback is only available on Windows.");
}
const script = buildWindowsOcrScript(snapshot.path, this.language);
const encodedCommand = Buffer.from(script, "utf16le").toString("base64");
let stdout: string;
try {
const result = await execFileAsync(
"powershell.exe",
["-NoProfile", "-ExecutionPolicy", "Bypass", "-EncodedCommand", encodedCommand],
{
maxBuffer: 8 * 1024 * 1024,
timeout: 30000,
windowsHide: true,
},
);
stdout = result.stdout;
} catch (error) {
throw new Error(
`Windows OCR failed: ${error instanceof Error ? error.message : String(error)}`,
);
}
let payload: unknown;
try {
payload = parseWindowsOcrPayload(stdout);
} catch (error) {
throw new Error(
`Windows OCR returned invalid JSON: ${
error instanceof Error ? error.message : String(error)
}`,
);
}
return normalizeOcrResponse(payload, snapshot);
}
}
export class DefaultGuideOcrClient implements GuideOcrClient {
constructor(
private readonly httpClient = new PaddleOcrHttpClient(),
private readonly windowsClient = new WindowsOcrClient(),
) {}
async recognize(snapshot: GuideSnapshot): Promise<OcrBlock[]> {
try {
return await this.httpClient.recognize(snapshot);
} catch (httpError) {
try {
return await this.windowsClient.recognize(snapshot);
} catch (fallbackError) {
throw new Error(
[
httpError instanceof Error ? httpError.message : String(httpError),
fallbackError instanceof Error ? fallbackError.message : String(fallbackError),
].join(" "),
);
}
}
}
}
export function parseWindowsOcrPayload(stdout: string): unknown {
const normalized = stdout.replace(/^\uFEFF/, "").trim();
try {
return JSON.parse(normalized);
} catch {
return JSON.parse(replaceRawJsonControlCharacters(normalized));
}
}
function replaceRawJsonControlCharacters(value: string): string {
let result = "";
for (const character of value) {
const code = character.charCodeAt(0);
result += code < 32 || code === 127 ? " " : character;
}
return result;
}
export function normalizeOcrResponse(payload: unknown, snapshot: GuideSnapshot): OcrBlock[] {
const rawBlocks = extractRawBlocks(payload);
return rawBlocks
.map((raw, index) => normalizeBlock(raw, snapshot, index))
.filter((block): block is OcrBlock => block !== null);
}
function extractRawBlocks(payload: unknown): PaddleOcrResponseBlock[] {
if (Array.isArray(payload)) {
return payload as PaddleOcrResponseBlock[];
}
if (isRecord(payload)) {
if (Array.isArray(payload.blocks)) {
return payload.blocks as PaddleOcrResponseBlock[];
}
if (Array.isArray(payload.results)) {
return payload.results as PaddleOcrResponseBlock[];
}
if (Array.isArray(payload.data)) {
return payload.data as PaddleOcrResponseBlock[];
}
}
return [];
}
function normalizeBlock(
raw: PaddleOcrResponseBlock,
snapshot: GuideSnapshot,
index: number,
): OcrBlock | null {
if (!isRecord(raw)) {
return null;
}
const text = typeof raw.text === "string" ? raw.text.trim() : "";
if (!text) {
return null;
}
const confidence = normalizeConfidence(raw.confidence ?? raw.score);
const box = normalizeBox(raw.box ?? raw.bbox, snapshot);
if (!box) {
return null;
}
return {
id: `ocr-${snapshot.id}-${index + 1}`,
snapshotId: snapshot.id,
text,
confidence,
box,
};
}
function normalizeConfidence(value: unknown): number {
if (typeof value !== "number" || !Number.isFinite(value)) {
return 0.5;
}
return value > 1 ? clamp01(value / 100) : clamp01(value);
}
function normalizeBox(
value: unknown,
snapshot: GuideSnapshot,
): { x: number; y: number; width: number; height: number } | null {
if (Array.isArray(value)) {
return normalizeArrayBox(value, snapshot);
}
if (!isRecord(value)) {
return null;
}
const x = normalizeNumber(value.x);
const y = normalizeNumber(value.y);
const width = normalizeNumber(value.width ?? value.w);
const height = normalizeNumber(value.height ?? value.h);
if (x === null || y === null || width === null || height === null) {
return null;
}
return normalizeBoxDimensions({ x, y, width, height }, snapshot);
}
function normalizeArrayBox(
value: unknown[],
snapshot: GuideSnapshot,
): { x: number; y: number; width: number; height: number } | null {
const numbers = value.flat(2).filter((item): item is number => typeof item === "number");
if (numbers.length >= 8) {
const xs = [numbers[0], numbers[2], numbers[4], numbers[6]];
const ys = [numbers[1], numbers[3], numbers[5], numbers[7]];
const minX = Math.min(...xs);
const maxX = Math.max(...xs);
const minY = Math.min(...ys);
const maxY = Math.max(...ys);
return normalizeBoxDimensions(
{ x: minX, y: minY, width: maxX - minX, height: maxY - minY },
snapshot,
);
}
if (numbers.length >= 4) {
return normalizeBoxDimensions(
{ x: numbers[0] ?? 0, y: numbers[1] ?? 0, width: numbers[2] ?? 0, height: numbers[3] ?? 0 },
snapshot,
);
}
return null;
}
function normalizeBoxDimensions(
box: { x: number; y: number; width: number; height: number },
snapshot: GuideSnapshot,
): { x: number; y: number; width: number; height: number } {
const usesPixels =
box.x > 1 ||
box.y > 1 ||
box.width > 1 ||
box.height > 1 ||
box.x + box.width > 1 ||
box.y + box.height > 1;
const scaleX = usesPixels ? snapshot.width : 1;
const scaleY = usesPixels ? snapshot.height : 1;
return {
x: clamp01(box.x / scaleX),
y: clamp01(box.y / scaleY),
width: clamp01(box.width / scaleX),
height: clamp01(box.height / scaleY),
};
}
function normalizeNumber(value: unknown): number | null {
return typeof value === "number" && Number.isFinite(value) ? value : null;
}
function clamp01(value: number): number {
if (!Number.isFinite(value)) {
return 0;
}
return Math.min(1, Math.max(0, value));
}
function isRecord(value: unknown): value is Record<string, unknown> {
return typeof value === "object" && value !== null;
}
function buildWindowsOcrScript(imagePath: string, language: string): string {
const imagePathBase64 = Buffer.from(imagePath, "utf8").toString("base64");
const languageBase64 = Buffer.from(language, "utf8").toString("base64");
return `
$ErrorActionPreference = "Stop"
[Console]::OutputEncoding = [System.Text.UTF8Encoding]::new($false)
$OutputEncoding = [System.Text.UTF8Encoding]::new($false)
$imagePath = [System.Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("${imagePathBase64}"))
$languageSetting = [System.Text.Encoding]::UTF8.GetString([Convert]::FromBase64String("${languageBase64}"))
Add-Type -AssemblyName System.Runtime.WindowsRuntime
[void][Windows.Storage.StorageFile, Windows.Storage, ContentType=WindowsRuntime]
[void][Windows.Storage.FileAccessMode, Windows.Storage, ContentType=WindowsRuntime]
[void][Windows.Graphics.Imaging.BitmapDecoder, Windows.Graphics.Imaging, ContentType=WindowsRuntime]
[void][Windows.Graphics.Imaging.SoftwareBitmap, Windows.Graphics.Imaging, ContentType=WindowsRuntime]
[void][Windows.Media.Ocr.OcrEngine, Windows.Foundation, ContentType=WindowsRuntime]
[void][Windows.Globalization.Language, Windows.Globalization, ContentType=WindowsRuntime]
$asTaskGeneric = ([System.WindowsRuntimeSystemExtensions].GetMethods() | Where-Object {
$_.Name -eq "AsTask" -and $_.IsGenericMethodDefinition -and $_.GetParameters().Count -eq 1
})[0]
function Await-WinRt($operation, [Type]$resultType) {
$asTask = $asTaskGeneric.MakeGenericMethod($resultType)
$task = $asTask.Invoke($null, @($operation))
$task.Wait()
return $task.Result
}
function New-OcrEngine($languageSetting) {
$languageTags = @()
foreach ($item in $languageSetting.Split(",")) {
$tag = $item.Trim()
if ($tag -eq "vi") { $tag = "vi-VN" }
if ($tag -eq "en") { $tag = "en-US" }
if ($tag.Length -gt 0) { $languageTags += $tag }
}
foreach ($tag in $languageTags) {
try {
$language = [Windows.Globalization.Language]::new($tag)
$engine = [Windows.Media.Ocr.OcrEngine]::TryCreateFromLanguage($language)
if ($null -ne $engine) { return $engine }
} catch {}
}
$profileEngine = [Windows.Media.Ocr.OcrEngine]::TryCreateFromUserProfileLanguages()
if ($null -ne $profileEngine) { return $profileEngine }
return [Windows.Media.Ocr.OcrEngine]::TryCreateFromLanguage([Windows.Globalization.Language]::new("en-US"))
}
function Normalize-OcrText($value) {
if ($null -eq $value) { return "" }
$text = [string]$value
$text = [System.Text.RegularExpressions.Regex]::Replace($text, "[\\x00-\\x1F\\x7F]", " ")
return $text.Trim()
}
$file = Await-WinRt ([Windows.Storage.StorageFile]::GetFileFromPathAsync($imagePath)) ([Windows.Storage.StorageFile])
$stream = Await-WinRt ($file.OpenAsync([Windows.Storage.FileAccessMode]::Read)) ([Windows.Storage.Streams.IRandomAccessStream])
$decoder = Await-WinRt ([Windows.Graphics.Imaging.BitmapDecoder]::CreateAsync($stream)) ([Windows.Graphics.Imaging.BitmapDecoder])
$bitmap = Await-WinRt ($decoder.GetSoftwareBitmapAsync()) ([Windows.Graphics.Imaging.SoftwareBitmap])
$engine = New-OcrEngine $languageSetting
if ($null -eq $engine) { throw "No Windows OCR engine is available." }
$result = Await-WinRt ($engine.RecognizeAsync($bitmap)) ([Windows.Media.Ocr.OcrResult])
$blocks = @()
$index = 0
foreach ($line in $result.Lines) {
foreach ($word in $line.Words) {
$rect = $word.BoundingRect
$text = Normalize-OcrText $word.Text
if ($text.Length -gt 0) {
$index += 1
$blocks += [PSCustomObject]@{
text = $text
confidence = 0.75
box = [PSCustomObject]@{
x = [double]$rect.X
y = [double]$rect.Y
width = [double]$rect.Width
height = [double]$rect.Height
}
}
}
}
}
[PSCustomObject]@{ blocks = $blocks } | ConvertTo-Json -Depth 6 -Compress
`;
}
+111
View File
@@ -0,0 +1,111 @@
// Lightweight i18n for the Electron main process.
// Imports the same JSON translation files used by the renderer.
import commonAr from "../src/i18n/locales/ar/common.json";
import dialogsAr from "../src/i18n/locales/ar/dialogs.json";
import commonEn from "../src/i18n/locales/en/common.json";
import dialogsEn from "../src/i18n/locales/en/dialogs.json";
import commonEs from "../src/i18n/locales/es/common.json";
import dialogsEs from "../src/i18n/locales/es/dialogs.json";
import commonFr from "../src/i18n/locales/fr/common.json";
import dialogsFr from "../src/i18n/locales/fr/dialogs.json";
import commonIt from "../src/i18n/locales/it/common.json";
import dialogsIt from "../src/i18n/locales/it/dialogs.json";
import commonJa from "../src/i18n/locales/ja-JP/common.json";
import dialogsJa from "../src/i18n/locales/ja-JP/dialogs.json";
import commonKo from "../src/i18n/locales/ko-KR/common.json";
import dialogsKo from "../src/i18n/locales/ko-KR/dialogs.json";
import commonRu from "../src/i18n/locales/ru/common.json";
import dialogsRu from "../src/i18n/locales/ru/dialogs.json";
import commonTr from "../src/i18n/locales/tr/common.json";
import dialogsTr from "../src/i18n/locales/tr/dialogs.json";
import commonVi from "../src/i18n/locales/vi/common.json";
import dialogsVi from "../src/i18n/locales/vi/dialogs.json";
import commonZh from "../src/i18n/locales/zh-CN/common.json";
import dialogsZh from "../src/i18n/locales/zh-CN/dialogs.json";
import commonZhTw from "../src/i18n/locales/zh-TW/common.json";
import dialogsZhTw from "../src/i18n/locales/zh-TW/dialogs.json";
type Locale =
| "en"
| "ar"
| "es"
| "fr"
| "it"
| "ja-JP"
| "ko-KR"
| "ru"
| "tr"
| "vi"
| "zh-CN"
| "zh-TW";
type Namespace = "common" | "dialogs";
type MessageMap = Record<string, unknown>;
const messages: Record<Locale, Record<Namespace, MessageMap>> = {
en: { common: commonEn, dialogs: dialogsEn },
ar: { common: commonAr, dialogs: dialogsAr },
es: { common: commonEs, dialogs: dialogsEs },
fr: { common: commonFr, dialogs: dialogsFr },
it: { common: commonIt, dialogs: dialogsIt },
"ja-JP": { common: commonJa, dialogs: dialogsJa },
"ko-KR": { common: commonKo, dialogs: dialogsKo },
ru: { common: commonRu, dialogs: dialogsRu },
tr: { common: commonTr, dialogs: dialogsTr },
vi: { common: commonVi, dialogs: dialogsVi },
"zh-CN": { common: commonZh, dialogs: dialogsZh },
"zh-TW": { common: commonZhTw, dialogs: dialogsZhTw },
};
let currentLocale: Locale = "en";
export function setMainLocale(locale: string) {
if (
locale === "en" ||
locale === "ar" ||
locale === "es" ||
locale === "fr" ||
locale === "it" ||
locale === "ja-JP" ||
locale === "ko-KR" ||
locale === "ru" ||
locale === "tr" ||
locale === "vi" ||
locale === "zh-CN" ||
locale === "zh-TW"
) {
currentLocale = locale;
}
}
export function getMainLocale(): Locale {
return currentLocale;
}
function getMessageValue(obj: unknown, dotPath: string): string | undefined {
const keys = dotPath.split(".");
let current: unknown = obj;
for (const key of keys) {
if (current == null || typeof current !== "object") return undefined;
current = (current as Record<string, unknown>)[key];
}
return typeof current === "string" ? current : undefined;
}
function interpolate(str: string, vars?: Record<string, string | number>): string {
if (!vars) return str;
return str.replace(/\{\{(\w+)\}\}/g, (_, key: string) => String(vars[key] ?? `{{${key}}}`));
}
export function mainT(
namespace: Namespace,
key: string,
vars?: Record<string, string | number>,
): string {
const value =
getMessageValue(messages[currentLocale]?.[namespace], key) ??
getMessageValue(messages.en?.[namespace], key);
if (value == null) return `${namespace}.${key}`;
return interpolate(value, vars);
}
File diff suppressed because it is too large Load Diff
+229
View File
@@ -0,0 +1,229 @@
import { ipcMain } from "electron";
import {
NATIVE_BRIDGE_CHANNEL,
NATIVE_BRIDGE_VERSION,
type NativeBridgeErrorCode,
type NativeBridgeRequest,
type NativeBridgeResponse,
type NativePlatform,
type ProjectFileResult,
type ProjectPathResult,
} from "../../src/native/contracts";
import type { CursorTelemetryLoadResult } from "../native-bridge/cursor/adapter";
import { TelemetryCursorAdapter } from "../native-bridge/cursor/telemetryCursorAdapter";
import { CursorService } from "../native-bridge/services/cursorService";
import { ProjectService } from "../native-bridge/services/projectService";
import { SystemService } from "../native-bridge/services/systemService";
import { NativeBridgeStateStore } from "../native-bridge/store";
export interface NativeBridgeContext {
getPlatform: () => NodeJS.Platform;
getCurrentProjectPath: () => string | null;
getCurrentVideoPath: () => string | null;
saveProjectFile: (
projectData: unknown,
suggestedName?: string,
existingProjectPath?: string,
) => Promise<ProjectFileResult>;
loadProjectFile: () => Promise<ProjectFileResult>;
loadCurrentProjectFile: () => Promise<ProjectFileResult>;
setCurrentVideoPath: (path: string) => ProjectPathResult | Promise<ProjectPathResult>;
getCurrentVideoPathResult: () => ProjectPathResult;
clearCurrentVideoPath: () => ProjectPathResult;
resolveAssetBasePath: () => string | null;
resolveVideoPath: (videoPath?: string | null) => string | null;
loadCursorRecordingData: (
videoPath: string,
) => Promise<import("../../src/native/contracts").CursorRecordingData>;
loadCursorTelemetry: (videoPath: string) => Promise<CursorTelemetryLoadResult>;
}
function normalizePlatform(platform: NodeJS.Platform): NativePlatform {
if (platform === "darwin" || platform === "win32") {
return platform;
}
return "linux";
}
function createMeta(requestId?: string) {
return {
version: NATIVE_BRIDGE_VERSION,
requestId: requestId || `native-${Date.now()}`,
timestampMs: Date.now(),
} as const;
}
function createSuccessResponse<TData>(requestId: string | undefined, data: TData) {
return {
ok: true,
data,
meta: createMeta(requestId),
} satisfies NativeBridgeResponse<TData>;
}
function createErrorResponse(
requestId: string | undefined,
code: NativeBridgeErrorCode,
message: string,
retryable = false,
) {
return {
ok: false,
error: {
code,
message,
retryable,
},
meta: createMeta(requestId),
} satisfies NativeBridgeResponse;
}
function isBridgeRequest(value: unknown): value is NativeBridgeRequest {
if (!value || typeof value !== "object") {
return false;
}
const candidate = value as Partial<NativeBridgeRequest>;
return typeof candidate.domain === "string" && typeof candidate.action === "string";
}
export function registerNativeBridgeHandlers(context: NativeBridgeContext) {
ipcMain.removeHandler(NATIVE_BRIDGE_CHANNEL);
const platform = normalizePlatform(context.getPlatform());
const store = new NativeBridgeStateStore(platform);
const projectService = new ProjectService({
store,
getCurrentProjectPath: context.getCurrentProjectPath,
getCurrentVideoPath: context.getCurrentVideoPath,
saveProjectFile: context.saveProjectFile,
loadProjectFile: context.loadProjectFile,
loadCurrentProjectFile: context.loadCurrentProjectFile,
setCurrentVideoPath: context.setCurrentVideoPath,
getCurrentVideoPathResult: context.getCurrentVideoPathResult,
clearCurrentVideoPath: context.clearCurrentVideoPath,
});
const cursorService = new CursorService({
store,
adapter: new TelemetryCursorAdapter({
loadRecordingData: context.loadCursorRecordingData,
resolveVideoPath: context.resolveVideoPath,
loadTelemetry: context.loadCursorTelemetry,
}),
});
const systemService = new SystemService({
store,
getPlatform: () => platform,
getAssetBasePath: context.resolveAssetBasePath,
getCursorCapabilities: () => cursorService.getCapabilities(),
});
ipcMain.handle(NATIVE_BRIDGE_CHANNEL, async (_, request: unknown) => {
if (!isBridgeRequest(request)) {
return createErrorResponse(undefined, "INVALID_REQUEST", "Invalid native bridge request.");
}
const requestId = request.requestId;
const domain = request.domain as string;
try {
switch (request.domain) {
case "system": {
const action = request.action as string;
switch (request.action) {
case "getPlatform":
return createSuccessResponse(requestId, systemService.getPlatform());
case "getAssetBasePath":
return createSuccessResponse(requestId, systemService.getAssetBasePath());
case "getCapabilities":
return createSuccessResponse(requestId, await systemService.getCapabilities());
default:
return createErrorResponse(
requestId,
"UNSUPPORTED_ACTION",
`Unsupported system action: ${action}`,
);
}
}
case "project": {
const action = request.action as string;
switch (request.action) {
case "getCurrentContext":
return createSuccessResponse(requestId, projectService.getCurrentContext());
case "saveProjectFile":
return createSuccessResponse(
requestId,
await projectService.saveProjectFile(
request.payload.projectData,
request.payload.suggestedName,
request.payload.existingProjectPath,
),
);
case "loadProjectFile":
return createSuccessResponse(requestId, await projectService.loadProjectFile());
case "loadCurrentProjectFile":
return createSuccessResponse(
requestId,
await projectService.loadCurrentProjectFile(),
);
case "setCurrentVideoPath":
return createSuccessResponse(
requestId,
await projectService.setCurrentVideoPath(request.payload.path),
);
case "getCurrentVideoPath":
return createSuccessResponse(requestId, projectService.getCurrentVideoPath());
case "clearCurrentVideoPath":
return createSuccessResponse(requestId, projectService.clearCurrentVideoPath());
default:
return createErrorResponse(
requestId,
"UNSUPPORTED_ACTION",
`Unsupported project action: ${action}`,
);
}
}
case "cursor": {
const action = request.action as string;
switch (request.action) {
case "getCapabilities":
return createSuccessResponse(requestId, await cursorService.getCapabilities());
case "getTelemetry":
return createSuccessResponse(
requestId,
await cursorService.getTelemetry(request.payload?.videoPath),
);
case "getRecordingData":
return createSuccessResponse(
requestId,
await cursorService.getRecordingData(request.payload?.videoPath),
);
default:
return createErrorResponse(
requestId,
"UNSUPPORTED_ACTION",
`Unsupported cursor action: ${action}`,
);
}
}
default:
return createErrorResponse(
requestId,
"UNSUPPORTED_ACTION",
`Unsupported bridge domain: ${domain}`,
);
}
} catch (error) {
return createErrorResponse(
requestId,
"INTERNAL_ERROR",
error instanceof Error ? error.message : "Unknown native bridge error.",
true,
);
}
});
}
+84
View File
@@ -0,0 +1,84 @@
import { mkdtemp, readFile, rm, stat } from "node:fs/promises";
import { tmpdir } from "node:os";
import path from "node:path";
import { afterEach, beforeEach, describe, expect, it } from "vitest";
import { RecordingStreamRegistry } from "./recordingStream";
describe("RecordingStreamRegistry", () => {
let dir: string;
const pathFor = (name: string) => path.join(dir, name);
beforeEach(async () => {
dir = await mkdtemp(path.join(tmpdir(), "openscreen-stream-"));
});
afterEach(async () => {
await rm(dir, { recursive: true, force: true });
});
it("streams chunks to disk in order and reports streamed on finalize", async () => {
const registry = new RecordingStreamRegistry();
await registry.open("rec.webm", pathFor("rec.webm"));
await registry.append("rec.webm", Buffer.from("hello "));
await registry.append("rec.webm", Buffer.from("world"));
const streamed = await registry.finalize("rec.webm");
expect(streamed).toBe(true);
expect(await readFile(pathFor("rec.webm"), "utf8")).toBe("hello world");
// A second finalize has nothing to close.
expect(await registry.finalize("rec.webm")).toBe(false);
});
it("reports not-streamed when no stream was opened", async () => {
const registry = new RecordingStreamRegistry();
expect(await registry.finalize("missing.webm")).toBe(false);
expect(registry.has("missing.webm")).toBe(false);
});
it("rejects open when the target path is not writable (open is awaited, not assumed)", async () => {
const registry = new RecordingStreamRegistry();
// Parent directory does not exist, so createWriteStream emits 'error' on open.
await expect(
registry.open("rec.webm", path.join(dir, "does-not-exist", "rec.webm")),
).rejects.toThrow();
// A failed open must not register a stream the renderer would treat as live.
expect(registry.has("rec.webm")).toBe(false);
});
it("rejects append when no stream is open", async () => {
const registry = new RecordingStreamRegistry();
await expect(registry.append("rec.webm", Buffer.from("x"))).rejects.toThrow(
/No active recording stream/,
);
});
it("discard closes the stream and removes the partial file", async () => {
const registry = new RecordingStreamRegistry();
await registry.open("rec.webm", pathFor("rec.webm"));
await registry.append("rec.webm", Buffer.from("partial"));
await registry.discard("rec.webm", pathFor("rec.webm"));
expect(registry.has("rec.webm")).toBe(false);
await expect(stat(pathFor("rec.webm"))).rejects.toThrow();
// Nothing left to finalize after a discard.
expect(await registry.finalize("rec.webm")).toBe(false);
});
it("discard tolerates a missing file", async () => {
const registry = new RecordingStreamRegistry();
await expect(registry.discard("never.webm", pathFor("never.webm"))).resolves.toBeUndefined();
});
it("opening the same file twice replaces the prior stream", async () => {
const registry = new RecordingStreamRegistry();
await registry.open("rec.webm", pathFor("rec.webm"));
await registry.append("rec.webm", Buffer.from("first"));
await registry.open("rec.webm", pathFor("rec.webm"));
await registry.append("rec.webm", Buffer.from("second"));
await registry.finalize("rec.webm");
expect(await readFile(pathFor("rec.webm"), "utf8")).toBe("second");
});
});
+147
View File
@@ -0,0 +1,147 @@
import { createWriteStream, type WriteStream } from "node:fs";
import { unlink } from "node:fs/promises";
import type { IpcMain } from "electron";
/**
* Owns the lifecycle of on-disk write streams for in-progress recordings, keyed
* by the recording's output file name. Browser MediaRecorder chunks are appended
* here as they arrive so a long recording never buffers the whole video in the
* renderer (the #616 fix).
*
* The file name is the key because it is the one value the renderer and main
* process already exchange and it is globally unique per recording, so there is
* no derived/offset key to keep in sync across the IPC boundary.
*/
export class RecordingStreamRegistry {
private readonly streams = new Map<string, WriteStream>();
/**
* Open a write stream and resolve only once the OS confirms it is writable.
* Resolving on the `open` event (rather than on `createWriteStream` returning)
* means a bad path or permission error rejects here instead of surfacing as a
* silent chunk drop later, so the renderer's fallback can take over.
*/
async open(fileName: string, filePath: string): Promise<void> {
await this.endStream(fileName);
const ws = createWriteStream(filePath, { flags: "w" });
await new Promise<void>((resolve, reject) => {
const onError = (error: Error) => reject(error);
ws.once("error", onError);
ws.once("open", () => {
ws.removeListener("error", onError);
resolve();
});
});
// Keep a listener for the stream's lifetime so a late error logs rather
// than crashing the main process with an unhandled 'error' event. Per-write
// failures still surface through the `append` callback below.
ws.on("error", (error) => {
console.error(`[recording-stream] ${fileName}:`, error);
});
this.streams.set(fileName, ws);
}
has(fileName: string): boolean {
return this.streams.has(fileName);
}
/** Append a chunk; rejects if no stream is open or the write fails. */
async append(fileName: string, chunk: Buffer): Promise<void> {
const ws = this.streams.get(fileName);
if (!ws) {
throw new Error(`No active recording stream for ${fileName}`);
}
await new Promise<void>((resolve, reject) => {
ws.write(chunk, (error) => (error ? reject(error) : resolve()));
});
}
/**
* Flush and close the stream, keeping the file. Returns whether a stream was
* open — i.e. whether the recording was streamed to disk (true) or needs its
* in-memory buffer written by the caller (false).
*/
async finalize(fileName: string): Promise<boolean> {
const ws = this.streams.get(fileName);
if (!ws) {
return false;
}
this.streams.delete(fileName);
await new Promise<void>((resolve, reject) => {
ws.end((error?: Error | null) => (error ? reject(error) : resolve()));
});
return true;
}
/**
* Close the stream (if any) and delete the partial file. Used when a streamed
* recording is discarded or fails before a successful save, so cancelled runs
* don't leak file descriptors or orphan partial recordings on disk.
*/
async discard(fileName: string, filePath: string): Promise<void> {
await this.endStream(fileName);
await unlink(filePath).catch(() => undefined);
}
private async endStream(fileName: string): Promise<void> {
const ws = this.streams.get(fileName);
if (!ws) {
return;
}
this.streams.delete(fileName);
await new Promise<void>((resolve) => ws.end(() => resolve()));
}
}
/**
* Register the streaming IPC handlers. Thin wrappers that translate the
* registry's throw-on-failure contract into the `{ success, error }` shape the
* renderer expects.
*/
export function registerRecordingStreamHandlers(
ipcMain: IpcMain,
registry: RecordingStreamRegistry,
resolveRecordingOutputPath: (fileName: string) => string,
): void {
ipcMain.handle(
"open-recording-stream",
async (_, fileName: string): Promise<{ success: boolean; error?: string }> => {
try {
await registry.open(fileName, resolveRecordingOutputPath(fileName));
return { success: true };
} catch (error) {
return { success: false, error: String(error) };
}
},
);
ipcMain.handle(
"append-recording-chunk",
async (
_,
fileName: string,
chunk: ArrayBuffer,
): Promise<{ success: boolean; error?: string }> => {
try {
await registry.append(fileName, Buffer.from(chunk));
return { success: true };
} catch (error) {
return { success: false, error: String(error) };
}
},
);
ipcMain.handle(
"close-recording-stream",
async (_, fileName: string): Promise<{ success: boolean; error?: string }> => {
try {
await registry.discard(fileName, resolveRecordingOutputPath(fileName));
return { success: true };
} catch (error) {
return { success: false, error: String(error) };
}
},
);
}
+549
View File
@@ -0,0 +1,549 @@
import fs from "node:fs/promises";
import path from "node:path";
import { fileURLToPath } from "node:url";
import {
app,
BrowserWindow,
ipcMain,
Menu,
nativeImage,
session,
systemPreferences,
Tray,
} from "electron";
import { mainT, setMainLocale } from "./i18n";
import { getSelectedDesktopSource, registerIpcHandlers } from "./ipc/handlers";
import {
createCountdownOverlayWindow,
createEditorWindow,
createHudOverlayWindow,
createSourceSelectorWindow,
} from "./windows";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
// Use Screen & System Audio Recording permissions instead of CoreAudio Tap API on macOS.
// CoreAudio Tap requires NSAudioCaptureUsageDescription in the parent app's Info.plist,
// which doesn't work when running from a terminal/IDE during development, makes my life easier
if (process.platform === "darwin") {
app.commandLine.appendSwitch("disable-features", "MacCatapLoopbackAudioForScreenShare");
}
// Enable Wayland support for proper screen capture and window management
// on Wayland compositors (Hyprland, GNOME, KDE, etc.)
if (process.platform === "linux") {
const isWayland =
process.env.XDG_SESSION_TYPE === "wayland" || process.env.WAYLAND_DISPLAY !== undefined;
if (isWayland) {
app.commandLine.appendSwitch("ozone-platform", "wayland");
// Enable WebRTCPipeWireCapturer for screen capture on Wayland
app.commandLine.appendSwitch("enable-features", "WaylandWindowDrag,WebRTCPipeWireCapturer");
}
}
export const RECORDINGS_DIR = path.join(app.getPath("userData"), "recordings");
async function ensureRecordingsDir() {
try {
await fs.mkdir(RECORDINGS_DIR, { recursive: true });
console.log("RECORDINGS_DIR:", RECORDINGS_DIR);
console.log("User Data Path:", app.getPath("userData"));
} catch (error) {
console.error("Failed to create recordings directory:", error);
}
}
// The built directory structure
//
// ├─┬─┬ dist
// │ │ └── index.html
// │ │
// │ ├─┬ dist-electron
// │ │ ├── main.js
// │ │ └── preload.mjs
// │
process.env.APP_ROOT = path.join(__dirname, "..");
// Use ['ENV_NAME'] avoid vite:define plugin - Vite@2.x
export const VITE_DEV_SERVER_URL = process.env["VITE_DEV_SERVER_URL"];
export const MAIN_DIST = path.join(process.env.APP_ROOT, "dist-electron");
export const RENDERER_DIST = path.join(process.env.APP_ROOT, "dist");
process.env.VITE_PUBLIC = VITE_DEV_SERVER_URL
? path.join(process.env.APP_ROOT, "public")
: RENDERER_DIST;
// Window references
let mainWindow: BrowserWindow | null = null;
let sourceSelectorWindow: BrowserWindow | null = null;
let countdownOverlayWindow: BrowserWindow | null = null;
let tray: Tray | null = null;
let selectedSourceName = "";
const isMac = process.platform === "darwin";
const trayIconSize = isMac ? 16 : 24;
// Tray Icons
const defaultTrayIcon = getTrayIcon("openscreen.png", trayIconSize);
const recordingTrayIcon = getTrayIcon("rec-button.png", trayIconSize);
function createWindow() {
mainWindow = createHudOverlayWindow();
}
function showMainWindow() {
if (mainWindow && !mainWindow.isDestroyed()) {
if (mainWindow.isMinimized()) {
mainWindow.restore();
}
mainWindow.show();
mainWindow.focus();
return;
}
createWindow();
}
function isEditorWindow(window: BrowserWindow) {
return window.webContents.getURL().includes("windowType=editor");
}
function sendEditorMenuAction(
channel: "menu-load-project" | "menu-save-project" | "menu-save-project-as",
) {
let targetWindow = BrowserWindow.getFocusedWindow() ?? mainWindow;
if (!targetWindow || targetWindow.isDestroyed() || !isEditorWindow(targetWindow)) {
createEditorWindowWrapper();
targetWindow = mainWindow;
if (!targetWindow || targetWindow.isDestroyed()) return;
targetWindow.webContents.once("did-finish-load", () => {
if (!targetWindow || targetWindow.isDestroyed()) return;
targetWindow.webContents.send(channel);
});
return;
}
targetWindow.webContents.send(channel);
}
function setupApplicationMenu() {
const isMac = process.platform === "darwin";
const template: Electron.MenuItemConstructorOptions[] = [];
if (isMac) {
template.push({
label: app.name,
submenu: [
{
role: "about",
label: mainT("common", "actions.about") || "About OpenScreen",
},
{ type: "separator" },
{
role: "services",
label: mainT("common", "actions.services") || "Services",
},
{ type: "separator" },
{
role: "hide",
label: mainT("common", "actions.hide") || "Hide OpenScreen",
},
{
role: "hideOthers",
label: mainT("common", "actions.hideOthers") || "Hide Others",
},
{
role: "unhide",
label: mainT("common", "actions.unhide") || "Show All",
},
{ type: "separator" },
{ role: "quit", label: mainT("common", "actions.quit") || "Quit" },
],
});
}
template.push(
{
label: mainT("common", "actions.file") || "File",
submenu: [
{
label: mainT("dialogs", "unsavedChanges.loadProject") || "Load Project…",
accelerator: "CmdOrCtrl+O",
click: () => sendEditorMenuAction("menu-load-project"),
},
{
label: mainT("dialogs", "unsavedChanges.saveProject") || "Save Project…",
accelerator: "CmdOrCtrl+S",
click: () => sendEditorMenuAction("menu-save-project"),
},
{
label: mainT("dialogs", "unsavedChanges.saveProjectAs") || "Save Project As…",
accelerator: "CmdOrCtrl+Shift+S",
click: () => sendEditorMenuAction("menu-save-project-as"),
},
...(isMac
? []
: [
{ type: "separator" as const },
{
role: "quit" as const,
label: mainT("common", "actions.quit") || "Quit",
},
]),
],
},
{
label: mainT("common", "actions.edit") || "Edit",
submenu: [
{ role: "undo", label: mainT("common", "actions.undo") || "Undo" },
{ role: "redo", label: mainT("common", "actions.redo") || "Redo" },
{ type: "separator" },
{ role: "cut", label: mainT("common", "actions.cut") || "Cut" },
{ role: "copy", label: mainT("common", "actions.copy") || "Copy" },
{ role: "paste", label: mainT("common", "actions.paste") || "Paste" },
{
role: "selectAll",
label: mainT("common", "actions.selectAll") || "Select All",
},
],
},
{
label: mainT("common", "actions.view") || "View",
submenu: [
{
role: "reload",
label: mainT("common", "actions.reload") || "Reload",
},
{
role: "forceReload",
label: mainT("common", "actions.forceReload") || "Force Reload",
},
{
role: "toggleDevTools",
label: mainT("common", "actions.toggleDevTools") || "Toggle Developer Tools",
},
{ type: "separator" },
{
role: "resetZoom",
label: mainT("common", "actions.actualSize") || "Actual Size",
},
{
role: "zoomIn",
label: mainT("common", "actions.zoomIn") || "Zoom In",
},
{
role: "zoomOut",
label: mainT("common", "actions.zoomOut") || "Zoom Out",
},
{ type: "separator" },
{
role: "togglefullscreen",
label: mainT("common", "actions.toggleFullScreen") || "Toggle Full Screen",
},
],
},
{
label: mainT("common", "actions.window") || "Window",
submenu: isMac
? [
{
role: "minimize",
label: mainT("common", "actions.minimize") || "Minimize",
},
{ role: "zoom" },
{ type: "separator" },
{ role: "front" },
]
: [
{
role: "minimize",
label: mainT("common", "actions.minimize") || "Minimize",
},
{
role: "close",
label: mainT("common", "actions.close") || "Close",
},
],
},
);
const menu = Menu.buildFromTemplate(template);
Menu.setApplicationMenu(menu);
}
function createTray() {
tray = new Tray(defaultTrayIcon);
tray.on("click", () => {
showMainWindow();
});
tray.on("double-click", () => {
showMainWindow();
});
}
function getTrayIcon(filename: string, size: number) {
return nativeImage
.createFromPath(path.join(process.env.VITE_PUBLIC || RENDERER_DIST, filename))
.resize({
width: size,
height: size,
quality: "best",
});
}
function updateTrayMenu(recording: boolean = false) {
if (!tray) return;
const trayIcon = recording ? recordingTrayIcon : defaultTrayIcon;
const trayToolTip = recording
? mainT("common", "actions.recordingStatus", {
source: selectedSourceName,
}) || `Recording: ${selectedSourceName}`
: "OpenScreen";
const menuTemplate = recording
? [
{
label: mainT("common", "actions.stopRecording") || "Stop Recording",
click: () => {
if (mainWindow && !mainWindow.isDestroyed()) {
mainWindow.webContents.send("stop-recording-from-tray");
}
},
},
]
: [
{
label: mainT("common", "actions.open") || "Open",
click: () => {
showMainWindow();
},
},
{
label: mainT("common", "actions.quit") || "Quit",
click: () => {
app.quit();
},
},
];
tray.setImage(trayIcon);
tray.setToolTip(trayToolTip);
tray.setContextMenu(Menu.buildFromTemplate(menuTemplate));
}
let editorHasUnsavedChanges = false;
let isForceClosing = false;
let isCloseConfirmInFlight = false;
ipcMain.on("set-has-unsaved-changes", (_, hasChanges: boolean) => {
editorHasUnsavedChanges = hasChanges;
});
function forceCloseEditorWindow(windowToClose: BrowserWindow | null) {
if (!windowToClose || windowToClose.isDestroyed()) return;
isForceClosing = true;
setImmediate(() => {
try {
if (!windowToClose.isDestroyed()) {
windowToClose.close();
}
} finally {
isForceClosing = false;
}
});
}
function createEditorWindowWrapper() {
if (mainWindow) {
isForceClosing = true;
mainWindow.close();
isForceClosing = false;
mainWindow = null;
}
mainWindow = createEditorWindow();
editorHasUnsavedChanges = false;
mainWindow.on("close", (event) => {
if (isForceClosing || !editorHasUnsavedChanges || isCloseConfirmInFlight) return;
event.preventDefault();
isCloseConfirmInFlight = true;
const windowToClose = mainWindow;
if (!windowToClose || windowToClose.isDestroyed()) return;
// Ask renderer to show the custom in-app dialog
windowToClose.webContents.send("request-close-confirm");
ipcMain.once("close-confirm-response", (event, choice: "save" | "discard" | "cancel") => {
if (event.sender.id !== windowToClose?.webContents.id) return;
isCloseConfirmInFlight = false;
if (!windowToClose || windowToClose.isDestroyed()) return;
if (choice === "save") {
// Tell renderer to save the project, then close when done
windowToClose.webContents.send("request-save-before-close");
ipcMain.once("save-before-close-done", (event, shouldClose: boolean) => {
if (event.sender.id !== windowToClose?.webContents.id) return;
if (!shouldClose) return;
forceCloseEditorWindow(windowToClose);
});
} else if (choice === "discard") {
forceCloseEditorWindow(windowToClose);
}
// "cancel": flag reset, window stays open
});
});
}
function createSourceSelectorWindowWrapper() {
sourceSelectorWindow = createSourceSelectorWindow();
sourceSelectorWindow.on("closed", () => {
sourceSelectorWindow = null;
});
return sourceSelectorWindow;
}
function createCountdownOverlayWindowWrapper() {
if (countdownOverlayWindow && !countdownOverlayWindow.isDestroyed()) {
return countdownOverlayWindow;
}
countdownOverlayWindow = createCountdownOverlayWindow();
countdownOverlayWindow.on("closed", () => {
countdownOverlayWindow = null;
});
return countdownOverlayWindow;
}
// Closing every window quits the app entirely (tray icon goes too).
// The in-app "Return to Recorder" button covers the editor → HUD round-trip,
// so closing the last window is an explicit "I'm done" signal.
app.on("window-all-closed", () => {
app.quit();
});
app.on("activate", () => {
// On OS X it's common to re-create a window in the app when the
// dock icon is clicked and there are no other windows open.
const hasVisibleWindow = BrowserWindow.getAllWindows().some((window) => {
if (window.isDestroyed() || !window.isVisible()) {
return false;
}
const url = window.webContents.getURL();
const isCountdownOverlayWindow = url.includes("windowType=countdown-overlay");
return !isCountdownOverlayWindow;
});
if (!hasVisibleWindow) {
showMainWindow();
}
});
// Register all IPC handlers when app is ready
app.whenReady().then(async () => {
// Force the app into "regular" activation policy so the Dock icon appears.
// The HUD overlay (transparent + frameless + skipTaskbar) is the first
// window we open, and AppKit otherwise classifies us as an accessory app.
if (process.platform === "darwin") {
app.dock?.show();
}
// Allow microphone/media/screen permission checks
session.defaultSession.setPermissionCheckHandler((_webContents, permission) => {
const allowed = [
"media",
"audioCapture",
"microphone",
"videoCapture",
"camera",
"screen",
"display-capture",
];
return allowed.includes(permission);
});
session.defaultSession.setPermissionRequestHandler((_webContents, permission, callback) => {
const allowed = [
"media",
"audioCapture",
"microphone",
"videoCapture",
"camera",
"screen",
"display-capture",
];
callback(allowed.includes(permission));
});
session.defaultSession.setDisplayMediaRequestHandler(
(request, callback) => {
const source = getSelectedDesktopSource();
if (!request.videoRequested || !source) {
callback({});
return;
}
callback({
video: source,
...(request.audioRequested && process.platform === "win32" ? { audio: "loopback" } : {}),
});
},
{ useSystemPicker: false },
);
// Request microphone permission from macOS. Screen Recording is requested
// lazily from the source-picker action so the system prompt is not hidden
// behind OpenScreen's source selector window.
if (process.platform === "darwin") {
const micStatus = systemPreferences.getMediaAccessStatus("microphone");
if (micStatus !== "granted") {
await systemPreferences.askForMediaAccess("microphone");
}
}
// Listen for HUD overlay quit event (macOS only)
ipcMain.on("hud-overlay-close", () => {
app.quit();
});
ipcMain.handle("set-locale", (_, locale: string) => {
setMainLocale(locale);
setupApplicationMenu();
updateTrayMenu();
});
createTray();
updateTrayMenu();
setupApplicationMenu();
// Ensure recordings directory exists
await ensureRecordingsDir();
function switchToHudWrapper() {
if (mainWindow) {
isForceClosing = true;
mainWindow.close();
isForceClosing = false;
mainWindow = null;
}
showMainWindow();
}
registerIpcHandlers(
createEditorWindowWrapper,
createSourceSelectorWindowWrapper,
createCountdownOverlayWindowWrapper,
() => mainWindow,
() => sourceSelectorWindow,
() => countdownOverlayWindow,
(recording: boolean, sourceName: string) => {
selectedSourceName = sourceName;
if (!tray) createTray();
updateTrayMenu(recording);
if (!recording) {
showMainWindow();
}
},
switchToHudWrapper,
);
createWindow();
});
+20
View File
@@ -0,0 +1,20 @@
import type {
CursorCapabilities,
CursorProviderKind,
CursorRecordingData,
CursorTelemetryPoint,
} from "../../../src/native/contracts";
export interface CursorTelemetryLoadResult {
success: boolean;
samples: CursorTelemetryPoint[];
message?: string;
error?: string;
}
export interface CursorNativeAdapter {
readonly kind: CursorProviderKind;
getCapabilities(): Promise<CursorCapabilities>;
getRecordingData(videoPath?: string | null): Promise<CursorRecordingData>;
getTelemetry(videoPath?: string | null): Promise<CursorTelemetryLoadResult>;
}
@@ -0,0 +1,46 @@
import type { Rectangle } from "electron";
import { MacNativeCursorRecordingSession } from "./macNativeCursorRecordingSession";
import type { CursorRecordingSession } from "./session";
import { TelemetryRecordingSession } from "./telemetryRecordingSession";
import { WindowsNativeRecordingSession } from "./windowsNativeRecordingSession";
interface CreateCursorRecordingSessionOptions {
getDisplayBounds: () => Rectangle | null;
maxSamples: number;
platform: NodeJS.Platform;
sampleIntervalMs: number;
sourceId?: string | null;
startTimeMs?: number;
}
export function createCursorRecordingSession(
options: CreateCursorRecordingSessionOptions,
): CursorRecordingSession {
if (options.platform === "win32") {
return new WindowsNativeRecordingSession({
getDisplayBounds: options.getDisplayBounds,
maxSamples: options.maxSamples,
sampleIntervalMs: options.sampleIntervalMs,
sourceId: options.sourceId,
startTimeMs: options.startTimeMs,
});
}
if (options.platform === "darwin") {
return new MacNativeCursorRecordingSession({
getDisplayBounds: options.getDisplayBounds,
maxSamples: options.maxSamples,
sampleIntervalMs: options.sampleIntervalMs,
startTimeMs: options.startTimeMs,
});
}
// Linux: capture cursor positions via Electron's `screen` API on an interval.
// No cursor sprites/assets and no clicks — just position telemetry.
return new TelemetryRecordingSession({
getDisplayBounds: options.getDisplayBounds,
maxSamples: options.maxSamples,
sampleIntervalMs: options.sampleIntervalMs,
startTimeMs: options.startTimeMs,
});
}
@@ -0,0 +1,411 @@
import { type ChildProcessByStdio, spawn } from "node:child_process";
import { accessSync, constants as fsConstants } from "node:fs";
import path from "node:path";
import type { Readable } from "node:stream";
import { type Rectangle, screen, systemPreferences } from "electron";
import type {
CursorRecordingData,
CursorRecordingSample,
NativeCursorType,
} from "../../../../src/native/contracts";
import type { CursorRecordingSession } from "./session";
interface MacNativeCursorRecordingSessionOptions {
getDisplayBounds: () => Rectangle | null;
maxSamples: number;
sampleIntervalMs: number;
startTimeMs?: number;
}
type MacCursorEvent =
| {
type: "ready";
timestampMs: number;
accessibilityTrusted?: boolean;
mouseTapReady?: boolean;
}
| {
type: "sample";
timestampMs: number;
cursorType?: NativeCursorType | null;
leftButtonDown?: boolean;
leftButtonPressed?: boolean;
leftButtonReleased?: boolean;
};
const HELPER_NAME = "openscreen-macos-cursor-helper";
const READY_TIMEOUT_MS = 5_000;
function helperCandidates() {
const envPath = process.env.OPENSCREEN_MAC_CURSOR_HELPER_EXE?.trim();
const appRoot = process.env.APP_ROOT ? path.resolve(process.env.APP_ROOT) : process.cwd();
const archTag = process.arch === "arm64" ? "darwin-arm64" : "darwin-x64";
const resourceRoot =
typeof process.resourcesPath === "string"
? process.resourcesPath
: path.join(appRoot, "resources");
return [
envPath,
path.join(appRoot, "electron", "native", "screencapturekit", "build", HELPER_NAME),
path.join(appRoot, "electron", "native", "bin", archTag, HELPER_NAME),
path.join(resourceRoot, "electron", "native", "bin", archTag, HELPER_NAME),
].filter((candidate): candidate is string => Boolean(candidate));
}
export function findMacCursorHelperPath() {
for (const candidate of helperCandidates()) {
try {
accessSync(candidate, fsConstants.X_OK);
return candidate;
} catch {
// Try the next helper location.
}
}
return null;
}
export async function requestMacCursorAccessibilityAccess() {
if (process.platform !== "darwin") {
return { success: true, granted: true, status: "granted" };
}
try {
systemPreferences.isTrustedAccessibilityClient(true);
} catch {
// Continue with helper probing; it can trigger the same macOS prompt.
}
const helperPath = findMacCursorHelperPath();
if (!helperPath) {
return { success: true, granted: false, status: "missing-helper" };
}
return new Promise<{ success: boolean; granted: boolean; status: string; error?: string }>(
(resolve) => {
const child = spawn(helperPath, [JSON.stringify({ sampleIntervalMs: 250 })], {
stdio: ["ignore", "pipe", "pipe"],
});
let settled = false;
let lineBuffer = "";
const finish = (result: {
success: boolean;
granted: boolean;
status: string;
error?: string;
}) => {
if (settled) {
return;
}
settled = true;
clearTimeout(timer);
if (!child.killed) {
child.kill("SIGTERM");
}
resolve(result);
};
const timer = setTimeout(() => {
finish({
success: false,
granted: false,
status: "timeout",
error: "Timed out waiting for macOS cursor helper",
});
}, READY_TIMEOUT_MS);
child.stdout.setEncoding("utf8");
child.stdout.on("data", (chunk: string) => {
lineBuffer += chunk;
const lines = lineBuffer.split(/\r?\n/);
lineBuffer = lines.pop() ?? "";
for (const line of lines) {
const trimmed = line.trim();
if (!trimmed) {
continue;
}
try {
const event = JSON.parse(trimmed) as MacCursorEvent;
if (event.type === "ready") {
finish({
success: true,
granted: event.accessibilityTrusted === true,
status: event.accessibilityTrusted === true ? "granted" : "not-determined",
});
return;
}
} catch {
// Ignore non-JSON helper output.
}
}
});
child.once("error", (error) => {
finish({
success: false,
granted: false,
status: "error",
error: error.message,
});
});
child.once("exit", (code, signal) => {
finish({
success: false,
granted: false,
status: "exited",
error: `macOS cursor helper exited before ready (code=${code}, signal=${signal})`,
});
});
},
);
}
function clamp(value: number, min: number, max: number) {
return Math.min(max, Math.max(min, value));
}
function normalizeCursorType(value: unknown): NativeCursorType | null {
return value === "arrow" || value === "pointer" || value === "text" ? value : null;
}
export class MacNativeCursorRecordingSession implements CursorRecordingSession {
private samples: CursorRecordingSample[] = [];
private process: ChildProcessByStdio<null, Readable, Readable> | null = null;
private lineBuffer = "";
private startTimeMs = 0;
private fallbackInterval: NodeJS.Timeout | null = null;
private readyResolve: (() => void) | null = null;
private readyReject: ((error: Error) => void) | null = null;
private readyTimer: NodeJS.Timeout | null = null;
private previousLeftButtonDown = false;
private consecutiveOutsideSamples = 0;
// Only hide after this many consecutive out-of-bounds samples (≈100ms at 33ms interval).
// Fast swipes that briefly exit the display are clipped by clip-path instead of disappearing.
private static readonly OUTSIDE_HIDE_THRESHOLD = 3;
constructor(private readonly options: MacNativeCursorRecordingSessionOptions) {}
async start(): Promise<void> {
this.samples = [];
this.lineBuffer = "";
this.startTimeMs = this.options.startTimeMs ?? Date.now();
this.previousLeftButtonDown = false;
this.consecutiveOutsideSamples = 0;
try {
systemPreferences.isTrustedAccessibilityClient(true);
} catch {
// Link cursor detection degrades to arrow when Accessibility is unavailable.
}
const helperPath = findMacCursorHelperPath();
if (!helperPath) {
this.startPositionOnlyFallback();
return;
}
const child = spawn(
helperPath,
[
JSON.stringify({
sampleIntervalMs: this.options.sampleIntervalMs,
}),
],
{
stdio: ["ignore", "pipe", "pipe"],
},
);
this.process = child;
child.stdout.setEncoding("utf8");
child.stdout.on("data", (chunk: string) => this.handleStdoutChunk(chunk));
child.stderr.setEncoding("utf8");
child.stderr.on("data", (chunk: string) => {
const message = chunk.trim();
if (message) {
console.error("[cursor-macos]", message);
}
});
child.once("exit", (code, signal) => {
this.rejectReady(
new Error(`macOS cursor helper exited before ready (code=${code}, signal=${signal})`),
);
this.process = null;
});
child.once("error", (error) => {
this.rejectReady(error);
this.process = null;
});
try {
await this.waitUntilReady();
} catch (error) {
this.killHelperProcess(child);
this.process = null;
console.warn("[cursor-macos] falling back to position-only cursor telemetry:", error);
this.startPositionOnlyFallback();
}
}
async stop(): Promise<CursorRecordingData> {
const child = this.process;
this.process = null;
this.clearReadyState();
if (this.fallbackInterval) {
clearInterval(this.fallbackInterval);
this.fallbackInterval = null;
}
if (child) {
this.killHelperProcess(child);
}
return {
version: 2,
provider: "none",
samples: this.samples,
assets: [],
};
}
private startPositionOnlyFallback() {
this.captureSample(Date.now(), null, false, false, false);
this.fallbackInterval = setInterval(() => {
this.captureSample(Date.now(), null, false, false, false);
}, this.options.sampleIntervalMs);
}
private handleStdoutChunk(chunk: string) {
this.lineBuffer += chunk;
const lines = this.lineBuffer.split(/\r?\n/);
this.lineBuffer = lines.pop() ?? "";
for (const line of lines) {
const trimmedLine = line.trim();
if (!trimmedLine) {
continue;
}
try {
this.handleEvent(JSON.parse(trimmedLine) as MacCursorEvent);
} catch (error) {
console.error("Failed to parse macOS cursor helper output:", error, trimmedLine);
}
}
}
private handleEvent(payload: MacCursorEvent) {
if (payload.type === "ready") {
if (payload.accessibilityTrusted === false) {
console.warn(
"[cursor-macos] Accessibility is not trusted; cursor shape detection will be arrow-only.",
);
}
this.resolveReady();
return;
}
if (payload.type === "sample") {
this.captureSample(
payload.timestampMs,
normalizeCursorType(payload.cursorType),
payload.leftButtonDown === true,
payload.leftButtonPressed === true,
payload.leftButtonReleased === true,
);
}
}
private captureSample(
timestampMs: number,
cursorType: NativeCursorType | null,
leftButtonDown: boolean,
leftButtonPressed: boolean,
leftButtonReleased: boolean,
) {
const cursor = screen.getCursorScreenPoint();
const bounds = this.options.getDisplayBounds() ?? screen.getDisplayNearestPoint(cursor).bounds;
const width = Math.max(1, bounds.width);
const height = Math.max(1, bounds.height);
const normalizedX = (cursor.x - bounds.x) / width;
const normalizedY = (cursor.y - bounds.y) / height;
const isOutsideDisplay =
normalizedX < 0 || normalizedX > 1 || normalizedY < 0 || normalizedY > 1;
// Fast swipes that briefly exit the display (<THRESHOLD samples) are handled by
// clip-path — the cursor clips to the canvas edge instead of snapping invisible.
// Sustained exits (≥THRESHOLD samples, ≈100ms) mark visible=false to prevent
// ghost cursors and motion trails from multi-display movement.
if (isOutsideDisplay) {
this.consecutiveOutsideSamples++;
} else {
this.consecutiveOutsideSamples = 0;
}
const visible =
this.consecutiveOutsideSamples < MacNativeCursorRecordingSession.OUTSIDE_HIDE_THRESHOLD;
const interactionType =
leftButtonPressed || (leftButtonDown && !this.previousLeftButtonDown)
? "click"
: leftButtonReleased || (!leftButtonDown && this.previousLeftButtonDown)
? "mouseup"
: "move";
this.previousLeftButtonDown = leftButtonDown;
this.samples.push({
timeMs: Math.max(0, timestampMs - this.startTimeMs),
cx: clamp(normalizedX, 0, 1),
cy: clamp(normalizedY, 0, 1),
visible,
interactionType,
...(cursorType ? { cursorType } : {}),
});
if (this.samples.length > this.options.maxSamples) {
this.samples.shift();
}
}
private waitUntilReady() {
return new Promise<void>((resolve, reject) => {
this.readyResolve = resolve;
this.readyReject = reject;
this.readyTimer = setTimeout(() => {
this.rejectReady(new Error("Timed out waiting for macOS cursor helper"));
}, READY_TIMEOUT_MS);
});
}
private resolveReady() {
const resolve = this.readyResolve;
this.clearReadyState();
resolve?.();
}
private rejectReady(error: Error) {
const reject = this.readyReject;
this.clearReadyState();
reject?.(error);
}
private clearReadyState() {
if (this.readyTimer) {
clearTimeout(this.readyTimer);
this.readyTimer = null;
}
this.readyResolve = null;
this.readyReject = null;
}
private killHelperProcess(child: ChildProcessByStdio<null, Readable, Readable>) {
if (child.killed) {
return;
}
child.kill("SIGTERM");
setTimeout(() => {
if (!child.killed) {
child.kill("SIGKILL");
}
}, 500).unref();
}
}
@@ -0,0 +1,6 @@
import type { CursorRecordingData } from "../../../../src/native/contracts";
export interface CursorRecordingSession {
start(): Promise<void>;
stop(): Promise<CursorRecordingData>;
}
@@ -0,0 +1,63 @@
import { type Rectangle, screen } from "electron";
import type { CursorRecordingData, CursorRecordingSample } from "../../../../src/native/contracts";
import type { CursorRecordingSession } from "./session";
interface TelemetryRecordingSessionOptions {
getDisplayBounds: () => Rectangle | null;
maxSamples: number;
sampleIntervalMs: number;
startTimeMs?: number;
}
function clamp(value: number, min: number, max: number) {
return Math.min(max, Math.max(min, value));
}
export class TelemetryRecordingSession implements CursorRecordingSession {
private samples: CursorRecordingSample[] = [];
private interval: NodeJS.Timeout | null = null;
private startTimeMs = 0;
constructor(private readonly options: TelemetryRecordingSessionOptions) {}
async start(): Promise<void> {
this.samples = [];
this.startTimeMs = this.options.startTimeMs ?? Date.now();
this.captureSample();
this.interval = setInterval(() => {
this.captureSample();
}, this.options.sampleIntervalMs);
}
async stop(): Promise<CursorRecordingData> {
if (this.interval) {
clearInterval(this.interval);
this.interval = null;
}
return {
version: 2,
provider: "none",
samples: this.samples,
assets: [],
};
}
private captureSample() {
const cursor = screen.getCursorScreenPoint();
const display = this.options.getDisplayBounds() ?? screen.getDisplayNearestPoint(cursor).bounds;
const width = Math.max(1, display.width);
const height = Math.max(1, display.height);
this.samples.push({
timeMs: Math.max(0, Date.now() - this.startTimeMs),
cx: clamp((cursor.x - display.x) / width, 0, 1),
cy: clamp((cursor.y - display.y) / height, 0, 1),
visible: true,
});
if (this.samples.length > this.options.maxSamples) {
this.samples.shift();
}
}
}
@@ -0,0 +1,326 @@
import { type ChildProcessByStdio, spawn } from "node:child_process";
import { existsSync } from "node:fs";
import { join } from "node:path";
import type { Readable } from "node:stream";
import { app, screen } from "electron";
import { parseWindowHandleFromSourceId } from "../../../../src/lib/nativeWindowsRecording";
import type {
CursorRecordingData,
CursorRecordingSample,
NativeCursorAsset,
} from "../../../../src/native/contracts";
import type { CursorRecordingSession } from "./session";
import type {
WindowsCursorEvent,
WindowsNativeRecordingSessionOptions,
} from "./windowsNativeRecordingSession.types";
function getCursorSamplerCandidates(): string[] {
const envPath = process.env.OPENSCREEN_CURSOR_SAMPLER_EXE?.trim();
const archTag = process.arch === "arm64" ? "win32-arm64" : "win32-x64";
const resolve = (...segs: string[]) => {
const p = join(app.getAppPath(), ...segs);
return app.isPackaged ? p.replace(/\.asar([/\\])/, ".asar.unpacked$1") : p;
};
return [
envPath,
resolve("electron", "native", "wgc-capture", "build", "cursor-sampler.exe"),
resolve("electron", "native", "bin", archTag, "cursor-sampler.exe"),
].filter((c): c is string => Boolean(c));
}
function findCursorSamplerPath(): string | null {
for (const candidate of getCursorSamplerCandidates()) {
if (existsSync(candidate)) return candidate;
}
return null;
}
const READY_TIMEOUT_MS = 5_000;
interface NormalizedSample {
sample: CursorRecordingSample;
withinBounds: boolean;
}
export class WindowsNativeRecordingSession implements CursorRecordingSession {
private assets = new Map<string, NativeCursorAsset>();
private samples: CursorRecordingSample[] = [];
private process: ChildProcessByStdio<null, Readable, Readable> | null = null;
private lineBuffer = "";
private startTimeMs = 0;
private readyResolve: (() => void) | null = null;
private readyReject: ((error: Error) => void) | null = null;
private readyTimer: NodeJS.Timeout | null = null;
private sampleCount = 0;
private outOfBoundsSampleCount = 0;
private previousLeftButtonDown = false;
constructor(private readonly options: WindowsNativeRecordingSessionOptions) {}
async start(): Promise<void> {
this.assets.clear();
this.samples = [];
this.lineBuffer = "";
this.startTimeMs = this.options.startTimeMs ?? Date.now();
this.sampleCount = 0;
this.outOfBoundsSampleCount = 0;
this.previousLeftButtonDown = false;
const helperPath = findCursorSamplerPath();
if (!helperPath) {
throw new Error("Windows cursor sampler helper is not available.");
}
const windowHandle = parseWindowHandleFromSourceId(this.options.sourceId);
const args = [String(this.options.sampleIntervalMs)];
if (windowHandle) args.push(windowHandle);
const child = spawn(helperPath, args, {
stdio: ["ignore", "pipe", "pipe"],
windowsHide: true,
});
this.process = child;
this.logDiagnostic("spawn", {
pid: child.pid ?? null,
sampleIntervalMs: this.options.sampleIntervalMs,
sourceId: this.options.sourceId ?? null,
windowHandle,
});
child.stdout.setEncoding("utf8");
child.stdout.on("data", (chunk: string) => {
this.handleStdoutChunk(chunk);
});
child.stderr.setEncoding("utf8");
child.stderr.on("data", (chunk: string) => {
const message = chunk.trim();
if (message) {
this.logDiagnostic("stderr", { message });
}
console.error("[cursor-native]", message);
});
child.once("exit", (code, signal) => {
this.logDiagnostic("exit", {
code,
signal,
sampleCount: this.sampleCount,
assetCount: this.assets.size,
outOfBoundsSampleCount: this.outOfBoundsSampleCount,
});
this.rejectReady(
new Error(`Windows cursor helper exited before ready (code=${code}, signal=${signal})`),
);
});
child.once("error", (error) => {
this.logDiagnostic("process-error", { message: error.message });
this.rejectReady(error);
});
try {
await this.waitUntilReady();
} catch (error) {
this.terminateHelperProcess();
throw error;
}
}
async stop(): Promise<CursorRecordingData> {
const child = this.process;
this.process = null;
this.clearReadyState();
this.killHelperProcess(child);
this.logDiagnostic("stop", {
sampleCount: this.sampleCount,
assetCount: this.assets.size,
outOfBoundsSampleCount: this.outOfBoundsSampleCount,
});
return {
version: 2,
provider: this.assets.size > 0 ? "native" : "none",
samples: this.samples,
assets: [...this.assets.values()],
};
}
private handleStdoutChunk(chunk: string) {
this.lineBuffer += chunk;
const lines = this.lineBuffer.split(/\r?\n/);
this.lineBuffer = lines.pop() ?? "";
for (const line of lines) {
const trimmedLine = line.trim();
if (!trimmedLine) {
continue;
}
try {
const payload = JSON.parse(trimmedLine) as WindowsCursorEvent;
this.handleEvent(payload);
} catch (error) {
console.error("Failed to parse Windows cursor helper output:", error, trimmedLine);
}
}
}
private handleEvent(payload: WindowsCursorEvent) {
if (payload.type === "error") {
this.logDiagnostic("helper-error", { message: payload.message });
console.error("Windows cursor helper error:", payload.message);
this.failHelper(new Error(payload.message));
return;
}
if (payload.type === "ready") {
this.logDiagnostic("ready", { timestampMs: payload.timestampMs });
this.resolveReady();
return;
}
if (payload.asset?.id && !this.assets.has(payload.asset.id)) {
const assetDisplay = screen.getDisplayNearestPoint({ x: payload.x, y: payload.y });
this.assets.set(payload.asset.id, {
id: payload.asset.id,
platform: "win32",
imageDataUrl: payload.asset.imageDataUrl,
width: payload.asset.width,
height: payload.asset.height,
hotspotX: payload.asset.hotspotX,
hotspotY: payload.asset.hotspotY,
scaleFactor: assetDisplay.scaleFactor,
cursorType: payload.asset.cursorType ?? payload.cursorType ?? null,
});
this.logDiagnostic("asset", {
id: payload.asset.id,
width: payload.asset.width,
height: payload.asset.height,
hotspotX: payload.asset.hotspotX,
hotspotY: payload.asset.hotspotY,
scaleFactor: assetDisplay.scaleFactor,
});
}
const normalized = this.normalizeSample(payload);
this.sampleCount += 1;
if (!normalized.withinBounds) {
this.outOfBoundsSampleCount += 1;
}
this.samples.push(normalized.sample);
if (this.samples.length > this.options.maxSamples) {
this.samples.shift();
}
}
private normalizeSample(
payload: Extract<WindowsCursorEvent, { type: "sample" }>,
): NormalizedSample {
const bounds =
payload.bounds ?? this.options.getDisplayBounds() ?? screen.getPrimaryDisplay().bounds;
const width = Math.max(1, bounds.width);
const height = Math.max(1, bounds.height);
const normalizedX = (payload.x - bounds.x) / width;
const normalizedY = (payload.y - bounds.y) / height;
const withinBounds =
normalizedX >= 0 && normalizedX <= 1 && normalizedY >= 0 && normalizedY <= 1;
const leftButtonDown = payload.leftButtonDown === true;
const leftButtonPressed = payload.leftButtonPressed === true;
const leftButtonReleased = payload.leftButtonReleased === true;
const interactionType =
leftButtonPressed || (leftButtonDown && !this.previousLeftButtonDown)
? "click"
: leftButtonReleased || (!leftButtonDown && this.previousLeftButtonDown)
? "mouseup"
: "move";
this.previousLeftButtonDown = leftButtonDown;
if (this.sampleCount === 0 || (!withinBounds && this.outOfBoundsSampleCount === 0)) {
this.logDiagnostic("sample", {
rawX: payload.x,
rawY: payload.y,
normalizedX,
normalizedY,
visible: payload.visible,
withinBounds,
bounds,
handle: payload.handle,
});
}
return {
withinBounds,
sample: {
timeMs: Math.max(0, payload.timestampMs - this.startTimeMs),
cx: normalizedX,
cy: normalizedY,
assetId: payload.handle,
visible: payload.visible && withinBounds,
cursorType: payload.cursorType ?? payload.asset?.cursorType ?? null,
interactionType,
},
};
}
private waitUntilReady() {
return new Promise<void>((resolve, reject) => {
this.readyResolve = resolve;
this.readyReject = reject;
this.readyTimer = setTimeout(() => {
this.rejectReady(new Error("Timed out waiting for Windows cursor helper readiness"));
}, READY_TIMEOUT_MS);
});
}
private resolveReady() {
const resolve = this.readyResolve;
this.clearReadyState();
resolve?.();
}
private rejectReady(error: Error) {
const reject = this.readyReject;
this.clearReadyState();
reject?.(error);
}
private failHelper(error: Error) {
this.rejectReady(error);
this.terminateHelperProcess();
}
private terminateHelperProcess() {
const child = this.process;
this.process = null;
this.killHelperProcess(child);
}
private killHelperProcess(child: ChildProcessByStdio<null, Readable, Readable> | null) {
if (child && !child.killed) {
child.kill();
}
}
private clearReadyState() {
if (this.readyTimer) {
clearTimeout(this.readyTimer);
this.readyTimer = null;
}
this.readyResolve = null;
this.readyReject = null;
}
private logDiagnostic(event: string, data: Record<string, unknown>) {
console.info(
"[cursor-native][win32]",
JSON.stringify({
event,
...data,
}),
);
}
}
@@ -0,0 +1,56 @@
import type { Rectangle } from "electron";
import type { NativeCursorType } from "../../../../src/native/contracts";
export interface WindowsCursorSampleEvent {
type: "sample";
timestampMs: number;
x: number;
y: number;
visible: boolean;
handle: string | null;
cursorType?: NativeCursorType | null;
leftButtonDown?: boolean;
leftButtonPressed?: boolean;
leftButtonReleased?: boolean;
bounds?: {
x: number;
y: number;
width: number;
height: number;
} | null;
asset: WindowsCursorAssetPayload | null;
}
export interface WindowsCursorReadyEvent {
type: "ready";
timestampMs: number;
}
export interface WindowsCursorErrorEvent {
type: "error";
timestampMs: number;
message: string;
}
export interface WindowsCursorAssetPayload {
id: string;
imageDataUrl: string;
width: number;
height: number;
hotspotX: number;
hotspotY: number;
cursorType?: NativeCursorType | null;
}
export type WindowsCursorEvent =
| WindowsCursorSampleEvent
| WindowsCursorReadyEvent
| WindowsCursorErrorEvent;
export interface WindowsNativeRecordingSessionOptions {
getDisplayBounds: () => Rectangle | null;
maxSamples: number;
sampleIntervalMs: number;
sourceId?: string | null;
startTimeMs?: number;
}
@@ -0,0 +1,49 @@
import type { CursorCapabilities, CursorRecordingData } from "../../../src/native/contracts";
import type { CursorNativeAdapter, CursorTelemetryLoadResult } from "./adapter";
interface TelemetryCursorAdapterOptions {
loadRecordingData: (videoPath: string) => Promise<CursorRecordingData>;
resolveVideoPath: (videoPath?: string | null) => string | null;
loadTelemetry: (videoPath: string) => Promise<CursorTelemetryLoadResult>;
}
export class TelemetryCursorAdapter implements CursorNativeAdapter {
readonly kind = "none" as const;
constructor(private readonly options: TelemetryCursorAdapterOptions) {}
async getCapabilities(): Promise<CursorCapabilities> {
return {
telemetry: true,
systemAssets: false,
provider: this.kind,
};
}
async getRecordingData(videoPath?: string | null): Promise<CursorRecordingData> {
const resolvedVideoPath = this.options.resolveVideoPath(videoPath);
if (!resolvedVideoPath) {
return {
version: 2,
provider: this.kind,
samples: [],
assets: [],
};
}
return this.options.loadRecordingData(resolvedVideoPath);
}
async getTelemetry(videoPath?: string | null) {
const resolvedVideoPath = this.options.resolveVideoPath(videoPath);
if (!resolvedVideoPath) {
return {
success: false,
message: "No video path is available for cursor telemetry",
samples: [],
} satisfies CursorTelemetryLoadResult;
}
return this.options.loadTelemetry(resolvedVideoPath);
}
}
@@ -0,0 +1,46 @@
import type {
CursorCapabilities,
CursorRecordingData,
CursorTelemetryPoint,
} from "../../../src/native/contracts";
import type { CursorNativeAdapter } from "../cursor/adapter";
import type { NativeBridgeStateStore } from "../store";
interface CursorServiceOptions {
store: NativeBridgeStateStore;
adapter: CursorNativeAdapter;
}
export class CursorService {
constructor(private readonly options: CursorServiceOptions) {}
async getCapabilities(): Promise<CursorCapabilities> {
const capabilities = await this.options.adapter.getCapabilities();
this.options.store.setCursorCapabilities(capabilities);
return capabilities;
}
async getTelemetry(videoPath?: string | null): Promise<CursorTelemetryPoint[]> {
const result = await this.options.adapter.getTelemetry(videoPath);
if (!result.success) {
throw new Error(result.message || result.error || "Failed to load cursor telemetry");
}
const resolvedVideoPath = videoPath ?? this.options.store.getState().project.currentVideoPath;
if (resolvedVideoPath) {
this.options.store.markCursorTelemetryLoaded(resolvedVideoPath, result.samples.length);
}
return result.samples;
}
async getRecordingData(videoPath?: string | null): Promise<CursorRecordingData> {
const data = await this.options.adapter.getRecordingData(videoPath);
const resolvedVideoPath = videoPath ?? this.options.store.getState().project.currentVideoPath;
if (resolvedVideoPath) {
this.options.store.markCursorTelemetryLoaded(resolvedVideoPath, data.samples.length);
}
return data;
}
}
@@ -0,0 +1,80 @@
import type {
ProjectContext,
ProjectFileResult,
ProjectPathResult,
} from "../../../src/native/contracts";
import type { NativeBridgeStateStore } from "../store";
interface ProjectServiceOptions {
store: NativeBridgeStateStore;
getCurrentProjectPath: () => string | null;
getCurrentVideoPath: () => string | null;
saveProjectFile: (
projectData: unknown,
suggestedName?: string,
existingProjectPath?: string,
) => Promise<ProjectFileResult>;
loadProjectFile: () => Promise<ProjectFileResult>;
loadCurrentProjectFile: () => Promise<ProjectFileResult>;
setCurrentVideoPath: (path: string) => ProjectPathResult | Promise<ProjectPathResult>;
getCurrentVideoPathResult: () => ProjectPathResult;
clearCurrentVideoPath: () => ProjectPathResult;
}
export class ProjectService {
constructor(private readonly options: ProjectServiceOptions) {}
getCurrentContext(): ProjectContext {
const context = {
currentProjectPath: this.options.getCurrentProjectPath(),
currentVideoPath: this.options.getCurrentVideoPath(),
};
this.options.store.setProjectContext(context);
return context;
}
async saveProjectFile(
projectData: unknown,
suggestedName?: string,
existingProjectPath?: string,
) {
const result = await this.options.saveProjectFile(
projectData,
suggestedName,
existingProjectPath,
);
this.getCurrentContext();
return result;
}
async loadProjectFile() {
const result = await this.options.loadProjectFile();
this.getCurrentContext();
return result;
}
async loadCurrentProjectFile() {
const result = await this.options.loadCurrentProjectFile();
this.getCurrentContext();
return result;
}
async setCurrentVideoPath(path: string) {
const result = await this.options.setCurrentVideoPath(path);
this.getCurrentContext();
return result;
}
getCurrentVideoPath() {
const result = this.options.getCurrentVideoPathResult();
this.getCurrentContext();
return result;
}
clearCurrentVideoPath() {
const result = this.options.clearCurrentVideoPath();
this.getCurrentContext();
return result;
}
}
@@ -0,0 +1,43 @@
import type {
CursorCapabilities,
NativePlatform,
SystemCapabilities,
} from "../../../src/native/contracts";
import { NATIVE_BRIDGE_VERSION } from "../../../src/native/contracts";
import type { NativeBridgeStateStore } from "../store";
interface SystemServiceOptions {
store: NativeBridgeStateStore;
getPlatform: () => NativePlatform;
getAssetBasePath: () => string | null;
getCursorCapabilities: () => Promise<CursorCapabilities>;
}
export class SystemService {
constructor(private readonly options: SystemServiceOptions) {}
getPlatform() {
return this.options.getPlatform();
}
getAssetBasePath() {
return this.options.getAssetBasePath();
}
async getCapabilities(): Promise<SystemCapabilities> {
const platform = this.getPlatform();
const cursorCapabilities = await this.options.getCursorCapabilities();
const capabilities: SystemCapabilities = {
bridgeVersion: NATIVE_BRIDGE_VERSION,
platform,
cursor: cursorCapabilities,
project: {
currentContext: true,
},
};
this.options.store.setSystemCapabilities(capabilities);
return capabilities;
}
}
+88
View File
@@ -0,0 +1,88 @@
import type {
CursorCapabilities,
NativePlatform,
ProjectContext,
SystemCapabilities,
} from "../../src/native/contracts";
export interface NativeBridgeState {
system: {
platform: NativePlatform;
capabilities: SystemCapabilities | null;
};
project: ProjectContext;
cursor: {
capabilities: CursorCapabilities | null;
lastTelemetryLoad: {
videoPath: string;
sampleCount: number;
loadedAt: number;
} | null;
};
}
export class NativeBridgeStateStore {
private state: NativeBridgeState;
constructor(platform: NativePlatform) {
this.state = {
system: {
platform,
capabilities: null,
},
project: {
currentProjectPath: null,
currentVideoPath: null,
},
cursor: {
capabilities: null,
lastTelemetryLoad: null,
},
};
}
getState() {
return this.state;
}
setProjectContext(project: ProjectContext) {
this.state = {
...this.state,
project,
};
}
setSystemCapabilities(capabilities: SystemCapabilities) {
this.state = {
...this.state,
system: {
...this.state.system,
capabilities,
},
};
}
setCursorCapabilities(capabilities: CursorCapabilities) {
this.state = {
...this.state,
cursor: {
...this.state.cursor,
capabilities,
},
};
}
markCursorTelemetryLoaded(videoPath: string, sampleCount: number) {
this.state = {
...this.state,
cursor: {
...this.state.cursor,
lastTelemetryLoad: {
videoPath,
sampleCount,
loadedAt: Date.now(),
},
},
};
}
}
+111
View File
@@ -0,0 +1,111 @@
# Native capture helpers
## macOS
macOS native recording will use a ScreenCaptureKit helper with the same process boundary as the Windows WGC helper:
1. Electron resolves the selected source, output paths, and user-selected devices.
2. The helper receives one structured JSON request.
3. The helper owns ScreenCaptureKit/AVFoundation capture, timing, encoding, and muxing.
4. Electron persists the resulting media/session manifest and reports helper errors explicitly.
Helper locations:
1. `OPENSCREEN_SCK_CAPTURE_EXE`, for local development and diagnostics.
2. `electron/native/screencapturekit/build/openscreen-screencapturekit-helper`, for locally built Swift output.
3. `electron/native/bin/darwin-arm64/openscreen-screencapturekit-helper` or `electron/native/bin/darwin-x64/openscreen-screencapturekit-helper`, for packaged prebuilt helpers.
The macOS cursor-shape helper is resolved from `OPENSCREEN_MAC_CURSOR_HELPER_EXE` first, then the matching `openscreen-macos-cursor-helper` binary in the same local build and packaged `electron/native/bin/darwin-${arch}` directories.
Build the macOS helper with:
```bash
npm run build:native:mac
```
On non-macOS hosts this command exits successfully and does not affect Windows/Linux development. On macOS it builds the Swift package at `electron/native/screencapturekit`, writes the development binaries to `electron/native/screencapturekit/build`, and copies redistributable binaries to `electron/native/bin/darwin-${arch}`.
The current helper implementation supports display/window ScreenCaptureKit video capture, cursor exclusion through `SCStreamConfiguration.showsCursor`, H.264 encoding, MP4 muxing, and ScreenCaptureKit system audio. It also attempts native ScreenCaptureKit microphone capture when the running macOS version exposes that capability. Webcam recording currently stays as an Electron sidecar and is attached to the same recording session after the native screen capture stops.
Electron exposes `is-native-mac-capture-available` for capability probing. It resolves the same helper locations listed above and reports `missing-helper` until a Swift helper binary is present. When available, macOS recording routes screen/window capture through the native helper so editable cursor recordings do not bake the system cursor into the video. Cursor positions are sampled in Electron; when the cursor helper is available and Accessibility is granted, samples are also tagged with link/text cursor hints such as `pointer`.
See `docs/engineering/macos-native-recorder-roadmap.md` for the contract, rollout phases, and SSOT rules.
## Windows
Windows native recording is resolved from one of these locations:
1. `OPENSCREEN_WGC_CAPTURE_EXE`, for local development and diagnostics.
2. `electron/native/wgc-capture/build/wgc-capture.exe`, for a locally built Ninja helper.
3. `electron/native/wgc-capture/build/Release/wgc-capture.exe`, for a locally built multi-config helper.
4. `electron/native/bin/win32-x64/wgc-capture.exe` or `electron/native/bin/win32-arm64/wgc-capture.exe`, for packaged prebuilt helpers.
Build the Windows helper with:
```powershell
npm run build:native:win
```
The build writes the CMake output to `electron/native/wgc-capture/build/wgc-capture.exe` and copies the redistributable binary to `electron/native/bin/win32-x64/wgc-capture.exe`.
The helper contract is process-based: the app starts the process with one JSON argument and sends commands on stdin. `stop\n` finalizes the recording. During migration the helper prints both newline-delimited JSON events and the legacy text messages `Recording started` / `Recording stopped. Output path: <path>`.
Current V2 JSON shape:
```json
{
"schemaVersion": 2,
"recordingId": 123,
"sourceType": "display",
"sourceId": "screen:0:0",
"displayId": 1,
"windowHandle": null,
"outputPath": "C:\\path\\recording-123.mp4",
"videoWidth": 1920,
"videoHeight": 1080,
"fps": 60,
"captureSystemAudio": false,
"captureMic": false,
"microphoneDeviceId": "default",
"microphoneDeviceName": "Microphone (NVIDIA Broadcast)",
"microphoneGain": 1.4,
"webcamEnabled": true,
"webcamDeviceId": "default",
"webcamDeviceName": "Camera (NVIDIA Broadcast)",
"webcamWidth": 1280,
"webcamHeight": 720,
"webcamFps": 30,
"outputs": {
"screenPath": "C:\\path\\recording-123.mp4"
}
}
```
The current helper implementation supports display/window video capture, system audio loopback, selected-microphone capture, Media Foundation webcam capture, and a DirectShow webcam fallback for virtual cameras that are not exposed through Media Foundation. Webcam frames are currently composed into the primary MP4 as a bottom-right picture-in-picture overlay. Browser `deviceId` values do not always map to Media Foundation symbolic links or WASAPI endpoint IDs, so the renderer passes both browser IDs and user-visible device names. For microphones, the helper tries the requested WASAPI endpoint ID first, then resolves an active capture endpoint by `microphoneDeviceName`, then falls back to the default endpoint. For webcams, Electron resolves a matching DirectShow filter CLSID for the selected label; the helper uses Media Foundation first, then that exact DirectShow filter when the requested camera is absent from Media Foundation.
Smoke-test the helper with:
```powershell
npm run test:wgc-helper:win
npm run test:wgc-window:win
npm run test:wgc-audio:win
npm run test:wgc-mic:win
npm run test:wgc-mixed-audio:win
npm run test:wgc-webcam:win
```
To validate a specific native webcam manually:
```powershell
$env:OPENSCREEN_WGC_TEST_WEBCAM_DEVICE_NAME = "NVIDIA Broadcast"
npm run test:wgc-webcam:win
Remove-Item Env:OPENSCREEN_WGC_TEST_WEBCAM_DEVICE_NAME
```
To validate a specific native microphone manually:
```powershell
$env:OPENSCREEN_WGC_TEST_MICROPHONE_DEVICE_NAME = "Microphone (NVIDIA Broadcast)"
npm run test:wgc-mic:win
Remove-Item Env:OPENSCREEN_WGC_TEST_MICROPHONE_DEVICE_NAME
```
@@ -0,0 +1,30 @@
// swift-tools-version: 5.9
import PackageDescription
let package = Package(
name: "OpenScreenScreenCaptureKitHelper",
platforms: [
.macOS(.v13)
],
products: [
.executable(
name: "openscreen-screencapturekit-helper",
targets: ["OpenScreenScreenCaptureKitHelper"]
),
.executable(
name: "openscreen-macos-cursor-helper",
targets: ["OpenScreenMacOSCursorHelper"]
)
],
targets: [
.executableTarget(
name: "OpenScreenScreenCaptureKitHelper",
path: "Sources/OpenScreenScreenCaptureKitHelper"
),
.executableTarget(
name: "OpenScreenMacOSCursorHelper",
path: "Sources/OpenScreenMacOSCursorHelper"
)
]
)
@@ -0,0 +1,268 @@
import AppKit
import ApplicationServices
import Foundation
struct CursorHelperRequest: Decodable {
let sampleIntervalMs: Int?
}
final class MouseButtonTracker {
private let lock = NSLock()
private var leftDownCount = 0
private var leftUpCount = 0
private var eventTap: CFMachPort?
private var runLoopSource: CFRunLoopSource?
struct Events {
let leftDownCount: Int
let leftUpCount: Int
}
func start() -> Bool {
let mask =
(1 << CGEventType.leftMouseDown.rawValue) |
(1 << CGEventType.leftMouseUp.rawValue)
guard let tap = CGEvent.tapCreate(
tap: .cgSessionEventTap,
place: .headInsertEventTap,
options: .listenOnly,
eventsOfInterest: CGEventMask(mask),
callback: { _, type, event, userInfo in
if let userInfo {
let tracker = Unmanaged<MouseButtonTracker>.fromOpaque(userInfo).takeUnretainedValue()
tracker.record(type)
}
return Unmanaged.passUnretained(event)
},
userInfo: UnsafeMutableRawPointer(Unmanaged.passUnretained(self).toOpaque())
) else {
return false
}
guard let source = CFMachPortCreateRunLoopSource(kCFAllocatorDefault, tap, 0) else {
return false
}
eventTap = tap
runLoopSource = source
CFRunLoopAddSource(CFRunLoopGetCurrent(), source, .commonModes)
CGEvent.tapEnable(tap: tap, enable: true)
return true
}
func pump() {
CFRunLoopRunInMode(.defaultMode, 0.001, false)
}
func consume() -> Events {
lock.lock()
defer { lock.unlock() }
let events = Events(leftDownCount: leftDownCount, leftUpCount: leftUpCount)
leftDownCount = 0
leftUpCount = 0
return events
}
private func record(_ type: CGEventType) {
lock.lock()
defer { lock.unlock() }
if type == .tapDisabledByTimeout || type == .tapDisabledByUserInput {
reenableTap()
return
}
if type == .leftMouseDown {
leftDownCount += 1
} else if type == .leftMouseUp {
leftUpCount += 1
}
}
private func reenableTap() {
if let eventTap {
CGEvent.tapEnable(tap: eventTap, enable: true)
}
}
}
func emit(_ fields: [String: Any?]) {
let compacted = fields.compactMapValues { $0 }
if let data = try? JSONSerialization.data(withJSONObject: compacted, options: []),
let line = String(data: data, encoding: .utf8)
{
print(line)
fflush(stdout)
}
}
func stringAttribute(_ element: AXUIElement, _ attribute: String) -> String? {
var value: CFTypeRef?
let result = AXUIElementCopyAttributeValue(element, attribute as CFString, &value)
guard result == .success else {
return nil
}
return value as? String
}
func parentElement(_ element: AXUIElement) -> AXUIElement? {
var value: CFTypeRef?
let result = AXUIElementCopyAttributeValue(element, kAXParentAttribute as CFString, &value)
guard result == .success else {
return nil
}
guard CFGetTypeID(value) == AXUIElementGetTypeID() else {
return nil
}
return (value as! AXUIElement)
}
func roleDescription(_ element: AXUIElement) -> String? {
var value: CFTypeRef?
let result = AXUIElementCopyAttributeValue(element, kAXRoleDescriptionAttribute as CFString, &value)
guard result == .success else {
return nil
}
return value as? String
}
func actionNames(_ element: AXUIElement) -> [String] {
var value: CFArray?
let result = AXUIElementCopyActionNames(element, &value)
guard result == .success, let value else {
return []
}
return (value as NSArray).compactMap { $0 as? String }
}
func isTextInputRole(_ role: String?) -> Bool {
role == "AXTextField" ||
role == "AXTextArea" ||
role == "AXTextView" ||
role == "AXComboBox"
}
func isPointerRole(_ role: String?, _ subrole: String?, _ description: String?) -> Bool {
if role == "AXLink" ||
subrole?.localizedCaseInsensitiveContains("link") == true ||
description?.contains("link") == true
{
return true
}
return role == "AXButton" ||
role == "AXMenuButton" ||
role == "AXPopUpButton" ||
role == "AXCheckBox" ||
role == "AXRadioButton" ||
role == "AXSwitch" ||
role == "AXDisclosureTriangle" ||
role == "AXTab" ||
role == "AXMenuItem"
}
func cursorTypeForElement(_ element: AXUIElement) -> String? {
var current: AXUIElement? = element
for _ in 0..<5 {
guard let element = current else {
break
}
let role = stringAttribute(element, kAXRoleAttribute)
let subrole = stringAttribute(element, kAXSubroleAttribute)
let description = roleDescription(element)?.lowercased()
if isTextInputRole(role) {
return "text"
}
if isPointerRole(role, subrole, description) {
return "pointer"
}
current = parentElement(element)
}
return nil
}
func accessibilityPointForMouse() -> CGPoint {
let mouse = NSEvent.mouseLocation
let primaryHeight = NSScreen.screens.first?.frame.height ?? NSScreen.main?.frame.height ?? 0
return CGPoint(x: mouse.x, y: primaryHeight - mouse.y)
}
func currentCursorType() -> String? {
guard AXIsProcessTrusted() else {
return nil
}
let point = accessibilityPointForMouse()
let systemWide = AXUIElementCreateSystemWide()
var element: AXUIElement?
let result = AXUIElementCopyElementAtPosition(
systemWide,
Float(point.x),
Float(point.y),
&element
)
guard result == .success, let element else {
return "arrow"
}
return cursorTypeForElement(element) ?? "arrow"
}
func timestampMs() -> Int {
Int(Date().timeIntervalSince1970 * 1000)
}
func leftButtonDown() -> Bool {
CGEventSource.buttonState(.hidSystemState, button: .left)
}
func requestAccessibilityTrust() -> Bool {
let options = [
kAXTrustedCheckOptionPrompt.takeUnretainedValue() as String: true
] as CFDictionary
return AXIsProcessTrustedWithOptions(options)
}
let request: CursorHelperRequest
if CommandLine.arguments.count >= 2,
let data = CommandLine.arguments[1].data(using: .utf8),
let decoded = try? JSONDecoder().decode(CursorHelperRequest.self, from: data)
{
request = decoded
} else {
request = CursorHelperRequest(sampleIntervalMs: nil)
}
let intervalMs = max(8, request.sampleIntervalMs ?? 33)
let accessibilityTrusted = requestAccessibilityTrust()
let mouseTracker = MouseButtonTracker()
let mouseTapReady = mouseTracker.start()
emit([
"type": "ready",
"timestampMs": timestampMs(),
"accessibilityTrusted": accessibilityTrusted,
"mouseTapReady": mouseTapReady,
])
while true {
mouseTracker.pump()
let mouseEvents = mouseTracker.consume()
emit([
"type": "sample",
"timestampMs": timestampMs(),
"cursorType": currentCursorType(),
"leftButtonDown": leftButtonDown(),
"leftButtonPressed": mouseEvents.leftDownCount > 0,
"leftButtonReleased": mouseEvents.leftUpCount > 0,
])
Thread.sleep(forTimeInterval: Double(intervalMs) / 1000.0)
}
@@ -0,0 +1,673 @@
import AVFoundation
import CoreGraphics
import CoreMedia
import Foundation
import ScreenCaptureKit
struct Rectangle: Decodable {
let x: Double
let y: Double
let width: Double
let height: Double
}
struct RecordingRequest: Decodable {
struct Source: Decodable {
let type: String
let sourceId: String
let displayId: UInt32?
let windowId: UInt32?
let bounds: Rectangle?
}
struct Video: Decodable {
let fps: Int
let width: Int
let height: Int
let bitrate: Int?
let hideSystemCursor: Bool
}
struct Audio: Decodable {
struct SystemAudio: Decodable {
let enabled: Bool
}
struct Microphone: Decodable {
let enabled: Bool
let deviceId: String?
let deviceName: String?
let gain: Double
}
let system: SystemAudio
let microphone: Microphone
}
struct Webcam: Decodable {
let enabled: Bool
let deviceId: String?
let deviceName: String?
let width: Int
let height: Int
let fps: Int
}
struct Cursor: Decodable {
let mode: String
}
struct Outputs: Decodable {
let screenPath: String
let manifestPath: String?
}
let schemaVersion: Int?
let recordingId: Int?
let source: Source
let video: Video
let audio: Audio
let webcam: Webcam
let cursor: Cursor
let outputs: Outputs
}
enum HelperError: Error, CustomStringConvertible {
case invalidArguments
case unsupportedMacOS
case unsupportedFeature(String)
case sourceNotFound(String)
case invalidSourceType(String)
case permissionDenied(String)
case writerSetupFailed(String)
var description: String {
switch self {
case .invalidArguments:
return "Expected one JSON recording request argument."
case .unsupportedMacOS:
return "ScreenCaptureKit recording requires macOS 13 or newer."
case .unsupportedFeature(let message):
return message
case .sourceNotFound(let message):
return message
case .invalidSourceType(let sourceType):
return "Unsupported source type: \(sourceType)."
case .permissionDenied(let message):
return message
case .writerSetupFailed(let message):
return message
}
}
}
func emit(_ fields: [String: Any]) {
if let data = try? JSONSerialization.data(withJSONObject: fields, options: []),
let line = String(data: data, encoding: .utf8)
{
print(line)
fflush(stdout)
}
}
func emitError(code: String, message: String) {
emit([
"event": "error",
"code": code,
"message": message,
])
}
@available(macOS 13.0, *)
final class ScreenCaptureRecorder: NSObject, SCStreamOutput, SCStreamDelegate {
private struct CaptureTarget {
let filter: SCContentFilter
let width: Int
let height: Int
}
private let request: RecordingRequest
private let sampleQueue = DispatchQueue(label: "app.openscreen.sck-helper.samples")
private let stateQueue = DispatchQueue(label: "app.openscreen.sck-helper.state")
private var stream: SCStream?
private var writer: AVAssetWriter?
private var videoInput: AVAssetWriterInput?
private var systemAudioInput: AVAssetWriterInput?
private var microphoneAudioInput: AVAssetWriterInput?
private var didStartWriting = false
private var didEmitRecordingStarted = false
private var isStopping = false
private var isPaused = false
private var pauseStartedAt: CMTime?
private var totalPausedDuration = CMTime.zero
private var nativeMicrophoneEnabled = false
private var outputWidth = 1920
private var outputHeight = 1080
private let microphoneOutputTypeRawValue = 2
private let hostClock = CMClockGetHostTimeClock()
init(request: RecordingRequest) {
self.request = request
}
func start() async throws {
try ensureRequestedPermissions()
let content = try await SCShareableContent.excludingDesktopWindows(
false,
onScreenWindowsOnly: true
)
let target = try makeCaptureTarget(from: content)
outputWidth = target.width
outputHeight = target.height
let configuration = makeStreamConfiguration()
let stream = SCStream(filter: target.filter, configuration: configuration, delegate: self)
try stream.addStreamOutput(self, type: .screen, sampleHandlerQueue: sampleQueue)
if request.audio.system.enabled {
try stream.addStreamOutput(self, type: .audio, sampleHandlerQueue: sampleQueue)
}
if nativeMicrophoneEnabled {
guard let microphoneOutputType = SCStreamOutputType(rawValue: microphoneOutputTypeRawValue) else {
throw HelperError.unsupportedFeature(
"Native microphone capture requires a macOS version with ScreenCaptureKit microphone output."
)
}
try stream.addStreamOutput(self, type: microphoneOutputType, sampleHandlerQueue: sampleQueue)
}
try setupWriter()
self.stream = stream
emit(["event": "ready", "schemaVersion": 1])
try await stream.startCapture()
}
func stop() async {
let shouldStop = stateQueue.sync {
if isStopping {
return false
}
isStopping = true
return true
}
if !shouldStop {
return
}
do {
try await stream?.stopCapture()
} catch {
emit([
"event": "warning",
"code": "stop-capture-failed",
"message": "\(error)",
])
}
await finishWriter()
}
func pause() {
let didPause = stateQueue.sync {
if isStopping || isPaused {
return false
}
isPaused = true
pauseStartedAt = CMClockGetTime(hostClock)
return true
}
if didPause {
emit([
"event": "recording-paused",
"timestampMs": Int(Date().timeIntervalSince1970 * 1000),
])
}
}
func resume() {
let didResume = stateQueue.sync {
if isStopping || !isPaused {
return false
}
if let pauseStartedAt {
let now = CMClockGetTime(hostClock)
totalPausedDuration = CMTimeAdd(
totalPausedDuration,
CMTimeSubtract(now, pauseStartedAt)
)
}
isPaused = false
pauseStartedAt = nil
return true
}
if didResume {
emit([
"event": "recording-resumed",
"timestampMs": Int(Date().timeIntervalSince1970 * 1000),
])
}
}
func stream(_ stream: SCStream, didStopWithError error: Error) {
emitError(code: "capture-stopped-with-error", message: "\(error)")
Task {
await stop()
}
}
func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) {
guard CMSampleBufferDataIsReady(sampleBuffer) else {
return
}
let pauseState = currentPauseState()
if pauseState.paused {
return
}
guard let sampleBuffer = retimedSampleBuffer(sampleBuffer, subtracting: pauseState.offset) else {
return
}
if type == .audio {
appendAudioSampleBuffer(sampleBuffer, to: systemAudioInput)
return
}
if type.rawValue == microphoneOutputTypeRawValue {
appendAudioSampleBuffer(sampleBuffer, to: microphoneAudioInput)
return
}
guard type == .screen else {
return
}
guard isCompleteFrame(sampleBuffer) else {
return
}
guard let videoInput, let writer else {
return
}
let presentationTime = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
if !didStartWriting {
writer.startWriting()
writer.startSession(atSourceTime: presentationTime)
didStartWriting = true
}
if videoInput.isReadyForMoreMediaData {
if videoInput.append(sampleBuffer), !didEmitRecordingStarted {
didEmitRecordingStarted = true
emit([
"event": "recording-started",
"timestampMs": Int(Date().timeIntervalSince1970 * 1000),
"width": outputWidth,
"height": outputHeight,
])
}
}
}
private func ensureRequestedPermissions() throws {
if !CGPreflightScreenCaptureAccess() {
let granted = CGRequestScreenCaptureAccess()
if !granted {
throw HelperError.permissionDenied("Screen recording permission is required for ScreenCaptureKit capture.")
}
}
if request.audio.microphone.enabled {
switch AVCaptureDevice.authorizationStatus(for: .audio) {
case .authorized:
break
case .notDetermined:
let semaphore = DispatchSemaphore(value: 0)
AVCaptureDevice.requestAccess(for: .audio) { _ in
semaphore.signal()
}
let waitResult = semaphore.wait(timeout: .now() + 30)
if waitResult == .timedOut || AVCaptureDevice.authorizationStatus(for: .audio) != .authorized {
throw HelperError.permissionDenied("Microphone permission is required for native microphone capture.")
}
default:
throw HelperError.permissionDenied("Microphone permission is required for native microphone capture.")
}
}
}
private func makeCaptureTarget(from content: SCShareableContent) throws -> CaptureTarget {
switch request.source.type {
case "display":
guard let displayId = request.source.displayId else {
throw HelperError.sourceNotFound("Display capture requires source.displayId.")
}
guard let display = content.displays.first(where: { $0.displayID == displayId }) else {
throw HelperError.sourceNotFound("No ScreenCaptureKit display found for id \(displayId).")
}
let width = Int(CGDisplayPixelsWide(display.displayID))
let height = Int(CGDisplayPixelsHigh(display.displayID))
return CaptureTarget(
filter: SCContentFilter(display: display, excludingWindows: []),
width: clampCaptureDimension(width, fallback: request.video.width),
height: clampCaptureDimension(height, fallback: request.video.height)
)
case "window":
guard let windowId = request.source.windowId else {
throw HelperError.sourceNotFound("Window capture requires source.windowId.")
}
guard let window = content.windows.first(where: { $0.windowID == windowId }) else {
throw HelperError.sourceNotFound("No ScreenCaptureKit window found for id \(windowId).")
}
let candidateDisplay = content.displays.first {
$0.frame.intersects(window.frame) || $0.frame.contains(CGPoint(x: window.frame.midX, y: window.frame.midY))
}
let scaleFactor = Self.scaleFactor(for: candidateDisplay?.displayID ?? CGMainDisplayID())
let width = Int(window.frame.width) * scaleFactor
let height = Int(window.frame.height) * scaleFactor
return CaptureTarget(
filter: SCContentFilter(desktopIndependentWindow: window),
width: clampCaptureDimension(width, fallback: request.video.width),
height: clampCaptureDimension(height, fallback: request.video.height)
)
default:
throw HelperError.invalidSourceType(request.source.type)
}
}
private func makeStreamConfiguration() -> SCStreamConfiguration {
let configuration = SCStreamConfiguration()
configuration.width = outputWidth
configuration.height = outputHeight
configuration.minimumFrameInterval = CMTime(value: 1, timescale: CMTimeScale(max(1, request.video.fps)))
configuration.queueDepth = 6
configuration.showsCursor = !request.video.hideSystemCursor
configuration.pixelFormat = kCVPixelFormatType_32BGRA
configuration.sampleRate = 48_000
configuration.channelCount = 2
configuration.excludesCurrentProcessAudio = true
configuration.capturesAudio = request.audio.system.enabled
if request.audio.microphone.enabled {
guard supportsNativeMicrophoneCapture(streamConfig: configuration) else {
nativeMicrophoneEnabled = false
emit([
"event": "warning",
"code": "microphone-unavailable",
"message": "Native microphone capture requires ScreenCaptureKit microphone support on this macOS version.",
])
return configuration
}
nativeMicrophoneEnabled = true
configuration.capturesAudio = true
configuration.setValue(true, forKey: "captureMicrophone")
if let deviceId = resolveMicrophoneCaptureDeviceID() {
configuration.setValue(deviceId, forKey: "microphoneCaptureDeviceID")
}
} else {
nativeMicrophoneEnabled = false
}
return configuration
}
private func setupWriter() throws {
let outputUrl = URL(fileURLWithPath: request.outputs.screenPath)
try? FileManager.default.removeItem(at: outputUrl)
try FileManager.default.createDirectory(
at: outputUrl.deletingLastPathComponent(),
withIntermediateDirectories: true
)
let writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
let settings: [String: Any] = [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey: outputWidth,
AVVideoHeightKey: outputHeight,
AVVideoCompressionPropertiesKey: [
AVVideoAverageBitRateKey: request.video.bitrate ?? 18_000_000,
AVVideoExpectedSourceFrameRateKey: request.video.fps,
],
]
let input = AVAssetWriterInput(mediaType: .video, outputSettings: settings)
input.expectsMediaDataInRealTime = true
guard writer.canAdd(input) else {
throw HelperError.writerSetupFailed("Unable to add H.264 video input to AVAssetWriter.")
}
writer.add(input)
self.writer = writer
self.videoInput = input
if request.audio.system.enabled {
systemAudioInput = try addAudioInput(to: writer, bitRate: 192_000)
}
if nativeMicrophoneEnabled {
microphoneAudioInput = try addAudioInput(to: writer, bitRate: 128_000)
}
}
private func finishWriter() async {
guard let writer else {
return
}
videoInput?.markAsFinished()
systemAudioInput?.markAsFinished()
microphoneAudioInput?.markAsFinished()
await withCheckedContinuation { continuation in
writer.finishWriting {
continuation.resume()
}
}
if writer.status == .completed {
emit([
"event": "recording-stopped",
"screenPath": request.outputs.screenPath,
])
} else {
emitError(
code: "writer-failed",
message: writer.error.map { "\($0)" } ?? "AVAssetWriter failed with status \(writer.status.rawValue)."
)
}
}
private func addAudioInput(to writer: AVAssetWriter, bitRate: Int) throws -> AVAssetWriterInput {
let settings: [String: Any] = [
AVFormatIDKey: kAudioFormatMPEG4AAC,
AVSampleRateKey: 48_000,
AVNumberOfChannelsKey: 2,
AVEncoderBitRateKey: bitRate,
]
let input = AVAssetWriterInput(mediaType: .audio, outputSettings: settings)
input.expectsMediaDataInRealTime = true
guard writer.canAdd(input) else {
throw HelperError.writerSetupFailed("Unable to add AAC audio input to AVAssetWriter.")
}
writer.add(input)
return input
}
private func appendAudioSampleBuffer(_ sampleBuffer: CMSampleBuffer, to input: AVAssetWriterInput?) {
guard didStartWriting else {
return
}
guard let input, input.isReadyForMoreMediaData else {
return
}
input.append(sampleBuffer)
}
private func currentPauseState() -> (paused: Bool, offset: CMTime) {
stateQueue.sync {
(isPaused, totalPausedDuration)
}
}
private func retimedSampleBuffer(_ sampleBuffer: CMSampleBuffer, subtracting offset: CMTime) -> CMSampleBuffer? {
if !offset.isValid || offset == .zero {
return sampleBuffer
}
let sampleCount = CMSampleBufferGetNumSamples(sampleBuffer)
if sampleCount <= 0 {
return sampleBuffer
}
var timing = Array(repeating: CMSampleTimingInfo(), count: sampleCount)
let timingStatus = CMSampleBufferGetSampleTimingInfoArray(
sampleBuffer,
entryCount: sampleCount,
arrayToFill: &timing,
entriesNeededOut: nil
)
if timingStatus != noErr {
emit([
"event": "warning",
"code": "sample-retime-failed",
"message": "Unable to read sample timing info: \(timingStatus).",
])
return sampleBuffer
}
for index in timing.indices {
if timing[index].presentationTimeStamp.isValid {
timing[index].presentationTimeStamp = CMTimeSubtract(
timing[index].presentationTimeStamp,
offset
)
}
if timing[index].decodeTimeStamp.isValid {
timing[index].decodeTimeStamp = CMTimeSubtract(timing[index].decodeTimeStamp, offset)
}
}
var retimedBuffer: CMSampleBuffer?
let copyStatus = CMSampleBufferCreateCopyWithNewTiming(
allocator: kCFAllocatorDefault,
sampleBuffer: sampleBuffer,
sampleTimingEntryCount: sampleCount,
sampleTimingArray: &timing,
sampleBufferOut: &retimedBuffer
)
if copyStatus != noErr {
emit([
"event": "warning",
"code": "sample-retime-failed",
"message": "Unable to copy sample timing info: \(copyStatus).",
])
return sampleBuffer
}
return retimedBuffer
}
private func isCompleteFrame(_ sampleBuffer: CMSampleBuffer) -> Bool {
guard let attachments = CMSampleBufferGetSampleAttachmentsArray(
sampleBuffer,
createIfNecessary: false
) as? [[SCStreamFrameInfo: Any]],
let attachment = attachments.first,
let statusRawValue = attachment[SCStreamFrameInfo.status] as? Int,
let status = SCFrameStatus(rawValue: statusRawValue)
else {
return true
}
return status == .complete
}
private func clampCaptureDimension(_ value: Int, fallback: Int) -> Int {
let requested = max(2, fallback)
let candidate = value > 0 ? value : requested
let clamped = min(candidate, requested)
return max(2, clamped - (clamped % 2))
}
private static func scaleFactor(for displayId: CGDirectDisplayID) -> Int {
guard let mode = CGDisplayCopyDisplayMode(displayId) else {
return 1
}
return max(1, mode.pixelWidth / max(1, mode.width))
}
private func supportsNativeMicrophoneCapture(streamConfig: SCStreamConfiguration) -> Bool {
streamConfig.responds(to: Selector(("setCaptureMicrophone:"))) &&
streamConfig.responds(to: Selector(("setMicrophoneCaptureDeviceID:"))) &&
SCStreamOutputType(rawValue: microphoneOutputTypeRawValue) != nil
}
private func resolveMicrophoneCaptureDeviceID() -> String? {
let devices = AVCaptureDevice.devices(for: .audio)
if let deviceName = request.audio.microphone.deviceName?.trimmingCharacters(in: .whitespacesAndNewlines),
!deviceName.isEmpty,
let device = devices.first(where: { $0.localizedName == deviceName })
{
return device.uniqueID
}
if let deviceId = request.audio.microphone.deviceId?.trimmingCharacters(in: .whitespacesAndNewlines),
!deviceId.isEmpty,
devices.contains(where: { $0.uniqueID == deviceId })
{
return deviceId
}
return nil
}
}
@main
struct OpenScreenScreenCaptureKitHelper {
static func main() async {
do {
guard CommandLine.arguments.count == 2 else {
throw HelperError.invalidArguments
}
guard #available(macOS 13.0, *) else {
throw HelperError.unsupportedMacOS
}
let requestData = Data(CommandLine.arguments[1].utf8)
let decoder = JSONDecoder()
let request = try decoder.decode(RecordingRequest.self, from: requestData)
let recorder = ScreenCaptureRecorder(request: request)
let stopTask = Task.detached {
while let line = readLine() {
let command = line.trimmingCharacters(in: .whitespacesAndNewlines)
switch command {
case "pause":
recorder.pause()
case "resume":
recorder.resume()
case "stop":
await recorder.stop()
exit(0)
default:
break
}
}
}
try await recorder.start()
await stopTask.value
} catch let error as HelperError {
emitError(code: "helper-error", message: error.description)
exit(1)
} catch {
emitError(code: "helper-error", message: "\(error)")
exit(1)
}
}
}
@@ -0,0 +1,67 @@
cmake_minimum_required(VERSION 3.20)
# The local Windows SDK image used by some contributors can miss gdi32.lib,
# while CMake's default MSVC console template links it unconditionally. This
# helper does not use GDI, so keep the standard library set minimal and explicit.
set(CMAKE_CXX_STANDARD_LIBRARIES
"kernel32.lib user32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib"
CACHE STRING "" FORCE)
project(openscreen-wgc-capture LANGUAGES CXX)
set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)
add_executable(wgc-capture
src/audio_sample_utils.cpp
src/audio_sample_utils.h
src/dshow_webcam_capture.cpp
src/dshow_webcam_capture.h
src/main.cpp
src/mf_encoder.cpp
src/mf_encoder.h
src/monitor_utils.cpp
src/monitor_utils.h
src/wasapi_loopback_capture.cpp
src/wasapi_loopback_capture.h
src/webcam_capture.cpp
src/webcam_capture.h
src/wgc_session.cpp
src/wgc_session.h
)
target_compile_definitions(wgc-capture PRIVATE
NOMINMAX
WIN32_LEAN_AND_MEAN
_WIN32_WINNT=0x0A00
)
target_compile_options(wgc-capture PRIVATE /EHsc /W4 /utf-8)
target_link_libraries(wgc-capture PRIVATE
d3d11
dxgi
mf
mfplat
mfreadwrite
mfuuid
runtimeobject
windowsapp
)
add_executable(cursor-sampler
src/cursor-sampler.cpp
)
target_compile_definitions(cursor-sampler PRIVATE
NOMINMAX
_WIN32_WINNT=0x0A00
)
target_compile_options(cursor-sampler PRIVATE /EHsc /W4 /utf-8)
target_link_libraries(cursor-sampler PRIVATE
gdi32
gdiplus
)
@@ -0,0 +1,439 @@
#include "audio_sample_utils.h"
#include <mfapi.h>
#include <algorithm>
#include <chrono>
#include <cmath>
#include <cstring>
#include <limits>
namespace {
bool isFloatFormat(const AudioInputFormat& format) {
return format.subtype == MFAudioFormat_Float && format.bitsPerSample == 32;
}
bool isPcmFormat(const AudioInputFormat& format, UINT32 bitsPerSample) {
return format.subtype == MFAudioFormat_PCM && format.bitsPerSample == bitsPerSample;
}
template <typename T>
T clampTo(double value) {
const double minValue = static_cast<double>(std::numeric_limits<T>::min());
const double maxValue = static_cast<double>(std::numeric_limits<T>::max());
return static_cast<T>(std::clamp(std::round(value), minValue, maxValue));
}
size_t bytesPerSample(const AudioInputFormat& format) {
return format.bitsPerSample / 8;
}
double readSampleAsDouble(const BYTE* source, const AudioInputFormat& format, size_t frameIndex, UINT32 channelIndex) {
if (!source || format.blockAlign == 0 || channelIndex >= format.channels) {
return 0.0;
}
const size_t offset = frameIndex * format.blockAlign + channelIndex * bytesPerSample(format);
if (isFloatFormat(format)) {
return static_cast<double>(*reinterpret_cast<const float*>(source + offset));
}
if (isPcmFormat(format, 16)) {
return static_cast<double>(*reinterpret_cast<const int16_t*>(source + offset)) / 32768.0;
}
if (isPcmFormat(format, 32)) {
return static_cast<double>(*reinterpret_cast<const int32_t*>(source + offset)) / 2147483648.0;
}
return 0.0;
}
void writeSampleFromDouble(BYTE* destination, const AudioInputFormat& format, size_t frameIndex, UINT32 channelIndex, double value) {
if (!destination || format.blockAlign == 0 || channelIndex >= format.channels) {
return;
}
const double clamped = std::clamp(value, -1.0, 1.0);
const size_t offset = frameIndex * format.blockAlign + channelIndex * bytesPerSample(format);
if (isFloatFormat(format)) {
*reinterpret_cast<float*>(destination + offset) = static_cast<float>(clamped);
return;
}
if (isPcmFormat(format, 16)) {
*reinterpret_cast<int16_t*>(destination + offset) = clampTo<int16_t>(clamped * 32767.0);
return;
}
if (isPcmFormat(format, 32)) {
*reinterpret_cast<int32_t*>(destination + offset) = clampTo<int32_t>(clamped * 2147483647.0);
}
}
double readMappedChannel(const BYTE* source, const AudioInputFormat& format, size_t frameIndex, UINT32 targetChannel, UINT32 targetChannels) {
if (format.channels == 0) {
return 0.0;
}
if (format.channels == targetChannels && targetChannel < format.channels) {
return readSampleAsDouble(source, format, frameIndex, targetChannel);
}
if (format.channels == 1) {
return readSampleAsDouble(source, format, frameIndex, 0);
}
if (targetChannels == 1) {
double sum = 0.0;
for (UINT32 channel = 0; channel < format.channels; ++channel) {
sum += readSampleAsDouble(source, format, frameIndex, channel);
}
return sum / static_cast<double>(format.channels);
}
return readSampleAsDouble(source, format, frameIndex, std::min(targetChannel, format.channels - 1));
}
} // namespace
constexpr int64_t HnsPerSecond = 10'000'000;
bool sameAudioFormatForMixing(const AudioInputFormat& left, const AudioInputFormat& right) {
return left.subtype == right.subtype &&
left.sampleRate == right.sampleRate &&
left.channels == right.channels &&
left.bitsPerSample == right.bitsPerSample &&
left.blockAlign == right.blockAlign &&
left.avgBytesPerSec == right.avgBytesPerSec;
}
AudioInputFormat makeAacCompatibleAudioFormat(const AudioInputFormat& source) {
AudioInputFormat format{};
format.subtype = MFAudioFormat_PCM;
format.sampleRate = source.sampleRate > 0 ? source.sampleRate : 48000;
format.channels = 2;
format.bitsPerSample = 16;
format.blockAlign = format.channels * (format.bitsPerSample / 8);
format.avgBytesPerSec = format.sampleRate * format.blockAlign;
return format;
}
void copyAudioWithGain(
const BYTE* source,
DWORD byteCount,
const AudioInputFormat& format,
double gain,
std::vector<BYTE>& destination) {
destination.resize(byteCount);
if (!source || byteCount == 0) {
std::fill(destination.begin(), destination.end(), static_cast<BYTE>(0));
return;
}
if (std::abs(gain - 1.0) < 0.0001) {
std::memcpy(destination.data(), source, byteCount);
return;
}
if (isFloatFormat(format)) {
const auto* input = reinterpret_cast<const float*>(source);
auto* output = reinterpret_cast<float*>(destination.data());
const size_t sampleCount = byteCount / sizeof(float);
for (size_t index = 0; index < sampleCount; index += 1) {
output[index] = static_cast<float>(std::clamp(input[index] * gain, -1.0, 1.0));
}
return;
}
if (isPcmFormat(format, 16)) {
const auto* input = reinterpret_cast<const int16_t*>(source);
auto* output = reinterpret_cast<int16_t*>(destination.data());
const size_t sampleCount = byteCount / sizeof(int16_t);
for (size_t index = 0; index < sampleCount; index += 1) {
output[index] = clampTo<int16_t>(static_cast<double>(input[index]) * gain);
}
return;
}
if (isPcmFormat(format, 32)) {
const auto* input = reinterpret_cast<const int32_t*>(source);
auto* output = reinterpret_cast<int32_t*>(destination.data());
const size_t sampleCount = byteCount / sizeof(int32_t);
for (size_t index = 0; index < sampleCount; index += 1) {
output[index] = clampTo<int32_t>(static_cast<double>(input[index]) * gain);
}
return;
}
std::memcpy(destination.data(), source, byteCount);
}
void convertAudioWithGain(
const BYTE* source,
DWORD byteCount,
const AudioInputFormat& sourceFormat,
const AudioInputFormat& targetFormat,
double gain,
std::vector<BYTE>& destination) {
if (!source || byteCount == 0 || sourceFormat.blockAlign == 0 || targetFormat.blockAlign == 0 ||
sourceFormat.sampleRate == 0 || targetFormat.sampleRate == 0 || sourceFormat.channels == 0 ||
targetFormat.channels == 0) {
destination.clear();
return;
}
if (sameAudioFormatForMixing(sourceFormat, targetFormat)) {
copyAudioWithGain(source, byteCount, targetFormat, gain, destination);
return;
}
const size_t sourceFrames = byteCount / sourceFormat.blockAlign;
if (sourceFrames == 0) {
destination.clear();
return;
}
const double rateRatio = static_cast<double>(targetFormat.sampleRate) /
static_cast<double>(sourceFormat.sampleRate);
const size_t targetFrames = std::max<size_t>(1, static_cast<size_t>(std::llround(sourceFrames * rateRatio)));
destination.assign(targetFrames * targetFormat.blockAlign, 0);
for (size_t targetFrame = 0; targetFrame < targetFrames; ++targetFrame) {
const double sourcePosition = static_cast<double>(targetFrame) / rateRatio;
const size_t sourceFrame = std::min(
sourceFrames - 1,
static_cast<size_t>(std::llround(sourcePosition)));
for (UINT32 channel = 0; channel < targetFormat.channels; ++channel) {
const double sample = readMappedChannel(
source,
sourceFormat,
sourceFrame,
channel,
targetFormat.channels);
writeSampleFromDouble(destination.data(), targetFormat, targetFrame, channel, sample * gain);
}
}
}
void mixAudioInPlace(
std::vector<BYTE>& destination,
const BYTE* source,
DWORD byteCount,
const AudioInputFormat& format) {
if (!source || byteCount == 0 || destination.empty()) {
return;
}
const size_t mixByteCount = std::min(destination.size(), static_cast<size_t>(byteCount));
if (isFloatFormat(format)) {
auto* output = reinterpret_cast<float*>(destination.data());
const auto* input = reinterpret_cast<const float*>(source);
const size_t sampleCount = mixByteCount / sizeof(float);
for (size_t index = 0; index < sampleCount; index += 1) {
output[index] = static_cast<float>(std::clamp(output[index] + input[index], -1.0f, 1.0f));
}
return;
}
if (isPcmFormat(format, 16)) {
auto* output = reinterpret_cast<int16_t*>(destination.data());
const auto* input = reinterpret_cast<const int16_t*>(source);
const size_t sampleCount = mixByteCount / sizeof(int16_t);
for (size_t index = 0; index < sampleCount; index += 1) {
output[index] = clampTo<int16_t>(
static_cast<double>(output[index]) + static_cast<double>(input[index]));
}
return;
}
if (isPcmFormat(format, 32)) {
auto* output = reinterpret_cast<int32_t*>(destination.data());
const auto* input = reinterpret_cast<const int32_t*>(source);
const size_t sampleCount = mixByteCount / sizeof(int32_t);
for (size_t index = 0; index < sampleCount; index += 1) {
output[index] = clampTo<int32_t>(
static_cast<double>(output[index]) + static_cast<double>(input[index]));
}
}
}
AudioMixer::AudioMixer(
const AudioInputFormat& format,
const AudioInputFormat& systemFormat,
const AudioInputFormat& microphoneFormat,
bool includeSystem,
bool includeMicrophone,
double microphoneGain,
OutputCallback output)
: format_(format),
systemFormat_(systemFormat),
microphoneFormat_(microphoneFormat),
includeSystem_(includeSystem),
includeMicrophone_(includeMicrophone),
microphoneGain_(microphoneGain),
output_(std::move(output)) {}
AudioMixer::~AudioMixer() {
stop();
}
bool AudioMixer::start() {
if (!output_ || format_.sampleRate == 0 || format_.blockAlign == 0) {
return false;
}
stopRequested_ = false;
emittedFrames_ = 0;
timelineStarted_ = false;
paused_ = false;
thread_ = std::thread([this] {
mixLoop();
});
return true;
}
void AudioMixer::beginTimeline() {
{
std::scoped_lock lock(mutex_);
systemQueue_.clear();
microphoneQueue_.clear();
emittedFrames_ = 0;
timelineStarted_ = true;
}
cv_.notify_all();
}
void AudioMixer::setPaused(bool paused) {
{
std::scoped_lock lock(mutex_);
paused_ = paused;
if (paused_) {
systemQueue_.clear();
microphoneQueue_.clear();
}
}
cv_.notify_all();
}
void AudioMixer::stop() {
stopRequested_ = true;
cv_.notify_all();
if (thread_.joinable()) {
thread_.join();
}
}
void AudioMixer::pushSystem(const BYTE* data, DWORD byteCount) {
if (!includeSystem_ || stopRequested_) {
return;
}
{
std::scoped_lock lock(mutex_);
if (paused_) {
return;
}
append(systemQueue_, data, byteCount, systemFormat_, 1.0);
}
cv_.notify_all();
}
void AudioMixer::pushMicrophone(const BYTE* data, DWORD byteCount) {
if (!includeMicrophone_ || stopRequested_) {
return;
}
{
std::scoped_lock lock(mutex_);
if (paused_) {
return;
}
append(microphoneQueue_, data, byteCount, microphoneFormat_, microphoneGain_);
}
cv_.notify_all();
}
void AudioMixer::append(
std::vector<BYTE>& queue,
const BYTE* data,
DWORD byteCount,
const AudioInputFormat& sourceFormat,
double gain) {
if (!data || byteCount == 0) {
return;
}
convertAudioWithGain(data, byteCount, sourceFormat, format_, gain, gainBuffer_);
queue.insert(queue.end(), gainBuffer_.begin(), gainBuffer_.end());
}
bool AudioMixer::pop(std::vector<BYTE>& queue, std::vector<BYTE>& chunk, size_t byteCount) {
if (queue.empty()) {
chunk.assign(byteCount, 0);
return false;
}
chunk.assign(byteCount, 0);
const size_t copiedBytes = std::min(byteCount, queue.size());
std::memcpy(chunk.data(), queue.data(), copiedBytes);
queue.erase(queue.begin(), queue.begin() + static_cast<std::ptrdiff_t>(copiedBytes));
return copiedBytes > 0;
}
void AudioMixer::mixLoop() {
const uint32_t chunkFrames = std::max<uint32_t>(1, format_.sampleRate / 100);
const size_t chunkBytes = static_cast<size_t>(chunkFrames) * format_.blockAlign;
std::vector<BYTE> mixedChunk;
std::vector<BYTE> sourceChunk;
std::chrono::steady_clock::time_point audioClockStart;
bool audioClockStarted = false;
while (true) {
{
std::unique_lock lock(mutex_);
cv_.wait_for(lock, std::chrono::milliseconds(20), [&] {
const bool hasSystem = !includeSystem_ || systemQueue_.size() >= chunkBytes;
const bool hasMicrophone = !includeMicrophone_ || microphoneQueue_.size() >= chunkBytes;
const bool hasAnySource = !systemQueue_.empty() || !microphoneQueue_.empty();
return stopRequested_.load() ||
(timelineStarted_ && !paused_ && (hasSystem || hasMicrophone) && hasAnySource);
});
if (stopRequested_) {
break;
}
if (!timelineStarted_ || paused_) {
continue;
}
const bool hasAnyQueuedAudio = !systemQueue_.empty() || !microphoneQueue_.empty();
if (!hasAnyQueuedAudio) {
continue;
}
mixedChunk.assign(chunkBytes, 0);
if (includeSystem_) {
pop(systemQueue_, sourceChunk, chunkBytes);
mixAudioInPlace(mixedChunk, sourceChunk.data(), static_cast<DWORD>(sourceChunk.size()), format_);
}
if (includeMicrophone_) {
pop(microphoneQueue_, sourceChunk, chunkBytes);
mixAudioInPlace(mixedChunk, sourceChunk.data(), static_cast<DWORD>(sourceChunk.size()), format_);
}
}
if (!audioClockStarted) {
audioClockStart = std::chrono::steady_clock::now();
audioClockStarted = true;
}
const int64_t timestampHns =
static_cast<int64_t>((emittedFrames_ * HnsPerSecond) / format_.sampleRate);
const int64_t durationHns =
static_cast<int64_t>((static_cast<uint64_t>(chunkFrames) * HnsPerSecond) / format_.sampleRate);
if (!output_(mixedChunk.data(), static_cast<DWORD>(mixedChunk.size()), timestampHns, durationHns)) {
stopRequested_ = true;
break;
}
emittedFrames_ += chunkFrames;
const auto nextDeadline = audioClockStart +
std::chrono::duration_cast<std::chrono::steady_clock::duration>(
std::chrono::duration<double>(static_cast<double>(emittedFrames_) / format_.sampleRate));
std::this_thread::sleep_until(nextDeadline);
}
}
@@ -0,0 +1,87 @@
#pragma once
#include "mf_encoder.h"
#include <Windows.h>
#include <atomic>
#include <condition_variable>
#include <cstdint>
#include <functional>
#include <mutex>
#include <thread>
#include <vector>
bool sameAudioFormatForMixing(const AudioInputFormat& left, const AudioInputFormat& right);
AudioInputFormat makeAacCompatibleAudioFormat(const AudioInputFormat& source);
void copyAudioWithGain(
const BYTE* source,
DWORD byteCount,
const AudioInputFormat& format,
double gain,
std::vector<BYTE>& destination);
void convertAudioWithGain(
const BYTE* source,
DWORD byteCount,
const AudioInputFormat& sourceFormat,
const AudioInputFormat& targetFormat,
double gain,
std::vector<BYTE>& destination);
void mixAudioInPlace(
std::vector<BYTE>& destination,
const BYTE* source,
DWORD byteCount,
const AudioInputFormat& format);
class AudioMixer {
public:
using OutputCallback = std::function<bool(const BYTE* data, DWORD byteCount, int64_t timestampHns, int64_t durationHns)>;
AudioMixer(
const AudioInputFormat& format,
const AudioInputFormat& systemFormat,
const AudioInputFormat& microphoneFormat,
bool includeSystem,
bool includeMicrophone,
double microphoneGain,
OutputCallback output);
~AudioMixer();
AudioMixer(const AudioMixer&) = delete;
AudioMixer& operator=(const AudioMixer&) = delete;
bool start();
void beginTimeline();
void setPaused(bool paused);
void stop();
void pushSystem(const BYTE* data, DWORD byteCount);
void pushMicrophone(const BYTE* data, DWORD byteCount);
private:
void append(
std::vector<BYTE>& queue,
const BYTE* data,
DWORD byteCount,
const AudioInputFormat& sourceFormat,
double gain);
bool pop(std::vector<BYTE>& queue, std::vector<BYTE>& chunk, size_t byteCount);
void mixLoop();
AudioInputFormat format_{};
AudioInputFormat systemFormat_{};
AudioInputFormat microphoneFormat_{};
bool includeSystem_ = false;
bool includeMicrophone_ = false;
double microphoneGain_ = 1.0;
OutputCallback output_;
std::mutex mutex_;
std::condition_variable cv_;
std::vector<BYTE> systemQueue_;
std::vector<BYTE> microphoneQueue_;
std::vector<BYTE> gainBuffer_;
std::thread thread_;
std::atomic<bool> stopRequested_ = false;
bool timelineStarted_ = false;
bool paused_ = false;
uint64_t emittedFrames_ = 0;
};
@@ -0,0 +1,482 @@
#include <windows.h>
#include <gdiplus.h>
#include <objbase.h>
#include <atomic>
#include <algorithm>
#include <chrono>
#include <cinttypes>
#include <cstdint>
#include <cstdio>
#include <cstring>
#include <iostream>
#include <mutex>
#include <string>
#include <thread>
#include <vector>
// ─────────────────────────────────────────────────────────────────────────────
// Global mouse-hook state
// ─────────────────────────────────────────────────────────────────────────────
static HHOOK g_mouseHook = nullptr;
static DWORD g_mainThreadId = 0;
static std::atomic<int> g_leftDownCount{0};
static std::atomic<int> g_leftUpCount{0};
static std::atomic<bool> g_stop{false};
static std::mutex g_stdoutMtx;
static LRESULT CALLBACK LowLevelMouseProc(int nCode, WPARAM wParam, LPARAM lParam) {
if (nCode >= 0) {
if (wParam == WM_LBUTTONDOWN) g_leftDownCount.fetch_add(1, std::memory_order_relaxed);
else if (wParam == WM_LBUTTONUP) g_leftUpCount.fetch_add(1, std::memory_order_relaxed);
}
return CallNextHookEx(g_mouseHook, nCode, wParam, lParam);
}
// ─────────────────────────────────────────────────────────────────────────────
// Utilities
// ─────────────────────────────────────────────────────────────────────────────
static int64_t nowMs() {
return static_cast<int64_t>(
std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now().time_since_epoch())
.count());
}
static void writeJsonLine(const std::string& json) {
std::lock_guard<std::mutex> lock(g_stdoutMtx);
std::cout << json << '\n';
std::cout.flush();
}
static std::string jsonEscape(const std::string& s) {
std::string r;
r.reserve(s.size());
for (unsigned char c : s) {
switch (c) {
case '"': r += "\\\""; break;
case '\\': r += "\\\\"; break;
case '\n': r += "\\n"; break;
case '\r': r += "\\r"; break;
case '\t': r += "\\t"; break;
default: r.push_back(static_cast<char>(c)); break;
}
}
return r;
}
static const char kBase64Chars[] =
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
static std::string base64Encode(const uint8_t* data, size_t len) {
std::string out;
out.reserve(((len + 2) / 3) * 4);
for (size_t i = 0; i < len; i += 3) {
const uint32_t b =
(static_cast<uint32_t>(data[i]) << 16) |
(i + 1 < len ? static_cast<uint32_t>(data[i + 1]) << 8 : 0u) |
(i + 2 < len ? static_cast<uint32_t>(data[i + 2]) : 0u);
out.push_back(kBase64Chars[(b >> 18) & 0x3F]);
out.push_back(kBase64Chars[(b >> 12) & 0x3F]);
out.push_back(i + 1 < len ? kBase64Chars[(b >> 6) & 0x3F] : '=');
out.push_back(i + 2 < len ? kBase64Chars[(b ) & 0x3F] : '=');
}
return out;
}
// ─────────────────────────────────────────────────────────────────────────────
// GDI+ PNG encoder CLSID
// ─────────────────────────────────────────────────────────────────────────────
static bool getPngClsid(CLSID& out) {
UINT num = 0, sz = 0;
if (Gdiplus::GetImageEncodersSize(&num, &sz) != Gdiplus::Ok || sz == 0) return false;
std::vector<uint8_t> buf(sz);
auto* enc = reinterpret_cast<Gdiplus::ImageCodecInfo*>(buf.data());
if (Gdiplus::GetImageEncoders(num, sz, enc) != Gdiplus::Ok) return false;
for (UINT i = 0; i < num; ++i) {
if (std::wstring(enc[i].MimeType) == L"image/png") {
out = enc[i].Clsid;
return true;
}
}
return false;
}
// ─────────────────────────────────────────────────────────────────────────────
// Standard cursor-type lookup
// ─────────────────────────────────────────────────────────────────────────────
static const char* standardCursorType(HCURSOR hc) {
if (!hc) return nullptr;
static const struct { WORD id; const char* name; } kMap[] = {
{32512, "arrow"},
{32513, "text"},
{32514, "wait"},
{32515, "crosshair"},
{32516, "up-arrow"},
{32642, "resize-nwse"},
{32643, "resize-nesw"},
{32644, "resize-ew"},
{32645, "resize-ns"},
{32646, "move"},
{32648, "not-allowed"},
{32649, "pointer"},
{32650, "app-starting"},
{32651, "help"},
};
static constexpr int N = static_cast<int>(sizeof(kMap) / sizeof(kMap[0]));
static HCURSOR g_handles[N] = {};
static bool g_init = false;
if (!g_init) {
for (int i = 0; i < N; ++i)
g_handles[i] = LoadCursor(nullptr, MAKEINTRESOURCE(kMap[i].id));
g_init = true;
}
for (int i = 0; i < N; ++i)
if (g_handles[i] && g_handles[i] == hc) return kMap[i].name;
return nullptr;
}
// ─────────────────────────────────────────────────────────────────────────────
// Custom cursor-type detection (replicates the PowerShell heuristic)
// ─────────────────────────────────────────────────────────────────────────────
static const char* detectCustomCursorType(
const uint32_t* pixels, int w, int h, int hotX, int hotY)
{
if (w < 24 || h < 24 || w > 64 || h > 64) return nullptr;
if (hotX < w * 0.25 || hotX > w * 0.75) return nullptr;
if (hotY < h * 0.15 || hotY > h * 0.55) return nullptr;
int opaque = 0, topHalf = 0;
int left = w, top = h, right = -1, bottom = -1;
for (int y = 0; y < h; ++y) {
for (int x = 0; x < w; ++x) {
const uint8_t a = static_cast<uint8_t>(pixels[y * w + x] >> 24);
if (a <= 32) continue;
++opaque;
if (y < h / 2) ++topHalf;
if (x < left) left = x;
if (x > right) right = x;
if (y < top) top = y;
if (y > bottom) bottom = y;
}
}
if (opaque < 90 || right < left || bottom < top) return nullptr;
const int ow = right - left + 1;
const int oh = bottom - top + 1;
if (ow < w * 0.35 || ow > w * 0.9) return nullptr;
if (oh < h * 0.45 || oh > static_cast<double>(h)) return nullptr;
if (top > h * 0.45 || bottom < h * 0.65) return nullptr;
return topHalf > opaque * 0.55 ? "closed-hand" : "open-hand";
}
// ─────────────────────────────────────────────────────────────────────────────
// Build asset JSON for the given cursor (returns empty string on failure)
//
// Renders the cursor via GDI DrawIconEx onto a 32-bpp transparent DIB section
// and then encodes to PNG — matching the PowerShell approach of
// Graphics.Clear(Transparent) + Graphics.DrawIcon(). This correctly preserves
// per-pixel alpha for 32-bit cursors, unlike Gdiplus::Bitmap::FromHICON which
// can produce incorrect alpha for cursor handles.
// ─────────────────────────────────────────────────────────────────────────────
static std::string buildAssetJson(
HCURSOR hCursor,
const std::string& handleStr,
const CLSID& pngClsid,
const char** outCustomType)
{
*outCustomType = nullptr;
// Get hotspot and cursor dimensions from the icon info.
// For color cursors hbmColor gives the size; for monochrome cursors the
// mask bitmap is twice the cursor height (AND mask stacked on XOR mask).
ICONINFO ii{};
if (!GetIconInfo(hCursor, &ii)) return {};
const int hotX = static_cast<int>(ii.xHotspot);
const int hotY = static_cast<int>(ii.yHotspot);
int w = 0, h = 0;
if (ii.hbmColor) {
BITMAP bm{};
if (GetObject(ii.hbmColor, sizeof(bm), &bm)) { w = bm.bmWidth; h = bm.bmHeight; }
}
if (ii.hbmMask && (w == 0 || h == 0)) {
BITMAP bm{};
if (GetObject(ii.hbmMask, sizeof(bm), &bm)) {
w = bm.bmWidth;
h = ii.hbmColor ? bm.bmHeight : bm.bmHeight / 2;
}
}
if (ii.hbmMask) DeleteObject(ii.hbmMask);
if (ii.hbmColor) DeleteObject(ii.hbmColor);
if (w <= 0 || h <= 0) return {};
// Copy the cursor handle so DrawIconEx cannot affect the live system cursor.
const HICON hCopy = CopyIcon(hCursor);
if (!hCopy) return {};
// Allocate a 32-bpp top-down DIB section and clear it to transparent black,
// then draw the cursor with DI_NORMAL. For 32-bit alpha cursors Windows
// writes correct per-pixel alpha into the high byte of each BGRA pixel.
const int stride = w * 4;
BITMAPINFOHEADER bih{};
bih.biSize = sizeof(bih);
bih.biWidth = w;
bih.biHeight = -h; // negative = top-down scanline order
bih.biPlanes = 1;
bih.biBitCount = 32;
bih.biCompression = BI_RGB;
void* pBits = nullptr;
HDC hDC = CreateCompatibleDC(nullptr);
HBITMAP hBmp = hDC ? CreateDIBSection(hDC,
reinterpret_cast<const BITMAPINFO*>(&bih),
DIB_RGB_COLORS, &pBits, nullptr, 0)
: nullptr;
if (!hBmp || !pBits) {
if (hBmp) DeleteObject(hBmp);
if (hDC) DeleteDC(hDC);
DestroyIcon(hCopy);
return {};
}
HGDIOBJ hOld = SelectObject(hDC, hBmp);
std::memset(pBits, 0, static_cast<size_t>(stride * h)); // transparent black
DrawIconEx(hDC, 0, 0, hCopy, w, h, 0, nullptr, DI_NORMAL);
GdiFlush();
SelectObject(hDC, hOld);
DeleteDC(hDC);
DestroyIcon(hCopy);
// GDI's 32-bit DIB stores pixels as BGRA in memory. GDI+'s
// PixelFormat32bppARGB interprets each 32-bit word as 0xAARRGGBB which is
// identical to BGRA on little-endian, so the alpha byte is always >> 24.
{
const auto* px = static_cast<const uint32_t*>(pBits);
*outCustomType = detectCustomCursorType(px, w, h, hotX, hotY);
}
// Wrap the DIB pixels in a GDI+ Bitmap (zero-copy) and save to PNG.
// Keep hBmp alive until after gBmp is destroyed so pBits remains valid.
std::vector<uint8_t> pngData;
{
Gdiplus::Bitmap gBmp(w, h, stride, PixelFormat32bppARGB,
static_cast<BYTE*>(pBits));
if (gBmp.GetLastStatus() == Gdiplus::Ok) {
IStream* pStream = nullptr;
if (SUCCEEDED(CreateStreamOnHGlobal(nullptr, TRUE, &pStream))) {
if (gBmp.Save(pStream, &pngClsid) == Gdiplus::Ok) {
ULARGE_INTEGER sz{};
LARGE_INTEGER zero{};
pStream->Seek(zero, STREAM_SEEK_END, &sz);
pStream->Seek(zero, STREAM_SEEK_SET, nullptr);
pngData.resize(static_cast<size_t>(sz.QuadPart));
ULONG n = 0;
pStream->Read(pngData.data(), static_cast<ULONG>(pngData.size()), &n);
pngData.resize(n);
}
pStream->Release();
}
}
} // gBmp destroyed here; pBits (owned by hBmp) still valid
DeleteObject(hBmp);
if (pngData.empty()) return {};
const std::string dataUrl =
"data:image/png;base64," + base64Encode(pngData.data(), pngData.size());
std::string json;
json.reserve(dataUrl.size() + 128);
json = "{\"id\":\"" + handleStr + "\"";
json += ",\"imageDataUrl\":\"" + jsonEscape(dataUrl) + "\"";
json += ",\"width\":" + std::to_string(w);
json += ",\"height\":" + std::to_string(h);
json += ",\"hotspotX\":" + std::to_string(hotX);
json += ",\"hotspotY\":" + std::to_string(hotY);
if (*outCustomType) {
json += ",\"cursorType\":\"";
json += *outCustomType;
json += "\"";
} else {
json += ",\"cursorType\":null";
}
json += "}";
return json;
}
// ─────────────────────────────────────────────────────────────────────────────
// Sampling loop (background thread)
// ─────────────────────────────────────────────────────────────────────────────
static void runSamplingLoop(int intervalMs, HWND targetWindow, const CLSID& pngClsid) {
HCURSOR lastCursor = nullptr;
while (!g_stop.load(std::memory_order_relaxed)) {
const int downCount = g_leftDownCount.exchange(0, std::memory_order_relaxed);
const int upCount = g_leftUpCount.exchange(0, std::memory_order_relaxed);
CURSORINFO ci{};
ci.cbSize = sizeof(ci);
if (!GetCursorInfo(&ci)) {
char buf[160];
std::snprintf(buf, sizeof(buf),
"{\"type\":\"error\",\"timestampMs\":%" PRId64 ",\"message\":\"GetCursorInfo failed\"}",
nowMs());
writeJsonLine(buf);
std::this_thread::sleep_for(std::chrono::milliseconds(intervalMs));
continue;
}
const bool visible = (ci.flags & CURSOR_SHOWING) != 0;
const HCURSOR hc = ci.hCursor;
// Handle string ("0xHEX" or empty for null cursor)
char handleBuf[32] = {};
if (hc)
std::snprintf(handleBuf, sizeof(handleBuf),
"0x%" PRIX64, static_cast<uint64_t>(reinterpret_cast<uintptr_t>(hc)));
const std::string handleStr = hc ? handleBuf : "";
// Standard cursor type
const char* cursorType = standardCursorType(hc);
// Mouse button state
const SHORT ks = GetAsyncKeyState(VK_LBUTTON);
const bool leftDown = (ks & 0x8000) != 0;
const bool leftPressed = downCount > 0 || (ks & 0x0001) != 0;
const bool leftReleased = upCount > 0;
// Asset — only when the cursor handle changes
std::string assetJson;
if (visible && hc && hc != lastCursor) {
const char* customType = nullptr;
assetJson = buildAssetJson(hc, handleStr, pngClsid, &customType);
if (!assetJson.empty() && !cursorType && customType)
cursorType = customType;
lastCursor = hc;
}
// Window bounds
std::string boundsJson = "null";
if (targetWindow && IsWindow(targetWindow)) {
RECT r{};
if (GetWindowRect(targetWindow, &r)) {
const int bw = r.right - r.left;
const int bh = r.bottom - r.top;
if (bw > 0 && bh > 0) {
char buf[128];
std::snprintf(buf, sizeof(buf),
"{\"x\":%ld,\"y\":%ld,\"width\":%d,\"height\":%d}",
r.left, r.top, bw, bh);
boundsJson = buf;
}
}
}
// Emit sample JSON
std::string out;
out.reserve(256);
out += "{\"type\":\"sample\"";
out += ",\"timestampMs\":"; out += std::to_string(nowMs());
out += ",\"x\":"; out += std::to_string(ci.ptScreenPos.x);
out += ",\"y\":"; out += std::to_string(ci.ptScreenPos.y);
out += ",\"visible\":"; out += visible ? "true" : "false";
out += ",\"handle\":"; out += hc ? ("\"" + handleStr + "\"") : "null";
out += ",\"cursorType\":"; out += cursorType ? ("\"" + std::string(cursorType) + "\"") : "null";
out += ",\"leftButtonDown\":"; out += leftDown ? "true" : "false";
out += ",\"leftButtonPressed\":"; out += leftPressed ? "true" : "false";
out += ",\"leftButtonReleased\":"; out += leftReleased ? "true" : "false";
out += ",\"bounds\":"; out += boundsJson;
out += ",\"asset\":"; out += assetJson.empty() ? "null" : assetJson;
out += "}";
writeJsonLine(out);
// Exit if stdout pipe is broken (parent process died)
if (std::cout.fail()) {
PostThreadMessage(g_mainThreadId, WM_QUIT, 0, 0);
break;
}
std::this_thread::sleep_for(std::chrono::milliseconds(intervalMs));
}
}
// ─────────────────────────────────────────────────────────────────────────────
// main
// ─────────────────────────────────────────────────────────────────────────────
int main(int argc, char* argv[]) {
if (argc < 2) {
std::cerr << "Usage: cursor-sampler <intervalMs> [windowHandle]" << std::endl;
return 1;
}
const int intervalMs = std::max(1, std::atoi(argv[1]));
HWND targetWindow = nullptr;
if (argc >= 3) {
const std::string arg = argv[2];
if (!arg.empty() && arg != "null") {
try {
const int base = (arg.rfind("0x", 0) == 0 || arg.rfind("0X", 0) == 0) ? 16 : 10;
const uint64_t v = std::stoull(arg, nullptr, base);
if (v) targetWindow = reinterpret_cast<HWND>(static_cast<uintptr_t>(v));
} catch (...) {}
}
}
// Initialize GDI+
Gdiplus::GdiplusStartupInput gdipInput{};
ULONG_PTR gdipToken = 0;
if (Gdiplus::GdiplusStartup(&gdipToken, &gdipInput, nullptr) != Gdiplus::Ok) {
std::cerr << "GDI+ init failed" << std::endl;
return 1;
}
CLSID pngClsid{};
if (!getPngClsid(pngClsid)) {
std::cerr << "PNG encoder not found" << std::endl;
Gdiplus::GdiplusShutdown(gdipToken);
return 1;
}
// Install global low-level mouse hook on this thread
g_mouseHook = SetWindowsHookEx(WH_MOUSE_LL, LowLevelMouseProc, GetModuleHandle(nullptr), 0);
if (!g_mouseHook) {
std::cerr << "SetWindowsHookEx failed" << std::endl;
Gdiplus::GdiplusShutdown(gdipToken);
return 1;
}
// Prime GetAsyncKeyState so the first poll doesn't return stale "since-last-call" bits
GetAsyncKeyState(VK_LBUTTON);
// Signal readiness
g_mainThreadId = GetCurrentThreadId();
{
char buf[80];
std::snprintf(buf, sizeof(buf),
"{\"type\":\"ready\",\"timestampMs\":%" PRId64 "}", nowMs());
writeJsonLine(buf);
}
// Start sampling on a background thread
std::thread sampler(runSamplingLoop, intervalMs, targetWindow, std::cref(pngClsid));
// Run the message pump on the main thread — required for WH_MOUSE_LL callbacks
MSG msg;
while (GetMessage(&msg, nullptr, 0, 0) > 0) {
TranslateMessage(&msg);
DispatchMessage(&msg);
}
g_stop.store(true, std::memory_order_relaxed);
if (sampler.joinable()) sampler.join();
UnhookWindowsHookEx(g_mouseHook);
Gdiplus::GdiplusShutdown(gdipToken);
return 0;
}
@@ -0,0 +1,427 @@
#include "dshow_webcam_capture.h"
#include <initguid.h>
#include <dshow.h>
#include <wrl/client.h>
#include <algorithm>
#include <array>
#include <chrono>
#include <exception>
#include <iomanip>
#include <iostream>
#include <sstream>
namespace {
const CLSID CLSID_SampleGrabberLocal = {0xC1F400A0, 0x3F08, 0x11D3, {0x9F, 0x0B, 0x00, 0x60, 0x08, 0x03, 0x9E, 0x37}};
const CLSID CLSID_NullRendererLocal = {0xC1F400A4, 0x3F08, 0x11D3, {0x9F, 0x0B, 0x00, 0x60, 0x08, 0x03, 0x9E, 0x37}};
MIDL_INTERFACE("6B652FFF-11FE-4FCE-92AD-0266B5D7C78F")
ISampleGrabber : public IUnknown {
public:
virtual HRESULT STDMETHODCALLTYPE SetOneShot(BOOL oneShot) = 0;
virtual HRESULT STDMETHODCALLTYPE SetMediaType(const AM_MEDIA_TYPE* type) = 0;
virtual HRESULT STDMETHODCALLTYPE GetConnectedMediaType(AM_MEDIA_TYPE* type) = 0;
virtual HRESULT STDMETHODCALLTYPE SetBufferSamples(BOOL bufferThem) = 0;
virtual HRESULT STDMETHODCALLTYPE GetCurrentBuffer(long* bufferSize, long* buffer) = 0;
virtual HRESULT STDMETHODCALLTYPE GetCurrentSample(IMediaSample** sample) = 0;
virtual HRESULT STDMETHODCALLTYPE SetCallback(IUnknown* callback, long whichMethodToCallback) = 0;
};
bool succeeded(HRESULT hr, const char* label) {
if (SUCCEEDED(hr)) {
return true;
}
std::cerr << "ERROR: " << label << " failed (hr=0x" << std::hex << hr << std::dec << ")"
<< std::endl;
return false;
}
std::string guidToString(const GUID& guid) {
if (guid == MEDIASUBTYPE_RGB32) {
return "RGB32";
}
if (guid == MEDIASUBTYPE_YUY2) {
return "YUY2";
}
if (guid == MEDIASUBTYPE_NV12) {
return "NV12";
}
std::ostringstream stream;
stream << std::hex << std::setfill('0')
<< '{' << std::setw(8) << guid.Data1
<< '-' << std::setw(4) << guid.Data2
<< '-' << std::setw(4) << guid.Data3
<< '-';
for (int index = 0; index < 2; index += 1) {
stream << std::setw(2) << static_cast<int>(guid.Data4[index]);
}
stream << '-';
for (int index = 2; index < 8; index += 1) {
stream << std::setw(2) << static_cast<int>(guid.Data4[index]);
}
stream << '}';
return stream.str();
}
void freeMediaType(AM_MEDIA_TYPE& type) {
if (type.cbFormat != 0) {
CoTaskMemFree(type.pbFormat);
type.cbFormat = 0;
type.pbFormat = nullptr;
}
if (type.pUnk) {
type.pUnk->Release();
type.pUnk = nullptr;
}
}
BYTE clampToByte(int value) {
return static_cast<BYTE>(std::clamp(value, 0, 255));
}
std::array<BYTE, 3> yuvToBgr(int y, int u, int v) {
const int c = y - 16;
const int d = u - 128;
const int e = v - 128;
const int blue = (298 * c + 516 * d + 128) >> 8;
const int green = (298 * c - 100 * d - 208 * e + 128) >> 8;
const int red = (298 * c + 409 * e + 128) >> 8;
return {clampToByte(blue), clampToByte(green), clampToByte(red)};
}
} // namespace
struct DirectShowWebcamCapture::Impl {
Microsoft::WRL::ComPtr<IGraphBuilder> graph;
Microsoft::WRL::ComPtr<ICaptureGraphBuilder2> captureGraph;
Microsoft::WRL::ComPtr<IBaseFilter> captureFilter;
Microsoft::WRL::ComPtr<IBaseFilter> sampleGrabberFilter;
Microsoft::WRL::ComPtr<ISampleGrabber> sampleGrabber;
Microsoft::WRL::ComPtr<IBaseFilter> nullRenderer;
Microsoft::WRL::ComPtr<IMediaControl> mediaControl;
bool comInitialized = false;
bool running = false;
};
DirectShowWebcamCapture::~DirectShowWebcamCapture() {
stop();
delete impl_;
}
bool DirectShowWebcamCapture::initialize(
const std::wstring& deviceId,
const std::wstring& deviceName,
const std::wstring& directShowClsid,
int requestedWidth,
int requestedHeight,
int requestedFps) {
(void)deviceId;
stop();
delete impl_;
impl_ = nullptr;
impl_ = new Impl();
fps_ = std::clamp(requestedFps > 0 ? requestedFps : 30, 1, 60);
HRESULT hr = CoInitializeEx(nullptr, COINIT_MULTITHREADED);
if (SUCCEEDED(hr)) {
impl_->comInitialized = true;
} else if (hr != RPC_E_CHANGED_MODE) {
return succeeded(hr, "CoInitializeEx(DirectShow webcam)");
}
if (directShowClsid.empty()) {
std::cerr << "ERROR: DirectShow webcam fallback requires a resolved filter CLSID" << std::endl;
return false;
}
CLSID selectedClsid{};
if (FAILED(CLSIDFromString(directShowClsid.c_str(), &selectedClsid))) {
std::cerr << "ERROR: DirectShow webcam fallback received an invalid filter CLSID" << std::endl;
return false;
}
selectedDeviceName_ = deviceName.empty() ? directShowClsid : deviceName;
if (!succeeded(CoCreateInstance(selectedClsid, nullptr, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&impl_->captureFilter)),
"CoCreateInstance(DirectShow webcam filter)")) {
return false;
}
if (!succeeded(CoCreateInstance(CLSID_FilterGraph, nullptr, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&impl_->graph)),
"CoCreateInstance(FilterGraph)")) {
return false;
}
if (!succeeded(CoCreateInstance(CLSID_CaptureGraphBuilder2, nullptr, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&impl_->captureGraph)),
"CoCreateInstance(CaptureGraphBuilder2)")) {
return false;
}
if (!succeeded(impl_->captureGraph->SetFiltergraph(impl_->graph.Get()), "SetFiltergraph(DirectShow webcam)")) {
return false;
}
if (!succeeded(impl_->graph->AddFilter(impl_->captureFilter.Get(), L"OpenScreen Webcam Source"),
"AddFilter(DirectShow webcam source)")) {
return false;
}
if (!succeeded(CoCreateInstance(CLSID_SampleGrabberLocal, nullptr, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&impl_->sampleGrabberFilter)),
"CoCreateInstance(SampleGrabber)")) {
return false;
}
if (!succeeded(impl_->sampleGrabberFilter.As(&impl_->sampleGrabber), "QueryInterface(ISampleGrabber)")) {
return false;
}
AM_MEDIA_TYPE requestedType{};
requestedType.majortype = MEDIATYPE_Video;
requestedType.formattype = FORMAT_VideoInfo;
if (!succeeded(impl_->sampleGrabber->SetMediaType(&requestedType), "SetMediaType(DirectShow video)")) {
return false;
}
if (!succeeded(impl_->graph->AddFilter(impl_->sampleGrabberFilter.Get(), L"OpenScreen Webcam Sample Grabber"),
"AddFilter(SampleGrabber)")) {
return false;
}
if (!succeeded(CoCreateInstance(CLSID_NullRendererLocal, nullptr, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&impl_->nullRenderer)),
"CoCreateInstance(NullRenderer)")) {
return false;
}
if (!succeeded(impl_->graph->AddFilter(impl_->nullRenderer.Get(), L"OpenScreen Webcam Null Renderer"),
"AddFilter(NullRenderer)")) {
return false;
}
if (!succeeded(impl_->captureGraph->RenderStream(
&PIN_CATEGORY_CAPTURE,
&MEDIATYPE_Video,
impl_->captureFilter.Get(),
impl_->sampleGrabberFilter.Get(),
impl_->nullRenderer.Get()),
"RenderStream(DirectShow webcam)")) {
return false;
}
AM_MEDIA_TYPE connectedType{};
if (!succeeded(impl_->sampleGrabber->GetConnectedMediaType(&connectedType), "GetConnectedMediaType(DirectShow webcam)")) {
return false;
}
if (connectedType.subtype == MEDIASUBTYPE_YUY2) {
pixelFormat_ = PixelFormat::Yuy2;
} else if (connectedType.subtype == MEDIASUBTYPE_NV12) {
pixelFormat_ = PixelFormat::Nv12;
} else if (connectedType.subtype == MEDIASUBTYPE_RGB32) {
pixelFormat_ = PixelFormat::Bgra;
} else {
std::cerr << "ERROR: Unsupported DirectShow webcam media subtype "
<< guidToString(connectedType.subtype) << std::endl;
freeMediaType(connectedType);
return false;
}
if (connectedType.formattype == FORMAT_VideoInfo && connectedType.pbFormat) {
const auto* videoInfo = reinterpret_cast<VIDEOINFOHEADER*>(connectedType.pbFormat);
width_ = std::abs(videoInfo->bmiHeader.biWidth);
height_ = std::abs(videoInfo->bmiHeader.biHeight);
const int bitsPerPixel = videoInfo->bmiHeader.biBitCount > 0 ? videoInfo->bmiHeader.biBitCount : 16;
if (pixelFormat_ == PixelFormat::Nv12) {
sourceStride_ = ((width_ + 3) / 4) * 4;
} else {
sourceStride_ = ((width_ * bitsPerPixel + 31) / 32) * 4;
}
sourceTopDown_ = pixelFormat_ != PixelFormat::Bgra || videoInfo->bmiHeader.biHeight < 0;
}
std::cerr << "INFO: DirectShow webcam connected subtype " << guidToString(connectedType.subtype)
<< " " << width_ << "x" << height_ << " stride=" << sourceStride_ << std::endl;
freeMediaType(connectedType);
if (width_ <= 0 || height_ <= 0) {
width_ = requestedWidth > 0 ? requestedWidth : 1280;
height_ = requestedHeight > 0 ? requestedHeight : 720;
}
if (sourceStride_ <= 0) {
sourceStride_ = pixelFormat_ == PixelFormat::Bgra ? width_ * 4 : ((width_ + 3) / 4) * 4;
}
impl_->sampleGrabber->SetBufferSamples(TRUE);
impl_->sampleGrabber->SetOneShot(FALSE);
if (!succeeded(impl_->graph.As(&impl_->mediaControl), "QueryInterface(IMediaControl)")) {
return false;
}
return true;
}
bool DirectShowWebcamCapture::start() {
if (!impl_ || !impl_->mediaControl || impl_->running) {
return false;
}
HRESULT hr = impl_->mediaControl->Run();
if (!succeeded(hr, "Run(DirectShow webcam)")) {
return false;
}
stopRequested_ = false;
try {
thread_ = std::thread(&DirectShowWebcamCapture::captureLoop, this);
} catch (const std::exception& error) {
stopRequested_ = true;
impl_->mediaControl->Stop();
std::cerr << "ERROR: Failed to start DirectShow webcam capture thread: " << error.what() << std::endl;
return false;
} catch (...) {
stopRequested_ = true;
impl_->mediaControl->Stop();
std::cerr << "ERROR: Failed to start DirectShow webcam capture thread" << std::endl;
return false;
}
impl_->running = true;
return true;
}
void DirectShowWebcamCapture::stop() {
stopRequested_ = true;
if (thread_.joinable()) {
thread_.join();
}
if (!impl_) {
return;
}
if (impl_->mediaControl && impl_->running) {
impl_->mediaControl->Stop();
}
impl_->running = false;
impl_->mediaControl.Reset();
impl_->nullRenderer.Reset();
impl_->sampleGrabber.Reset();
impl_->sampleGrabberFilter.Reset();
impl_->captureFilter.Reset();
impl_->captureGraph.Reset();
impl_->graph.Reset();
if (impl_->comInitialized) {
CoUninitialize();
impl_->comInitialized = false;
}
}
void DirectShowWebcamCapture::captureLoop() {
const HRESULT coinitHr = CoInitializeEx(nullptr, COINIT_MULTITHREADED);
while (!stopRequested_ && impl_ && impl_->sampleGrabber) {
long bufferSize = 0;
HRESULT hr = impl_->sampleGrabber->GetCurrentBuffer(&bufferSize, nullptr);
if (SUCCEEDED(hr) && bufferSize > 0) {
std::vector<BYTE> buffer(static_cast<size_t>(bufferSize));
hr = impl_->sampleGrabber->GetCurrentBuffer(&bufferSize, reinterpret_cast<long*>(buffer.data()));
if (SUCCEEDED(hr)) {
storeFrame(buffer.data(), bufferSize);
}
}
std::this_thread::sleep_for(std::chrono::milliseconds(1000 / std::max(1, fps_)));
}
if (SUCCEEDED(coinitHr)) {
CoUninitialize();
}
}
void DirectShowWebcamCapture::storeFrame(const BYTE* buffer, long length) {
const int destinationStride = width_ * 4;
const int sourceStride = sourceStride_ > 0 ? sourceStride_ : destinationStride;
const int expectedLength = pixelFormat_ == PixelFormat::Nv12
? sourceStride * height_ + sourceStride * ((height_ + 1) / 2)
: sourceStride * height_;
if (!buffer || length < expectedLength || width_ <= 0 || height_ <= 0) {
return;
}
std::vector<BYTE> frame(static_cast<size_t>(destinationStride * height_));
for (int y = 0; y < height_; y += 1) {
const int sourceY = sourceTopDown_ ? y : height_ - 1 - y;
const BYTE* source = buffer + sourceY * sourceStride;
BYTE* destination = frame.data() + y * destinationStride;
if (pixelFormat_ == PixelFormat::Bgra) {
std::copy(source, source + destinationStride, destination);
for (int x = 0; x < width_; x += 1) {
destination[x * 4 + 3] = 255;
}
continue;
}
if (pixelFormat_ == PixelFormat::Nv12) {
const BYTE* yPlane = buffer + sourceY * sourceStride;
const BYTE* uvPlane = buffer + sourceStride * height_ + (sourceY / 2) * sourceStride;
for (int x = 0; x < width_; x += 1) {
const int uvX = (x / 2) * 2;
const auto color = yuvToBgr(yPlane[x], uvPlane[uvX], uvPlane[uvX + 1]);
BYTE* pixel = destination + x * 4;
pixel[0] = color[0];
pixel[1] = color[1];
pixel[2] = color[2];
pixel[3] = 255;
}
continue;
}
for (int x = 0; x + 1 < width_; x += 2) {
const BYTE y0 = source[x * 2];
const BYTE u = source[x * 2 + 1];
const BYTE y1 = source[x * 2 + 2];
const BYTE v = source[x * 2 + 3];
const auto first = yuvToBgr(y0, u, v);
const auto second = yuvToBgr(y1, u, v);
BYTE* firstPixel = destination + x * 4;
BYTE* secondPixel = firstPixel + 4;
firstPixel[0] = first[0];
firstPixel[1] = first[1];
firstPixel[2] = first[2];
firstPixel[3] = 255;
secondPixel[0] = second[0];
secondPixel[1] = second[1];
secondPixel[2] = second[2];
secondPixel[3] = 255;
}
if (width_ % 2 == 1) {
const int x = width_ - 1;
const int previousPairStart = ((x - 1) / 2) * 4;
const BYTE y = source[x * 2];
const BYTE u = source[previousPairStart + 1];
const BYTE v = source[previousPairStart + 3];
const auto color = yuvToBgr(y, u, v);
BYTE* pixel = destination + x * 4;
pixel[0] = color[0];
pixel[1] = color[1];
pixel[2] = color[2];
pixel[3] = 255;
}
}
std::scoped_lock lock(frameMutex_);
latestFrame_ = std::move(frame);
latestFrameSequence_ += 1;
}
bool DirectShowWebcamCapture::copyLatestFrame(WebcamFrameSnapshot& destination) {
std::scoped_lock lock(frameMutex_);
if (latestFrame_.empty() || width_ <= 0 || height_ <= 0) {
return false;
}
destination.data = latestFrame_;
destination.width = width_;
destination.height = height_;
destination.sequence = latestFrameSequence_;
return true;
}
int DirectShowWebcamCapture::width() const {
return width_;
}
int DirectShowWebcamCapture::height() const {
return height_;
}
int DirectShowWebcamCapture::fps() const {
return fps_;
}
const std::wstring& DirectShowWebcamCapture::selectedDeviceName() const {
return selectedDeviceName_;
}
@@ -0,0 +1,67 @@
#pragma once
#include <Windows.h>
#include <atomic>
#include <cstdint>
#include <mutex>
#include <string>
#include <thread>
#include <vector>
struct WebcamFrameSnapshot {
std::vector<BYTE> data;
int width = 0;
int height = 0;
uint64_t sequence = 0;
};
class DirectShowWebcamCapture {
public:
DirectShowWebcamCapture() = default;
~DirectShowWebcamCapture();
DirectShowWebcamCapture(const DirectShowWebcamCapture&) = delete;
DirectShowWebcamCapture& operator=(const DirectShowWebcamCapture&) = delete;
bool initialize(
const std::wstring& deviceId,
const std::wstring& deviceName,
const std::wstring& directShowClsid,
int requestedWidth,
int requestedHeight,
int requestedFps);
bool start();
void stop();
bool copyLatestFrame(WebcamFrameSnapshot& destination);
int width() const;
int height() const;
int fps() const;
const std::wstring& selectedDeviceName() const;
void storeFrame(const BYTE* buffer, long length);
private:
enum class PixelFormat {
Bgra,
Nv12,
Yuy2,
};
struct Impl;
void captureLoop();
Impl* impl_ = nullptr;
std::thread thread_;
std::atomic<bool> stopRequested_ = false;
std::mutex frameMutex_;
std::vector<BYTE> latestFrame_;
uint64_t latestFrameSequence_ = 0;
int width_ = 0;
int height_ = 0;
int fps_ = 30;
int sourceStride_ = 0;
bool sourceTopDown_ = false;
PixelFormat pixelFormat_ = PixelFormat::Bgra;
std::wstring selectedDeviceName_;
};
+859
View File
@@ -0,0 +1,859 @@
#include "audio_sample_utils.h"
#include "mf_encoder.h"
#include "monitor_utils.h"
#include "wasapi_loopback_capture.h"
#include "webcam_capture.h"
#include "wgc_session.h"
#include <winrt/Windows.Foundation.h>
#include <algorithm>
#include <atomic>
#include <chrono>
#include <condition_variable>
#include <cctype>
#include <cstdint>
#include <functional>
#include <iostream>
#include <memory>
#include <mutex>
#include <string>
#include <thread>
namespace {
struct CaptureConfig {
int schemaVersion = 1;
int64_t displayId = 0;
int64_t recordingId = 0;
std::string sourceType = "display";
std::string sourceId;
std::string windowHandle;
std::string outputPath;
std::string webcamOutputPath;
int fps = 60;
int width = 0;
int height = 0;
MonitorBounds bounds{};
bool hasDisplayBounds = false;
bool captureSystemAudio = false;
bool captureMic = false;
bool captureCursor = false;
bool webcamEnabled = false;
std::string microphoneDeviceId;
std::string microphoneDeviceName;
double microphoneGain = 1.0;
std::string webcamDeviceId;
std::string webcamDeviceName;
std::string webcamDirectShowClsid;
int webcamWidth = 0;
int webcamHeight = 0;
int webcamFps = 0;
};
struct CaptureControl {
std::atomic<bool> stopRequested = false;
std::atomic<bool> paused = false;
std::mutex mutex;
std::condition_variable cv;
std::chrono::steady_clock::time_point pauseStartedAt;
std::chrono::steady_clock::duration totalPausedDuration{};
int64_t pausedDurationHns() {
std::scoped_lock lock(mutex);
auto total = totalPausedDuration;
if (paused.load()) {
total += std::chrono::steady_clock::now() - pauseStartedAt;
}
return std::chrono::duration_cast<std::chrono::nanoseconds>(total).count() / 100;
}
void setPaused(bool nextPaused) {
std::scoped_lock lock(mutex);
if (nextPaused == paused.load()) {
return;
}
if (nextPaused) {
pauseStartedAt = std::chrono::steady_clock::now();
} else {
totalPausedDuration += std::chrono::steady_clock::now() - pauseStartedAt;
}
paused = nextPaused;
}
};
std::wstring utf8ToWide(const std::string& value) {
if (value.empty()) {
return {};
}
const int size = MultiByteToWideChar(CP_UTF8, 0, value.data(), static_cast<int>(value.size()), nullptr, 0);
std::wstring result(static_cast<size_t>(size), L'\0');
MultiByteToWideChar(CP_UTF8, 0, value.data(), static_cast<int>(value.size()), result.data(), size);
return result;
}
std::string wideToUtf8(const std::wstring& value) {
if (value.empty()) {
return {};
}
const int size = WideCharToMultiByte(CP_UTF8, 0, value.data(), static_cast<int>(value.size()), nullptr, 0, nullptr, nullptr);
std::string result(static_cast<size_t>(size), '\0');
WideCharToMultiByte(CP_UTF8, 0, value.data(), static_cast<int>(value.size()), result.data(), size, nullptr, nullptr);
return result;
}
std::string jsonEscape(const std::string& value) {
std::string result;
result.reserve(value.size());
for (const char c : value) {
switch (c) {
case '\\':
result += "\\\\";
break;
case '"':
result += "\\\"";
break;
case '\n':
result += "\\n";
break;
case '\r':
result += "\\r";
break;
case '\t':
result += "\\t";
break;
default:
result.push_back(c);
break;
}
}
return result;
}
bool hasVisibleBgraContent(const std::vector<BYTE>& frame) {
if (frame.size() < 4) {
return false;
}
uint64_t lumaTotal = 0;
BYTE maxLuma = 0;
const size_t pixelCount = frame.size() / 4;
const size_t step = std::max<size_t>(1, pixelCount / 4096);
size_t sampledPixels = 0;
for (size_t pixel = 0; pixel < pixelCount; pixel += step) {
const size_t offset = pixel * 4;
const BYTE b = frame[offset + 0];
const BYTE g = frame[offset + 1];
const BYTE r = frame[offset + 2];
const BYTE luma = static_cast<BYTE>((static_cast<uint16_t>(r) * 54 + static_cast<uint16_t>(g) * 183 + static_cast<uint16_t>(b) * 19) >> 8);
lumaTotal += luma;
maxLuma = std::max(maxLuma, luma);
sampledPixels += 1;
}
const uint64_t averageLuma = sampledPixels > 0 ? lumaTotal / sampledPixels : 0;
return maxLuma > 24 || averageLuma > 4;
}
bool findBool(const std::string& json, const std::string& key, bool fallback) {
auto pos = json.find("\"" + key + "\"");
if (pos == std::string::npos) {
return fallback;
}
pos = json.find(':', pos);
if (pos == std::string::npos) {
return fallback;
}
pos += 1;
while (pos < json.size() && std::isspace(static_cast<unsigned char>(json[pos]))) {
pos += 1;
}
if (json.compare(pos, 4, "true") == 0) {
return true;
}
if (json.compare(pos, 5, "false") == 0) {
return false;
}
return fallback;
}
int64_t findInt64(const std::string& json, const std::string& key, int64_t fallback) {
auto pos = json.find("\"" + key + "\"");
if (pos == std::string::npos) {
return fallback;
}
pos = json.find(':', pos);
if (pos == std::string::npos) {
return fallback;
}
pos += 1;
while (pos < json.size() && std::isspace(static_cast<unsigned char>(json[pos]))) {
pos += 1;
}
try {
return std::stoll(json.substr(pos));
} catch (...) {
return fallback;
}
}
int findInt(const std::string& json, const std::string& key, int fallback) {
return static_cast<int>(findInt64(json, key, fallback));
}
double findDouble(const std::string& json, const std::string& key, double fallback) {
auto pos = json.find("\"" + key + "\"");
if (pos == std::string::npos) {
return fallback;
}
pos = json.find(':', pos);
if (pos == std::string::npos) {
return fallback;
}
pos += 1;
while (pos < json.size() && std::isspace(static_cast<unsigned char>(json[pos]))) {
pos += 1;
}
try {
return std::stod(json.substr(pos));
} catch (...) {
return fallback;
}
}
std::string findString(const std::string& json, const std::string& key) {
auto pos = json.find("\"" + key + "\"");
if (pos == std::string::npos) {
return {};
}
pos = json.find(':', pos);
if (pos == std::string::npos) {
return {};
}
pos += 1;
while (pos < json.size() && std::isspace(static_cast<unsigned char>(json[pos]))) {
pos += 1;
}
if (pos >= json.size() || json[pos] != '"') {
return {};
}
pos += 1;
std::string result;
while (pos < json.size()) {
const char c = json[pos++];
if (c == '"') {
break;
}
if (c == '\\' && pos < json.size()) {
const char escaped = json[pos++];
switch (escaped) {
case '\\':
case '"':
case '/':
result.push_back(escaped);
break;
case 'n':
result.push_back('\n');
break;
case 'r':
result.push_back('\r');
break;
case 't':
result.push_back('\t');
break;
default:
result.push_back(escaped);
break;
}
continue;
}
result.push_back(c);
}
return result;
}
std::string parseWindowHandleFromSourceId(const std::string& sourceId) {
constexpr char prefix[] = "window:";
if (sourceId.rfind(prefix, 0) != 0) {
return {};
}
const size_t start = sizeof(prefix) - 1;
const size_t end = sourceId.find(':', start);
const std::string handle = sourceId.substr(start, end == std::string::npos ? std::string::npos : end - start);
return handle.empty() ? std::string{} : handle;
}
HWND parseWindowHandle(const std::string& value) {
if (value.empty()) {
return nullptr;
}
try {
size_t parsed = 0;
const int base = value.rfind("0x", 0) == 0 || value.rfind("0X", 0) == 0 ? 16 : 10;
const uint64_t handleValue = std::stoull(value, &parsed, base);
if (parsed != value.size() || handleValue == 0) {
return nullptr;
}
return reinterpret_cast<HWND>(static_cast<uintptr_t>(handleValue));
} catch (...) {
return nullptr;
}
}
bool parseConfig(const std::string& json, CaptureConfig& config) {
config.schemaVersion = findInt(json, "schemaVersion", 1);
config.outputPath = findString(json, "screenPath");
if (config.outputPath.empty()) {
config.outputPath = findString(json, "outputPath");
}
if (config.outputPath.empty()) {
return false;
}
config.recordingId = findInt64(json, "recordingId", 0);
config.sourceType = findString(json, "sourceType");
if (config.sourceType.empty()) {
config.sourceType = "display";
}
config.sourceId = findString(json, "sourceId");
config.windowHandle = findString(json, "windowHandle");
if (config.windowHandle.empty()) {
config.windowHandle = parseWindowHandleFromSourceId(config.sourceId);
}
config.displayId = findInt64(json, "displayId", 0);
config.fps = std::clamp(findInt(json, "fps", 60), 1, 120);
config.width = findInt(json, "videoWidth", findInt(json, "width", 0));
config.height = findInt(json, "videoHeight", findInt(json, "height", 0));
config.bounds.x = findInt(json, "displayX", 0);
config.bounds.y = findInt(json, "displayY", 0);
config.bounds.width = findInt(json, "displayW", 0);
config.bounds.height = findInt(json, "displayH", 0);
config.hasDisplayBounds = findBool(json, "hasDisplayBounds", false);
config.captureSystemAudio = findBool(json, "captureSystemAudio", false);
config.captureMic = findBool(json, "captureMic", false);
config.captureCursor = findBool(json, "captureCursor", false);
config.webcamEnabled = findBool(json, "webcamEnabled", false);
config.microphoneDeviceId = findString(json, "microphoneDeviceId");
config.microphoneDeviceName = findString(json, "microphoneDeviceName");
config.microphoneGain = findDouble(json, "microphoneGain", 1.0);
config.webcamDeviceId = findString(json, "webcamDeviceId");
config.webcamDeviceName = findString(json, "webcamDeviceName");
config.webcamDirectShowClsid = findString(json, "webcamDirectShowClsid");
config.webcamOutputPath = findString(json, "webcamPath");
config.webcamWidth = findInt(json, "webcamWidth", 0);
config.webcamHeight = findInt(json, "webcamHeight", 0);
config.webcamFps = findInt(json, "webcamFps", 0);
return true;
}
void readCaptureCommands(CaptureControl& control, const std::function<void(bool)>& onPauseChanged) {
std::string line;
while (std::getline(std::cin, line)) {
if (line == "stop" || line == "q" || line == "quit") {
control.stopRequested = true;
control.cv.notify_all();
return;
}
if (line == "pause") {
control.setPaused(true);
onPauseChanged(true);
std::cout << "{\"event\":\"recording-paused\",\"schemaVersion\":2}" << std::endl;
control.cv.notify_all();
continue;
}
if (line == "resume") {
control.setPaused(false);
onPauseChanged(false);
std::cout << "{\"event\":\"recording-resumed\",\"schemaVersion\":2}" << std::endl;
control.cv.notify_all();
continue;
}
}
control.stopRequested = true;
control.cv.notify_all();
}
} // namespace
int main(int argc, char* argv[]) {
if (argc < 2) {
std::cerr << "ERROR: Missing JSON config argument" << std::endl;
return 1;
}
winrt::init_apartment(winrt::apartment_type::multi_threaded);
CaptureConfig config;
if (!parseConfig(argv[1], config)) {
std::cerr << "ERROR: Failed to parse config JSON" << std::endl;
return 1;
}
std::cout << "{\"event\":\"ready\",\"schemaVersion\":2}" << std::endl;
WgcSession session;
if (config.sourceType == "display") {
HMONITOR monitor = findMonitorForCapture(
config.displayId,
config.hasDisplayBounds ? &config.bounds : nullptr);
if (!monitor) {
std::cerr << "ERROR: Could not resolve monitor" << std::endl;
return 1;
}
if (!session.initialize(monitor, config.fps, config.captureCursor)) {
std::cerr << "ERROR: Failed to initialize WGC display session" << std::endl;
return 1;
}
} else if (config.sourceType == "window") {
HWND window = parseWindowHandle(config.windowHandle);
if (!window || !IsWindow(window)) {
std::cerr << "ERROR: Native window capture requires a valid HWND" << std::endl;
return 1;
}
if (!session.initialize(window, config.fps, config.captureCursor)) {
std::cerr << "ERROR: Failed to initialize WGC window session" << std::endl;
return 1;
}
} else {
std::cerr << "ERROR: Unsupported native capture source type: " << config.sourceType << std::endl;
return 1;
}
// WGC owns the captured texture size. Encoding must use that exact size
// until a dedicated GPU scaling pass is introduced; CopyResource requires
// matching resource dimensions.
int width = session.captureWidth();
int height = session.captureHeight();
width = (std::max(2, width) / 2) * 2;
height = (std::max(2, height) / 2) * 2;
const int pixels = width * height;
const int bitrate = pixels >= 3840 * 2160 ? 45'000'000 : pixels >= 2560 * 1440 ? 28'000'000 : 18'000'000;
WebcamCapture webcamCapture;
bool webcamActive = false;
bool writeSeparateWebcam = false;
if (config.webcamEnabled) {
if (!webcamCapture.initialize(
utf8ToWide(config.webcamDeviceId),
utf8ToWide(config.webcamDeviceName),
utf8ToWide(config.webcamDirectShowClsid),
config.webcamWidth,
config.webcamHeight,
config.webcamFps > 0 ? config.webcamFps : config.fps)) {
std::cerr << "ERROR: Failed to initialize native webcam capture" << std::endl;
return 1;
}
std::cout << "{\"event\":\"webcam-format\",\"schemaVersion\":2,\"width\":" << webcamCapture.width()
<< ",\"height\":" << webcamCapture.height()
<< ",\"fps\":" << webcamCapture.fps()
<< ",\"deviceName\":\"" << jsonEscape(wideToUtf8(webcamCapture.selectedDeviceName()))
<< "\"}" << std::endl;
writeSeparateWebcam = !config.webcamOutputPath.empty();
}
WasapiLoopbackCapture loopbackCapture;
WasapiLoopbackCapture microphoneCapture;
const AudioInputFormat* audioFormat = nullptr;
AudioInputFormat encoderAudioFormat{};
AudioInputFormat systemAudioFormat{};
AudioInputFormat microphoneAudioFormat{};
if (config.captureSystemAudio) {
if (!loopbackCapture.initializeSystemLoopback()) {
std::cerr << "ERROR: Failed to initialize WASAPI loopback capture" << std::endl;
return 1;
}
systemAudioFormat = loopbackCapture.inputFormat();
audioFormat = &loopbackCapture.inputFormat();
}
if (config.captureMic) {
if (!microphoneCapture.initializeMicrophone(
utf8ToWide(config.microphoneDeviceId),
utf8ToWide(config.microphoneDeviceName))) {
std::cerr << "ERROR: Failed to initialize WASAPI microphone capture" << std::endl;
return 1;
}
microphoneAudioFormat = microphoneCapture.inputFormat();
if (!audioFormat) {
audioFormat = &microphoneCapture.inputFormat();
}
}
if (audioFormat) {
std::cout << "{\"event\":\"audio-format\",\"schemaVersion\":2,\"sampleRate\":" << audioFormat->sampleRate
<< ",\"channels\":" << audioFormat->channels
<< ",\"bitsPerSample\":" << audioFormat->bitsPerSample
<< ",\"system\":" << (config.captureSystemAudio ? "true" : "false")
<< ",\"microphone\":" << (config.captureMic ? "true" : "false");
if (config.captureMic) {
std::cout << ",\"microphoneDeviceName\":\""
<< jsonEscape(wideToUtf8(microphoneCapture.selectedDeviceName())) << "\"";
}
std::cout << "}" << std::endl;
encoderAudioFormat = makeAacCompatibleAudioFormat(*audioFormat);
std::cout << "{\"event\":\"encoder-audio-format\",\"schemaVersion\":2,\"sampleRate\":"
<< encoderAudioFormat.sampleRate
<< ",\"channels\":" << encoderAudioFormat.channels
<< ",\"bitsPerSample\":" << encoderAudioFormat.bitsPerSample
<< "}" << std::endl;
}
MFEncoder encoder;
if (!encoder.initialize(
utf8ToWide(config.outputPath),
width,
height,
config.fps,
bitrate,
session.device(),
session.context(),
audioFormat ? &encoderAudioFormat : nullptr)) {
std::cerr << "ERROR: Failed to initialize Media Foundation encoder" << std::endl;
return 1;
}
MFEncoder webcamEncoder;
if (writeSeparateWebcam) {
const int webcamPixels = std::max(1, webcamCapture.width()) * std::max(1, webcamCapture.height());
const int webcamBitrate = webcamPixels >= 1280 * 720 ? 8'000'000 : 4'000'000;
if (!webcamEncoder.initialize(
utf8ToWide(config.webcamOutputPath),
webcamCapture.width(),
webcamCapture.height(),
webcamCapture.fps(),
webcamBitrate,
session.device(),
session.context(),
nullptr)) {
std::cerr << "ERROR: Failed to initialize native webcam encoder" << std::endl;
return 1;
}
}
std::mutex mutex;
CaptureControl control;
std::atomic<bool> firstFrameWritten = false;
std::atomic<bool> encodeFailed = false;
Microsoft::WRL::ComPtr<ID3D11Texture2D> latestFrameTexture;
int64_t latestFrameTimestampHns = 0;
int64_t firstFrameTimestampHns = -1;
std::vector<BYTE> latestWebcamFrame;
int latestWebcamWidth = 0;
int latestWebcamHeight = 0;
uint64_t latestWebcamSequence = 0;
bool hasVisibleWebcamFrame = false;
session.setFrameCallback([&](ID3D11Texture2D* texture, int64_t timestampHns) {
if (control.stopRequested || control.paused) {
return;
}
std::scoped_lock lock(mutex);
if (!latestFrameTexture) {
D3D11_TEXTURE2D_DESC desc{};
texture->GetDesc(&desc);
desc.BindFlags = 0;
desc.CPUAccessFlags = 0;
desc.MiscFlags = 0;
if (FAILED(session.device()->CreateTexture2D(&desc, nullptr, &latestFrameTexture))) {
encodeFailed = true;
control.stopRequested = true;
control.cv.notify_all();
return;
}
}
session.context()->CopyResource(latestFrameTexture.Get(), texture);
latestFrameTimestampHns = timestampHns;
if (!firstFrameWritten.exchange(true)) {
control.cv.notify_all();
}
});
auto writeVideoFrames = [&]() {
const auto frameDuration = std::chrono::duration_cast<std::chrono::steady_clock::duration>(
std::chrono::duration<double>(1.0 / config.fps));
uint64_t frameIndex = 0;
uint64_t lastWrittenWebcamSequence = 0;
uint64_t webcamOutputFrameIndex = 0;
int64_t lastEncodedVideoTimestampHns = -1;
while (!control.stopRequested && !encodeFailed) {
{
std::unique_lock lock(mutex);
control.cv.wait(lock, [&] {
return control.stopRequested.load() ||
encodeFailed.load() ||
(!control.paused.load() && latestFrameTexture);
});
if (control.stopRequested || encodeFailed) {
break;
}
if (webcamActive) {
WebcamFrameSnapshot candidateWebcamFrame;
if (webcamCapture.copyLatestFrame(candidateWebcamFrame) &&
candidateWebcamFrame.sequence != latestWebcamSequence &&
hasVisibleBgraContent(candidateWebcamFrame.data)) {
latestWebcamFrame = std::move(candidateWebcamFrame.data);
latestWebcamWidth = candidateWebcamFrame.width;
latestWebcamHeight = candidateWebcamFrame.height;
latestWebcamSequence = candidateWebcamFrame.sequence;
hasVisibleWebcamFrame = true;
}
}
const BgraFrameView webcamFrame{
hasVisibleWebcamFrame && !latestWebcamFrame.empty() ? latestWebcamFrame.data() : nullptr,
latestWebcamWidth,
latestWebcamHeight,
};
const int64_t syntheticTimestampHns =
static_cast<int64_t>((frameIndex * 10'000'000ULL) / config.fps);
const int64_t sourceTimestampHns =
latestFrameTimestampHns > 0 ? latestFrameTimestampHns : syntheticTimestampHns;
if (firstFrameTimestampHns < 0) {
firstFrameTimestampHns = sourceTimestampHns;
}
int64_t frameTimestampHns =
std::max<int64_t>(
0,
sourceTimestampHns - firstFrameTimestampHns - control.pausedDurationHns());
if (lastEncodedVideoTimestampHns >= 0 &&
frameTimestampHns <= lastEncodedVideoTimestampHns) {
frameTimestampHns =
lastEncodedVideoTimestampHns + static_cast<int64_t>(10'000'000ULL / config.fps);
}
if (writeSeparateWebcam && webcamFrame.data &&
latestWebcamSequence != lastWrittenWebcamSequence) {
const int64_t webcamTimestampHns = static_cast<int64_t>(
(webcamOutputFrameIndex * 10'000'000ULL) / std::max(1, webcamCapture.fps()));
if (!webcamEncoder.writeBgraFrame(webcamFrame, webcamTimestampHns)) {
encodeFailed = true;
control.stopRequested = true;
control.cv.notify_all();
return;
}
lastWrittenWebcamSequence = latestWebcamSequence;
webcamOutputFrameIndex += 1;
}
if (latestFrameTexture && !encoder.writeFrame(
latestFrameTexture.Get(),
frameTimestampHns,
!writeSeparateWebcam && webcamFrame.data ? &webcamFrame : nullptr)) {
encodeFailed = true;
control.stopRequested = true;
control.cv.notify_all();
return;
}
if (latestFrameTexture) {
lastEncodedVideoTimestampHns = frameTimestampHns;
}
}
frameIndex += 1;
std::this_thread::sleep_for(frameDuration);
}
};
std::thread videoWriterThread;
auto stopVideoWriter = [&]() {
if (videoWriterThread.joinable()) {
videoWriterThread.join();
}
};
auto startVideoWriter = [&]() {
videoWriterThread = std::thread(writeVideoFrames);
};
std::unique_ptr<AudioMixer> audioMixer;
auto startAudioCaptures = [&]() -> bool {
if (!audioFormat) {
return true;
}
audioMixer = std::make_unique<AudioMixer>(
encoderAudioFormat,
config.captureSystemAudio ? systemAudioFormat : encoderAudioFormat,
config.captureMic ? microphoneAudioFormat : encoderAudioFormat,
config.captureSystemAudio,
config.captureMic,
config.microphoneGain,
[&](const BYTE* data, DWORD byteCount, int64_t timestampHns, int64_t durationHns) {
if (!encoder.writeAudio(data, byteCount, timestampHns, durationHns)) {
encodeFailed = true;
control.stopRequested = true;
control.cv.notify_all();
return false;
}
return true;
});
if (!audioMixer->start()) {
std::cerr << "ERROR: Failed to start native audio mixer" << std::endl;
return false;
}
if (config.captureMic) {
if (!microphoneCapture.start([&](const BYTE* data, DWORD byteCount, int64_t timestampHns, int64_t durationHns) {
(void)timestampHns;
(void)durationHns;
if (control.stopRequested || !audioMixer) {
return;
}
audioMixer->pushMicrophone(data, byteCount);
})) {
std::cerr << "ERROR: Failed to start WASAPI microphone capture" << std::endl;
audioMixer->stop();
return false;
}
}
if (config.captureSystemAudio) {
if (!loopbackCapture.start([&](const BYTE* data, DWORD byteCount, int64_t timestampHns, int64_t durationHns) {
(void)timestampHns;
(void)durationHns;
if (control.stopRequested || !audioMixer) {
return;
}
audioMixer->pushSystem(data, byteCount);
})) {
std::cerr << "ERROR: Failed to start WASAPI loopback capture" << std::endl;
microphoneCapture.stop();
audioMixer->stop();
return false;
}
}
return true;
};
if (!startAudioCaptures()) {
return 1;
}
if (config.webcamEnabled) {
if (!webcamCapture.start()) {
microphoneCapture.stop();
loopbackCapture.stop();
if (audioMixer) {
audioMixer->stop();
}
std::cerr << "ERROR: Failed to start native webcam capture" << std::endl;
return 1;
}
webcamActive = true;
const auto webcamDeadline = std::chrono::steady_clock::now() + std::chrono::seconds(3);
while (std::chrono::steady_clock::now() < webcamDeadline && !hasVisibleWebcamFrame) {
WebcamFrameSnapshot candidateWebcamFrame;
if (webcamCapture.copyLatestFrame(candidateWebcamFrame) &&
hasVisibleBgraContent(candidateWebcamFrame.data)) {
latestWebcamFrame = std::move(candidateWebcamFrame.data);
latestWebcamWidth = candidateWebcamFrame.width;
latestWebcamHeight = candidateWebcamFrame.height;
latestWebcamSequence = candidateWebcamFrame.sequence;
hasVisibleWebcamFrame = true;
break;
}
std::this_thread::sleep_for(std::chrono::milliseconds(20));
}
if (!hasVisibleWebcamFrame) {
std::cerr << "WARNING: Native webcam started but no visible frame was available before screen capture"
<< std::endl;
}
}
if (!session.start()) {
webcamCapture.stop();
microphoneCapture.stop();
loopbackCapture.stop();
if (audioMixer) {
audioMixer->stop();
}
std::cerr << "ERROR: Failed to start WGC session" << std::endl;
return 1;
}
std::thread stdinThread(readCaptureCommands, std::ref(control), [&](bool isPaused) {
if (audioMixer) {
audioMixer->setPaused(isPaused);
}
});
{
std::unique_lock lock(mutex);
const bool started = control.cv.wait_for(lock, std::chrono::seconds(10), [&] {
return firstFrameWritten.load() || control.stopRequested.load();
});
if (!started || !firstFrameWritten) {
control.stopRequested = true;
control.cv.notify_all();
if (stdinThread.joinable()) {
stdinThread.detach();
}
microphoneCapture.stop();
loopbackCapture.stop();
webcamCapture.stop();
if (audioMixer) {
audioMixer->stop();
}
session.stop();
std::cerr << "ERROR: Timed out waiting for first WGC frame" << std::endl;
return 1;
}
}
if (audioMixer) {
audioMixer->beginTimeline();
}
startVideoWriter();
std::cout << "{\"event\":\"recording-started\",\"schemaVersion\":2}" << std::endl;
std::cout << "Recording started" << std::endl;
{
std::unique_lock lock(mutex);
control.cv.wait(lock, [&] {
return control.stopRequested.load();
});
}
microphoneCapture.stop();
loopbackCapture.stop();
webcamCapture.stop();
if (audioMixer) {
audioMixer->stop();
}
stopVideoWriter();
session.stop();
{
std::scoped_lock lock(mutex);
encoder.finalize();
if (writeSeparateWebcam) {
webcamEncoder.finalize();
}
}
if (stdinThread.joinable()) {
stdinThread.detach();
}
if (encodeFailed) {
std::cerr << "ERROR: Failed to encode WGC frame" << std::endl;
return 1;
}
std::cout << "{\"event\":\"recording-stopped\",\"schemaVersion\":2,\"screenPath\":\""
<< jsonEscape(config.outputPath) << "\"";
if (writeSeparateWebcam) {
std::cout << ",\"webcamPath\":\"" << jsonEscape(config.webcamOutputPath) << "\"";
}
std::cout << "}" << std::endl;
std::cout << "Recording stopped. Output path: " << config.outputPath << std::endl;
return 0;
}
@@ -0,0 +1,450 @@
#include "mf_encoder.h"
#include "audio_sample_utils.h"
#include <mfapi.h>
#include <mferror.h>
#include <propvarutil.h>
#include <algorithm>
#include <cstring>
#include <iostream>
namespace {
bool succeeded(HRESULT hr, const char* label) {
if (SUCCEEDED(hr)) {
return true;
}
std::cerr << "ERROR: " << label << " failed (hr=0x" << std::hex << hr << std::dec << ")"
<< std::endl;
return false;
}
void setFrameSize(IMFMediaType* type, UINT32 width, UINT32 height) {
MFSetAttributeSize(type, MF_MT_FRAME_SIZE, width, height);
}
void setFrameRate(IMFMediaType* type, UINT32 fps) {
MFSetAttributeRatio(type, MF_MT_FRAME_RATE, fps, 1);
}
void setPixelAspectRatio(IMFMediaType* type) {
MFSetAttributeRatio(type, MF_MT_PIXEL_ASPECT_RATIO, 1, 1);
}
void setAudioFormat(IMFMediaType* type, UINT32 channels, UINT32 sampleRate, UINT32 bitsPerSample) {
type->SetUINT32(MF_MT_AUDIO_NUM_CHANNELS, channels);
type->SetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, sampleRate);
type->SetUINT32(MF_MT_AUDIO_BITS_PER_SAMPLE, bitsPerSample);
}
void compositeWebcam(BYTE* destination, int width, int height, const BgraFrameView& webcamFrame) {
if (!webcamFrame.data || webcamFrame.width <= 0 || webcamFrame.height <= 0 || width <= 0 || height <= 0) {
return;
}
const int margin = std::max(16, std::min(width, height) / 60);
const int maxOverlayWidth = std::max(2, width / 4);
int overlayWidth = maxOverlayWidth;
int overlayHeight = static_cast<int>(
(static_cast<int64_t>(overlayWidth) * webcamFrame.height) / std::max(1, webcamFrame.width));
const int maxOverlayHeight = std::max(2, height / 3);
if (overlayHeight > maxOverlayHeight) {
overlayHeight = maxOverlayHeight;
overlayWidth = static_cast<int>(
(static_cast<int64_t>(overlayHeight) * webcamFrame.width) / std::max(1, webcamFrame.height));
}
overlayWidth = std::max(2, std::min(overlayWidth, width - margin * 2));
overlayHeight = std::max(2, std::min(overlayHeight, height - margin * 2));
const int originX = std::max(0, width - overlayWidth - margin);
const int originY = std::max(0, height - overlayHeight - margin);
for (int y = 0; y < overlayHeight; y += 1) {
const int sourceY = static_cast<int>((static_cast<int64_t>(y) * webcamFrame.height) / overlayHeight);
BYTE* destinationRow = destination + ((originY + y) * width + originX) * 4;
for (int x = 0; x < overlayWidth; x += 1) {
const int sourceX = static_cast<int>((static_cast<int64_t>(x) * webcamFrame.width) / overlayWidth);
const BYTE* source = webcamFrame.data + (sourceY * webcamFrame.width + sourceX) * 4;
BYTE* target = destinationRow + x * 4;
target[0] = source[0];
target[1] = source[1];
target[2] = source[2];
target[3] = 255;
}
}
}
} // namespace
MFEncoder::~MFEncoder() {
finalize();
}
bool MFEncoder::initialize(
const std::wstring& outputPath,
int width,
int height,
int fps,
int bitrate,
ID3D11Device* device,
ID3D11DeviceContext* context,
const AudioInputFormat* audioFormat) {
width_ = (std::max(2, width) / 2) * 2;
height_ = (std::max(2, height) / 2) * 2;
fps_ = std::max(1, fps);
device_ = device;
context_ = context;
if (!succeeded(MFStartup(MF_VERSION), "MFStartup")) {
return false;
}
Microsoft::WRL::ComPtr<IMFMediaType> outputType;
if (!succeeded(MFCreateMediaType(&outputType), "MFCreateMediaType(output)")) {
return false;
}
outputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
outputType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_H264);
outputType->SetUINT32(MF_MT_AVG_BITRATE, static_cast<UINT32>(std::max(1, bitrate)));
outputType->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive);
setFrameSize(outputType.Get(), static_cast<UINT32>(width_), static_cast<UINT32>(height_));
setFrameRate(outputType.Get(), static_cast<UINT32>(fps_));
setPixelAspectRatio(outputType.Get());
if (!succeeded(MFCreateSinkWriterFromURL(outputPath.c_str(), nullptr, nullptr, &sinkWriter_),
"MFCreateSinkWriterFromURL")) {
return false;
}
if (!succeeded(sinkWriter_->AddStream(outputType.Get(), &videoStreamIndex_), "AddStream")) {
return false;
}
if (audioFormat && !configureAudioStream(*audioFormat)) {
return false;
}
Microsoft::WRL::ComPtr<IMFMediaType> inputType;
if (!succeeded(MFCreateMediaType(&inputType), "MFCreateMediaType(input)")) {
return false;
}
inputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
inputType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB32);
inputType->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive);
inputType->SetUINT32(MF_MT_DEFAULT_STRIDE, static_cast<UINT32>(width_ * 4));
setFrameSize(inputType.Get(), static_cast<UINT32>(width_), static_cast<UINT32>(height_));
setFrameRate(inputType.Get(), static_cast<UINT32>(fps_));
setPixelAspectRatio(inputType.Get());
if (!succeeded(sinkWriter_->SetInputMediaType(videoStreamIndex_, inputType.Get(), nullptr),
"SetInputMediaType")) {
return false;
}
if (!succeeded(sinkWriter_->BeginWriting(), "BeginWriting")) {
return false;
}
return true;
}
bool MFEncoder::configureAudioStream(const AudioInputFormat& audioFormat) {
if (!sinkWriter_) {
return false;
}
if (audioFormat.sampleRate == 0 || audioFormat.channels == 0 || audioFormat.blockAlign == 0) {
std::cerr << "ERROR: Invalid audio input format" << std::endl;
return false;
}
const AudioInputFormat encoderFormat = makeAacCompatibleAudioFormat(audioFormat);
const UINT32 aacBytesPerSecond = 24'000;
Microsoft::WRL::ComPtr<IMFMediaType> outputType;
if (!succeeded(MFCreateMediaType(&outputType), "MFCreateMediaType(audio output)")) {
return false;
}
outputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio);
outputType->SetGUID(MF_MT_SUBTYPE, MFAudioFormat_AAC);
setAudioFormat(outputType.Get(), encoderFormat.channels, encoderFormat.sampleRate, 16);
outputType->SetUINT32(MF_MT_AUDIO_AVG_BYTES_PER_SECOND, aacBytesPerSecond);
outputType->SetUINT32(MF_MT_AAC_PAYLOAD_TYPE, 0);
if (!succeeded(sinkWriter_->AddStream(outputType.Get(), &audioStreamIndex_), "AddStream(audio)")) {
return false;
}
Microsoft::WRL::ComPtr<IMFMediaType> inputType;
if (!succeeded(MFCreateMediaType(&inputType), "MFCreateMediaType(audio input)")) {
return false;
}
inputType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio);
inputType->SetGUID(MF_MT_SUBTYPE, encoderFormat.subtype);
setAudioFormat(inputType.Get(), encoderFormat.channels, encoderFormat.sampleRate, encoderFormat.bitsPerSample);
inputType->SetUINT32(MF_MT_AUDIO_BLOCK_ALIGNMENT, encoderFormat.blockAlign);
inputType->SetUINT32(MF_MT_AUDIO_AVG_BYTES_PER_SECOND, encoderFormat.avgBytesPerSec);
inputType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE);
if (!succeeded(sinkWriter_->SetInputMediaType(audioStreamIndex_, inputType.Get(), nullptr),
"SetInputMediaType(audio)")) {
return false;
}
hasAudioStream_ = true;
return true;
}
bool MFEncoder::ensureStagingTexture(ID3D11Texture2D* texture) {
if (stagingTexture_) {
return true;
}
D3D11_TEXTURE2D_DESC desc{};
texture->GetDesc(&desc);
desc.Width = static_cast<UINT>(width_);
desc.Height = static_cast<UINT>(height_);
desc.MipLevels = 1;
desc.ArraySize = 1;
desc.Format = DXGI_FORMAT_B8G8R8A8_UNORM;
desc.SampleDesc.Count = 1;
desc.SampleDesc.Quality = 0;
desc.Usage = D3D11_USAGE_STAGING;
desc.BindFlags = 0;
desc.CPUAccessFlags = D3D11_CPU_ACCESS_READ;
desc.MiscFlags = 0;
return succeeded(device_->CreateTexture2D(&desc, nullptr, &stagingTexture_),
"CreateTexture2D(staging)");
}
bool MFEncoder::copyFrameToBuffer(
ID3D11Texture2D* texture,
BYTE* destination,
DWORD destinationSize,
const BgraFrameView* webcamFrame) {
if (!ensureStagingTexture(texture)) {
return false;
}
context_->CopyResource(stagingTexture_.Get(), texture);
D3D11_MAPPED_SUBRESOURCE mapped{};
if (!succeeded(context_->Map(stagingTexture_.Get(), 0, D3D11_MAP_READ, 0, &mapped), "Map")) {
return false;
}
const DWORD rowBytes = static_cast<DWORD>(width_ * 4);
const DWORD requiredBytes = rowBytes * static_cast<DWORD>(height_);
if (destinationSize < requiredBytes) {
context_->Unmap(stagingTexture_.Get(), 0);
std::cerr << "ERROR: Media Foundation buffer is too small" << std::endl;
return false;
}
auto* source = static_cast<const BYTE*>(mapped.pData);
for (int y = 0; y < height_; y += 1) {
std::memcpy(destination + rowBytes * y, source + mapped.RowPitch * y, rowBytes);
}
if (webcamFrame) {
compositeWebcam(destination, width_, height_, *webcamFrame);
}
context_->Unmap(stagingTexture_.Get(), 0);
return true;
}
bool MFEncoder::copyBgraFrameToBuffer(const BgraFrameView& frame, BYTE* destination, DWORD destinationSize) {
if (!frame.data || frame.width <= 0 || frame.height <= 0) {
return false;
}
const DWORD rowBytes = static_cast<DWORD>(width_ * 4);
const DWORD requiredBytes = rowBytes * static_cast<DWORD>(height_);
if (destinationSize < requiredBytes) {
std::cerr << "ERROR: Media Foundation webcam buffer is too small" << std::endl;
return false;
}
if (frame.width == width_ && frame.height == height_) {
for (DWORD i = 0; i < requiredBytes; i += 4) {
destination[i] = frame.data[i];
destination[i + 1] = frame.data[i + 1];
destination[i + 2] = frame.data[i + 2];
destination[i + 3] = 255;
}
return true;
}
for (int y = 0; y < height_; y += 1) {
const int sourceY = static_cast<int>((static_cast<int64_t>(y) * frame.height) / height_);
BYTE* destinationRow = destination + rowBytes * y;
for (int x = 0; x < width_; x += 1) {
const int sourceX = static_cast<int>((static_cast<int64_t>(x) * frame.width) / width_);
const BYTE* source = frame.data + (sourceY * frame.width + sourceX) * 4;
BYTE* target = destinationRow + x * 4;
target[0] = source[0];
target[1] = source[1];
target[2] = source[2];
target[3] = 255;
}
}
return true;
}
bool MFEncoder::writeFrame(ID3D11Texture2D* texture, int64_t timestampHns, const BgraFrameView* webcamFrame) {
std::scoped_lock writerLock(writerMutex_);
if (!sinkWriter_ || finalized_) {
return false;
}
if (firstTimestampHns_ < 0) {
firstTimestampHns_ = timestampHns;
}
int64_t sampleTime = timestampHns - firstTimestampHns_;
if (sampleTime <= lastTimestampHns_) {
sampleTime = lastTimestampHns_ + (10'000'000LL / fps_);
}
const int64_t sampleDuration = 10'000'000LL / fps_;
lastTimestampHns_ = sampleTime;
Microsoft::WRL::ComPtr<IMFMediaBuffer> buffer;
const DWORD frameBytes = static_cast<DWORD>(width_ * height_ * 4);
if (!succeeded(MFCreateMemoryBuffer(frameBytes, &buffer), "MFCreateMemoryBuffer")) {
return false;
}
BYTE* data = nullptr;
DWORD maxLength = 0;
DWORD currentLength = 0;
if (!succeeded(buffer->Lock(&data, &maxLength, &currentLength), "IMFMediaBuffer::Lock")) {
return false;
}
const bool copied = copyFrameToBuffer(texture, data, maxLength, webcamFrame);
buffer->Unlock();
if (!copied) {
return false;
}
buffer->SetCurrentLength(frameBytes);
Microsoft::WRL::ComPtr<IMFSample> sample;
if (!succeeded(MFCreateSample(&sample), "MFCreateSample")) {
return false;
}
sample->AddBuffer(buffer.Get());
sample->SetSampleTime(sampleTime);
sample->SetSampleDuration(sampleDuration);
return succeeded(sinkWriter_->WriteSample(videoStreamIndex_, sample.Get()), "WriteSample");
}
bool MFEncoder::writeBgraFrame(const BgraFrameView& frame, int64_t timestampHns) {
std::scoped_lock writerLock(writerMutex_);
if (!sinkWriter_ || finalized_) {
return false;
}
if (firstTimestampHns_ < 0) {
firstTimestampHns_ = timestampHns;
}
int64_t sampleTime = timestampHns - firstTimestampHns_;
if (sampleTime <= lastTimestampHns_) {
sampleTime = lastTimestampHns_ + (10'000'000LL / fps_);
}
const int64_t sampleDuration = 10'000'000LL / fps_;
lastTimestampHns_ = sampleTime;
Microsoft::WRL::ComPtr<IMFMediaBuffer> buffer;
const DWORD frameBytes = static_cast<DWORD>(width_ * height_ * 4);
if (!succeeded(MFCreateMemoryBuffer(frameBytes, &buffer), "MFCreateMemoryBuffer(webcam)")) {
return false;
}
BYTE* data = nullptr;
DWORD maxLength = 0;
DWORD currentLength = 0;
if (!succeeded(buffer->Lock(&data, &maxLength, &currentLength), "IMFMediaBuffer::Lock(webcam)")) {
return false;
}
const bool copied = copyBgraFrameToBuffer(frame, data, maxLength);
buffer->Unlock();
if (!copied) {
return false;
}
buffer->SetCurrentLength(frameBytes);
Microsoft::WRL::ComPtr<IMFSample> sample;
if (!succeeded(MFCreateSample(&sample), "MFCreateSample(webcam)")) {
return false;
}
sample->AddBuffer(buffer.Get());
sample->SetSampleTime(sampleTime);
sample->SetSampleDuration(sampleDuration);
return succeeded(sinkWriter_->WriteSample(videoStreamIndex_, sample.Get()), "WriteSample(webcam)");
}
bool MFEncoder::writeAudio(const BYTE* data, DWORD byteCount, int64_t timestampHns, int64_t durationHns) {
std::scoped_lock writerLock(writerMutex_);
if (!sinkWriter_ || finalized_ || !hasAudioStream_) {
return false;
}
if (!data || byteCount == 0 || durationHns <= 0) {
return true;
}
Microsoft::WRL::ComPtr<IMFMediaBuffer> buffer;
if (!succeeded(MFCreateMemoryBuffer(byteCount, &buffer), "MFCreateMemoryBuffer(audio)")) {
return false;
}
BYTE* destination = nullptr;
DWORD maxLength = 0;
DWORD currentLength = 0;
if (!succeeded(buffer->Lock(&destination, &maxLength, &currentLength),
"IMFMediaBuffer::Lock(audio)")) {
return false;
}
if (maxLength < byteCount) {
buffer->Unlock();
std::cerr << "ERROR: Media Foundation audio buffer is too small" << std::endl;
return false;
}
std::memcpy(destination, data, byteCount);
buffer->Unlock();
buffer->SetCurrentLength(byteCount);
Microsoft::WRL::ComPtr<IMFSample> sample;
if (!succeeded(MFCreateSample(&sample), "MFCreateSample(audio)")) {
return false;
}
sample->AddBuffer(buffer.Get());
sample->SetSampleTime(std::max<int64_t>(0, timestampHns));
sample->SetSampleDuration(durationHns);
return succeeded(sinkWriter_->WriteSample(audioStreamIndex_, sample.Get()), "WriteSample(audio)");
}
bool MFEncoder::finalize() {
std::scoped_lock writerLock(writerMutex_);
if (finalized_) {
return true;
}
finalized_ = true;
bool ok = true;
if (sinkWriter_) {
ok = succeeded(sinkWriter_->Finalize(), "SinkWriter::Finalize");
sinkWriter_.Reset();
}
stagingTexture_.Reset();
context_.Reset();
device_.Reset();
MFShutdown();
return ok;
}
@@ -0,0 +1,75 @@
#pragma once
#include <Windows.h>
#include <d3d11.h>
#include <mfapi.h>
#include <mfidl.h>
#include <mfreadwrite.h>
#include <wrl/client.h>
#include <cstdint>
#include <mutex>
#include <string>
struct BgraFrameView {
const BYTE* data = nullptr;
int width = 0;
int height = 0;
};
struct AudioInputFormat {
GUID subtype = MFAudioFormat_PCM;
UINT32 sampleRate = 0;
UINT32 channels = 0;
UINT32 bitsPerSample = 0;
UINT32 blockAlign = 0;
UINT32 avgBytesPerSec = 0;
};
class MFEncoder {
public:
MFEncoder() = default;
~MFEncoder();
MFEncoder(const MFEncoder&) = delete;
MFEncoder& operator=(const MFEncoder&) = delete;
bool initialize(
const std::wstring& outputPath,
int width,
int height,
int fps,
int bitrate,
ID3D11Device* device,
ID3D11DeviceContext* context,
const AudioInputFormat* audioFormat = nullptr);
bool writeFrame(ID3D11Texture2D* texture, int64_t timestampHns, const BgraFrameView* webcamFrame = nullptr);
bool writeBgraFrame(const BgraFrameView& frame, int64_t timestampHns);
bool writeAudio(const BYTE* data, DWORD byteCount, int64_t timestampHns, int64_t durationHns);
bool finalize();
private:
bool ensureStagingTexture(ID3D11Texture2D* texture);
bool copyFrameToBuffer(
ID3D11Texture2D* texture,
BYTE* destination,
DWORD destinationSize,
const BgraFrameView* webcamFrame);
bool copyBgraFrameToBuffer(const BgraFrameView& frame, BYTE* destination, DWORD destinationSize);
bool configureAudioStream(const AudioInputFormat& audioFormat);
Microsoft::WRL::ComPtr<IMFSinkWriter> sinkWriter_;
Microsoft::WRL::ComPtr<ID3D11Device> device_;
Microsoft::WRL::ComPtr<ID3D11DeviceContext> context_;
Microsoft::WRL::ComPtr<ID3D11Texture2D> stagingTexture_;
std::mutex writerMutex_;
DWORD videoStreamIndex_ = 0;
DWORD audioStreamIndex_ = 0;
bool hasAudioStream_ = false;
int width_ = 0;
int height_ = 0;
int fps_ = 60;
int64_t firstTimestampHns_ = -1;
int64_t lastTimestampHns_ = -1;
bool finalized_ = false;
};
@@ -0,0 +1,88 @@
#include "monitor_utils.h"
#include <algorithm>
#include <cmath>
#include <vector>
namespace {
struct MonitorCandidate {
HMONITOR monitor = nullptr;
RECT rect{};
};
std::vector<MonitorCandidate> enumerateMonitors() {
std::vector<MonitorCandidate> monitors;
EnumDisplayMonitors(
nullptr,
nullptr,
[](HMONITOR monitor, HDC, LPRECT rect, LPARAM userData) -> BOOL {
auto* result = reinterpret_cast<std::vector<MonitorCandidate>*>(userData);
result->push_back({monitor, *rect});
return TRUE;
},
reinterpret_cast<LPARAM>(&monitors));
return monitors;
}
bool rectMatchesBounds(const RECT& rect, const MonitorBounds& bounds) {
return rect.left == bounds.x &&
rect.top == bounds.y &&
(rect.right - rect.left) == bounds.width &&
(rect.bottom - rect.top) == bounds.height;
}
int64_t overlapArea(const RECT& rect, const MonitorBounds& bounds) {
const LONG left = std::max<LONG>(rect.left, bounds.x);
const LONG top = std::max<LONG>(rect.top, bounds.y);
const LONG right = std::min<LONG>(rect.right, bounds.x + bounds.width);
const LONG bottom = std::min<LONG>(rect.bottom, bounds.y + bounds.height);
if (right <= left || bottom <= top) {
return 0;
}
return static_cast<int64_t>(right - left) * static_cast<int64_t>(bottom - top);
}
} // namespace
HMONITOR findMonitorForCapture(int64_t displayId, const MonitorBounds* bounds) {
const auto monitors = enumerateMonitors();
if (monitors.empty()) {
return MonitorFromPoint({0, 0}, MONITOR_DEFAULTTOPRIMARY);
}
// Electron's display_id is not stable across all Windows capture backends.
// Bounds are the most reliable contract because they come from Electron's
// selected display and match the WGC monitor coordinate space.
if (bounds && bounds->width > 0 && bounds->height > 0) {
for (const auto& candidate : monitors) {
if (rectMatchesBounds(candidate.rect, *bounds)) {
return candidate.monitor;
}
}
HMONITOR bestMonitor = nullptr;
int64_t bestArea = 0;
for (const auto& candidate : monitors) {
const int64_t area = overlapArea(candidate.rect, *bounds);
if (area > bestArea) {
bestArea = area;
bestMonitor = candidate.monitor;
}
}
if (bestMonitor) {
return bestMonitor;
}
}
// Best-effort fallback for helpers invoked without bounds. Some callers pass
// zero-based ids while Win32 monitor handles are pointer values, so only use
// this when it exactly matches the HMONITOR value.
for (const auto& candidate : monitors) {
if (reinterpret_cast<int64_t>(candidate.monitor) == displayId) {
return candidate.monitor;
}
}
return MonitorFromPoint({0, 0}, MONITOR_DEFAULTTOPRIMARY);
}
@@ -0,0 +1,14 @@
#pragma once
#include <Windows.h>
#include <cstdint>
struct MonitorBounds {
int x = 0;
int y = 0;
int width = 0;
int height = 0;
};
HMONITOR findMonitorForCapture(int64_t displayId, const MonitorBounds* bounds);
@@ -0,0 +1,411 @@
#include "wasapi_loopback_capture.h"
#include <Functiondiscoverykeys_devpkey.h>
#include <ksmedia.h>
#include <propvarutil.h>
#include <algorithm>
#include <chrono>
#include <cwctype>
#include <iostream>
namespace {
constexpr REFERENCE_TIME BufferDurationHns = 10'000'000;
constexpr int64_t HnsPerSecond = 10'000'000;
bool succeeded(HRESULT hr, const char* label) {
if (SUCCEEDED(hr)) {
return true;
}
std::cerr << "ERROR: " << label << " failed (hr=0x" << std::hex << hr << std::dec << ")"
<< std::endl;
return false;
}
GUID audioSubtypeFromFormat(WAVEFORMATEX* format) {
if (format->wFormatTag == WAVE_FORMAT_IEEE_FLOAT) {
return MFAudioFormat_Float;
}
if (format->wFormatTag == WAVE_FORMAT_PCM) {
return MFAudioFormat_PCM;
}
if (format->wFormatTag == WAVE_FORMAT_EXTENSIBLE &&
format->cbSize >= sizeof(WAVEFORMATEXTENSIBLE) - sizeof(WAVEFORMATEX)) {
auto* extensible = reinterpret_cast<WAVEFORMATEXTENSIBLE*>(format);
if (extensible->SubFormat == KSDATAFORMAT_SUBTYPE_IEEE_FLOAT) {
return MFAudioFormat_Float;
}
if (extensible->SubFormat == KSDATAFORMAT_SUBTYPE_PCM) {
return MFAudioFormat_PCM;
}
}
return GUID_NULL;
}
std::wstring normalizeDeviceName(const std::wstring& value) {
std::wstring result;
result.reserve(value.size());
bool lastWasSpace = true;
for (const wchar_t c : value) {
if (std::iswalnum(c)) {
result.push_back(static_cast<wchar_t>(std::towlower(c)));
lastWasSpace = false;
} else if (!lastWasSpace) {
result.push_back(L' ');
lastWasSpace = true;
}
}
if (!result.empty() && result.back() == L' ') {
result.pop_back();
}
return result;
}
int scoreDeviceName(const std::wstring& candidateName, const std::wstring& candidateId, const std::wstring& requestedName) {
const std::wstring candidate = normalizeDeviceName(candidateName);
const std::wstring id = normalizeDeviceName(candidateId);
const std::wstring requested = normalizeDeviceName(requestedName);
if (requested.empty()) {
return 0;
}
if (candidate == requested) {
return 1000;
}
if (!candidate.empty() && (candidate.find(requested) != std::wstring::npos || requested.find(candidate) != std::wstring::npos)) {
return 900;
}
if (!id.empty() && (id.find(requested) != std::wstring::npos || requested.find(id) != std::wstring::npos)) {
return 800;
}
int score = 0;
size_t pos = 0;
while (pos < requested.size()) {
const size_t end = requested.find(L' ', pos);
const std::wstring word = requested.substr(pos, end == std::wstring::npos ? std::wstring::npos : end - pos);
if (word.size() > 1 && word != L"microphone" && word != L"mic" && word != L"audio" && word != L"input") {
if (candidate.find(word) != std::wstring::npos) {
score += 100;
} else if (id.find(word) != std::wstring::npos) {
score += 50;
}
}
if (end == std::wstring::npos) {
break;
}
pos = end + 1;
}
return score;
}
std::wstring getDeviceFriendlyName(IMMDevice* device) {
if (!device) {
return {};
}
Microsoft::WRL::ComPtr<IPropertyStore> properties;
HRESULT hr = device->OpenPropertyStore(STGM_READ, &properties);
if (FAILED(hr) || !properties) {
return {};
}
PROPVARIANT value;
PropVariantInit(&value);
hr = properties->GetValue(PKEY_Device_FriendlyName, &value);
std::wstring name;
if (SUCCEEDED(hr) && value.vt == VT_LPWSTR && value.pwszVal) {
name = value.pwszVal;
}
PropVariantClear(&value);
return name;
}
} // namespace
WasapiLoopbackCapture::~WasapiLoopbackCapture() {
stop();
if (mixFormat_) {
CoTaskMemFree(mixFormat_);
mixFormat_ = nullptr;
}
}
bool WasapiLoopbackCapture::initializeSystemLoopback() {
return initialize(WasapiCaptureEndpoint::SystemLoopback, {}, {});
}
bool WasapiLoopbackCapture::initializeMicrophone(const std::wstring& deviceId, const std::wstring& deviceName) {
return initialize(WasapiCaptureEndpoint::Microphone, deviceId, deviceName);
}
bool WasapiLoopbackCapture::initialize(WasapiCaptureEndpoint endpoint, const std::wstring& deviceId, const std::wstring& deviceName) {
HRESULT hr = CoCreateInstance(
__uuidof(MMDeviceEnumerator),
nullptr,
CLSCTX_ALL,
IID_PPV_ARGS(&deviceEnumerator_));
if (!succeeded(hr, "CoCreateInstance(MMDeviceEnumerator)")) {
return false;
}
if (endpoint == WasapiCaptureEndpoint::Microphone && !deviceId.empty() && deviceId != L"default") {
hr = deviceEnumerator_->GetDevice(deviceId.c_str(), &device_);
if (FAILED(hr)) {
std::wcerr << L"WARNING: Could not resolve microphone device id directly"
<< std::endl;
device_.Reset();
}
}
if (endpoint == WasapiCaptureEndpoint::Microphone && !device_ && !deviceName.empty()) {
if (!resolveMicrophoneByName(deviceName)) {
std::wcerr << L"WARNING: Could not resolve microphone by name; using default capture endpoint"
<< std::endl;
}
}
if (!device_) {
const EDataFlow flow =
endpoint == WasapiCaptureEndpoint::SystemLoopback ? eRender : eCapture;
hr = deviceEnumerator_->GetDefaultAudioEndpoint(flow, eConsole, &device_);
if (!succeeded(hr, "GetDefaultAudioEndpoint")) {
return false;
}
}
selectedDeviceName_ = getDeviceFriendlyName(device_.Get());
hr = device_->Activate(__uuidof(IAudioClient), CLSCTX_ALL, nullptr, &audioClient_);
if (!succeeded(hr, "IMMDevice::Activate(IAudioClient)")) {
return false;
}
hr = audioClient_->GetMixFormat(&mixFormat_);
if (!succeeded(hr, "IAudioClient::GetMixFormat") || !mixFormat_) {
return false;
}
if (!resolveInputFormat(mixFormat_)) {
std::cerr << "ERROR: Unsupported WASAPI loopback mix format" << std::endl;
return false;
}
const DWORD streamFlags =
endpoint == WasapiCaptureEndpoint::SystemLoopback ? AUDCLNT_STREAMFLAGS_LOOPBACK : 0;
hr = audioClient_->Initialize(
AUDCLNT_SHAREMODE_SHARED,
streamFlags,
BufferDurationHns,
0,
mixFormat_,
nullptr);
if (!succeeded(hr, "IAudioClient::Initialize(loopback)")) {
return false;
}
hr = audioClient_->GetService(IID_PPV_ARGS(&captureClient_));
if (!succeeded(hr, "IAudioClient::GetService(IAudioCaptureClient)")) {
return false;
}
return true;
}
bool WasapiLoopbackCapture::resolveMicrophoneByName(const std::wstring& deviceName) {
if (!deviceEnumerator_ || deviceName.empty()) {
return false;
}
Microsoft::WRL::ComPtr<IMMDeviceCollection> devices;
HRESULT hr = deviceEnumerator_->EnumAudioEndpoints(eCapture, DEVICE_STATE_ACTIVE, &devices);
if (!succeeded(hr, "IMMDeviceEnumerator::EnumAudioEndpoints(eCapture)")) {
return false;
}
UINT count = 0;
hr = devices->GetCount(&count);
if (!succeeded(hr, "IMMDeviceCollection::GetCount")) {
return false;
}
Microsoft::WRL::ComPtr<IMMDevice> bestDevice;
std::wstring bestId;
std::wstring bestName;
int bestScore = 0;
for (UINT i = 0; i < count; ++i) {
Microsoft::WRL::ComPtr<IMMDevice> candidate;
hr = devices->Item(i, &candidate);
if (FAILED(hr) || !candidate) {
continue;
}
LPWSTR rawId = nullptr;
std::wstring candidateId;
if (SUCCEEDED(candidate->GetId(&rawId)) && rawId) {
candidateId = rawId;
CoTaskMemFree(rawId);
}
const std::wstring candidateName = getDeviceFriendlyName(candidate.Get());
const int score = scoreDeviceName(candidateName, candidateId, deviceName);
std::wcerr << L"Native microphone candidate: " << candidateName << L" score=" << score << std::endl;
if (score > bestScore) {
bestScore = score;
bestDevice = candidate;
bestId = candidateId;
bestName = candidateName;
}
}
if (!bestDevice || bestScore <= 0) {
return false;
}
device_ = bestDevice;
std::wcerr << L"Selected native microphone endpoint: " << bestName << L" id=" << bestId << std::endl;
return true;
}
bool WasapiLoopbackCapture::resolveInputFormat(WAVEFORMATEX* mixFormat) {
const GUID subtype = audioSubtypeFromFormat(mixFormat);
if (subtype == GUID_NULL) {
return false;
}
inputFormat_.subtype = subtype;
inputFormat_.sampleRate = mixFormat->nSamplesPerSec;
inputFormat_.channels = mixFormat->nChannels;
inputFormat_.bitsPerSample = mixFormat->wBitsPerSample;
inputFormat_.blockAlign = mixFormat->nBlockAlign;
inputFormat_.avgBytesPerSec = mixFormat->nAvgBytesPerSec;
return inputFormat_.sampleRate > 0 && inputFormat_.channels > 0 && inputFormat_.blockAlign > 0;
}
bool WasapiLoopbackCapture::start(AudioCallback callback) {
if (!audioClient_ || !captureClient_ || !callback) {
return false;
}
callback_ = std::move(callback);
stopRequested_ = false;
writtenFrames_ = 0;
lastDevicePositionEnd_ = 0;
hasLastDevicePosition_ = false;
HRESULT hr = audioClient_->Start();
if (!succeeded(hr, "IAudioClient::Start")) {
return false;
}
thread_ = std::thread([this] {
captureLoop();
});
return true;
}
void WasapiLoopbackCapture::stop() {
stopRequested_ = true;
if (thread_.joinable()) {
thread_.join();
}
if (audioClient_) {
audioClient_->Stop();
}
}
const AudioInputFormat& WasapiLoopbackCapture::inputFormat() const {
return inputFormat_;
}
const std::wstring& WasapiLoopbackCapture::selectedDeviceName() const {
return selectedDeviceName_;
}
void WasapiLoopbackCapture::captureLoop() {
auto emitSilenceFrames = [&](uint64_t frames, int64_t timestampHns) {
constexpr uint64_t MaxSilenceChunkFrames = 4800;
uint64_t remainingFrames = frames;
int64_t currentTimestampHns = timestampHns;
while (remainingFrames > 0 && !stopRequested_) {
const uint64_t chunkFrames = std::min<uint64_t>(remainingFrames, MaxSilenceChunkFrames);
const DWORD chunkBytes = static_cast<DWORD>(chunkFrames * inputFormat_.blockAlign);
const int64_t chunkDurationHns =
static_cast<int64_t>((chunkFrames * HnsPerSecond) / inputFormat_.sampleRate);
silenceBuffer_.assign(chunkBytes, 0);
callback_(silenceBuffer_.data(), chunkBytes, currentTimestampHns, chunkDurationHns);
remainingFrames -= chunkFrames;
currentTimestampHns += chunkDurationHns;
}
};
while (!stopRequested_) {
UINT32 packetFrames = 0;
HRESULT hr = captureClient_->GetNextPacketSize(&packetFrames);
if (FAILED(hr)) {
std::cerr << "ERROR: IAudioCaptureClient::GetNextPacketSize failed (hr=0x" << std::hex
<< hr << std::dec << ")" << std::endl;
break;
}
while (packetFrames > 0 && !stopRequested_) {
BYTE* data = nullptr;
UINT32 framesAvailable = 0;
DWORD flags = 0;
UINT64 devicePosition = 0;
UINT64 qpcPosition = 0;
hr = captureClient_->GetBuffer(&data, &framesAvailable, &flags, &devicePosition, &qpcPosition);
if (FAILED(hr)) {
std::cerr << "ERROR: IAudioCaptureClient::GetBuffer failed (hr=0x" << std::hex
<< hr << std::dec << ")" << std::endl;
break;
}
(void)qpcPosition;
if (hasLastDevicePosition_ && devicePosition > lastDevicePositionEnd_) {
const uint64_t gapFrames = devicePosition - lastDevicePositionEnd_;
if ((flags & AUDCLNT_BUFFERFLAGS_DATA_DISCONTINUITY) != 0 || gapFrames > framesAvailable) {
const int64_t gapTimestampHns =
static_cast<int64_t>((lastDevicePositionEnd_ * HnsPerSecond) / inputFormat_.sampleRate);
emitSilenceFrames(gapFrames, gapTimestampHns);
}
}
const DWORD byteCount = framesAvailable * inputFormat_.blockAlign;
const int64_t timestampHns =
static_cast<int64_t>((devicePosition * HnsPerSecond) / inputFormat_.sampleRate);
const int64_t durationHns =
static_cast<int64_t>((static_cast<uint64_t>(framesAvailable) * HnsPerSecond) /
inputFormat_.sampleRate);
if (byteCount > 0) {
if ((flags & AUDCLNT_BUFFERFLAGS_SILENT) != 0 || !data) {
silenceBuffer_.assign(byteCount, 0);
callback_(silenceBuffer_.data(), byteCount, timestampHns, durationHns);
} else {
callback_(data, byteCount, timestampHns, durationHns);
}
}
writtenFrames_ += framesAvailable;
lastDevicePositionEnd_ = devicePosition + framesAvailable;
hasLastDevicePosition_ = true;
captureClient_->ReleaseBuffer(framesAvailable);
hr = captureClient_->GetNextPacketSize(&packetFrames);
if (FAILED(hr)) {
std::cerr << "ERROR: IAudioCaptureClient::GetNextPacketSize failed (hr=0x"
<< std::hex << hr << std::dec << ")" << std::endl;
packetFrames = 0;
break;
}
}
std::this_thread::sleep_for(std::chrono::milliseconds(5));
}
}
@@ -0,0 +1,60 @@
#pragma once
#include "mf_encoder.h"
#include <Windows.h>
#include <audioclient.h>
#include <mmdeviceapi.h>
#include <wrl/client.h>
#include <atomic>
#include <cstdint>
#include <functional>
#include <string>
#include <thread>
#include <vector>
enum class WasapiCaptureEndpoint {
SystemLoopback,
Microphone,
};
class WasapiLoopbackCapture {
public:
using AudioCallback = std::function<void(const BYTE* data, DWORD byteCount, int64_t timestampHns, int64_t durationHns)>;
WasapiLoopbackCapture() = default;
~WasapiLoopbackCapture();
WasapiLoopbackCapture(const WasapiLoopbackCapture&) = delete;
WasapiLoopbackCapture& operator=(const WasapiLoopbackCapture&) = delete;
bool initializeSystemLoopback();
bool initializeMicrophone(const std::wstring& deviceId, const std::wstring& deviceName);
bool start(AudioCallback callback);
void stop();
const AudioInputFormat& inputFormat() const;
const std::wstring& selectedDeviceName() const;
private:
bool initialize(WasapiCaptureEndpoint endpoint, const std::wstring& deviceId, const std::wstring& deviceName);
bool resolveMicrophoneByName(const std::wstring& deviceName);
void captureLoop();
bool resolveInputFormat(WAVEFORMATEX* mixFormat);
Microsoft::WRL::ComPtr<IMMDeviceEnumerator> deviceEnumerator_;
Microsoft::WRL::ComPtr<IMMDevice> device_;
Microsoft::WRL::ComPtr<IAudioClient> audioClient_;
Microsoft::WRL::ComPtr<IAudioCaptureClient> captureClient_;
WAVEFORMATEX* mixFormat_ = nullptr;
AudioInputFormat inputFormat_{};
std::wstring selectedDeviceName_;
AudioCallback callback_;
std::thread thread_;
std::atomic<bool> stopRequested_ = false;
std::vector<BYTE> silenceBuffer_;
uint64_t writtenFrames_ = 0;
uint64_t lastDevicePositionEnd_ = 0;
bool hasLastDevicePosition_ = false;
};
@@ -0,0 +1,419 @@
#include "webcam_capture.h"
#include <mfapi.h>
#include <mferror.h>
#include <propvarutil.h>
#include <algorithm>
#include <chrono>
#include <cwctype>
#include <iostream>
namespace {
bool succeeded(HRESULT hr, const char* label) {
if (SUCCEEDED(hr)) {
return true;
}
std::cerr << "ERROR: " << label << " failed (hr=0x" << std::hex << hr << std::dec << ")"
<< std::endl;
return false;
}
std::wstring readAllocatedString(IMFActivate* activate, REFGUID key) {
WCHAR* value = nullptr;
UINT32 length = 0;
if (FAILED(activate->GetAllocatedString(key, &value, &length)) || !value) {
return {};
}
std::wstring result(value, value + length);
CoTaskMemFree(value);
return result;
}
bool containsInsensitive(const std::wstring& haystack, const std::wstring& needle) {
if (haystack.empty() || needle.empty()) {
return false;
}
std::wstring lowerHaystack = haystack;
std::wstring lowerNeedle = needle;
std::transform(lowerHaystack.begin(), lowerHaystack.end(), lowerHaystack.begin(), ::towlower);
std::transform(lowerNeedle.begin(), lowerNeedle.end(), lowerNeedle.begin(), ::towlower);
return lowerHaystack.find(lowerNeedle) != std::wstring::npos ||
lowerNeedle.find(lowerHaystack) != std::wstring::npos;
}
std::wstring normalizeDeviceName(const std::wstring& value) {
std::wstring normalized;
normalized.reserve(value.size());
bool lastWasSpace = true;
for (const wchar_t ch : value) {
if (std::iswalnum(ch)) {
normalized.push_back(static_cast<wchar_t>(std::towlower(ch)));
lastWasSpace = false;
continue;
}
if (!lastWasSpace) {
normalized.push_back(L' ');
lastWasSpace = true;
}
}
while (!normalized.empty() && normalized.back() == L' ') {
normalized.pop_back();
}
return normalized;
}
std::vector<std::wstring> splitWords(const std::wstring& value) {
std::vector<std::wstring> words;
size_t start = 0;
while (start < value.size()) {
const size_t end = value.find(L' ', start);
const auto word = value.substr(start, end == std::wstring::npos ? std::wstring::npos : end - start);
if (word.size() > 1 && word != L"camera" && word != L"webcam" && word != L"video" && word != L"input") {
words.push_back(word);
}
if (end == std::wstring::npos) {
break;
}
start = end + 1;
}
return words;
}
int deviceMatchScore(
const std::wstring& candidateName,
const std::wstring& candidateLink,
const std::wstring& requestedName,
const std::wstring& requestedId) {
int score = 0;
const auto normalizedName = normalizeDeviceName(candidateName);
const auto normalizedLink = normalizeDeviceName(candidateLink);
const auto normalizedRequestedName = normalizeDeviceName(requestedName);
const auto normalizedRequestedId = normalizeDeviceName(requestedId);
if (!normalizedRequestedName.empty()) {
if (normalizedName == normalizedRequestedName) {
score = std::max(score, 1000);
}
if (containsInsensitive(normalizedName, normalizedRequestedName)) {
score = std::max(score, 900);
}
if (containsInsensitive(normalizedLink, normalizedRequestedName)) {
score = std::max(score, 800);
}
int wordScore = 0;
for (const auto& word : splitWords(normalizedRequestedName)) {
if (normalizedName.find(word) != std::wstring::npos) {
wordScore += 100;
} else if (normalizedLink.find(word) != std::wstring::npos) {
wordScore += 50;
}
}
score = std::max(score, wordScore);
}
if (!normalizedRequestedId.empty()) {
if (containsInsensitive(normalizedLink, normalizedRequestedId)) {
score = std::max(score, 700);
}
if (containsInsensitive(normalizedName, normalizedRequestedId)) {
score = std::max(score, 600);
}
}
return score;
}
} // namespace
WebcamCapture::~WebcamCapture() {
stop();
}
bool WebcamCapture::initialize(
const std::wstring& deviceId,
const std::wstring& deviceName,
const std::wstring& directShowClsid,
int requestedWidth,
int requestedHeight,
int requestedFps) {
fps_ = std::clamp(requestedFps > 0 ? requestedFps : 30, 1, 60);
usingDirectShow_ = false;
selectedMatchScore_ = 0;
if (!succeeded(MFStartup(MF_VERSION), "MFStartup(webcam)")) {
if (directShowCapture_.initialize(deviceId, deviceName, directShowClsid, requestedWidth, requestedHeight, fps_)) {
usingDirectShow_ = true;
return true;
}
return false;
}
mfStarted_ = true;
if (!selectDevice(deviceId, deviceName)) {
if (mfStarted_) {
MFShutdown();
mfStarted_ = false;
}
if (directShowCapture_.initialize(deviceId, deviceName, directShowClsid, requestedWidth, requestedHeight, fps_)) {
usingDirectShow_ = true;
return true;
}
return false;
}
if ((!deviceId.empty() || !deviceName.empty()) && selectedMatchScore_ <= 0) {
if (mediaSource_) {
mediaSource_->Shutdown();
}
sourceReader_.Reset();
mediaSource_.Reset();
if (mfStarted_) {
MFShutdown();
mfStarted_ = false;
}
if (directShowCapture_.initialize(deviceId, deviceName, directShowClsid, requestedWidth, requestedHeight, fps_)) {
usingDirectShow_ = true;
return true;
}
std::cerr << "ERROR: Requested webcam device was not found by native Windows webcam providers"
<< std::endl;
return false;
}
return configureReader(requestedWidth, requestedHeight, fps_);
}
bool WebcamCapture::selectDevice(const std::wstring& deviceId, const std::wstring& deviceName) {
Microsoft::WRL::ComPtr<IMFAttributes> attributes;
if (!succeeded(MFCreateAttributes(&attributes, 1), "MFCreateAttributes(webcam enumeration)")) {
return false;
}
if (!succeeded(attributes->SetGUID(
MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE,
MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID),
"SetGUID(webcam source type)")) {
return false;
}
IMFActivate** devices = nullptr;
UINT32 deviceCount = 0;
HRESULT hr = MFEnumDeviceSources(attributes.Get(), &devices, &deviceCount);
if (!succeeded(hr, "MFEnumDeviceSources") || deviceCount == 0) {
if (devices) {
CoTaskMemFree(devices);
}
std::cerr << "ERROR: No native Windows webcam devices were found" << std::endl;
return false;
}
UINT32 selectedIndex = 0;
int bestScore = 0;
for (UINT32 index = 0; index < deviceCount; index += 1) {
const std::wstring name = readAllocatedString(devices[index], MF_DEVSOURCE_ATTRIBUTE_FRIENDLY_NAME);
const std::wstring symbolicLink = readAllocatedString(devices[index], MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_SYMBOLIC_LINK);
const int score = deviceMatchScore(name, symbolicLink, deviceName, deviceId);
std::wcerr << L"INFO: Native webcam candidate [" << index << L"] name=\"" << name << L"\" score=" << score << std::endl;
if (score > bestScore) {
selectedIndex = index;
bestScore = score;
}
}
if ((!deviceId.empty() || !deviceName.empty()) && bestScore <= 0) {
std::cerr << "WARNING: Requested webcam device was not found by Media Foundation; trying DirectShow"
<< std::endl;
}
selectedMatchScore_ = bestScore;
selectedDeviceName_ = readAllocatedString(devices[selectedIndex], MF_DEVSOURCE_ATTRIBUTE_FRIENDLY_NAME);
hr = devices[selectedIndex]->ActivateObject(IID_PPV_ARGS(&mediaSource_));
for (UINT32 index = 0; index < deviceCount; index += 1) {
devices[index]->Release();
}
CoTaskMemFree(devices);
return succeeded(hr, "ActivateObject(webcam)");
}
bool WebcamCapture::configureReader(int requestedWidth, int requestedHeight, int requestedFps) {
Microsoft::WRL::ComPtr<IMFAttributes> attributes;
if (!succeeded(MFCreateAttributes(&attributes, 2), "MFCreateAttributes(webcam reader)")) {
return false;
}
attributes->SetUINT32(MF_SOURCE_READER_ENABLE_VIDEO_PROCESSING, TRUE);
attributes->SetUINT32(MF_READWRITE_DISABLE_CONVERTERS, FALSE);
if (!succeeded(MFCreateSourceReaderFromMediaSource(mediaSource_.Get(), attributes.Get(), &sourceReader_),
"MFCreateSourceReaderFromMediaSource(webcam)")) {
return false;
}
Microsoft::WRL::ComPtr<IMFMediaType> mediaType;
if (!succeeded(MFCreateMediaType(&mediaType), "MFCreateMediaType(webcam output)")) {
return false;
}
mediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
mediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB32);
if (requestedWidth > 0 && requestedHeight > 0) {
MFSetAttributeSize(mediaType.Get(), MF_MT_FRAME_SIZE, static_cast<UINT32>(requestedWidth), static_cast<UINT32>(requestedHeight));
}
MFSetAttributeRatio(mediaType.Get(), MF_MT_FRAME_RATE, static_cast<UINT32>(std::max(1, requestedFps)), 1);
if (!succeeded(sourceReader_->SetCurrentMediaType(MF_SOURCE_READER_FIRST_VIDEO_STREAM, nullptr, mediaType.Get()),
"SetCurrentMediaType(webcam RGB32)")) {
return false;
}
sourceReader_->SetStreamSelection(MF_SOURCE_READER_ALL_STREAMS, FALSE);
sourceReader_->SetStreamSelection(MF_SOURCE_READER_FIRST_VIDEO_STREAM, TRUE);
Microsoft::WRL::ComPtr<IMFMediaType> currentType;
if (!succeeded(sourceReader_->GetCurrentMediaType(MF_SOURCE_READER_FIRST_VIDEO_STREAM, &currentType),
"GetCurrentMediaType(webcam)")) {
return false;
}
UINT32 width = 0;
UINT32 height = 0;
if (FAILED(MFGetAttributeSize(currentType.Get(), MF_MT_FRAME_SIZE, &width, &height)) || width == 0 || height == 0) {
width = static_cast<UINT32>(requestedWidth > 0 ? requestedWidth : 1280);
height = static_cast<UINT32>(requestedHeight > 0 ? requestedHeight : 720);
}
width_ = static_cast<int>(width);
height_ = static_cast<int>(height);
return true;
}
bool WebcamCapture::start() {
if (usingDirectShow_) {
return directShowCapture_.start();
}
if (!sourceReader_ || thread_.joinable()) {
return false;
}
stopRequested_ = false;
thread_ = std::thread(&WebcamCapture::captureLoop, this);
return true;
}
void WebcamCapture::stop() {
directShowCapture_.stop();
stopRequested_ = true;
if (thread_.joinable()) {
thread_.join();
}
if (mediaSource_) {
mediaSource_->Shutdown();
}
sourceReader_.Reset();
mediaSource_.Reset();
if (mfStarted_) {
MFShutdown();
mfStarted_ = false;
}
}
void WebcamCapture::captureLoop() {
CoInitializeEx(nullptr, COINIT_MULTITHREADED);
while (!stopRequested_) {
DWORD streamIndex = 0;
DWORD flags = 0;
LONGLONG timestamp = 0;
Microsoft::WRL::ComPtr<IMFSample> sample;
HRESULT hr = sourceReader_->ReadSample(
MF_SOURCE_READER_FIRST_VIDEO_STREAM,
0,
&streamIndex,
&flags,
&timestamp,
&sample);
(void)streamIndex;
(void)timestamp;
if (FAILED(hr)) {
std::cerr << "WARNING: Failed to read webcam sample (hr=0x" << std::hex << hr << std::dec << ")"
<< std::endl;
std::this_thread::sleep_for(std::chrono::milliseconds(20));
continue;
}
if ((flags & MF_SOURCE_READERF_ENDOFSTREAM) != 0) {
break;
}
if (!sample) {
continue;
}
Microsoft::WRL::ComPtr<IMFMediaBuffer> buffer;
if (FAILED(sample->ConvertToContiguousBuffer(&buffer)) || !buffer) {
continue;
}
BYTE* data = nullptr;
DWORD maxLength = 0;
DWORD currentLength = 0;
if (FAILED(buffer->Lock(&data, &maxLength, &currentLength)) || !data) {
continue;
}
const DWORD expectedLength = static_cast<DWORD>(std::max(0, width_) * std::max(0, height_) * 4);
if (currentLength >= expectedLength && expectedLength > 0) {
std::scoped_lock lock(frameMutex_);
latestFrame_.assign(data, data + expectedLength);
latestFrameSequence_ += 1;
}
buffer->Unlock();
}
CoUninitialize();
}
bool WebcamCapture::copyLatestFrame(WebcamFrameSnapshot& destination) {
if (usingDirectShow_) {
return directShowCapture_.copyLatestFrame(destination);
}
std::scoped_lock lock(frameMutex_);
if (latestFrame_.empty() || width_ <= 0 || height_ <= 0) {
return false;
}
destination.data = latestFrame_;
destination.width = width_;
destination.height = height_;
destination.sequence = latestFrameSequence_;
return true;
}
int WebcamCapture::width() const {
if (usingDirectShow_) {
return directShowCapture_.width();
}
return width_;
}
int WebcamCapture::height() const {
if (usingDirectShow_) {
return directShowCapture_.height();
}
return height_;
}
int WebcamCapture::fps() const {
if (usingDirectShow_) {
return directShowCapture_.fps();
}
return fps_;
}
const std::wstring& WebcamCapture::selectedDeviceName() const {
if (usingDirectShow_) {
return directShowCapture_.selectedDeviceName();
}
return selectedDeviceName_;
}
@@ -0,0 +1,61 @@
#pragma once
#include "dshow_webcam_capture.h"
#include <Windows.h>
#include <mfidl.h>
#include <mfreadwrite.h>
#include <wrl/client.h>
#include <atomic>
#include <cstdint>
#include <mutex>
#include <string>
#include <thread>
#include <vector>
class WebcamCapture {
public:
WebcamCapture() = default;
~WebcamCapture();
WebcamCapture(const WebcamCapture&) = delete;
WebcamCapture& operator=(const WebcamCapture&) = delete;
bool initialize(
const std::wstring& deviceId,
const std::wstring& deviceName,
const std::wstring& directShowClsid,
int requestedWidth,
int requestedHeight,
int requestedFps);
bool start();
void stop();
bool copyLatestFrame(WebcamFrameSnapshot& destination);
int width() const;
int height() const;
int fps() const;
const std::wstring& selectedDeviceName() const;
private:
bool selectDevice(const std::wstring& deviceId, const std::wstring& deviceName);
bool configureReader(int requestedWidth, int requestedHeight, int requestedFps);
void captureLoop();
Microsoft::WRL::ComPtr<IMFMediaSource> mediaSource_;
Microsoft::WRL::ComPtr<IMFSourceReader> sourceReader_;
DirectShowWebcamCapture directShowCapture_;
std::thread thread_;
std::atomic<bool> stopRequested_ = false;
std::mutex frameMutex_;
std::vector<BYTE> latestFrame_;
uint64_t latestFrameSequence_ = 0;
int width_ = 0;
int height_ = 0;
int fps_ = 30;
bool mfStarted_ = false;
bool usingDirectShow_ = false;
int selectedMatchScore_ = 0;
std::wstring selectedDeviceName_;
};
@@ -0,0 +1,315 @@
#include "wgc_session.h"
#include <Windows.Graphics.Capture.Interop.h>
#include <dxgi1_2.h>
#include <inspectable.h>
#include <winrt/base.h>
#include <iostream>
namespace wf = winrt::Windows::Foundation;
namespace wgcap = winrt::Windows::Graphics::Capture;
namespace wgdx = winrt::Windows::Graphics::DirectX;
namespace wgd3d = winrt::Windows::Graphics::DirectX::Direct3D11;
extern "C" HRESULT __stdcall CreateDirect3D11DeviceFromDXGIDevice(
::IDXGIDevice* dxgiDevice,
::IInspectable** graphicsDevice);
namespace {
bool succeeded(HRESULT hr, const char* label) {
if (SUCCEEDED(hr)) {
return true;
}
std::cerr << "ERROR: " << label << " failed (hr=0x" << std::hex << hr << std::dec << ")"
<< std::endl;
return false;
}
int64_t timeSpanToHns(wf::TimeSpan const& value) {
return value.count();
}
} // namespace
WgcSession::~WgcSession() {
stop();
}
bool WgcSession::createD3DDevice() {
UINT flags = D3D11_CREATE_DEVICE_BGRA_SUPPORT;
#if defined(_DEBUG)
flags |= D3D11_CREATE_DEVICE_DEBUG;
#endif
D3D_FEATURE_LEVEL featureLevels[] = {
D3D_FEATURE_LEVEL_11_1,
D3D_FEATURE_LEVEL_11_0,
D3D_FEATURE_LEVEL_10_1,
D3D_FEATURE_LEVEL_10_0,
};
D3D_FEATURE_LEVEL featureLevel{};
HRESULT hr = D3D11CreateDevice(
nullptr,
D3D_DRIVER_TYPE_HARDWARE,
nullptr,
flags,
featureLevels,
ARRAYSIZE(featureLevels),
D3D11_SDK_VERSION,
&d3dDevice_,
&featureLevel,
&d3dContext_);
#if defined(_DEBUG)
if (FAILED(hr)) {
flags &= ~D3D11_CREATE_DEVICE_DEBUG;
hr = D3D11CreateDevice(
nullptr,
D3D_DRIVER_TYPE_HARDWARE,
nullptr,
flags,
featureLevels,
ARRAYSIZE(featureLevels),
D3D11_SDK_VERSION,
&d3dDevice_,
&featureLevel,
&d3dContext_);
}
#endif
if (!succeeded(hr, "D3D11CreateDevice")) {
return false;
}
Microsoft::WRL::ComPtr<IDXGIDevice> dxgiDevice;
if (!succeeded(d3dDevice_.As(&dxgiDevice), "Query IDXGIDevice")) {
return false;
}
winrt::com_ptr<::IInspectable> inspectableDevice;
if (!succeeded(CreateDirect3D11DeviceFromDXGIDevice(dxgiDevice.Get(), inspectableDevice.put()),
"CreateDirect3D11DeviceFromDXGIDevice")) {
return false;
}
winrtDevice_ = inspectableDevice.as<wgd3d::IDirect3DDevice>();
return true;
}
bool WgcSession::createCaptureItem(HMONITOR monitor) {
auto factory = winrt::get_activation_factory<wgcap::GraphicsCaptureItem>();
auto interop = factory.as<IGraphicsCaptureItemInterop>();
wgcap::GraphicsCaptureItem item{nullptr};
HRESULT hr = interop->CreateForMonitor(
monitor,
winrt::guid_of<wgcap::GraphicsCaptureItem>(),
reinterpret_cast<void**>(winrt::put_abi(item)));
if (!succeeded(hr, "CreateForMonitor")) {
return false;
}
item_ = item;
const auto size = item_.Size();
width_ = static_cast<int>(size.Width);
height_ = static_cast<int>(size.Height);
return width_ > 0 && height_ > 0;
}
bool WgcSession::createCaptureItem(HWND window) {
auto factory = winrt::get_activation_factory<wgcap::GraphicsCaptureItem>();
auto interop = factory.as<IGraphicsCaptureItemInterop>();
wgcap::GraphicsCaptureItem item{nullptr};
HRESULT hr = interop->CreateForWindow(
window,
winrt::guid_of<wgcap::GraphicsCaptureItem>(),
reinterpret_cast<void**>(winrt::put_abi(item)));
if (!succeeded(hr, "CreateForWindow")) {
return false;
}
item_ = item;
const auto size = item_.Size();
width_ = static_cast<int>(size.Width);
height_ = static_cast<int>(size.Height);
return width_ > 0 && height_ > 0;
}
bool WgcSession::applySessionOptions(bool captureCursor) {
captureCursor_ = captureCursor;
try {
auto session2 = session_.try_as<wgcap::IGraphicsCaptureSession2>();
if (!session2) {
if (!captureCursor) {
std::cerr << "ERROR: WGC cursor suppression is not supported by this Windows runtime"
<< std::endl;
return false;
}
} else {
session2.IsCursorCaptureEnabled(captureCursor);
const bool appliedCursorCapture = session2.IsCursorCaptureEnabled();
std::cout << "{\"event\":\"cursor-capture\",\"schemaVersion\":2,\"requested\":"
<< (captureCursor ? "true" : "false")
<< ",\"applied\":" << (appliedCursorCapture ? "true" : "false") << "}"
<< std::endl;
if (appliedCursorCapture != captureCursor) {
std::cerr << "ERROR: WGC cursor capture setting did not apply" << std::endl;
return false;
}
}
} catch (winrt::hresult_error const& error) {
std::cerr << "ERROR: Failed to configure WGC cursor capture (hr=0x" << std::hex
<< static_cast<uint32_t>(error.code()) << std::dec << ")" << std::endl;
if (!captureCursor) {
return false;
}
} catch (...) {
std::cerr << "ERROR: Failed to configure WGC cursor capture" << std::endl;
if (!captureCursor) {
return false;
}
}
try {
session_.IsBorderRequired(false);
} catch (...) {
// IsBorderRequired is Windows 11-only. Ignore it on older builds.
}
return true;
}
bool WgcSession::initialize(HMONITOR monitor, int fps, bool captureCursor) {
fps_ = fps > 0 ? fps : 60;
if (!createD3DDevice()) {
return false;
}
if (!createCaptureItem(monitor)) {
return false;
}
framePool_ = wgcap::Direct3D11CaptureFramePool::CreateFreeThreaded(
winrtDevice_,
wgdx::DirectXPixelFormat::B8G8R8A8UIntNormalized,
2,
item_.Size());
session_ = framePool_.CreateCaptureSession(item_);
if (!applySessionOptions(captureCursor)) {
return false;
}
frameArrivedToken_ = framePool_.FrameArrived({this, &WgcSession::onFrameArrived});
return true;
}
bool WgcSession::initialize(HWND window, int fps, bool captureCursor) {
fps_ = fps > 0 ? fps : 60;
if (!createD3DDevice()) {
return false;
}
if (!createCaptureItem(window)) {
return false;
}
framePool_ = wgcap::Direct3D11CaptureFramePool::CreateFreeThreaded(
winrtDevice_,
wgdx::DirectXPixelFormat::B8G8R8A8UIntNormalized,
2,
item_.Size());
session_ = framePool_.CreateCaptureSession(item_);
if (!applySessionOptions(captureCursor)) {
return false;
}
frameArrivedToken_ = framePool_.FrameArrived({this, &WgcSession::onFrameArrived});
return true;
}
void WgcSession::setFrameCallback(FrameCallback callback) {
std::scoped_lock lock(callbackMutex_);
frameCallback_ = std::move(callback);
}
bool WgcSession::start() {
if (!session_) {
return false;
}
if (!applySessionOptions(captureCursor_)) {
return false;
}
session_.StartCapture();
started_ = true;
return true;
}
void WgcSession::stop() {
if (framePool_) {
framePool_.FrameArrived(frameArrivedToken_);
}
if (session_) {
session_.Close();
session_ = nullptr;
}
if (framePool_) {
framePool_.Close();
framePool_ = nullptr;
}
item_ = nullptr;
winrtDevice_ = nullptr;
d3dContext_.Reset();
d3dDevice_.Reset();
started_ = false;
}
void WgcSession::onFrameArrived(
wgcap::Direct3D11CaptureFramePool const& sender,
wf::IInspectable const&) {
auto frame = sender.TryGetNextFrame();
if (!frame) {
return;
}
auto surface = frame.Surface();
auto access = surface.as<::Windows::Graphics::DirectX::Direct3D11::IDirect3DDxgiInterfaceAccess>();
Microsoft::WRL::ComPtr<ID3D11Texture2D> texture;
HRESULT hr = access->GetInterface(__uuidof(ID3D11Texture2D), reinterpret_cast<void**>(texture.GetAddressOf()));
if (FAILED(hr) || !texture) {
return;
}
FrameCallback callback;
{
std::scoped_lock lock(callbackMutex_);
callback = frameCallback_;
}
if (callback) {
callback(texture.Get(), timeSpanToHns(frame.SystemRelativeTime()));
}
frame.Close();
}
int WgcSession::captureWidth() const {
return width_;
}
int WgcSession::captureHeight() const {
return height_;
}
ID3D11Device* WgcSession::device() const {
return d3dDevice_.Get();
}
ID3D11DeviceContext* WgcSession::context() const {
return d3dContext_.Get();
}
@@ -0,0 +1,59 @@
#pragma once
#include <Windows.h>
#include <d3d11.h>
#include <windows.graphics.capture.h>
#include <windows.graphics.directx.direct3d11.interop.h>
#include <winrt/Windows.Foundation.h>
#include <winrt/Windows.Graphics.Capture.h>
#include <winrt/Windows.Graphics.DirectX.Direct3D11.h>
#include <wrl/client.h>
#include <functional>
#include <mutex>
class WgcSession {
public:
using FrameCallback = std::function<void(ID3D11Texture2D*, int64_t)>;
WgcSession() = default;
~WgcSession();
WgcSession(const WgcSession&) = delete;
WgcSession& operator=(const WgcSession&) = delete;
bool initialize(HMONITOR monitor, int fps, bool captureCursor);
bool initialize(HWND window, int fps, bool captureCursor);
void setFrameCallback(FrameCallback callback);
bool start();
void stop();
int captureWidth() const;
int captureHeight() const;
ID3D11Device* device() const;
ID3D11DeviceContext* context() const;
private:
bool createD3DDevice();
bool createCaptureItem(HMONITOR monitor);
bool createCaptureItem(HWND window);
bool applySessionOptions(bool captureCursor);
void onFrameArrived(
winrt::Windows::Graphics::Capture::Direct3D11CaptureFramePool const& sender,
winrt::Windows::Foundation::IInspectable const&);
Microsoft::WRL::ComPtr<ID3D11Device> d3dDevice_;
Microsoft::WRL::ComPtr<ID3D11DeviceContext> d3dContext_;
winrt::Windows::Graphics::DirectX::Direct3D11::IDirect3DDevice winrtDevice_{nullptr};
winrt::Windows::Graphics::Capture::GraphicsCaptureItem item_{nullptr};
winrt::Windows::Graphics::Capture::Direct3D11CaptureFramePool framePool_{nullptr};
winrt::Windows::Graphics::Capture::GraphicsCaptureSession session_{nullptr};
winrt::event_token frameArrivedToken_{};
FrameCallback frameCallback_;
std::mutex callbackMutex_;
int width_ = 0;
int height_ = 0;
int fps_ = 60;
bool captureCursor_ = false;
bool started_ = false;
};
+308
View File
@@ -0,0 +1,308 @@
import { contextBridge, ipcRenderer } from "electron";
import type {
AddGuideMarkerInput,
DiscardGuideSessionInput,
ExportGuideInput,
FinalizeGuideEventsInput,
GenerateGuideDraftInput,
RunGuideOcrInput,
SaveGuideAiSettingsInput,
SaveGuideInput,
WriteGuideSnapshotInput,
} from "../src/guide/contracts";
import type { NativeMacRecordingRequest } from "../src/lib/nativeMacRecording";
import type { NativeWindowsRecordingRequest } from "../src/lib/nativeWindowsRecording";
import type { RecordingSession, StoreRecordedSessionInput } from "../src/lib/recordingSession";
import { NATIVE_BRIDGE_CHANNEL, type NativeBridgeRequest } from "../src/native/contracts";
// Asset base URL is passed from the main process via webPreferences.additionalArguments
// (see windows.ts). Sandboxed preloads cannot import node:path / node:url, so we
// can't compute it here.
const ASSET_BASE_URL_ARG_PREFIX = "--asset-base-url=";
const assetBaseUrlArg = process.argv.find((arg) => arg.startsWith(ASSET_BASE_URL_ARG_PREFIX));
const assetBaseUrl = assetBaseUrlArg ? assetBaseUrlArg.slice(ASSET_BASE_URL_ARG_PREFIX.length) : "";
contextBridge.exposeInMainWorld("electronAPI", {
assetBaseUrl,
invokeNativeBridge: <TData>(request: NativeBridgeRequest) => {
return ipcRenderer.invoke(NATIVE_BRIDGE_CHANNEL, request) as Promise<TData>;
},
guide: {
startSession: (recordingId: string | number) => {
return ipcRenderer.invoke("guide:start-session", recordingId);
},
readSession: (recordingId: string | number) => {
return ipcRenderer.invoke("guide:read-session", recordingId);
},
addMarker: (input: AddGuideMarkerInput) => {
return ipcRenderer.invoke("guide:add-marker", input);
},
finalizeEvents: (input: FinalizeGuideEventsInput) => {
return ipcRenderer.invoke("guide:finalize-events", input);
},
writeSnapshot: (input: WriteGuideSnapshotInput) => {
return ipcRenderer.invoke("guide:write-snapshot", input);
},
runOcr: (input: RunGuideOcrInput) => {
return ipcRenderer.invoke("guide:run-ocr", input);
},
generateDraft: (input: GenerateGuideDraftInput) => {
return ipcRenderer.invoke("guide:generate-draft", input);
},
getAiSettings: () => {
return ipcRenderer.invoke("guide:get-ai-settings");
},
saveAiSettings: (input: SaveGuideAiSettingsInput) => {
return ipcRenderer.invoke("guide:save-ai-settings", input);
},
saveGuide: (input: SaveGuideInput) => {
return ipcRenderer.invoke("guide:save-guide", input);
},
exportMarkdown: (input: ExportGuideInput) => {
return ipcRenderer.invoke("guide:export-markdown", input);
},
exportHtml: (input: ExportGuideInput) => {
return ipcRenderer.invoke("guide:export-html", input);
},
discardSession: (input: DiscardGuideSessionInput) => {
return ipcRenderer.invoke("guide:discard-session", input);
},
},
hudOverlayHide: () => {
ipcRenderer.send("hud-overlay-hide");
},
hudOverlayClose: () => {
ipcRenderer.send("hud-overlay-close");
},
setHudOverlayIgnoreMouseEvents: (ignore: boolean) => {
ipcRenderer.send("hud-overlay-ignore-mouse-events", ignore);
},
moveHudOverlayBy: (deltaX: number, deltaY: number) => {
ipcRenderer.send("hud-overlay-move-by", deltaX, deltaY);
},
getSources: async (opts: Electron.SourcesOptions) => {
return await ipcRenderer.invoke("get-sources", opts);
},
switchToEditor: () => {
return ipcRenderer.invoke("switch-to-editor");
},
switchToHud: () => {
return ipcRenderer.invoke("switch-to-hud");
},
startNewRecording: () => {
return ipcRenderer.invoke("start-new-recording");
},
openSourceSelector: () => {
return ipcRenderer.invoke("open-source-selector");
},
selectSource: (source: ProcessedDesktopSource) => {
return ipcRenderer.invoke("select-source", source);
},
getSelectedSource: () => {
return ipcRenderer.invoke("get-selected-source");
},
requestCameraAccess: () => {
return ipcRenderer.invoke("request-camera-access");
},
requestScreenAccess: () => {
return ipcRenderer.invoke("request-screen-access");
},
requestNativeMacCursorAccess: () => {
return ipcRenderer.invoke("request-native-mac-cursor-access");
},
storeRecordedVideo: (videoData: ArrayBuffer, fileName: string) => {
return ipcRenderer.invoke("store-recorded-video", videoData, fileName);
},
storeRecordedSession: (payload: StoreRecordedSessionInput) => {
return ipcRenderer.invoke("store-recorded-session", payload);
},
openRecordingStream: (fileName: string) => {
return ipcRenderer.invoke("open-recording-stream", fileName);
},
appendRecordingChunk: (fileName: string, chunk: ArrayBuffer) => {
return ipcRenderer.invoke("append-recording-chunk", fileName, chunk);
},
closeRecordingStream: (fileName: string) => {
return ipcRenderer.invoke("close-recording-stream", fileName);
},
getRecordedVideoPath: () => {
return ipcRenderer.invoke("get-recorded-video-path");
},
setRecordingState: (
recording: boolean,
recordingId?: number,
cursorCaptureMode?: import("../src/lib/recordingSession").CursorCaptureMode,
) => {
return ipcRenderer.invoke("set-recording-state", recording, recordingId, cursorCaptureMode);
},
isNativeWindowsCaptureAvailable: () => {
return ipcRenderer.invoke("is-native-windows-capture-available");
},
isNativeMacCaptureAvailable: () => {
return ipcRenderer.invoke("is-native-mac-capture-available");
},
startNativeWindowsRecording: (request: NativeWindowsRecordingRequest) => {
return ipcRenderer.invoke("start-native-windows-recording", request);
},
stopNativeWindowsRecording: (discard?: boolean) => {
return ipcRenderer.invoke("stop-native-windows-recording", discard);
},
pauseNativeWindowsRecording: () => {
return ipcRenderer.invoke("pause-native-windows-recording");
},
resumeNativeWindowsRecording: () => {
return ipcRenderer.invoke("resume-native-windows-recording");
},
startNativeMacRecording: (request: NativeMacRecordingRequest) => {
return ipcRenderer.invoke("start-native-mac-recording", request);
},
pauseNativeMacRecording: () => {
return ipcRenderer.invoke("pause-native-mac-recording");
},
resumeNativeMacRecording: () => {
return ipcRenderer.invoke("resume-native-mac-recording");
},
stopNativeMacRecording: (discard?: boolean) => {
return ipcRenderer.invoke("stop-native-mac-recording", discard);
},
attachNativeMacWebcamRecording: (payload: {
screenVideoPath: string;
recordingId: number;
webcam: { fileName: string; videoData: ArrayBuffer };
cursorCaptureMode?: import("../src/lib/recordingSession").CursorCaptureMode;
}) => {
return ipcRenderer.invoke("attach-native-mac-webcam-recording", payload);
},
getCursorTelemetry: (videoPath?: string) => {
return ipcRenderer.invoke("get-cursor-telemetry", videoPath);
},
discardCursorTelemetry: (recordingId: number) => {
return ipcRenderer.invoke("discard-cursor-telemetry", recordingId);
},
onStopRecordingFromTray: (callback: () => void) => {
const listener = () => callback();
ipcRenderer.on("stop-recording-from-tray", listener);
return () => ipcRenderer.removeListener("stop-recording-from-tray", listener);
},
openExternalUrl: (url: string) => {
return ipcRenderer.invoke("open-external-url", url);
},
pickExportSavePath: (fileName: string, exportFolder?: string) => {
return ipcRenderer.invoke("pick-export-save-path", fileName, exportFolder);
},
writeExportToPath: (videoData: ArrayBuffer, filePath: string) => {
return ipcRenderer.invoke("write-export-to-path", videoData, filePath);
},
openVideoFilePicker: () => {
return ipcRenderer.invoke("open-video-file-picker");
},
setCurrentVideoPath: (path: string) => {
return ipcRenderer.invoke("set-current-video-path", path);
},
setCurrentRecordingSession: (session: RecordingSession | null) => {
return ipcRenderer.invoke("set-current-recording-session", session);
},
getCurrentVideoPath: () => {
return ipcRenderer.invoke("get-current-video-path");
},
getCurrentRecordingSession: () => {
return ipcRenderer.invoke("get-current-recording-session");
},
readBinaryFile: (filePath: string) => {
return ipcRenderer.invoke("read-binary-file", filePath);
},
preparePreviewAudioTrack: (filePath: string) => {
return ipcRenderer.invoke("prepare-preview-audio-track", filePath);
},
clearCurrentVideoPath: () => {
return ipcRenderer.invoke("clear-current-video-path");
},
saveProjectFile: (projectData: unknown, suggestedName?: string, existingProjectPath?: string) => {
return ipcRenderer.invoke("save-project-file", projectData, suggestedName, existingProjectPath);
},
loadProjectFile: () => {
return ipcRenderer.invoke("load-project-file");
},
loadCurrentProjectFile: () => {
return ipcRenderer.invoke("load-current-project-file");
},
onMenuLoadProject: (callback: () => void) => {
const listener = () => callback();
ipcRenderer.on("menu-load-project", listener);
return () => ipcRenderer.removeListener("menu-load-project", listener);
},
onMenuSaveProject: (callback: () => void) => {
const listener = () => callback();
ipcRenderer.on("menu-save-project", listener);
return () => ipcRenderer.removeListener("menu-save-project", listener);
},
onMenuSaveProjectAs: (callback: () => void) => {
const listener = () => callback();
ipcRenderer.on("menu-save-project-as", listener);
return () => ipcRenderer.removeListener("menu-save-project-as", listener);
},
getPlatform: () => {
return ipcRenderer.invoke("get-platform");
},
revealInFolder: (filePath: string) => {
return ipcRenderer.invoke("reveal-in-folder", filePath);
},
getShortcuts: () => {
return ipcRenderer.invoke("get-shortcuts");
},
saveShortcuts: (shortcuts: unknown) => {
return ipcRenderer.invoke("save-shortcuts", shortcuts);
},
setLocale: (locale: string) => {
return ipcRenderer.invoke("set-locale", locale);
},
saveDiagnostic: (payload: {
error: string;
stack?: string;
projectState: unknown;
logs: string[];
}) => {
return ipcRenderer.invoke("save-diagnostic", payload);
},
setMicrophoneExpanded: (expanded: boolean) => {
ipcRenderer.send("hud:setMicrophoneExpanded", expanded);
},
setHasUnsavedChanges: (hasChanges: boolean) => {
ipcRenderer.send("set-has-unsaved-changes", hasChanges);
},
showCountdownOverlay: (value: number, runId: number) => {
return ipcRenderer.invoke("countdown-overlay-show", value, runId);
},
setCountdownOverlayValue: (value: number, runId: number) => {
return ipcRenderer.invoke("countdown-overlay-set-value", value, runId);
},
hideCountdownOverlay: (runId: number) => {
return ipcRenderer.invoke("countdown-overlay-hide", runId);
},
onCountdownOverlayValue: (callback: (value: number | null) => void) => {
const listener = (_event: unknown, value: number | null) => callback(value);
ipcRenderer.on("countdown-overlay-value", listener);
return () => ipcRenderer.removeListener("countdown-overlay-value", listener);
},
onRequestSaveBeforeClose: (callback: () => Promise<boolean> | boolean) => {
const listener = async () => {
try {
const shouldClose = await callback();
ipcRenderer.send("save-before-close-done", shouldClose);
} catch {
ipcRenderer.send("save-before-close-done", false);
}
};
ipcRenderer.on("request-save-before-close", listener);
return () => ipcRenderer.removeListener("request-save-before-close", listener);
},
onRequestCloseConfirm: (callback: () => void) => {
const listener = () => callback();
ipcRenderer.on("request-close-confirm", listener);
return () => ipcRenderer.removeListener("request-close-confirm", listener);
},
sendCloseConfirmResponse: (choice: "save" | "discard" | "cancel") => {
ipcRenderer.send("close-confirm-response", choice);
},
});
+97
View File
@@ -0,0 +1,97 @@
import fs from "node:fs/promises";
import { fixParsedWebmDuration } from "@fix-webm-duration/fix";
import { WebmFile } from "@fix-webm-duration/parser";
export type DurationPatchResult =
| { patched: true }
| { patched: false; reason: "no-section" | "already-valid" | "io-error" | "internal" };
/**
* Patch the WebM Duration header on a finalized recording file.
*
* Browser MediaRecorder writes WebM with no Duration EBML element. With the
* streaming-to-disk path the renderer never holds the blob, so the historical
* `fixWebmDuration(blob, durationMs)` call can't run. Patching on disk after
* `WriteStream.end()` produces an equivalent result: the editor's seek bar and
* timeline read a real duration instead of `N/A`.
*
* Atomic by design: writes the patched bytes to `<filePath>.duration-patch.tmp`
* and renames in place. If the process crashes mid-rewrite, the original file
* survives intact, so the user never loses their recording to a partial write.
*
* Best-effort by intent: any failure (read, parse, write) logs and returns a
* non-`patched` result rather than throwing. The file is still playable without
* the patch (decoders walk frames sequentially); the only cost is that the
* editor's seek bar and timeline break until it is patched.
*
* Memory: reads the whole file into a main-process Buffer, the same footprint
* as the pre-streaming renderer path, just on the side without V8's heap cap.
*/
export async function patchWebmDurationOnDisk(
filePath: string,
durationMs: number,
): Promise<DurationPatchResult> {
try {
const fileBytes = await fs.readFile(filePath);
const webm = new WebmFile(new Uint8Array(fileBytes));
const patched = fixParsedWebmDuration(webm, durationMs, { logger: false });
if (!patched) {
// fixParsedWebmDuration returns false for: missing Segment, missing
// Info, or a Duration that is already valid. The first two mean a
// malformed (most likely truncated) file; the third is a no-op.
const reason = inferUnpatchedReason(webm);
if (reason === "no-section") {
console.warn(
`[webm-duration] no Segment/Info section in ${filePath}; file may be truncated`,
);
}
return { patched: false, reason };
}
if (!webm.source) {
console.error(`[webm-duration] patched but source missing for ${filePath}`);
return { patched: false, reason: "internal" };
}
const tmpPath = `${filePath}.duration-patch.tmp`;
const patchedBytes = Buffer.from(
webm.source.buffer,
webm.source.byteOffset,
webm.source.byteLength,
);
try {
await fs.writeFile(tmpPath, patchedBytes);
await fs.rename(tmpPath, filePath);
return { patched: true };
} catch (writeError) {
console.error(`[webm-duration] failed to write patched ${filePath}:`, writeError);
// Best-effort cleanup of the temp file; if unlink also fails, leave it.
// The original recording is untouched because the rename never ran.
await fs.unlink(tmpPath).catch(() => undefined);
return { patched: false, reason: "io-error" };
}
} catch (error) {
console.error(`[webm-duration] failed to patch ${filePath}:`, error);
return { patched: false, reason: "io-error" };
}
}
/**
* Distinguish "no Segment/Info section" (malformed/truncated file) from "Info
* present but Duration already valid" (patch unnecessary).
*
* The IDs are the length-descriptor-stripped form that @fix-webm-duration/parser
* uses as its lookup keys (Segment `0x8538067`, Info `0x549a966`), verified
* against the parser's `src/lib/sections.js` not the canonical 4-byte EBML
* IDs (`0x18538067` / `0x1549A966`), which this parser's `getSectionById` would
* never match.
*/
function inferUnpatchedReason(webm: WebmFile): "no-section" | "already-valid" {
const segment = webm.getSectionById?.(0x8538067);
if (!segment) return "no-section";
const info = (
segment as unknown as { getSectionById?: (id: number) => unknown }
).getSectionById?.(0x549a966);
return info ? "already-valid" : "no-section";
}
+263
View File
@@ -0,0 +1,263 @@
import path from "node:path";
import { fileURLToPath, pathToFileURL } from "node:url";
import { BrowserWindow, ipcMain, screen } from "electron";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const APP_ROOT = path.join(__dirname, "..");
const VITE_DEV_SERVER_URL = process.env["VITE_DEV_SERVER_URL"];
const RENDERER_DIST = path.join(APP_ROOT, "dist");
const HEADLESS = process.env["HEADLESS"] === "true";
// Asset base URL for renderer (wallpapers, etc.). Packaged: extraResources copies
// public/wallpapers -> resources/wallpapers. Unpackaged: <appRoot>/public/.
const ASSET_BASE_DIR = process.defaultApp
? path.join(__dirname, "..", "public")
: process.resourcesPath;
const ASSET_BASE_URL_ARG = `--asset-base-url=${pathToFileURL(`${ASSET_BASE_DIR}${path.sep}`).toString()}`;
let hudOverlayWindow: BrowserWindow | null = null;
ipcMain.on("hud-overlay-hide", () => {
if (hudOverlayWindow && !hudOverlayWindow.isDestroyed()) {
hudOverlayWindow.minimize();
}
});
ipcMain.on("hud-overlay-ignore-mouse-events", (_event, ignore: boolean) => {
if (hudOverlayWindow && !hudOverlayWindow.isDestroyed()) {
hudOverlayWindow.setIgnoreMouseEvents(ignore, { forward: true });
}
});
ipcMain.on("hud-overlay-move-by", (_event, deltaX: number, deltaY: number) => {
if (
!hudOverlayWindow ||
hudOverlayWindow.isDestroyed() ||
!Number.isFinite(deltaX) ||
!Number.isFinite(deltaY)
) {
return;
}
const [x, y] = hudOverlayWindow.getPosition();
hudOverlayWindow.setPosition(Math.round(x + deltaX), Math.round(y + deltaY), false);
});
/**
* Creates the always-on-top HUD overlay window centred at the bottom of the
* primary display. The window is frameless, transparent, and follows the user
* across macOS Spaces so it is never lost when switching virtual desktops.
*/
export function createHudOverlayWindow(): BrowserWindow {
const primaryDisplay = screen.getPrimaryDisplay();
const { workArea } = primaryDisplay;
const windowWidth = 600;
const windowHeight = 160;
const x = Math.floor(workArea.x + (workArea.width - windowWidth) / 2);
const y = Math.floor(workArea.y + workArea.height - windowHeight - 5);
const win = new BrowserWindow({
width: windowWidth,
height: windowHeight,
minWidth: 600,
maxWidth: 600,
minHeight: 160,
maxHeight: 160,
x: x,
y: y,
frame: false,
transparent: true,
resizable: false,
alwaysOnTop: true,
skipTaskbar: true,
hasShadow: false,
show: !HEADLESS,
webPreferences: {
preload: path.join(__dirname, "preload.mjs"),
additionalArguments: [ASSET_BASE_URL_ARG],
nodeIntegration: false,
contextIsolation: true,
backgroundThrottling: false,
},
});
win.setIgnoreMouseEvents(true, { forward: true });
// Follow the user across macOS Spaces (virtual desktops).
// Without this the HUD stays pinned to the Space it was first opened on.
if (process.platform === "darwin") {
win.setVisibleOnAllWorkspaces(true, { visibleOnFullScreen: true });
}
win.webContents.on("did-finish-load", () => {
win?.webContents.send("main-process-message", new Date().toLocaleString());
});
hudOverlayWindow = win;
win.on("closed", () => {
if (hudOverlayWindow === win) {
hudOverlayWindow = null;
}
});
if (VITE_DEV_SERVER_URL) {
win.loadURL(VITE_DEV_SERVER_URL + "?windowType=hud-overlay");
} else {
win.loadFile(path.join(RENDERER_DIST, "index.html"), {
query: { windowType: "hud-overlay" },
});
}
return win;
}
/**
* Creates the main editor window. Starts maximised with a hidden title bar on
* macOS. This window is not always-on-top and appears in the taskbar/dock.
*/
export function createEditorWindow(): BrowserWindow {
const isMac = process.platform === "darwin";
const win = new BrowserWindow({
width: 1200,
height: 800,
minWidth: 800,
minHeight: 600,
...(isMac && {
titleBarStyle: "hiddenInset",
trafficLightPosition: { x: 12, y: 12 },
}),
transparent: false,
resizable: true,
alwaysOnTop: false,
skipTaskbar: false,
title: "OpenScreen",
backgroundColor: "#000000",
show: !HEADLESS,
webPreferences: {
preload: path.join(__dirname, "preload.mjs"),
additionalArguments: [ASSET_BASE_URL_ARG],
nodeIntegration: false,
contextIsolation: true,
webSecurity: false,
backgroundThrottling: false,
},
});
// Maximize the window by default
win.maximize();
win.webContents.on("did-finish-load", () => {
win?.webContents.send("main-process-message", new Date().toLocaleString());
});
if (VITE_DEV_SERVER_URL) {
win.loadURL(VITE_DEV_SERVER_URL + "?windowType=editor");
} else {
win.loadFile(path.join(RENDERER_DIST, "index.html"), {
query: { windowType: "editor" },
});
}
return win;
}
/**
* Creates the floating source-selector window used to pick a screen or window
* to record. Frameless, transparent, and follows the user across macOS Spaces.
*/
export function createSourceSelectorWindow(): BrowserWindow {
const { width, height } = screen.getPrimaryDisplay().workAreaSize;
const win = new BrowserWindow({
width: 620,
height: 420,
minHeight: 350,
maxHeight: 500,
x: Math.round((width - 620) / 2),
y: Math.round((height - 420) / 2),
frame: false,
resizable: false,
alwaysOnTop: true,
transparent: true,
backgroundColor: "#00000000",
webPreferences: {
preload: path.join(__dirname, "preload.mjs"),
additionalArguments: [ASSET_BASE_URL_ARG],
nodeIntegration: false,
contextIsolation: true,
},
});
// Follow the user across macOS Spaces so the selector appears on the
// active desktop regardless of where the HUD was originally opened.
if (process.platform === "darwin") {
win.setVisibleOnAllWorkspaces(true, { visibleOnFullScreen: true });
}
if (VITE_DEV_SERVER_URL) {
win.loadURL(VITE_DEV_SERVER_URL + "?windowType=source-selector");
} else {
win.loadFile(path.join(RENDERER_DIST, "index.html"), {
query: { windowType: "source-selector" },
});
}
return win;
}
/**
* Creates a centered transparent countdown overlay window that sits above the
* HUD while recording pre-roll is running.
*/
export function createCountdownOverlayWindow(): BrowserWindow {
const { workArea } = screen.getPrimaryDisplay();
const overlayWidth = 420;
const overlayHeight = 260;
const win = new BrowserWindow({
width: overlayWidth,
height: overlayHeight,
minWidth: overlayWidth,
maxWidth: overlayWidth,
minHeight: overlayHeight,
maxHeight: overlayHeight,
x: Math.round(workArea.x + (workArea.width - overlayWidth) / 2),
y: Math.round(workArea.y + (workArea.height - overlayHeight) / 2),
frame: false,
resizable: false,
alwaysOnTop: true,
skipTaskbar: true,
focusable: false,
transparent: true,
backgroundColor: "#00000000",
hasShadow: false,
show: false,
webPreferences: {
preload: path.join(__dirname, "preload.mjs"),
additionalArguments: [ASSET_BASE_URL_ARG],
nodeIntegration: false,
contextIsolation: true,
backgroundThrottling: false,
},
});
win.setIgnoreMouseEvents(true);
if (process.platform === "darwin") {
win.setVisibleOnAllWorkspaces(true, { visibleOnFullScreen: true });
}
if (VITE_DEV_SERVER_URL) {
win.loadURL(VITE_DEV_SERVER_URL + "?windowType=countdown-overlay");
} else {
win.loadFile(path.join(RENDERER_DIST, "index.html"), {
query: { windowType: "countdown-overlay" },
});
}
return win;
}
Generated
+27
View File
@@ -0,0 +1,27 @@
{
"nodes": {
"nixpkgs": {
"locked": {
"lastModified": 1775710090,
"narHash": "sha256-ar3rofg+awPB8QXDaFJhJ2jJhu+KqN/PRCXeyuXR76E=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "4c1018dae018162ec878d42fec712642d214fdfa",
"type": "github"
},
"original": {
"owner": "NixOS",
"ref": "nixos-unstable",
"repo": "nixpkgs",
"type": "github"
}
},
"root": {
"inputs": {
"nixpkgs": "nixpkgs"
}
}
},
"root": "root",
"version": 7
}
+122
View File
@@ -0,0 +1,122 @@
{
description = "OpenScreen desktop screen recorder with built-in editor";
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
};
outputs =
{ self, nixpkgs }:
let
systems = [
"x86_64-linux"
"aarch64-linux"
];
forAllSystems = f: nixpkgs.lib.genAttrs systems (system: f nixpkgs.legacyPackages.${system});
in
{
# -- Per-system outputs (packages, dev shells) --
packages = forAllSystems (pkgs: {
openscreen = pkgs.callPackage ./nix/package.nix { };
default = self.packages.${pkgs.stdenv.hostPlatform.system}.openscreen;
});
devShells = forAllSystems (
pkgs:
let
electron = pkgs.electron;
# Libraries Electron needs at runtime on Linux
runtimeLibs = with pkgs; [
# X11
libx11
libxcomposite
libxdamage
libxext
libxfixes
libxrandr
libxtst
libxcb
libxshmfence
# Wayland
wayland
# GTK / UI toolkit
gtk3
glib
pango
cairo
gdk-pixbuf
atk
at-spi2-atk
at-spi2-core
# Graphics
mesa
libGL
libdrm
vulkan-loader
# Networking / crypto (NSS for Chromium)
nss
nspr
# Audio
alsa-lib
pipewire
pulseaudio
# System
dbus
cups
expat
libnotify
libsecret
util-linux # libuuid
];
in
{
default = pkgs.mkShell {
packages = with pkgs; [
nodejs_22
electron
# Native module compilation
python3
pkg-config
gcc
# Playwright browser tests
playwright-driver.browsers
];
# Electron's prebuilt binary needs these at runtime
LD_LIBRARY_PATH = pkgs.lib.makeLibraryPath runtimeLibs;
# Tell the npm `electron` package to use the Nix-provided binary
# instead of downloading its own. vite-plugin-electron respects this.
ELECTRON_OVERRIDE_DIST_PATH = "${electron}/libexec/electron";
# Playwright browser path for test:browser / test:e2e
PLAYWRIGHT_BROWSERS_PATH = "${pkgs.playwright-driver.browsers}";
PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD = "1";
shellHook = ''
echo "OpenScreen dev shell node $(node --version), electron v$(electron --version 2>/dev/null | tr -d 'v')"
'';
};
}
);
# -- System-wide outputs (modules, overlay) --
overlays.default = final: _prev: {
openscreen = self.packages.${final.stdenv.hostPlatform.system}.openscreen;
};
nixosModules.default = import ./nix/module.nix self;
homeManagerModules.default = import ./nix/hm-module.nix self;
};
}
Binary file not shown.
Binary file not shown.

After

Width:  |  Height:  |  Size: 813 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 630 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 224 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.3 KiB

Some files were not shown because too many files have changed in this diff Show More