Tue Mar 03 2026

Interview Coder

Electron

TypeScript

OpenAI

Gemini

Claude

Ollama

Cross-Platform

Screenshot

AI-powered coding interview assistant with multi-provider vision (OpenAI, Gemini, Claude, Ollama) and a transparent always-on-top window.

Overview

An advanced Electron application that leverages AI vision models to analyze screenshots and assist with coding interviews. It ships as two iterations:

interview-coder — Electron v35, multi-provider screenshot solver with markdown-rendered responses in 8 languages
interview-assistant — Electron 29 evolution adding live speech transcription, dynamic model catalog via models.dev, and an invisible window that bypasses most screen-capture methods

Key Features

📸 Smart Screenshot Capture — global shortcuts, window/area capture, multi-page mode
🤖 Multi-Provider AI — OpenAI (GPT-4 Vision / GPT-5), Google Gemini, Anthropic, Azure OpenAI, OpenRouter, and local Ollama
🗣️ Speech Recognition — Whisper + Gemini Audio for live interview transcription (Interview Assistant)
💬 Answer Suggestions — contextual answer suggestions during live interviews
🌐 8 Languages — English, Vietnamese, Spanish, French, German, Japanese, Korean, Chinese
👻 Invisible Window — transparent, frameless, always-on-top; opt-in screen-capture bypass
⌨️ Keyboard-First UX — move, resize, opacity, zoom — everything via shortcuts

Technical Stack

Electron 29–35 with a strict two-process boundary
React 18 + TypeScript + Vite 6 + Tailwind 3 + Radix UI + React Query 5
Vercel AI SDK for unified streaming across providers
electron-store for persistent config (extends EventEmitter for live updates)
models.dev catalog for 3,000+ models with pricing and context windows

Architecture

Process	Directory	Runtime	Purpose
Main	`electron/`	Node.js	Screenshots, AI inference, config, transcription, global shortcuts
Renderer	`src/`	Browser	React UI with Vite, Tailwind, React Query, Radix primitives
Shared	`shared/`	Both	AI model configuration (single source of truth)

Dynamic Model Selection (3-tier fallback)

Provider API — real-time model list from the user's account (requires API key)
models.dev catalog — open-source database of 3,000+ models with pricing and context windows (no key needed)
Static fallback — hardcoded defaults in shared/aiModels.ts

Keyboard Shortcuts

Action	Shortcut
Toggle visibility	`Ctrl/Cmd + B`
Take screenshot	`Ctrl/Cmd + H`
Process screenshots	`Ctrl/Cmd + Enter`
Start/stop recording	`Ctrl/Cmd + M`
Move window	`Ctrl/Cmd + Arrow Keys`
Opacity / zoom	`Ctrl/Cmd + [ ] / - = / 0`

Implementation Highlights

Provider abstraction: a single ProviderClientFactory returns a unified client for OpenAI / Gemini / Anthropic / Azure / OpenRouter, so the UI never branches on provider
Privacy-first: API keys stored locally in ~/Library/Application Support/interview-assistant/config.json (or %APPDATA% on Windows); data only ever leaves the box for the configured provider
Invisibility: transparent frameless BrowserWindow with setContentProtection so most screen recorders capture an empty rectangle
Streaming: real-time markdown rendering with Whisper-driven transcripts piped through an answer-assistant prompt

Role & Responsibilities

Built both iterations end-to-end (Electron main, IPC, React renderer, shortcuts)
Implemented the multi-provider abstraction and the models.dev integration
Designed the invisible-window UX and global shortcut layer
Owned the screenshot pipeline, vision prompts, and the transcription / answer-suggestion loop