minhvo.vercel.app
Tue Mar 03 2026

Interview Coder
Electron
TypeScript
AI
OpenAI
Gemini
Claude
Ollama
Cross-Platform
Screenshot
AI-powered coding interview assistant with multi-provider vision (OpenAI, Gemini, Claude, Ollama) and a transparent always-on-top window.
Overview
An advanced Electron application that leverages AI vision models to analyze screenshots and assist with coding interviews. It ships as two iterations:
- interview-coder — Electron v35, multi-provider screenshot solver with markdown-rendered responses in 8 languages
- interview-assistant — Electron 29 evolution adding live speech transcription, dynamic model catalog via models.dev, and an invisible window that bypasses most screen-capture methods
Key Features
- 📸 Smart Screenshot Capture — global shortcuts, window/area capture, multi-page mode
- 🤖 Multi-Provider AI — OpenAI (GPT-4 Vision / GPT-5), Google Gemini, Anthropic, Azure OpenAI, OpenRouter, and local Ollama
- 🗣️ Speech Recognition — Whisper + Gemini Audio for live interview transcription (Interview Assistant)
- 💬 Answer Suggestions — contextual answer suggestions during live interviews
- 🌐 8 Languages — English, Vietnamese, Spanish, French, German, Japanese, Korean, Chinese
- 👻 Invisible Window — transparent, frameless, always-on-top; opt-in screen-capture bypass
- ⌨️ Keyboard-First UX — move, resize, opacity, zoom — everything via shortcuts
Technical Stack
- Electron 29–35 with a strict two-process boundary
- React 18 + TypeScript + Vite 6 + Tailwind 3 + Radix UI + React Query 5
- Vercel AI SDK for unified streaming across providers
- electron-store for persistent config (extends
EventEmitterfor live updates) - models.dev catalog for 3,000+ models with pricing and context windows
Architecture
| Process | Directory | Runtime | Purpose |
|---|---|---|---|
| Main | electron/ | Node.js | Screenshots, AI inference, config, transcription, global shortcuts |
| Renderer | src/ | Browser | React UI with Vite, Tailwind, React Query, Radix primitives |
| Shared | shared/ | Both | AI model configuration (single source of truth) |
Dynamic Model Selection (3-tier fallback)
- Provider API — real-time model list from the user's account (requires API key)
- models.dev catalog — open-source database of 3,000+ models with pricing and context windows (no key needed)
- Static fallback — hardcoded defaults in
shared/aiModels.ts
Keyboard Shortcuts
| Action | Shortcut |
|---|---|
| Toggle visibility | Ctrl/Cmd + B |
| Take screenshot | Ctrl/Cmd + H |
| Process screenshots | Ctrl/Cmd + Enter |
| Start/stop recording | Ctrl/Cmd + M |
| Move window | Ctrl/Cmd + Arrow Keys |
| Opacity / zoom | Ctrl/Cmd + [ ] / - = / 0 |
Implementation Highlights
- Provider abstraction: a single
ProviderClientFactoryreturns a unified client for OpenAI / Gemini / Anthropic / Azure / OpenRouter, so the UI never branches on provider - Privacy-first: API keys stored locally in
~/Library/Application Support/interview-assistant/config.json(or%APPDATA%on Windows); data only ever leaves the box for the configured provider - Invisibility: transparent frameless
BrowserWindowwithsetContentProtectionso most screen recorders capture an empty rectangle - Streaming: real-time markdown rendering with Whisper-driven transcripts piped through an answer-assistant prompt
Role & Responsibilities
- Built both iterations end-to-end (Electron main, IPC, React renderer, shortcuts)
- Implemented the multi-provider abstraction and the models.dev integration
- Designed the invisible-window UX and global shortcut layer
- Owned the screenshot pipeline, vision prompts, and the transcription / answer-suggestion loop