minhvo.vercel.app

Tue Mar 03 2026

Interview Coder

Interview Coder

AI-powered coding interview assistant with multi-provider vision (OpenAI, Gemini, Claude, Ollama) and a transparent always-on-top window.

Overview

An advanced Electron application that leverages AI vision models to analyze screenshots and assist with coding interviews. It ships as two iterations:

  • interview-coder — Electron v35, multi-provider screenshot solver with markdown-rendered responses in 8 languages
  • interview-assistant — Electron 29 evolution adding live speech transcription, dynamic model catalog via models.dev, and an invisible window that bypasses most screen-capture methods

Key Features

  • 📸 Smart Screenshot Capture — global shortcuts, window/area capture, multi-page mode
  • 🤖 Multi-Provider AI — OpenAI (GPT-4 Vision / GPT-5), Google Gemini, Anthropic, Azure OpenAI, OpenRouter, and local Ollama
  • 🗣️ Speech Recognition — Whisper + Gemini Audio for live interview transcription (Interview Assistant)
  • 💬 Answer Suggestions — contextual answer suggestions during live interviews
  • 🌐 8 Languages — English, Vietnamese, Spanish, French, German, Japanese, Korean, Chinese
  • 👻 Invisible Window — transparent, frameless, always-on-top; opt-in screen-capture bypass
  • ⌨️ Keyboard-First UX — move, resize, opacity, zoom — everything via shortcuts

Technical Stack

  • Electron 29–35 with a strict two-process boundary
  • React 18 + TypeScript + Vite 6 + Tailwind 3 + Radix UI + React Query 5
  • Vercel AI SDK for unified streaming across providers
  • electron-store for persistent config (extends EventEmitter for live updates)
  • models.dev catalog for 3,000+ models with pricing and context windows

Architecture

ProcessDirectoryRuntimePurpose
Mainelectron/Node.jsScreenshots, AI inference, config, transcription, global shortcuts
Renderersrc/BrowserReact UI with Vite, Tailwind, React Query, Radix primitives
Sharedshared/BothAI model configuration (single source of truth)

Dynamic Model Selection (3-tier fallback)

  1. Provider API — real-time model list from the user's account (requires API key)
  2. models.dev catalog — open-source database of 3,000+ models with pricing and context windows (no key needed)
  3. Static fallback — hardcoded defaults in shared/aiModels.ts

Keyboard Shortcuts

ActionShortcut
Toggle visibilityCtrl/Cmd + B
Take screenshotCtrl/Cmd + H
Process screenshotsCtrl/Cmd + Enter
Start/stop recordingCtrl/Cmd + M
Move windowCtrl/Cmd + Arrow Keys
Opacity / zoomCtrl/Cmd + [ ] / - = / 0

Implementation Highlights

  • Provider abstraction: a single ProviderClientFactory returns a unified client for OpenAI / Gemini / Anthropic / Azure / OpenRouter, so the UI never branches on provider
  • Privacy-first: API keys stored locally in ~/Library/Application Support/interview-assistant/config.json (or %APPDATA% on Windows); data only ever leaves the box for the configured provider
  • Invisibility: transparent frameless BrowserWindow with setContentProtection so most screen recorders capture an empty rectangle
  • Streaming: real-time markdown rendering with Whisper-driven transcripts piped through an answer-assistant prompt

Role & Responsibilities

  • Built both iterations end-to-end (Electron main, IPC, React renderer, shortcuts)
  • Implemented the multi-provider abstraction and the models.dev integration
  • Designed the invisible-window UX and global shortcut layer
  • Owned the screenshot pipeline, vision prompts, and the transcription / answer-suggestion loop