Let AI control your stream - 118 tools for OBS and Twitch integration
Note: This feature is experimental. It works reasonably well for dialogue-heavy retro games but may have issues with certain fonts, image quality, or fast-paced text. Your mileage may vary.
Claude can watch your screen and translate Japanese text automatically, displaying English subtitles on your stream.
Works best with:
┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐
│ OBS Screen │────▶│ Claude Vision│────▶│ Translation │
│ (screenshot) │ │ (OCR + AI) │ │ Overlay on OBS │
└─────────────────┘ └──────────────┘ └─────────────────┘
The automatic service does this every 2 seconds, but only calls the API when the text actually changes.
Edit your setenv.sh:
export ANTHROPIC_API_KEY="sk-ant-api03-your-key-here"
Restart Claude Code after saving.
With a Japanese game on screen in OBS, try:
"Translate what's on screen"
You should see the Japanese text and English translation appear.
Best for: Testing, occasional translations, non-dialogue text
"Translate what's on screen"
"Translate the menu"
"What does that item description say?"
Each command takes one screenshot and translates it.
Best for: Streaming story-heavy games, continuous dialogue
"Start the translation service"
This:
To stop:
"Stop the translation service"
"Start translating"
"Start translating with 3 second intervals"
"Start the translation service, check every 1.5 seconds"
"How's the translation service doing?"
"Translation status"
Shows:
"Make translation check faster - every 1 second"
"Set translation threshold to 15%" (how different text must be to re-translate)
The English text appears at the bottom of your screen by default.
"Move translation to the top"
"Put translation overlay at 100, 800" (x, y coordinates)
"Make translation text bigger"
"Set translation font size to 100"
"Clear the translation"
"Hide translation overlay"
The service doesn’t call the API for every frame. Here’s how it saves your API budget:
On first run, it detects where dialogue typically appears in your game. This region is cached.
Game Screen:
┌────────────────────────────┐
│ │
│ (game area) │
│ │
├────────────────────────────┤
│ [Dialogue text here] │ ← Only this region is analyzed
└────────────────────────────┘
Uses perceptual hashing to compare frames:
"Translation status"
Service: Running
API calls: 45
Skipped (no change): 180
Efficiency: 80%
Avg latency: 320ms
That means 80% of checks didn’t need an API call!
Automatic mode works great. The dialogue box is usually detected automatically.
"Start translating"
May need larger poll interval due to lots of text:
"Start translating with 3 second intervals"
Use manual mode for menus and item descriptions:
"Translate that"
"What does the shop say?"
If auto-detection picks the wrong area:
"Stop translation"
"Start translation with auto-detect disabled"
Then manually translate key moments.
"Reset translation service"
"Start translation"
This clears the cached dialogue box region.
Claude Vision API calls cost money. The smart detection helps a lot:
| Mode | Calls per Hour | Approximate Cost |
|---|---|---|
| Manual (occasional) | 10-20 | ~$0.02-0.04 |
| Auto (2s, with smart detect) | 50-200 | ~$0.10-0.40 |
| Auto (without smart detect) | 1800 | ~$3.60 |
Smart detection typically saves 60-80% of calls.
You: "I'm about to play Penta Dragon, start translating"
Claude: Starting translation service with smart dialogue detection.
Monitoring every 2 seconds. I'll show translations as
new dialogue appears.
[Translation service started]
--- 30 minutes later ---
You: "How many API calls have we made?"
Claude: Translation service status:
- Running: Yes
- API calls: 89
- Skipped (unchanged): 412
- Efficiency: 82%
- Average latency: 285ms
We've saved about 82% of potential API calls through
change detection!
You: "Great, stop translating - I'm going to grind for a bit"
Claude: [Stops translation service]
Translation service stopped. Made 89 API calls total.
Overlay cleared.