라벨이 LLM robot demo인 게시물 표시

Building an LLM Robot with My Son — EP 8. My Son Gave the AI Robot Its First Real Command

Building an LLM Robot with My Son — EP 8. My Son Gave the AI Robot Its First Real Command EP 6 connected the LLM server. EP 7 migrated to Pi. This episode: camera joins. Qwen2.5-VL-7B is now on the LLM server — the multimodal variant that accepts image input alongside text. Camera frames from the robot get sent with each request, and the model decides what to do based on what it sees. Camera + sensors + LLM + robot, all connected at once for the first time. Switching to Qwen2.5-VL From text-only Qwen2.5-7B to Qwen2.5-VL-7B. Same family — harness barely changed. Three things were different: New section added to CLAUDE.md: ## Vision Input - Camera resolution: 640×480 - Transmission format: JPEG (quality 70) - Frame timing: sent only at command request time (not continuous streaming) - Image + sensor data sent together ## LLM input format (vision mode) { "image": "<base64 encoded JPEG>", "sensor": "dist:45", "instruct...