Project Description
I need a self-contained Python pipeline that runs on a Raspberry Pi, grabs an image from the Pi Camera, extracts any text it finds in the picture, sends that text to a lightweight AI function, then shows the AI’s short answer on a small SPI/I²C OLED display.
Scope of work
• Capture: Trigger the Pi Camera (libcamera or picamera2) and save a frame.
• OCR: Use Tesseract, EasyOCR, or another open-source library to read text from the image. The input is always “text from images,” never scanned PDFs or live handwriting.
• AI logic: Pass the recognised string to a simple question-answer routine (OpenAI API, Llama .cpp, or any efficient model you prefer) and return a concise, single-sentence reply. No lengthy explanations are needed.
• Display: Push that reply to a 128×64 (or similar) monochrome OLED over SPI/I²C using the common SSD1306 driver. Text must be readable and auto-wrap if longer than the screen width.
• Repeat or exit cleanly on Ctrl-C.
Acceptance criteria
1. A single CLI command (e.g., python3 main.py) initiates the whole flow and prints any debug messages to the terminal.
2. Accuracy: OCR reliably captures printed text that is at least 10 pt and well lit.
3. AI response arrives in <3 s on a stable network connection and never exceeds one sentence.
4. OLED shows the answer without truncation or flicker.
5. Include requirements.txt, commented source code, and a brief README covering wiring, environment setup, and API key placement.
I already have the Pi, camera, and OLED wired; just need the software and clear instructions so I can run it today.