We are building an AI-powered system to help teachers create high-quality educational videos that explain complex scientific concepts through visual storytelling.
Teachers already have strong:
•
explanations
•
analogies
•
teaching intuition
However, they lack the production tools to translate these into clear, visual, animated content.
Our goal is to build an AI agent pipeline that converts structured teaching inputs into:
This is not a simple prompt engineering task — we are looking for someone who can design and implement a robust multi-step AI workflow.
Project Goal
Build an AI agent (or system) that can reliably generate accurate, structured, and editable scene outputs, and convert them into educational videos.
Focus areas:
•
Scene generation accuracy (critical)
•
Human-in-the-loop validation
•
Video generation pipeline integration
Current Workflow (Target System)
We are aiming for the following pipeline:
1.
Teacher generates script (via GPT / Gemini)
2.
AI generates structured scene breakdowns
•
visuals
•
setting
•
concept mapping
•
~6–10 scenes per minute
3.
Teacher reviews & corrects scene outputs (critical step)
4.
AI generates video from validated scenes
5.
Text-to-speech + sync
6.
Final teacher revision
Current Challenges
We have tested tools like OpenArt and Gemini, but:
•
•
•
•
Example attempt:
What We Need Help With
We are looking for someone to design and/or build:
Core Focus (Priority)
•
Step 2: Structured scene generation system
•
Step 3: Human-in-the-loop validation workflow
•
Step 4: Video generation pipeline integration
Expected Output
A working system or prototype that can:
•
Convert script → structured scene JSON (or similar format)
•
Allow teachers to review/edit scenes easily
•
Generate consistent visual outputs from scenes
•
Produce video outputs (using existing APIs or tools)
Preferred Technical Approaches
We are open, but examples include:
•
LLM orchestration (GPT, Gemini, Claude)
•
Agent frameworks (LangChain, CrewAI, etc.)
•
Image/video generation APIs (Runway, Pika, Stable Diffusion, etc.)
•
Node-based workflows (n8n is a plus)
•
Custom pipeline design (Python / JS)
Ideal Candidate
•
Experience building AI pipelines / agents (not just prompts)
•
Strong understanding of multimodal systems (text → image → video)
•
Ability to structure outputs (JSON / schema design)
•
Experience with human-in-the-loop systems
•
Bonus:
◦
EdTech experience
◦
Scientific/technical content familiarity
◦
Experience with video generation tools

.jpg&blockId=2950e466-bde2-4a6b-82d2-196d8d47bbea)
