Autonomous AI Agent for Conversational Commerce
A production-grade WhatsApp ordering system powered by frontier LLMs. Processing real customer orders 24/7 with natural language understanding, multi-step workflows, and integrated payments.
System Architecture
End-to-end AI pipeline from WhatsApp message to POS terminal, fully autonomous with zero human intervention.
WhatsApp Channel
Receives customer messages via WhatsApp Web protocol. Supports text, images, and location sharing. 500ms debounce for multi-message batching.
LLM Agent Runtime
OpenClaw agent framework with GPT-5.4 primary model. 578 lines of behavioral rules, multi-model fallback chain, per-customer session memory.
Order Evaluator
State machine validates order completeness. Handles draft/final queue, entity normalization, fulfillment routing (delivery, pickup, dine-in).
Payment + POS
Doku QRIS generation with auto-verification polling. Pawoon POS sync for kitchen display. Supabase for persistent storage and real-time dashboard.
578 Lines of Production Agent Rules
Comprehensive behavioral specification including: 7-step ordering flow, one-shot order detection, alias parsing for 130+ menu items, ambiguity resolution, upsell logic, idle conversation management, prompt injection defense, and time-based personality adaptation. The agent reads customer profiles, order history, and live menu schema on every interaction.
Order Processing Pipeline
From natural language to kitchen display in under 30 seconds.
WhatsApp Message
Customer sends natural language text
LLM Processing
Intent detection + entity extraction
Order Validation
State machine + queue evaluation
QRIS Payment
Auto-generated QR + verification
POS + Kitchen
Pawoon sync + courier dispatch
Live System in Action
Real conversation with the AI agent processing an actual delivery order end-to-end.
Model Migration to MiMo
Replacing GPT-5.4 with Xiaomi MiMo as the primary inference model for all conversational AI processing.
GPT-5.4
OpenAI Codex provider with Claude Opus 4.6 fallback. High cost, rate-limited, no local inference option.
Xiaomi MiMo V2.5
Flagship reasoning model via MiMo API Platform. Direct integration with OpenClaw agent runtime. Lower latency, competitive pricing.
AI Capabilities in Production
Every capability listed here is actively running in production, processing real customer orders daily.
Natural Language Understanding
Processes casual Bahasa Indonesia including slang, abbreviations, and typos. "kopsu 2 less sugar" is parsed correctly.
Multi-Intent Detection
Identifies ordering, browsing, complaints, reservations, and info queries from a single message.
Entity Extraction
Extracts items, quantities, names, table numbers, fulfillment method, and payment preference from free text.
Session + Profile Memory
Maintains multi-turn conversation state and remembers returning customer preferences across sessions.
Function Calling
Executes backend scripts for distance calculation, payment generation, order history lookup, and menu queries.
Prompt Injection Defense
Explicit rejection rules for manipulation attempts. Restricted file access and command execution boundaries.
Projected Token Consumption
Estimated MiMo API usage based on current production traffic patterns.
Tokens per Order
Average context bootstrap (system prompt + menu schema + customer profile + conversation history) per interaction.
Max Output Tokens
Configured maximum output for complex multi-tool responses including order state writes and payment generation.
Turns per Order
Average conversation length from greeting to payment confirmation. One-shot orders complete in 3 turns.
Context Window
Full session context including conversation history, tool outputs, and agent memory for complex multi-turn flows.
Technology Stack
Ready for MiMo Integration
This system is architecturally prepared for model swap. The OpenClaw runtime supports any OpenAI-compatible API endpoint.
Try the Live System