A team of researchers from Sungkyunkwan University (SKKU) has introduced FCGraft (Functional Cache Grafting), a novel framework that dramatically accelerates how large language models generate control policies for embodied agents. The paper has been accepted at ICML 2026, one of the top conferences in machine learning.
The fundamental challenge facing code-writing LLMs (CodeLLMs) in robotics is twofold: generating policies for open-domain environments requires lengthy prompts that cause slow decoding due to repetitive prefill computation, and fully generative decoding often produces API mismatches, missing safety guards, and unstable control logic.
FCGraft addresses these limitations through a cache-grafting architecture. The framework maintains a library of function-level validated code skeletons along with their associated prompt-level Transformer key-value (KV) caches. When a new task is provided, FCGraft retrieves relevant functions and grafts their KV caches into a composite policy through two mechanisms: stitching (composing cached segments) and patching (locally adapting code regions).
By eliminating redundant prefill computation across similar tasks, FCGraft significantly reduces generation latency. More importantly, reusing validated control structures dramatically improves robustness. In experiments, FCGraft achieved an 18.31% higher task success rate compared to prompt-level caching methods like RAGCache, while delivering 2.3x faster policy synthesis.
FCGraft represents a significant step toward making CodeLLM-based policy generation practical for real-world robotic applications where both speed and safety are critical.
