통합 참고문헌 (References)

62 references

[1] Huang, W. et al., "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents," arXiv:2201.07207, 2022. scholar

[2] Ahn, M. et al., "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances," arXiv:2204.01691, 2022. scholar

[3] Liang, J. et al., "Code as Policies: Language Model Programs for Embodied Control," arXiv:2209.07753, 2022. scholar

[4] Driess, D. et al., "PaLM-E: An Embodied Multimodal Language Model," arXiv:2303.03378, 2023. scholar

[5] Brohan, A. et al., "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control," arXiv:2307.15818, 2023. scholar

[6] Chi, C. et al., "Diffusion Policy: Visuomotor Policy Learning via Action Diffusion," arXiv:2303.04137, 2023. scholar

[7] Liu, Z. et al., "REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction," arXiv:2306.15724, 2023. scholar

[8] Kim, M. J. et al., "OpenVLA: An Open-Source Vision-Language-Action Model," arXiv:2406.09246, 2024. scholar

[9] Ghosh, D. et al., "Octo: An Open-Source Generalist Robot Policy," arXiv:2405.12213, 2024. scholar

[10] Black, K. et al., "π0: A Vision-Language-Action Flow Model for General Robot Control," arXiv:2410.24164, 2024. scholar

[11] Li, X. et al., "Evaluating Real-World Robot Manipulation Policies in Simulation (SIMPLER)," arXiv:2405.05941, 2024. scholar

[12] Brohan, A. et al., "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents," arXiv:2401.12963, 2024. scholar

[13] Shah, M. et al., "BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation," arXiv:2410.06237, 2024. scholar

[14] Wang, Z. et al., "KARMA: Augmenting Embodied AI Agents with Long-and-short Term Memory Systems," arXiv:2409.14908, 2024. scholar

[15] Rana, K. et al., "SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning," arXiv:2307.06135, 2023. scholar

[16] Fu, M. et al., "CaP-X: A Framework for Benchmarking and Improving Coding Agents for Robot Manipulation," arXiv:2603.22435, 2026. scholar

[17] Xie, Q. et al., "Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation," arXiv:2409.18313, 2024. scholar

[18] Ekpo, D. et al., "VeriGraph: Scene Graphs for Execution Verifiable Robot Planning," arXiv:2411.10446, 2024. scholar

[19] Chen, Y. et al., "AutoTAMP: Autoregressive Task and Motion Planning with LLMs as Translators and Checkers," arXiv:2306.06531, 2023. scholar

[20] Survey, "A Survey on Large Language Models for Automated Planning," arXiv:2502.12435, 2025. scholar

[21] Chen, Y. et al., "Code-as-Symbolic-Planner: Foundation Model-Based Robot Planning via Symbolic Code Generation via Symbolic Computing," arXiv:2503.01700, 2025. scholar

[22] RL-GPT, "Integrating Reinforcement Learning and Code-as-Policy," arXiv:2402.19299, 2024. scholar

[23] Mikami, Y. et al., "Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs," arXiv:2403.13801, 2024. scholar

[24] Lang4Sim2Real, "Natural Language Can Help Bridge the Sim2Real Gap," arXiv:2405.10020, 2024. scholar

[25] Huang, W. et al., "Language Models as Zero-Shot Planners," arXiv:2201.07207, 2022. scholar

[26] Open X-Embodiment Collaboration, "Open X-Embodiment: Robotic Learning Datasets and RT-X Models," arXiv:2310.08864, 2023. scholar

[27] Physical Intelligence, "π0.5: A Vision-Language-Action Model with Open-World Generalization," arXiv:2504.16054, 2025. scholar

[28] NVIDIA, "GR00T N1: An Open Foundation Model for Generalist Humanoid Robots," arXiv:2503.14734, 2025. scholar

[29] FAST, "Efficient Action Tokenization for Vision-Language-Action Models," arXiv:2501.09747, 2025. scholar

[30] TinyVLA, "TinyVLA: Towards Fast and Data-Efficient Vision-Language-Action Models," arXiv:2409.12514, 2024. scholar

[31] Khazatsky, A. et al., "DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset," arXiv:2403.12945, 2024. scholar

[32] "What Matters in Building Vision-Language-Action Models," arXiv:2412.14058, 2024. scholar

[33] Belkhale, S. et al., "RT-H: Action Hierarchies Using Language," arXiv:2403.01823, 2024. scholar

[34] Shi, L. X. et al., "Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models," arXiv:2502.19417, 2025. scholar

[35] Li, J. et al., "HAMSTER: Hierarchical Action Models for Open-World Robot Manipulation," arXiv:2502.05485, 2025. scholar

[36] Ke, T. et al., "3D Diffuser Actor: Policy Diffusion with 3D Scene Representations," arXiv:2402.10885, 2024. scholar

[37] Jiang, H. et al., "RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation," arXiv:2402.15487, 2024. scholar

[38] MoMa-LLM, "Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation," arXiv:2403.08605, 2024. scholar

[39] 3D-Mem, "3D Scene Memory for Embodied Exploration and Reasoning," arXiv:2411.17735, 2024. scholar

[40] RoboMemory, "RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Interactive Environmental Learning in Physical Embodied Systems," arXiv:2508.01415, 2025. scholar

[41] PragmaBot, "A Pragmatist Robot: Learning to Plan Tasks by Experiencing the Real World," arXiv:2507.16713, 2025. scholar

[42] Yardi, Y. et al., "Bridging the Sim2Real Gap: Vision Encoder Pre-Training for Visuomotor Policy Transfer," 2025. scholar

[43] Chen, Y. et al., "Foundation Model-Based Robot Planning via Symbolic Code Generation for TAMP," arXiv:2503.01700, 2025. scholar

[44] Shah, M. et al., "BUMBLE: Unifying Reasoning and Acting with VLMs for Building-wide Mobile Manipulation," arXiv:2410.06237, 2024. scholar

[45] The Register, "Claude Code's innards revealed as source code leaked online," theregister.com, April 2026. scholar

[46] MindStudio, "Claude Code Source Leak: The Three-Layer Memory Architecture and What It Means for Builders," mindstudio.ai/blog, 2026. scholar

[47] Rajiv Pant, "How Claude's Memory Actually Works (And Why CLAUDE.md Matters)," rajiv.com/blog, December 2025. scholar

[48] Penligent, "Inside Claude Code: The Architecture Behind Tools, Memory, Hooks, and MCP," penligent.ai, 2025. scholar

[49] VentureBeat, "Claude Code's source code appears to have leaked: here's what we know," venturebeat.com, 2026. scholar

[50] Anthropic, "Claude Code Best Practices," anthropic.com/engineering, 2025. scholar

[51] OpenAI, "Introducing Codex," openai.com/index/introducing-codex, May 2025. scholar

[52] OpenAI, "Introducing the Codex App," openai.com/index/introducing-the-codex-app, February 2026. scholar

[53] OpenAI, "Introducing upgrades to Codex," openai.com/index/introducing-upgrades-to-codex, 2026. scholar

[54] Wikipedia, "OpenAI Codex (AI agent)," en.wikipedia.org, 2026. scholar

[55] Morphllm, "Claude Code as Orchestrator: Inter-Agent Communication Protocols," morphllm.com, 2026. scholar

[56] Morphllm, "Claude Code Subagents: How They Work, What They See & When to Use Them," morphllm.com, 2026. scholar

[57] Paddo.dev, "Claude Code Auto-Fix: The PR That Fixes Itself," paddo.dev/blog, 2026. scholar

[58] Springer, "Agentic AI: A Comprehensive Survey of Architectures, Applications, and Future Directions," Artificial Intelligence Review, 2025. scholar

[59] Anthropic, "2026 Agentic Coding Trends Report," resources.anthropic.com, 2026. scholar

[60] Claude Code Docs, "Create custom subagents," code.claude.com/docs/en/sub-agents, 2026. scholar

[61] Claude Code Docs, "How Claude remembers your project," code.claude.com/docs/en/memory, 2026. scholar

[62] Dbreunig, "How Claude Code Builds a System Prompt," dbreunig.com, April 2026. scholar

감사의 글

이 책은 LLM 기반 로봇 계획에서 에이전틱 로보틱스까지의 연구 흐름을 추적하는 서베이입니다. Agentic Coding과 Agentic Robotics의 근본적 차이를 분석하여 Physical AI의 미래 방향을 제시합니다.

이 서베이는 고려대학교 최성준 교수님(Prof. Sungjoon Choi)과 김찬우 박사과정(Chanwoo Kim)에게 감사드립니다. 김찬우님의 세미나 발표에서 영감을 받았으며, 그의 세미나 레퍼런스 논문들을 기초로 제작되었습니다.

이 프로젝트는 황민호님의 Harness 스킬을 이용하여 제작되었습니다.

이 저작물의 제작에 AI 도구가 활용되었습니다. 문헌 조사, 콘텐츠 생성, 원고 작성에 Claude(Opus 4.6)를 사용하였습니다.