Daimon Robotics and China Mobile have announced a landmark partnership to build a distributed embodied intelligence data collection network, leveraging China Mobile's vast network of hundreds of thousands of offline stores nationwide.
The first pilot base has been established in Chenzhou, Hunan province — positioned as the world's first "5S store" for embodied data collection. Set to begin regular operations on July 15, the facility combines exhibition, data collection training, equipment supply, pre-sales and after-sales service, and data-model-scenario collaboration under one roof.
The collaboration targets the most critical bottleneck holding back embodied intelligence today: the extreme scarcity of high-quality real-machine data. Industry estimates suggest the current stock of usable real-world data in embodied intelligence is only about 500,000 hours, yet reaching a delivery-ready standard for a single skill requires 2,000 to 5,000 hours of training data.
Under the plan, ordinary citizens can undergo short-term training and then don two-finger grippers, tactile gloves, and head-mounted cameras to become data collectors across five major scenarios including home, logistics, and manufacturing environments. The project will initially deploy 1,000 sets of equipment, with annual output expected to reach 1 million hours of real-world scenario data at full capacity.
This crowdsourcing model addresses the data crisis from multiple angles: it dramatically reduces collection costs compared to professional teleoperation (which can cost hundreds of dollars per hour), it provides unmatched scenario diversity — 10,000 collectors mean 10,000 different room layouts and lighting conditions — and it creates a sustainable commercial loop where collectors get paid, tech giants gain valuable data, and the industry advances.
The Daimon-China Mobile initiative follows a similar crowdsourcing push by JD.com, which in March announced plans to build the world's largest embodied intelligence data collection center in Suqian, mobilizing over 600,000 people and targeting 5 million hours of real-world human video data within one year.
However, industry experts caution that crowdsourcing alone cannot solve all data challenges. Issues of data quality consistency, precision ceilings due to human physiological limitations, and lack of unified data standards across companies remain significant hurdles. The ultimate vision is a "data pyramid" architecture where crowdsourced data forms the broad base, with higher-precision teleoperation and real-machine data occupying the upper layers.
