How We Built an Agentic AI Autonomous Drone Fleet, Never Stopping to Ask If We Should
The LLM Evolution
AI has evolved since we first wrote about fledging efforts to utilize an LLM to interface with a robot. Open-source LLMs in different sizes are available for fine-tuning, and for hosting in a robot with enough brains. What hardware qualifies as enough robot brains? As of this writing, a Raspberry Pi and Jetson Orin both can host a small LLM (or SLM, if you will). We sell reference hardware with both.
Agents are a relatively new field of research, and useful in some cases. For example, you ask an LLM to write you a computer program, but when you test the output, it doesn’t work. You have to tell it what problems you see during testing, and it tries again, several times until it succeeds. What if instead you set up several programs – agents – that work together to do that for you: the programmer agent, the compiler agent that makes sure the program compiles and runs without error, and the testing agent that checks the output. They may utilize the same LLM or different ones, but each agent has its own prompt, and sometimes its own memory. Research has shown that – again in some cases – agents working together can outperform a single LLM in quality, although they often take longer to due to going through the iterations. The mission needs only a small human effort and let the automatons do the rest.
Agents are an Area of Active Research
arXiv, the website where you can go to read (or publish) papers when you don’t feel like attending a conference, is full of activity on AI agents: 2,645 as of this writing. This is because while AI agents show promising results, no is sure yet how best to utilize them. The number of papers publish about AI agent drones is 8, which means either we got in early on something that’s going to be really big or we’re barking up the wrong tree. Several of the current AI agent research areas can be directly applied to drones include:
- Multi-Agent Collaboration: Robots can be developed to work as a team, sharing tasks and coordinating in dynamic environments, such as in manufacturing, autonomous delivery, and swarm robotics for search and rescue operations.
Reinforcement Learning with Real-World Constraints: This research is essential for robotic applications that need to adapt to varying physical environments, enabling robots to learn how to perform tasks like object manipulation, walking, or navigation more effectively. - Adaptive AI Agents: Applying this to robots allows them to change their behavior or strategies as they encounter new or shifting environments, making them more versatile for outdoor exploration or rapidly changing indoor environments.
- Explainability and Transparency: For robots deployed in sensitive areas like healthcare or autonomous vehicles, having the ability to explain their actions and decisions is crucial for trust and safety.
These are some of the areas that we work to apply to our platform. Current tools assume the developer will set up a static set of agents. Deciding how to configure the agents feels like the hyperparameter tuning of deep neural networks writ large, except that instead of testing accuracy, precision, etc. on holdout datasets, in our case we evaluate one or more drones’ ability to achieve objectives. Maybe it’s time to publish on arXiv.
Drone Agent Options
There are several ways to utilize LLM agents for a fleet of autonomous drones, and it’s an area of active research, but just like the fine-tune vs RAG debate, there is no verdict yet. Current options include:
- Task Execution and Planning: LLMs can be integrated to generate plans for drones in real time. The TypeFly system, for instance, uses an LLM to write plans in a custom language called MiniSpec, allowing drones to handle tasks in dynamic environments. This method avoids some of the challenges of traditional LLM planning, such as verbose or error-prone output, by focusing on more efficient code generation .
- Collaborative Fleets: One of the most promising areas involves coordinating multiple drones in collaborative tasks. Research highlights that autonomous drone fleets can share data via mesh wireless communication networks, enabling them to complete complex inspections or monitoring tasks over large areas without direct human intervention. This approach allows drones to communicate and adjust to environmental factors in real time, improving mission success rates .
- LLM-Based Command and Control: LLMs can be used as high-level decision-makers for drone fleets. By processing human language input, they interpret complex instructions, adapt to unexpected situations, and modify their strategies accordingly. This integration enables drones to execute commands more flexibly and respond to both environmental changes and operator requests.
Astral’s cloud service offer various AI features that include multimodal agentic LLM reasoning for your fleet. And Astral’s docker image includes an LLM for localized inference, which can be useful when a drone does not have internet connectivity. In such a case where the fleet is heterogeneous, one or more “motherships” can do the thinking for the lesser drones nearby, all connected to the mothership WiFi network, carrying out the fleet operator’s wishes until the desired outcome is achieved, without the need for pilots or constant guidance. This is real collaboration between humans and robots, and it brings a collective tear to our eyes.