Distributed Colony-Level Algorithm Switching for Robot Swarm Foraging

Swarm robotics utilizes a large number of simple robots to accomplish a task, instead of a single complex robot. Communications constraints often force these systems to be distributed and leaderless, placing restrictions on the types of algorithms which can be executed by the swarm. The performance of a swarm algorithm is a(cid:27)ected by the environment in which the swarm operates. Di(cid:27)erent environments may call for di(cid:27)erent algorithms to be chosen, but often no single robot has enough information to make this decision. In this paper, we focus on foraging as a multi-robot task and present two distributed foraging algorithms, each of which performs best for di(cid:27)er-ent food locations. We then present a third adaptive algorithm in which the swarm as a whole is able to choose the best algorithm for the given situation by combining individual-level and distributed colony-level algorithm switching. We show that this adaptive method combines the bene(cid:28)ts of the other methods, and yields the best overall performance.


Introduction
The performance of a robot swarm algorithm is aected by the environment in which it operates.In a search task for example, the location and number of search targets and presence or absence of obstacles could aect the eciency of the swarm.One algorithm may be fast but fail in the presence of obstacles, and a slower one may be more resilient.Because the specics of the environment are generally not known before hand, we would like the swarm itself to be able to intelligently change its own algorithm based on the environment.
Harvard University {nhoff, rjwood, rad}@eecs.harvard.eduAn interesting challenge is whether a robot swarm, as a whole, can assess the success or failure of an algorithm and, as a whole, switch algorithms to increase its success.
Colony-level algorithm switching is dicult for two reasons.First, the information on which the colony will base its decision is distributed throughout the environment and not detectable by any one robot.Therefore, the environment detection must be distributed.Second, if a swarm algorithm relies on strong coordination, then switches must be nearly unanimous and synchronized.Changing algorithms will not help unless all individuals change to the same algorithm at the same time.Both of these actionsenvironment sensing and algorithm switchingmust truly take place on the colony level, as opposed to the individual level.
In this work, we focus on the problem of foraging for robot swarms.Foraging algorithms enable a collection of robots to search a space for a goal (the `food'), then return it incrementally to the nest.We assume robots with simple sensing and communication capabilities that can exchange simple messages (a few bits) directionally with other robots within a short range.Since the robots do not have global localization or odometry, coordination within the swarm is essential for performing the task eciently.
We present three distributed foraging algorithms.All of the algorithms are based on a few core sub-algorithms which the swarm intelligently switches between in various combinations and at various times.The rst is a gradient based method in which robots form a stationary beacon eld around the nest to create gradients to the nest and food.The second is an areasweeping algorithm (called sweeper) in which robots use virtual forces to coordinate and form a line which sweeps the world.Finally, in the third algorithm (adaptive), the colony cooperates to detect when one algorithms is failing and switches to another, thus increasing performance.
To evaluate the algorithms, we place food at varying distances from the nest.We nd that the gradient algorithm works fast but only in a short range, and the sweeper algorithms is slower but has a larger range.We show that the adaptive algorithm is capable of high-level switching, and that the swarm is able to choose the algorithm best suited to the current food location.

Related Work
Gradient-based Foraging Gradient-based foraging methods create a gradient leading to the goal, using the sensing capabilities of the robot, such as chemical sensing or communication.Algorithms using many dierent types of sensing capabilities have been studied.Algorithms exist for robots with global positioning and global communication [18], robots that use physical marks to leave a trail [9,11], robots that use a pre-deployed sensor network [10], and robots that use deployable beacons [6].Our work focuses on robots with directional communication.Payton et.al.have developed an algorithm in which each robot can receive messages in a small radius, and use this to create a virtual pheromone.Our gradient algorithm [8] is similar to [5], [19], and [21] in that directional communication is used to transmit relative position information to establish the gradient.Networking researchers would also recognize the gradient algorithm as being very similar to hop-count routing [22].
Area-Sweeping Foraging In this algorithm, inter-robot virtual forces are used to make movement decisions which cause the swarm to adapt a line shape.Spears studied physics-based control of vehicle swarms, with attractive and repulsive forces forming a lattice of vehicles [20].In the eld of sensor networks, Howard et.al.have used potential elds to achieve dispersion of nodes [7].The closest work to our application is Balch's notion of social potentials [3].Social potentials involve robots navigating to a goal while remaining in a formation, feeling virtual forces based on the position of the goal and the relative positions of other robots.These algorithms are focused on maintaining a formation.We use a similar concept in which robots feel virtual forces, and we show how to use the formation to search the region.
We use this to develop a virtual forces-based foraging method.
Algorithm Switching Multiple behaviors within a single algorithm are wellknown inside the swarm algorithm community [16].Matari¢ and Arkin have worked extensively in behavior-based robotics [2,4].It is common for individual robots to switch between the behaviors of food collecting, obstacle avoidance, and resting, for example.Parker has studied distributed consensus in a swarm setting, using it to enable the swarm to move between subtasks in an overall task [17].We focus on foraging, and do not use formal distributed consensus (or assume that our robots are `well-stirred ').McLurkin has developed a large range of robot swarm behaviors [14] as well as dynamic task assignment methods for individual robots within a swarm [13].These methods focus on individuals, whereas we need a method for the swarm as a whole to switch algorithms.
The central contribution of this work is an adaptive foraging method in which the swarm makes colony-level decisions based on distributed information, choosing the algorithm best suited to the given food location.

Robot and Task Model
For our robots, we use a simple model inspired by recent swarm robot hardware, such as the E-Puck [15] (shown in gure 1b), the RBZ communication board (an E-Puck extension described in [1]), McLurkin's SwarmBots [12], and Payton's pherobots [5].We assume a simple non-holonomic robot that moves and turns in continuous space.Each robot has sensors for nest, food, and obstacles in direct proximity to the robot.The sweeper algorithm also requires two of the robots to have compasses.Each robot can communi- For the foraging task, we assume a world with a nest in the middle and one unlimited food source placed randomly.The swarm must nd the food, then begin returning food units to the nest.Robots can pick up / drop food units when in direct proximity to the food / nest.Because the robots have no direct position measurement system, they must coordinate in order to maintain and share information about their own position and the food position.
To test the algorithms, we developed a continuous-world multi-robot simulator.The simulator models robots, food, and the nest, along with proper movement and interactions (collisions, communication).We chose a continuous world model over a gridded world environment so that the algorithms would face real problems such as collisions and congestion.A snapshot of the simulator is shown in gure 1c.

Algorithm Description
In this section, we will describe the three algorithms and the means by which algorithm switches are made at the individual-and colony-level.Figure 2 diagrams the relationship between the algorithms and their parts.

Gradient Algorithm
Robots need a way to navigate to the food and the nest.The gradient algorithm provides this information using two gradientsone leading to the nest, and a second leading to the food once it is found.To implement these gradients, some robots decide to stop their normal food searching and become xed  beacons.These beacons transmit two numbers (one for each gradient), which the remaining robots can use to navigate.As the swarm expands outward from the nest, the beacon network expands and creates the nest gradient.
Once the food is found, the food gradient is developed.At that point, robots can navigate to either location to eciently return food.(See also [8].)

Local Description
Beacon: Robots acting as beacons are stationary, and broadcast two numbers, called nestGradient and foodGradient.They listen for all other beacons in their communication range and record the nestGradient and food-Gradient of each one.Beacons nd the minimum of all nestGradient values they have received, increment that by one, and take that as their own nest-Gradient.An analogous procedure is used to calculate foodGradient.These new values are then broadcast by the beacon.Any beacon directly next to the nest/food broadcasts a 1 for its nestGradient/foodGradient.
If a beacon has no information about its distance from the food (as happens early in the run, before the food has been found), it broadcasts 0. The value of 0 is treated speciallywhen a beacon hears a 0, it does not include it in its normal `minimum plus one' calculation.
Walker: Walker robots always attempt to navigate either to the food or the nest, depending on whether they have food.In either case, a walker measures the bearing to the minimum gradient value toward the target of interest, and moves in that direction.If a walker has no information about where it should go (it can only hear 0), it does a random walk.
Beacon to Walker transition: If a beacon robot can detect more than 4 other beacons, it will become a walker robot with a 20% chance.This Walker to beacon transition: A walker robot will decide to become a beacon if it can only detect 1 or 2 other beacons.

Global Behavior
All robots start as walkers clustered around the nest.Some of them will decide to become beacons immediately because initially, there are no beacons.The remaining walkers will begin searching for the food.There will be no food-Gradient (it will all be 0), so they will random walk.As they wander away from the nest, some will decide to become beacons, and the beacon eld will expand away from the nest.Eventually one of the robots could stumble across the food, and would then begin transmitting 1 for its foodGradient, causing the food gradient to form.At this point, any walker can listen to the beacons near it and know how far it is from the food and how to get there.All the walkers immediately start moving directly toward the food.As walkers pick up food, they use the gradient eld to bring it to the nest.Figure 3a illustrates an example snapshot of the gradient algorithm.

Sweeper Algorithm
A dierent strategy for search or foraging involves individuals forming a search front and systematically sweeping an area to nd an object.Here we describe an algorithm that uses virtual forces to form a line of robots extending from the nest that sweeps the world like the hand of a clock.When the line nds food, some fraction of the robots remain as beacons while others act as walkers to return the food.
Fundamentally, this strategy creates a 1D structure of robots (roughly a line), as opposed to the gradient strategy which creates a 2D structure of robots (roughly a circle).The 1D structure is expected to be able to sweep a larger area than could be `lled in' with the same number of robots

Local Description
Normal: In the sweeper algorithm, all robots are always transmitting.Each robot measures the range and bearing to all the other robots in its communication range.Based on the position of each other robot, it calculates a virtual force on itself.For each robot detected at relative position − → r , this force is where r c is the communication range of the robot and a and b are empirically chosen constants.In other words, the force is similar to what would be experienced if there were a virtual spring between the robots.The robot sums the virtual forces from each other robot in its communication range, and moves in that direction by an amount proportional to the magnitude of the force.
There is one special case: two robots directly next to the nest never move regardless of the virtual forces on them.
While the robots are calculating forces and moving, they are simultaneously using communication to establish a gradient eld similar to the one described in the gradient algorithm.This does not require extra communication.The data content of the signal encodes the two gradient values, and range and bearing to the transmitter are used to calculate the virtual forces.
The robot treats the nestGradient exactly as before, updating it using the min + 1 algorithm.foodGradient is treated slightly dierently.Any time a robot sees a non-zero foodGradient, it temporarily stops executing the sweeper algorithm and switches to gradient.If the foodGradient returns to zero, the robot returns to executing the sweeper algorithm.
Puller: Two robots are pre-determined to be puller robots.These participate in the virtual forces system described above, but they also feel one additional force.These two robots must use their compasses to measure the relative bearing to north (the unit vector N ), then put a virtual `clock' force on themselves equal to where R (θ) is simply the 2D rotation matrix, R (θ) = cos (θ) − sin (θ) sin (θ) cos (θ) − → F c is a force which simply rotates around like the hand of a clock as time t increases.The parameter c is an empirically chosen magnitude and T sets the period of the rotation.T is determined by the puller, and can be changed at any time based on the puller's nestGradient value.

Global Behavior
When the sweeper algorithm begins, all robots calculate forces and begin moving appropriately.Initially, repulsive forces cause the swarm to expand into a tight clump around the nest.The pullers will be forced to the edge of the pack and a line of robots will form extending from the nest to the pullers.This line of robots will rotate as the pullers pull it around, sweeping around the world like the hand of a clock.When the line encounters food, it stops moving and the swarm returns the food using the gradient algorithm, with walkers moving along the line of robots already established.When the food source is exhausted, the forces resume and the line keeps sweeping.
One can imagine that longer lines (with more robots) would need to rotate slower than shorter lines.Hard-coding the period T would require knowing the number of robots in the swarm, which is not a scalable solution.Instead, the pullers set T based on their nestGradient.A higher value for nest-Gradient indicates a long line, so the puller will choose a larger T .Figure 3b illustrates an example snapshot of the sweeper algorithm.

Adaptive Algorithm
The rst two algorithms, gradient and sweeper, each have strengths and weaknesses.Gradient operates in a short range but is fast, while sweeper has a longer range but is much slower.(This is quantied in section 4.) The adaptive algorithm combines the benets of these two algorithms, along with random walk, by trying each one in sequence and choosing the best one for a given situation.It rst tries gradient, which would work well if the food is near enough to use it.If not, it switches to sweeper to get food further away.If it still doesn't nd the food, it switches to the last resort random walk.
Switches between these three algorithms are made in a distributed manner at the colony level.To accomplish this, a third gradient is included.

Local Description
Robots begin with an algorithm very similar to the gradient algorithm above.
They are split between walker and beacon robots as before, but they maintain three gradients as opposed to two.The third one measures how far each beacon is from any walker robot.This requires all walker robots to transmit a single bit of information indicating their presence and identity as a walker.
The beacons then transmit a 1 for the walkerGradient if they can see a walker, and min + 1 if they can not see a walker.Other than this, they execute the gradient algorithm exactly as described above.
To implement adaptive foraging, robots need to explicitly detect when to change algorithms.If a robot does not see any (non-zero) walkerGradient for several time steps in a row, it will completely switch algorithms from gradient to sweeper.Thereafter, if the line of robots has swept the world twice and still has not found food (as evaluated by the pullers), the pullers send a signal through the beacon network causing every robot to again switch algorithms to random walk.

Global Behavior
The swarm will begin executing the gradient algorithm.There are many walkers at the beginning of the execution, so the values for walkerGradient are all fairly low.If the swarm nds the food, it returns it as usual, and the walkerGradient remains irrelevant.If, however, the swarm expands to the point that all robots have become beacons and the swarm has still not found the food, then there will be no walkers left.When the last walker becomes a beacon, suddenly no beacon anywhere in the swarm has information on which to broadcast a walkerGradient, so all walkerGradient values revert to 0. A short time later, they all decide nearly-simultaneously to switch algorithms and begin the sweeper algorithm.From this point on, the swarm proceeds as normal as if it had just begun the sweeper algorithm, pulling out a line of robots and sweeping the world.Once this line has swept around the world twice without nding food, the swarm switches to random walk, which is the only option left.Random walk does not involve coordination, which relieves the robots of the requirement of staying near each other.This is the only way to get food so far away.

Performance
These algorithms were tested in a continuous-world multi-agent simulator (screenshot in gure 1c).An unlimited food source was placed at varying distances from the nest, with a swarm of 20 robots trying to nd and retrieve it (as diagrammed in gure 4a).Since we are focusing on algorithm switching and the environmental impact, we chose to experiment with a xed number of robots.In the future, we will study scalability of the individual algorithms more closely.
We assessed the performance of the algorithms using three simple metrics: (1) whether or not the swarm found the food, (2) how quickly it found the Outside r 3 , even the sweeper algorithm fails because the line of robots can not reach that far.In r 4 , the adaptive algorithm switches to random walk.This works poorly (∼ 20% success rate, slow to locate and return food), but beyond about 3m, it is the only method capable of nding any food at all.
In every region, the adaptive algorithm is able to choose the most appropriate foraging method.In r 2 it runs the gradient method, in r 3 it runs the sweeper method, and in r 4 it runs random walk.
Overall Assessment As an overall assessment of each algorithm, we can place food in an unknown random location, and measure the performance.
In the table below, r 1 , r 2 , or r 3 indicates that the food is placed randomly anywhere in those three regions, and whole world indicates that the food is randomly placed anywhere.The numbers are averages over all placements.
For example, if the food is randomly placed anywhere and the sweeper algorithm is running, the swarm can be expected to nd it 32% of the time, after an average of 7300 time steps with a standard deviation of 3500 time steps.
(One time step roughly corresponds to one second.)In r 1 , r 2 , or r 3 , the adaptive algorithm nds the food almost as often as the sweeper algorithm, but does so faster.This is because it is able to take advantage of the speed of the gradient method when the food is nearby.When very distant food locations are included (whole world), all algorithms suer lower success, but the adaptive algorithm is able to use random walk to at least achieve some success in a very long time.

Algorithm Switching Generalizations
Algorithms other than those presented in this paper could potentially be combined into a single adaptive algorithm using the same method.The critical requirement is the connectedness of the communication network.This section will discuss several generalizations we can draw about colony-level algorithm switching, beyond the specic cases discussed in this paper.
There are two kinds of algorithm switches: individual switches and colony switches (see gure 2), with two main dierences between the types.First, individual switches are made by a particular robot based on the information it can perceive, whereas colony-level switches require information which is distributed throughout the environment and not directly perceivable by every robot making the decision.Second, colony-level switches must be nearly synchronized, whereas individual switches need not be explicitly coordinated.5a shows the success rate of each algorithm.Figure 5b shows the time at which food was rst found.Figure 5c shows the rate at which food is returned to the nest once it is found.Each point represents an average of 100 runs, and the error bars indicate one standard deviation.Note the log scales in the second two plots.
To achieve a colony-level algorithm switch, information must be shared throughout the swarm, because each robot requires global-level knowledge in order to decide to switch algorithms.For example, when the adaptive algorithm switches from gradient to sweeper, each robot must be aware that no walkers are left, but no single robot is capable of perceiving this.This global information is detected and shared through the beacon network.Because global information is required for colony-level switches, maintaining the connectedness of the beacon network is critical for these switches.
We can distinguish three types of information: information type example from this paper directly perceivable sense a beacon ahead directly perceivable by another robot someone found food only perceivable by swarm as a whole swarm has expanded to max size Individual switches can be made solely based on information of the rst type.Colony switches require the second and third types, which requires a beacon network.There is no robot in the swarm with a sensor capable of detecting the third type of information; it can only be detected by the swarm as a whole through cooperation.A connected network is critical for detecting and transmitting this information.

Conclusion
We have presented two distributed foraging algorithms which perform best under dierent food locations, and a third method in which the swarm as a whole can choose the best algorithm for the given situation.The gradient algorithm can return nearby food quickly, and the sweeper algorithm can nd food further away but is much slower.The adaptive algorithm uses the gradient, sweeper, and random walk methods, detecting in a distributed manner if one has failed and switching to the next.For food in any region of the world, the adaptive method is able to choose the most appropriate algorithm.
Colonylevel algorithm switching requires communication, but can combine benets of multiple algorithms and improve overall performance.
There are several possible improvements and expansions planed for the future, both in hardware and software.Although these algorithms are designed to be scalable, scalability will be tested experimentally.Second, we will consider methods by which swarms could switch algorithms depending on dynamic environments (which could require them to switch back to previously tried algorithms).Finally, we will conduct a hardware study on the eect of sensor / communication capability on algorithm possibilities and performance.This study will use the E-Puck robots and IR communication rings described earlier.

Fig. 1 :
Fig. 1: The communication and sensing capabilities of the robots (1a), the E-Puck robot with communication ring (1b), and a simulation snapshot with a blowup for detail (1c).

Fig. 2 :
Fig. 2: This gure describes the relationships between the gradient, sweeper, and adaptive algorithms, and their switches.The start arrows indicate how each algorithm begins.Bold arrows indicate colony-level switches and dotted arrows indicate individual-level switches.

Fig. 3 :
Fig. 3: Example situations in the gradient algorithm (a) and the sweeper algorithm (b).Walkers in (a) are not shown to reduce clutter.The gray contours roughly depict the gradient to the nest.
Fig.5: Figure5ashows the success rate of each algorithm.Figure5b shows 2 , or r 3