The tools module contains the operators for evolutionary algorithms. They are used to modify, select and move the individuals in their environment. The set of operators it contains are readily usable in the Toolbox. In addition to the basic operators this module also contains utility tools to enhance the basic algorithms with Statistics, HallOfFame, Checkpoint, and History.
The operator set does the minimum job for transforming or selecting individuals. This means, for example, that providing two individuals to the crossover will transform those individuals in-place. The responsibility of making offspring(s) independent of their parent(s) and invalidating the fitness is left to the user and is generally fulfilled in the algorithms by calling toolbox.clone() on an individuals to duplicate it and del on the values attribute of the individual’s fitness to invalidate to invalidate this last one.
Here is a list of the implemented operators in DEAP.
Call the function container with a generator function corresponding to the calling n times the function func.
This helper function can can be used in conjunction with a Toolbox to register a generator of filled containers, as individuals or population.
>>> initRepeat(list, random.random, 2)
...
[0.4761..., 0.6302...]
Call the function container with an iterable as its only argument. The iterable must be returned by the method or the object generator.
This helper function can can be used in conjunction with a Toolbox to register a generator of filled containers, as individuals or population.
>>> from random import sample
>>> from functools import partial
>>> gen_idx = partial(sample, range(10), 10)
>>> initIterate(list, gen_idx)
[4, 5, 3, 6, 0, 9, 2, 7, 1, 8]
Call the function container with a generator function corresponding to the calling n times the functions present in seq_func.
This helper function can can be used in conjunction with a Toolbox to register a generator of filled containers, as individuals or population.
>>> func_seq = [lambda:1 , lambda:'a', lambda:3]
>>> initCycle(list, func_seq, 2)
[1, 'a', 3, 1, 'a', 3]
Generate an expression where each leaf has a the same depth between min and max.
Generate an expression where each leaf might have a different depth between min and max.
Execute a two points crossover on the input individuals. The two individuals are modified in place. This operation apply on an individual composed of a list of attributes and act as follow
>>> ind1 = [A(1), ..., A(i), ..., A(j), ..., A(m)]
>>> ind2 = [B(1), ..., B(i), ..., B(j), ..., B(k)]
>>> # Crossover with mating points 1 < i < j <= min(m, k) + 1
>>> cxTwoPoints(ind1, ind2)
>>> print ind1, len(ind1)
[A(1), ..., B(i), ..., B(j-1), A(j), ..., A(m)], m
>>> print ind2, len(ind2)
[B(1), ..., A(i), ..., A(j-1), B(j), ..., B(k)], k
This function use the randint() function from the python base random module.
Execute a one point crossover on the input individuals. The two individuals are modified in place. This operation apply on an individual composed of a list of attributes and act as follow
>>> ind1 = [A(1), ..., A(n), ..., A(m)]
>>> ind2 = [B(1), ..., B(n), ..., B(k)]
>>> # Crossover with mating point i, 1 < i <= min(m, k)
>>> cxOnePoint(ind1, ind2)
>>> print ind1, len(ind1)
[A(1), ..., B(i), ..., B(k)], k
>>> print ind2, len(ind2)
[B(1), ..., A(i), ..., A(m)], m
This function use the randint() function from the python base random module.
Execute a uniform crossover that modify in place the two individuals. The genes are swapped according to the indpb probability.
This function use the random() function from the python base random module.
Execute a partially matched crossover (PMX) on the input individuals. The two individuals are modified in place. This crossover expect iterable individuals of indices, the result for any other type of individuals is unpredictable.
Moreover, this crossover consists of generating two children by matching pairs of values in a certain range of the two parents and swapping the values of those indexes. For more details see Goldberg and Lingel, “Alleles, loci, and the traveling salesman problem”, 1985.
For example, the following parents will produce the two following children when mated with crossover points a = 2 and b = 4.
>>> ind1 = [0, 1, 2, 3, 4]
>>> ind2 = [1, 2, 3, 4, 0]
>>> cxPartialyMatched(ind1, ind2)
>>> print ind1
[0, 2, 3, 1, 4]
>>> print ind2
[2, 3, 1, 4, 0]
This function use the randint() function from the python base random module.
Execute a uniform partially matched crossover (UPMX) on the input individuals. The two individuals are modified in place. This crossover expect iterable individuals of indices, the result for any other type of individuals is unpredictable.
Moreover, this crossover consists of generating two children by matching pairs of values chosen at random with a probability of indpb in the two parents and swapping the values of those indexes. For more details see Cicirello and Smith, “Modeling GA performance for control parameter optimization”, 2000.
For example, the following parents will produce the two following children when mated with the chosen points [0, 1, 0, 0, 1].
>>> ind1 = [0, 1, 2, 3, 4]
>>> ind2 = [1, 2, 3, 4, 0]
>>> cxUniformPartialyMatched(ind1, ind2)
>>> print ind1
[4, 2, 1, 3, 0]
>>> print ind2
[2, 1, 3, 0, 4]
This function use the random() and randint() functions from the python base random module.
Executes a blend crossover that modify in-place the input individuals. The blend crossover expect individuals formed of a list of floating point numbers.
This function use the random() function from the python base random module.
Execute a blend crossover on both, the individual and the strategy. The individuals must have a strategy attribute. Adjustement of the minimal strategy shall be done after the call to this function using a decorator, for example
def checkStrategy(minstrategy):
def decMinStrategy(func):
def wrapMinStrategy(*args, **kargs):
children = func(*args, **kargs)
for child in children:
if child.strategy < minstrategy:
child.strategy = minstrategy
return children
return wrapMinStrategy
return decMinStrategy
toolbox.register("mate", tools.cxEsBlend, alpha=ALPHA)
toolbox.decorate("mate", checkStrategy(minstrategy=0.01))
Execute a classical two points crossover on both the individual and its strategy. The crossover points for the individual and the strategy are the same.
Executes a simulated binary crossover that modify in-place the input individuals. The simulated binary crossover expect individuals formed of a list of floating point numbers.
This function use the random() function from the python base random module.
Execute a one point crossover that will in most cases change the individuals size. This operation apply on an individual composed of a list of attributes and act as follow
>>> ind1 = [A(1), ..., A(i), ..., A(m)]
>>> ind2 = [B(1), ..., B(j), ..., B(n)]
>>> # Crossover with mating points i, j, 1 <= i <= m, 1 <= j <= n
>>> cxMessyOnePoint(ind1, ind2)
>>> print ind1, len(ind1)
[A(1), ..., A(i - 1), B(j), ..., B(n)], n + j - i
>>> print ind2, len(ind2)
[B(1), ..., B(j - 1), A(i), ..., A(m)], m + i - j
This function use the randint() function from the python base random module.
Randomly select in each individual and exchange each subtree with the point as root between each individual.
Randomly select in each individual and exchange each subtree with the point as root between each individual. Since the node are strongly typed, the operator then make sure the type of second node correspond to the type of the first node. If it doesn’t, it randomly selects another point in the second individual and try again. It tries up to 5 times before giving up on the crossover.
Note
This crossover is subject to change for a more effective method of selecting the crossover points.
Randomly select crossover point in each individual and exchange each subtree with the point as root between each individual.
This operator takes another parameter cxtermpb, which set the probability to choose between a terminal or non-terminal crossover point. For instance, as defined by Koza, non-terminal primitives are selected for 90% of the crossover points, and terminals for 10%, so cxtermpb should be set to 0.1.
Randomly select crossover point in each individual and exchange each subtree with the point as root between each individual. Since the node are strongly typed, the operator then make sure the type of second node correspond to the type of the first node. If it doesn’t, it randomly selects another point in the second individual and try again. It tries up to 5 times before giving up on the crossover.
This operator takes another parameter cxtermpb, which set the probability to choose between a terminal or non-terminal crossover point. For instance, as defined by Koza, non-terminal primitives are selected for 90% of the crossover points, and terminals for 10%, so cxtermpb should be set to 0.1.
Note
This crossover is subject to change for a more effective method of selecting the crossover points.
This function applies a gaussian mutation of mean mu and standard deviation sigma on the input individual and returns the mutant. The individual is left intact and the mutant is an independant copy. This mutation expects an iterable individual composed of real valued attributes. The mutIndxPb argument is the probability of each attribute to be mutated.
Note
The mutation is not responsible for constraints checking, because there is too many possibilities for resetting the values. Which way is closer to the representation used is up to you.
One easy way to add constraint checking to an operator is to use the function decoration in the toolbox. See the multi-objective example (moga_kursawefct.py) for an explicit example.
This function uses the random() and gauss() functions from the python base random module.
Shuffle the attributes of the input individual and return the mutant. The individual is left intact and the mutant is an independent copy. The individual is expected to be iterable. The shuffleIndxPb argument is the probability of each attribute to be moved.
This function uses the random() and randint() functions from the python base random module.
Flip the value of the attributes of the input individual and return the mutant. The individual is left intact and the mutant is an independent copy. The individual is expected to be iterable and the values of the attributes shall stay valid after the not operator is called on them. The flipIndxPb argument is the probability of each attribute to be flipped.
This function uses the random() function from the python base random module.
Mutate an evolution strategy according to its strategy attribute as described in Beyer and Schwefel, 2002, Evolution strategies - A Comprehensive Introduction. The individual is first mutated by a normal distribution of mean 0 and standard deviation of then the strategy is mutated according to an extended log normal rule, , with and . A recommended choice is when using a evolution strategy (Beyer and Schwefel, 2002).
The strategy shall be the same size as the individual. Each index (strategy and attribute) is mutated with probability indpb. In order to limit the strategy, use a decorator as shown in the cxESBlend() function.
Randomly select a point in the Tree, then replace the subtree with the point as a root by a randomly generated expression. The expression is generated using the method expr.
The mutation of strongly typed GP expression is pretty easy. First, it finds a subtree. Second, it has to identify the return type of the root of this subtree. Third, it generates a new subtree with a root’s type corresponding to the original subtree root’s type. Finally, the old subtree is replaced by the new subtree.
This operator mutates the individual individual that are subjected to it. The operator randomly chooses a primitive in the individual and replaces it with a randomly selected primitive in pset that takes the same number of arguments.
This operator works on strongly typed trees as on normal GP trees.
This operator works on the constants of the tree ind. In mode "one", it will change the value of one of the individual ephemeral constants by calling its generator function. In mode "all", it will change the value of all the ephemeral constants.
This operator works on strongly typed trees as on normal GP trees.
This operator shrinks the individual individual that are subjected to it. The operator randomly chooses a branch in the individual and replaces it with one of the branch’s arguments (also randomly chosen).
This operator is not usable with STGP.
This operator mutate the GP tree of the individual passed and the primitive set expr, by inserting a new branch at a random position in a tree, using the original subtree at this position as one argument, and if necessary randomly selecting terminal primitives to complete the arguments of the inserted node. Note that the original subtree will become one of the children of the new primitive inserted, but not perforce the first (its position is randomly selected if the new primitive has more than one child)
This operator works on strongly typed trees as on normal GP trees.
Select k individuals from the input individuals using k tournaments of tournSize individuals. The list returned contains references to the input individuals.
This function uses the choice() function from the python base random module.
Select k individuals from the input individuals using k spins of a roulette. The selection is made by looking only at the first objective of each individual. The list returned contains references to the input individuals.
This function uses the random() function from the python base random module.
Warning
The roulette selection by definition cannot be used for minimization or when the fitness can be smaller or equal to 0.
Apply NSGA-II selection operator on the individuals. Usually, the size of individuals will be larger than k because any individual present in individuals will appear in the returned list at most once. Having the size of individuals equals to n will have no effect other than sorting the population according to a non-domination scheme. The list returned contains references to the input individuals.
For more details on the NSGA-II operator see Deb, Pratab, Agarwal, and Meyarivan, “A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II”, 2002.
Apply SPEA-II selection operator on the individuals. Usually, the size of individuals will be larger than n because any individual present in individuals will appear in the returned list at most once. Having the size of individuals equals to n will have no effect other than sorting the population according to a strength Pareto scheme. The list returned contains references to the input individuals.
For more details on the SPEA-II operator see Zitzler, Laumanns and Thiele, “SPEA 2: Improving the strength Pareto evolutionary algorithm”, 2001.
Select k individuals at random from the input individuals with replacement. The list returned contains references to the input individuals.
This function uses the choice() function from the python base random module.
Select the k best individuals among the input individuals. The list returned contains references to the input individuals.
Select the k worst individuals among the input individuals. The list returned contains references to the input individuals.
Perform a ring migration between the populations. The migration first select k emigrants from each population using the specified selection operator and then replace k individuals from the associated population in the migarray by the emigrants. If no replacement operator is specified, the immigrants will replace the emigrants of the population, otherwise, the immigrants will replace the individuals selected by the replacement operator. The migration array, if provided, shall contain each population’s index once and only once. If no migration array is provided, it defaults to a serial ring migration (1 – 2 – ... – n – 1). Selection and replacement function are called using the signature selection(populations[i], k) and replacement(populations[i], k). It is important to note that the replacement strategy must select k different individuals. For example, using a traditional tournament for replacement strategy will thus give undesirable effects, two individuals will most likely try to enter the same slot.
A statistics object that holds the required data for as long as it exists. When created the statistics object receives a key argument that is used to get the required data, if not provided the key argument defaults to the identity function. A statistics object can be represented as a 4 dimensional matrix. Along the first axis (wich length is given by the n argument) are independent statistics objects that are used on different collections given this index in the update() method. The second axis is the function it-self, each element along the second axis (indexed by their name) will represent a different function. The third axis is the accumulator of statistics. each time the update function is called the new statistics are added using the registered functions at the end of this axis. The fourth axis is used when the entered data is an iterable (for example a multiobjective fitness).
Data can be retrieved by different means in a statistics object. One can use directly the registered function name with an index argument that represent the first axis of the matrix. This method returns the last entered data.
>>> s = Statistics(n=2)
>>> s.register("Mean", mean)
>>> s.update([1, 2, 3, 4], index=0)
>>> s.update([5, 6, 7, 8], index=1)
>>> s.Mean(0)
[2.5]
>>> s.Mean(1)
[6.5]
An other way to obtain the statistics is to use directly the []. In that case all dimensions must be specified. This is how stats that have been registered earlier in the process can be retrieved.
>>> s.update([10, 20, 30, 40], index=0)
>>> s.update([50, 60, 70, 80], index=1)
>>> s[0]["Mean"][0]
[2.5]
>>> s[1]["Mean"][0]
[6.5]
>>> s[0]["Mean"][1]
[25]
>>> s[1]["Mean"][1]
[65]
Finally, the fourth dimension is used when stats are needed on lists of lists. The stats are computed on the matching indices of each list.
>>> s = Statistics()
>>> s.register("Mean", mean)
>>> s.update([[1, 2, 3], [4, 5, 6]])
>>> s.Mean()
[2.5, 3.5, 4.5]
>>> s[0]["Mean"][-1][0]
2.5
Register a function function that will be apply on the sequence each time update() is called. The function result will be accessible by using the string given by the argument name as a function of the statistics object.
>>> s = Statistics()
>>> s.register("Mean", mean)
>>> s.update([1,2,3,4,5,6,7])
>>> s.Mean()
[4.0]
Apply to the input sequence seq each registered function and store each result in a list specific to the function and the data index index.
>>> s = Statistics()
>>> s.register("Mean", mean)
>>> s.register("Max", max)
>>> s.update([4,5,6,7,8])
>>> s.Max()
[8]
>>> s.Mean()
[6.0]
>>> s.update([1,2,3])
>>> s.Max()
[3]
>>> s[0]["Max"]
[[8], [3]]
>>> s[0]["Mean"]
[[6.0], [2.0]]
Returns the arithmetic mean of the sequence seq = as .
Returns the median of seq - the numeric value separating the higher half of a sample from the lower half. If there is an even number of elements in seq, it returns the mean of the two middle values.
Returns the variance of seq = as , where is the arithmetic mean of seq.
Returns the square root of the variance of seq.
The hall of fame contains the best individual that ever lived in the population during the evolution. It is sorted at all time so that the first element of the hall of fame is the individual that has the best first fitness value ever seen, according to the weights provided to the fitness at creation time.
The class HallOfFame provides an interface similar to a list (without being one completely). It is possible to retrieve its length, to iterate on it forward and backward and to get an item or a slice from it.
Update the hall of fame with the population by replacing the worst individuals in it by the best individuals present in population (if they are better). The size of the hall of fame is kept constant.
Insert a new individual in the hall of fame using the bisect_right() function. The inserted individual is inserted on the right side of an equal individual. Inserting a new individual in the hall of fame also preserve the hall of fame’s order. This method does not check for the size of the hall of fame, in a way that inserting a new individual in a full hall of fame will not remove the worst individual to maintain a constant size.
Remove the specified index from the hall of fame.
Clear the hall of fame.
The Pareto front hall of fame contains all the non-dominated individuals that ever lived in the population. That means that the Pareto front hall of fame can contain an infinity of different individuals.
The size of the front may become very large if it is used for example on a continuous function with a continuous domain. In order to limit the number of individuals, it is possible to specify a similarity function that will return True if the genotype of two individuals are similar. In that case only one of the two individuals will be added to the hall of fame. By default the similarity function is operator.__eq__().
Since, the Pareto front hall of fame inherits from the HallOfFame, it is sorted lexicographically at every moment.
Update the Pareto front hall of fame with the population by adding the individuals from the population that are not dominated by the hall of fame. If any individual in the hall of fame is dominated it is removed.
A checkpoint is a file containing the state of any object that has been hooked. While initializing a checkpoint, add the objects that you want to be dumped by appending keyword arguments to the initializer or using the add(). By default the checkpoint tries to use the YAML format which is human readable, if PyYAML is not installed, it uses pickling which is not readable. You can force the use of pickling by setting the argument yaml to False.
In order to use efficiently this module, you must understand properly the assignment principles in Python. This module use the pointers you passed to dump the object, for example the following won’t work as desired
>>> my_object = [1, 2, 3]
>>> cp = Checkpoint(obj=my_object)
>>> my_object = [3, 5, 6]
>>> cp.dump("example")
>>> cp.load("example.ems")
>>> cp["obj"]
[1, 2, 3]
In order to dump the new value of my_object it is needed to change its internal values directly and not touch the label, as in the following
>>> my_object = [1, 2, 3]
>>> cp = Checkpoint(obj=my_object)
>>> my_object[:] = [3, 5, 6]
>>> cp.dump("example")
>>> cp.load("example.ems")
>>> cp["obj"]
[3, 5, 6]
Dump the current registered objects in a file named prefix.ecp, the randomizer state is always added to the file and available under the "randomizer_state" tag.
Load a checkpoint file retrieving the dumped objects, it is not safe to load a checkpoint file in a checkpoint object that contains references as all conflicting names will be updated with the new values.
Add objects to the list of objects to be dumped. The object is added under the name specified by the argument’s name. Keyword arguments are mandatory in this function.
Remove objects with the specified name from the list of objects to be dumped.
The History class helps to build a genealogy of all the individuals produced in the evolution. It contains two attributes, the genealogy_tree that is a dictionary of lists indexed by individual, the list contain the indices of the parents. The second attribute genealogy_history contains every individual indexed by their individual number as in the genealogy tree.
The produced genealogy tree is compatible with NetworkX, here is how to plot the genealogy tree
hist = History()
# Do some evolution and fill the history
import matplotlib.pyplot as plt
import networkx as nx
g = nx.DiGraph(hist.genealogy_tree)
nx.draw_springs(g)
plt.show()
Note
The genealogy tree might get very big if your population and/or the number of generation is large.
Populate the history with the initial individuals. An attribute history_index is added to every individual, this index will help to track the parents and the children through evolution. This index will be modified by the update() method when a child is produced. Modifying the internal genealogy_index of the history or the history_index of an individual may lead to unpredictable results and corruption of the history.
Update the history with the new individuals. The index present in their history_index attribute will be used to locate their parents and modified to a unique one to keep track of those new individuals.
Property that returns an appropriate decorator to enhance the operators of the toolbox. The returned decorator assumes that the individuals are returned by the operator. First the decorator calls the underlying operation and then calls the update function with what has been returned by the operator as argument. Finally, it returns the individuals with their history parameters modified according to the update function.
Decorate a function preserving its signature. There is two way of using this function, first as a decorator passing the decorator to use as argument, for example
@decorate(a_decorator)
def myFunc(arg1, arg2, arg3="default"):
do_some_work()
return "some_result"
Or as a decorator
@decorate
def myDecorator(func):
def wrapFunc(*args, **kargs):
decoration_work()
return func(*args, **kargs)
return wrapFunc
@myDecorator
def myFunc(arg1, arg2, arg3="default"):
do_some_work()
return "some_result"
Using the inspect module, we can retrieve the signature of the decorated function, what is not possible when not using this method.
print inspect.getargspec(myFunc)
It shall return something like
(["arg1", "arg2", "arg3"], None, None, ("default",))
This function is a simpler version of the decorator module (version 3.2.0) from Michele Simionato available at http://pypi.python.org/pypi/decorator.