This page is for those seeking information on the use and implementation of the HyperNEAT neuroevolution method, an extension of the NEAT method. The information herein aims to address common questions about HyperNEAT and to provide knowledge to those who wish to apply or extend the method.
If you haven't heard of HyperNEAT, it is a neuroevolution method, which means it evolves artificial neural networks through an evolutionary algorithm. It is extended from a prior neuroevolution algorithm called NeuroEvolution of Augmenting Topologies (NEAT), which also has its own NEAT Users Page. The HyperNEAT publications (link at left) offer a complete introduction to the method and its underlying theory of representation. This section briefly explains the general idea behind it.
In short, HyperNEAT is based on a theory of representation that hypothesizes that a good representation for an artificial neural network should be able to describe its pattern of connectivity compactly. This kind of description is called an encoding. The encoding in HyperNEAT, called compositional pattern producing networks, is designed to represent patterns with regularities such as symmetry, repetition, and repetition with variation. (Click here for an example of CPPN-generated patterns.) Thus HyperNEAT is able to evolve neural networks with these properties. The main implication of this capability is that HyperNEAT can efficiently evolve very large neural networks that look more like neural connectivity patterns in the brain (which are repetitious with many regularities, in addition to some irregularities) and that are generally much larger than what prior approaches to neural learning could produce.
The other unique and important facet of HyperNEAT is that it actually sees the geometry of the problem domain. It is strange to consider, but most neuroevolution algorithms (and most neural learning algorithms in general) are completely blind to domain geometry. For example, when a checkers board position is input into an artificial neural network, it has no idea which piece is next to which piece. If it ever comes to understand the board geometry, it must figure it out for itself. In contrast, when humans play checkers, we know right away the geometry of the board; we do not have to infer it from hundreds of examples of gameplay. HyperNEAT has the same capability. It actually sees the geometry of its inputs (and outputs) and can exploit that geometry to significantly enhance learning. To put it more technically, HyperNEAT computes the connectivity of its neural networks as a function of their geometry.
One implication of HyperNEAT's ability to exploit geometry is that it gives the user a completely new kind of influence over neural network learning. The user can now describe the geometry of the domain to HyperNEAT, which means there is room to be creative. If someone believes that a domain can be described best in a different geometry, it can be tested with HyperNEAT. Thus HyperNEAT opens up a new kind of research direction for artificial neural networks. This geometric layout is called a substrate, which is depicted in the images above.
Thus one way to express what HyperNEAT does is to say it evolves the connectivity pattern for a neural network with a particular substrate geometry.
If you are interested in sharing your own version of HyperNEAT, we are happy to link to it from the catalog. Please email kstanley@eecs.ucf.edu with information on your implementation. Note that HyperNEAT is possible to build from an existing package of NEAT, so you may want to start with an existing NEAT package instead of HyperNEAT.
The question for many people first coming to HyperNEAT is which package is right for me?
The experiments included in various packages at The HyperNEAT Software Catalog are different. HyperSharpNEAT provides a multi-agent predator-prey experiment, while HyperNEAT C++ includes a visual discrimination task (the "boxes" task) and a checkers experiment. Colin Green's version of HyperNEAT also includes the boxes domain, but unlike Jason Gauci's it is written in C#. Phillip Verbancsics' version includes Keepaway and implements a Bird's Eye View (BEV) substrate. Oliver Coleman's Java package offers experiments focusing on full visual fields. If you are planning a new experiment, it may be helpful to look at the code for similar experiments.
Your best option will be based on some combination of the above considerations. Of course, if you want HyperNEAT for platform X or language Y and such an implementation is not available, you may want to write your own version of HyperNEAT.
HyperNEAT extends the NEAT method. (NEAT evolves the CPPNs that generate networks in HyperNEAT.) Much information is available on NEAT and many implementations are supported. NEAT stands for NeuroEvolution of Augmenting Topologies. It is a method for evolving artificial neural networks with an evolutionary algorithm. NEAT implements the idea that it is most effective to start evolution with small, simple networks and allow them to become increasingly complex over generations. That way, just as organisms in nature increased in complexity since the first cell, so do neural networks in NEAT. This process of continual elaboration allows finding highly sophisticated and complex neural networks.
For more information about NEAT and NEAT software, please visit the NEAT Users Page.
Derek James created a NEAT Users Group on Yahoo! to encourage the discussion of ideas, questions, and variations of NEAT. The community of HyperNEAT users and those interested in HyperNEAT can benefit greatly from the availability of this forum. Please feel free to join the discussion!
What HyperNEAT software and source is available?
Please see the software section.Why is it called HyperNEAT?
The "Hyper" in "HyperNEAT" comes from the word "Hypercube." The complete name of the approach, which is a mouthful, is "Hypercube-based NeuroEvolution of Augmenting Topologies." The reason the word "Hypercube" describes the approach is because a CPPN that describes a connectivity pattern is at least four-dimensional (i.e. from taking x1,y1,x2, and y2). In some cases it may be more than four-dimensional, such as when the encoded connectivity pattern is in three dimensions; in that case the CPPN is six-dimensional (i.e. from inputs x1,y1,z1,x2,y2, and z2). These multi-dimensional spaces are usually sampled within the bounds of a hypercube that begins at -1 and ends 1 on each dimension. Each point within the hypercube represents a connection weight. Thus HyperNEAT really is in effect painting a pattern on the inside of a hypercube. That pattern is then interpreted as the connectivity pattern of a neural network. The rest of the name - NEAT - comes from the NeuroEvolution of Augmenting Topologies method that in HyperNEAT evolves the topology and weights (and activation functions) of the CPPN, which in turn encodes the weights of the neural network substrate.For what problems does HyperNEAT provide an advantage?
HyperNEAT is not necessarily the best choice for all problems. Although HyperNEAT should perform at least as well as other neuroevolution methods (such as NEAT) on most problems, it may not always be worth the effort of setting up a substrate if it will not provide a significant advantage. Therefore, it is useful to know when HyperNEAT provides the greatest advantage. There are several types of problem where you can expect a potential advantage:What is the right way to set up a CPPN to encode a substrate with hidden nodes?
First, there is no single right way. Instead, there are a couple of main possibilities.
The first possibility is to simply place hidden nodes on the substrate and query connections that connect to them with the CPPN. For example, if the substrate is two-dimensional, then the inputs might be at y=-1 (the bottom of the substrate ) and the outputs might be at y=1 (the top of the substrate). Then hidden nodes can be placed anywhere between y=-1 and y=1. For example, there could be a row of hidden nodes at y=0. Then connections from or two nodes at y=0 would connect from or to hidden nodes. If the substrate is three-dimensional (meaning that each node exists at an x,y,z coordinate), then the z dimension might be used to represent hidden nodes. That way, z=-1 might be inputs, z=1 outputs, and perhaps z=0 hidden nodes. In this 3-D setup, z=0 is a plane rather than a single line (as y=0 is in the 2-D case). Of course, hidden nodes can be anywhere and do not have to form a single layer. For example, some could be at z=0 and some could be at z=0.5. The placement is up to the user.
The second possibility for the CPPN to encode connections with hidden nodes is to represent different layers of the net with different outputs. The checkers substrate in this paper works this way. The idea is that the first output is read for connections from the input layer and to the hidden layer, while the second output is read for connections from the hidden layer and to the output layer. That way, a separate CPPN output node corresponds to each layer. Of course, even with the method, a decision still has to be made as to how to represent the nodes within each layer. For example, they could be represented in 1-D, 2-D, 3-D, or even in higher dimensionality. This paper compares these options and finds that the choice of dimensionality can make a significant difference. This paper provides new evidence of the benefit of using multiple outputs, especially for multimodal problems.
These two ways of querying substrates with hidden nodes have different implicatons. In the case wherein different CPPN outputs represent different hidden weights (the second possibility above), there is less likelihood that the geometric pattern of weights in the first layer will be highly related to the pattern in the second layer because the different CPPN outputs can compute their patterns semi-independently. On the other hand, if all connections are assigned weights from a single CPPN outputs, there is more likelihood that a global pattern can be observed across all weights and layers. Of course, these are only biases and not guarantees. Also, whether geometric correlation across layers is desirable will be domain dependent so one choice is not clearly better than the other.
Note also that a new enhancement to HyperNEAT called evlolvabe substrate HyperNEAT (ES-HyperNEAT) makes it possible for HyperNEAT to evolve the placement and density of hidden nodes itself, as the next question discusses.
Is there a way to make HyperNEAT decide on the placement and density of hidden neurons in the substrate on its own?
Yes, this subject is an active area of research. It turns out that there is indeed a way for HyperNEAT to decide on the placement and density of hidden neurons without any addition representation beyond the traditional CPPN. This approach is called evlolvabe substrate HyperNEAT. The main idea is that ES-HyperNEAT searches through the pattern in the hypercube painted by the CPPN to find areas of high information, from which is chooses connections. The nodes that these connections connect are then naturally also chosen. Thus the philosophy is that density should follow information: Where there is more information in the CPPN-encoded pattern, there should be higher density within the substrate to capture it. By following this approach, there is no need for the user to decide anything about hidden nodes placement or density. To learn more, see An Enhanced Hypercube-Based Encoding for Evolving the Placement, Density and Connectivity of Neurons .What is the best way to give the neural network substrate a bias?
First, just as with any type of neural network, the bias can be important. If your experiment is not working, it may be because you did not include a bias. Of course, it is important to note that just because the CPPN has a bias does not mean that the substrate has one, and because the substrate is the actual neural network, it will not function like a normal neural network with a bias unless you give it one.
There are two ways to give the substrate a bias:
First option: You can simply place a node on the substrate that will serve as a bias, which means that it will take a constant input (usually 1.0 or 0.5). The CPPN will then connect the bias to the rest of the network just like any other node. Outside of HyperNEAT, this idea would be straightforward. For example, in NEAT, adding a bias is as easy as including one more input. However, in HyperNEAT it is not so simple because the bias must be given a coordinate, and it is often not obvious where in the neural topography it should be. Thus adding a bias in this traditional way may not be natural in HyperNEAT. Similarly, to the extent that real brains have "bias," it is probably not a single node sitting at some arbitrary location in the brain. Therefore, with HyperNEAT, the second option may be preferable.
Second option: A new output can be added to the CPPN that outputs the bias for a particular location in space. It is important to note that this new output is in addition to the usual output that determines weights. So the CPPN will now determine weights from one output, and biases from another. Recall that nodes in the substrate exist at locations. Thus the CPPN can be queried for each location to return a bias. This bias is a constant that is added to the total activation of the node. So it is really analogous to the incoming weight from a bias node being multiplied by a constant term (e.g. 1.0). However, in this case, there is no actual bias node and instead the bias weights are computed separately from other weights.
One problem you may notice with the second option is that it is overspecified. Because the CPPN is used to determine the weights of connections, they have more coordinates than necessary to query node-centric values such as a bias. For example, in four dimensions, a connection exists at (x1,y1,x2,y2), but a node only exists at (x,y). Thus the question is how you ask a CPPN with four inputs for the value of a location that only requires two. The solution is to establish a convention. For example, by convention, we can say that biases are only queried for every (x1,y1,0,0). In this convention (x,y)=(x1,y1) and x2 and y2 are simply set to 0. Geometrically, the set of points (x1,y1,0,0) is a 2-D cross-section of the 4-D hypercube. So we are saying that all the biases lie on that cross-sectional plane. It could be done other ways, but this particular convention has worked in the past.
Recall that the bias output node of the CPPN is different from the weight output. Thus the weight output continues to be queried for all (x1,y1,x2,y2) coordinates in the substrate. Yet the bias output is only queried when x2=0 and y2=0. This way, you will get a pattern of bias values for your substrate.
Why is the range of the CPPN output cut off and normalized to a different range for HyperNEAT?
In HyperNEAT it is conventional not to express a connection whose weight magnitude (output by the CPPN) is below some threshold. For example, the threshold might be 0.2, which would mean no connection is expressed with a weight between [-0.2..0.2]. For any connection that is above this magnitude (and therefore expressed), its weight is scaled to a range. For example, a reasonable range is [-3..3]. The question is why this cutting and scaling are done.
The general idea is to make it possible for the CPPN to suggest that some connections not be expressed, thereby allowing for arbitrary topologies. For this purpose the threshold magnitude is chosen below which the connection is not expressed (e.g. 0.2). However, that leaves the output ranges [0.2..1] and [-0.2..-1] as expressed weights, because the CPPN only outputs numbers between -1 and 1. So we renormalize the range to [-3..3]. That is, a value of 0.2 or -0.2 would map to 0 and a value of 1 would map to 3 (and -1 to -3). The reason for the number 3 is based on empirical experience. It happens to be a good maximum weight. One reason for that is the particular shape of the sigmoid function, which after a certain point becomes very close to its maximum or minimum and thus no longer sensitive. The weight range is chosen to calibrate well with the range of sensitivity on the sigmoid.
Is there any theoretical work on the indirect encoding of artificial neural networks?
Yes, in fact Juergen Schmidhuber showed that a compressed encoding of the weights in a neural network make it possible to solve certain types of problems that would otherwise be too difficult without such compression (empirical results are also presented supporting the theoretical analysis):Where can I learn more about prior work on indirect encodings in general?
A survey of this area is provided in:HyperNEAT was originally invented at the Evolutionary Complexity Research Group (EPlex) at the University of Central Florida.
Significant research on HyperNEAT and CPPNs is ongoing at a number of research groups around the world:
Cornell Computational Synthesis Laboratory at Cornell University
Jonathan D. Hiller and Hod Lipson, Evolving Amorphous Robots, in Proceedings of the 12th International Conference on Artificial Life (Alife XII), 2010. (pdf)
Clune J, Lipson H,
Evolving three-dimensional objects with a generative encoding inspired by developmental biology, in
Proceedings of the European Conference on Artificial Life, pages 144-148, 2011. (pdf)
Yosinski J, Clune J, Hidalgo D, Nguyen S, Cristobal Zagal J, Lipson H, Evolving robot gaits in hardware: the HyperNEAT generative encoding vs. parameter optimization, in Proceedings of the European Conference on Artificial Life, pages 890-897, 2011. (pdf)
Lee S, Yosinski J, Glette K, Lipson H, Clune J.,
Evolving gaits for physical robots with the HyperNEAT generative encoding: the benefits of simulation, in
Applications of Evolutionary Computing, Springer, 2013. (pdf)
(video)
()
Computational Intelligence (CI) group at Vrije Universiteit (VU) Amsterdam
Evert Haasdijk, Andrei A. Rusu, and A.E. Eiben, HyperNEAT for Locomotion Control in Modular Robots, in Proceedings of the 9th International Conference on Evolvable Systems (ICES 2010), 2010. (pdf)
Morphology, Evolution & Cognition Laboratory at the University of Vermont
Auerbach, J. E., Bongard, J. C. , Evolving CPPNs to Grow Three Dimensional Structures, in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), 2010. (pdf)
Auerbach, J. E., Bongard, J. C. , Dynamic Resolution in the Co-Evolution of Morphology and Control, in 12th International Conference on the Synthesis and Simulation of Living Systems (ALife XII), 2010. (pdf)
University of New South Wales School of Engineering and Computer Science
Oliver Johan Coleman, Evolving Neural Networks for Visual Processing, Undergraduate Honours Thesis (Bachelor of Computer Science), 2010. (pdf)
Computational Intelligence Group at Czech Technical University in Prague
Drchal J., Koutník J., Šnorek M., HyperNEAT Controlled Robots Learn to Drive on Roads in Simulated Environment, to appear in CEC 2009 (pdf)
Digital Evolution Lab at Michigan State University
Clune J, Stanley KO, Pennock RT, and Ofria C.
On the performance of indirect encoding across the continuum of regularity.
IEEE Transactions on Evolutionary Computation, 2011 (to appear).(pdf)
Clune J, Beckmann BE, McKinley PK, and Ofria C.
Investigating whether HyperNEAT produces modular neural networks.
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), 2010.(pdf)
David B. Knoester, Heather J. Goldsby, and Philip K. McKinley
Neuroevolution of Mobile Ad Hoc Networks.
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO), 2010.(pdf)
Clune J, Beckmann BE, Pennock RT, and Ofria C.
HybrID: A Hybridization of Indirect and Direct Encodings for Evolutionary Computation.
Proceedings of the European Conference on Artificial Life (ECAL), 2009. Budapest, Hungary. (pdf)
Clune J, Pennock RT, and Ofria C.
The sensitivity of HyperNEAT to different geometric representations of a
problem.
Proceedings of the Genetic and Evolutionary Computation Conference
(GECCO), 2009. Montreal, Canada. (pdf)
Clune J, Beckmann BE, Ofria C, and Pennock RT.
Evolving coordinated gaits with the HyperNEAT generative encoding.
Proceedings of the IEEE Congress on Evolutionary Computing Special
Section on Evolutionary Robotics, 2009. Trondheim, Norway. (pdf)
Clune J, Ofria C, and Pennock RT.
How a generative encoding fares as problem-regularity decreases.
Proceedings of the 10th International Conference on Parallel Problem
Solving From Nature. pp 358-367. Dortmund, Germany. 2008. (pdf)
Institute of Computing Science at Poznan University of Technology
NeuroHunter, a HyperNEAT neural net evolved by this team won the GECCO'2008 Balanced Diet contest.
Neural Networks Research Group at University of Texas at Austin
Erkin Bahceci and Risto Miikkulainen, Transfer of Evolved Pattern-Based Heuristics in Games, in Proceedings of the IEEE Symposium on Computational Intelligence and Games (CIG), 2008. (pdf)