The Hybercube-based NeuroEvolution of Augmenting Topologies (HyperNEAT) Users Page


This page is for those seeking information on the use and implementation of the HyperNEAT neuroevolution method, an extension of the NEAT method.  The information herein aims to address common questions about HyperNEAT and to provide knowledge to those who wish to apply or extend the method. 

Please direct inquiries to kstanley@eecs.ucf.edu (website)

nc10 CPPN

checkers


Introduction / What is HyperNEAT?

If you haven't heard of HyperNEAT, it is a neuroevolution method, which means it evolves artificial neural networks through an evolutionary algorithm. It is extended from a prior neuroevolution algorithm called NeuroEvolution of Augmenting Topologies (NEAT), which also has its own NEAT Users Page. The HyperNEAT publications (link at left) offer a complete introduction to the method and its underlying theory of representation. This section briefly explains the general idea behind it.

In short, HyperNEAT is based on a theory of representation that hypothesizes that a good representation for an artificial neural network should be able to describe its pattern of connectivity compactly. This kind of description is called an encoding. The encoding in HyperNEAT, called compositional pattern producing networks, is designed to represent patterns with regularities such as symmetry, repetition, and repetition with variation. (Click here for an example of CPPN-generated patterns.) Thus HyperNEAT is able to evolve neural networks with these properties. The main implication of this capability is that HyperNEAT can efficiently evolve very large neural networks that look more like neural connectivity patterns in the brain (which are repetitious with many regularities, in addition to some irregularities) and that are generally much larger than what prior approaches to neural learning could produce.

The other unique and important facet of HyperNEAT is that it actually sees the geometry of the problem domain. It is strange to consider, but most neuroevolution algorithms (and most neural learning algorithms in general) are completely blind to domain geometry. For example, when a checkers board position is input into an artificial neural network, it has no idea which piece is next to which piece. If it ever comes to understand the board geometry, it must figure it out for itself. In contrast, when humans play checkers, we know right away the geometry of the board; we do not have to infer it from hundreds of examples of gameplay. HyperNEAT has the same capability. It actually sees the geometry of its inputs (and outputs) and can exploit that geometry to significantly enhance learning. To put it more technically, HyperNEAT computes the connectivity of its neural networks as a function of their geometry.

One implication of HyperNEAT's ability to exploit geometry is that it gives the user a completely new kind of influence over neural network learning. The user can now describe the geometry of the domain to HyperNEAT, which means there is room to be creative. If someone believes that a domain can be described best in a different geometry, it can be tested with HyperNEAT. Thus HyperNEAT opens up a new kind of research direction for artificial neural networks. This geometric layout is called a substrate, which is depicted in the images above.

Thus one way to express what HyperNEAT does is to say it evolves the connectivity pattern for a neural network with a particular substrate geometry.


Software Packages

See the HyperNEAT Software Catalog for a full selection of packages.

If you are interested in sharing your own version of HyperNEAT, we are happy to link to it from the catalog. Please email kstanley@eecs.ucf.edu with information on your implementation. Note that HyperNEAT is possible to build from an existing package of NEAT, so you may want to start with an existing NEAT package instead of HyperNEAT.

The question for many people first coming to HyperNEAT is which package is right for me?

The experiments included in various packages at The HyperNEAT Software Catalog are different. HyperSharpNEAT provides a multi-agent predator-prey experiment, while HyperNEAT C++ includes a visual discrimination task (the "boxes" task) and a checkers experiment. Colin Green's version of HyperNEAT also includes the boxes domain, but unlike Jason Gauci's it is written in C#. Phillip Verbancsics' version includes Keepaway and implements a Bird's Eye View (BEV) substrate. Oliver Coleman's Java package offers experiments focusing on full visual fields. If you are planning a new experiment, it may be helpful to look at the code for similar experiments.

Your best option will be based on some combination of the above considerations. Of course, if you want HyperNEAT for platform X or language Y and such an implementation is not available, you may want to write your own version of HyperNEAT.


NEAT

HyperNEAT extends the NEAT method. (NEAT evolves the CPPNs that generate networks in HyperNEAT.) Much information is available on NEAT and many implementations are supported. NEAT stands for NeuroEvolution of Augmenting Topologies. It is a method for evolving artificial neural networks with an evolutionary algorithm. NEAT implements the idea that it is most effective to start evolution with small, simple networks and allow them to become increasingly complex over generations. That way, just as organisms in nature increased in complexity since the first cell, so do neural networks in NEAT. This process of continual elaboration allows finding highly sophisticated and complex neural networks.

For more information about NEAT and NEAT software, please visit the NEAT Users Page.


NEAT and HyperNEAT Users Group

Derek James created a NEAT Users Group on Yahoo! to encourage the discussion of ideas, questions, and variations of NEAT. The community of HyperNEAT users and those interested in HyperNEAT can benefit greatly from the availability of this forum. Please feel free to join the discussion! 


HyperNEAT Software FAQ

This space is reserved for frequently asked question about software implementation.

What HyperNEAT software and source is available?

Please see the software section.

HyperNEAT Methodology FAQ

This space is reserved for frequently asked question about HyperNEAT methodology and the theory behind HyperNEAT.

Why is it called HyperNEAT?

The "Hyper" in "HyperNEAT" comes from the word "Hypercube." The complete name of the approach, which is a mouthful, is "Hypercube-based NeuroEvolution of Augmenting Topologies." The reason the word "Hypercube" describes the approach is because a CPPN that describes a connectivity pattern is at least four-dimensional (i.e. from taking x1,y1,x2, and y2). In some cases it may be more than four-dimensional, such as when the encoded connectivity pattern is in three dimensions; in that case the CPPN is six-dimensional (i.e. from inputs x1,y1,z1,x2,y2, and z2). These multi-dimensional spaces are usually sampled within the bounds of a hypercube that begins at -1 and ends 1 on each dimension. Each point within the hypercube represents a connection weight. Thus HyperNEAT really is in effect painting a pattern on the inside of a hypercube. That pattern is then interpreted as the connectivity pattern of a neural network. The rest of the name - NEAT - comes from the NeuroEvolution of Augmenting Topologies method that in HyperNEAT evolves the topology and weights (and activation functions) of the CPPN, which in turn encodes the weights of the neural network substrate.

For what problems does HyperNEAT provide an advantage?

HyperNEAT is not necessarily the best choice for all problems. Although HyperNEAT should perform at least as well as other neuroevolution methods (such as NEAT) on most problems, it may not always be worth the effort of setting up a substrate if it will not provide a significant advantage. Therefore, it is useful to know when HyperNEAT provides the greatest advantage. There are several types of problem where you can expect a potential advantage:

What is the right way to set up a CPPN to encode a substrate with hidden nodes?

First, there is no single right way. Instead, there are a couple of main possibilities.

The first possibility is to simply place hidden nodes on the substrate and query connections that connect to them with the CPPN. For example, if the substrate is two-dimensional, then the inputs might be at y=-1 (the bottom of the substrate ) and the outputs might be at y=1 (the top of the substrate). Then hidden nodes can be placed anywhere between y=-1 and y=1. For example, there could be a row of hidden nodes at y=0. Then connections from or two nodes at y=0 would connect from or to hidden nodes. If the substrate is three-dimensional (meaning that each node exists at an x,y,z coordinate), then the z dimension might be used to represent hidden nodes. That way, z=-1 might be inputs, z=1 outputs, and perhaps z=0 hidden nodes. In this 3-D setup, z=0 is a plane rather than a single line (as y=0 is in the 2-D case). Of course, hidden nodes can be anywhere and do not have to form a single layer. For example, some could be at z=0 and some could be at z=0.5. The placement is up to the user.

The second possibility for the CPPN to encode connections with hidden nodes is to represent different layers of the net with different outputs. The checkers substrate in this paper works this way. The idea is that the first output is read for connections from the input layer and to the hidden layer, while the second output is read for connections from the hidden layer and to the output layer. That way, a separate CPPN output node corresponds to each layer. Of course, even with the method, a decision still has to be made as to how to represent the nodes within each layer. For example, they could be represented in 1-D, 2-D, 3-D, or even in higher dimensionality. This paper compares these options and finds that the choice of dimensionality can make a significant difference. This paper provides new evidence of the benefit of using multiple outputs, especially for multimodal problems.

These two ways of querying substrates with hidden nodes have different implicatons. In the case wherein different CPPN outputs represent different hidden weights (the second possibility above), there is less likelihood that the geometric pattern of weights in the first layer will be highly related to the pattern in the second layer because the different CPPN outputs can compute their patterns semi-independently. On the other hand, if all connections are assigned weights from a single CPPN outputs, there is more likelihood that a global pattern can be observed across all weights and layers. Of course, these are only biases and not guarantees. Also, whether geometric correlation across layers is desirable will be domain dependent so one choice is not clearly better than the other.

Note also that a new enhancement to HyperNEAT called evlolvabe substrate HyperNEAT (ES-HyperNEAT) makes it possible for HyperNEAT to evolve the placement and density of hidden nodes itself, as the next question discusses.

Is there a way to make HyperNEAT decide on the placement and density of hidden neurons in the substrate on its own?

Yes, this subject is an active area of research. It turns out that there is indeed a way for HyperNEAT to decide on the placement and density of hidden neurons without any addition representation beyond the traditional CPPN. This approach is called evlolvabe substrate HyperNEAT. The main idea is that ES-HyperNEAT searches through the pattern in the hypercube painted by the CPPN to find areas of high information, from which is chooses connections. The nodes that these connections connect are then naturally also chosen. Thus the philosophy is that density should follow information: Where there is more information in the CPPN-encoded pattern, there should be higher density within the substrate to capture it. By following this approach, there is no need for the user to decide anything about hidden nodes placement or density. To learn more, see An Enhanced Hypercube-Based Encoding for Evolving the Placement, Density and Connectivity of Neurons .

What is the best way to give the neural network substrate a bias?

First, just as with any type of neural network, the bias can be important. If your experiment is not working, it may be because you did not include a bias. Of course, it is important to note that just because the CPPN has a bias does not mean that the substrate has one, and because the substrate is the actual neural network, it will not function like a normal neural network with a bias unless you give it one.

There are two ways to give the substrate a bias:

First option: You can simply place a node on the substrate that will serve as a bias, which means that it will take a constant input (usually 1.0 or 0.5). The CPPN will then connect the bias to the rest of the network just like any other node. Outside of HyperNEAT, this idea would be straightforward. For example, in NEAT, adding a bias is as easy as including one more input. However, in HyperNEAT it is not so simple because the bias must be given a coordinate, and it is often not obvious where in the neural topography it should be. Thus adding a bias in this traditional way may not be natural in HyperNEAT. Similarly, to the extent that real brains have "bias," it is probably not a single node sitting at some arbitrary location in the brain. Therefore, with HyperNEAT, the second option may be preferable.

Second option: A new output can be added to the CPPN that outputs the bias for a particular location in space. It is important to note that this new output is in addition to the usual output that determines weights. So the CPPN will now determine weights from one output, and biases from another. Recall that nodes in the substrate exist at locations. Thus the CPPN can be queried for each location to return a bias. This bias is a constant that is added to the total activation of the node. So it is really analogous to the incoming weight from a bias node being multiplied by a constant term (e.g. 1.0). However, in this case, there is no actual bias node and instead the bias weights are computed separately from other weights.

One problem you may notice with the second option is that it is overspecified. Because the CPPN is used to determine the weights of connections, they have more coordinates than necessary to query node-centric values such as a bias. For example, in four dimensions, a connection exists at (x1,y1,x2,y2), but a node only exists at (x,y). Thus the question is how you ask a CPPN with four inputs for the value of a location that only requires two. The solution is to establish a convention. For example, by convention, we can say that biases are only queried for every (x1,y1,0,0). In this convention (x,y)=(x1,y1) and x2 and y2 are simply set to 0. Geometrically, the set of points (x1,y1,0,0) is a 2-D cross-section of the 4-D hypercube. So we are saying that all the biases lie on that cross-sectional plane. It could be done other ways, but this particular convention has worked in the past.

Recall that the bias output node of the CPPN is different from the weight output. Thus the weight output continues to be queried for all (x1,y1,x2,y2) coordinates in the substrate. Yet the bias output is only queried when x2=0 and y2=0. This way, you will get a pattern of bias values for your substrate.

Why is the range of the CPPN output cut off and normalized to a different range for HyperNEAT?

In HyperNEAT it is conventional not to express a connection whose weight magnitude (output by the CPPN) is below some threshold. For example, the threshold might be 0.2, which would mean no connection is expressed with a weight between [-0.2..0.2]. For any connection that is above this magnitude (and therefore expressed), its weight is scaled to a range. For example, a reasonable range is [-3..3]. The question is why this cutting and scaling are done.

The general idea is to make it possible for the CPPN to suggest that some connections not be expressed, thereby allowing for arbitrary topologies. For this purpose the threshold magnitude is chosen below which the connection is not expressed (e.g. 0.2). However, that leaves the output ranges [0.2..1] and [-0.2..-1] as expressed weights, because the CPPN only outputs numbers between -1 and 1. So we renormalize the range to [-3..3]. That is, a value of 0.2 or -0.2 would map to 0 and a value of 1 would map to 3 (and -1 to -3). The reason for the number 3 is based on empirical experience. It happens to be a good maximum weight. One reason for that is the particular shape of the sigmoid function, which after a certain point becomes very close to its maximum or minimum and thus no longer sensitive. The weight range is chosen to calibrate well with the range of sensitivity on the sigmoid.

Is there any theoretical work on the indirect encoding of artificial neural networks?

Yes, in fact Juergen Schmidhuber showed that a compressed encoding of the weights in a neural network make it possible to solve certain types of problems that would otherwise be too difficult without such compression (empirical results are also presented supporting the theoretical analysis):
J. Schmidhuber. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10(5): 857-873, 1997

Where can I learn more about prior work on indirect encodings in general?

A survey of this area is provided in:
Kenneth O. Stanley and Risto Miikkulainen. A Taxonomy for Artificial Embryogeny. Artificial Life 9(2): 93-130, 2003

HyperNEAT Publications from EPlex

HyperNEAT was originally invented at the Evolutionary Complexity Research Group (EPlex) at the University of Central Florida.

The following two papers are good introductions:

All HyperNEAT-related publications from EPlex are listed below:


Enhancing ES-HyperNEAT to Evolve More Complex Regular Neural Networks

Sebastian Risi and Kenneth O. Stanley

In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2011). New York, NY: ACM, 2011 (8 pages)


Constraining Connectivity to Encourage Modularity in HyperNEAT

Phillip Verbancsics and Kenneth O. Stanley

In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2011). New York, NY: ACM, 2011 (8 pages)


Indirect Encoding of Neural Networks for Scalable Go

Jason Gauci and Kenneth O. Stanley

In: Proceedings of the 11th International Conference on Parallel Problem Solving From Nature (PPSN-2010). New York, NY: Springer, 2010 (10 pages)


Evolving a Single Scalable Controller for an Octopus Arm with a Variable Number of Segments

Brian G. Woolley and Kenneth O. Stanley

In: Proceedings of the 11th International Conference on Parallel Problem Solving From Nature (PPSN-2010). New York, NY: Springer, 2010 (10 pages)


bev

Evolving Static Representations for Task Transfer

Phillip Verbancsics and Kenneth O. Stanley

In: Journal of Machine Learning Research 11: pages 1737-1769. Brookline, MA: Microtome Publishing, 2010 (33 pages)


Indirectly Encoding Neural Plasticity as a Pattern of Local Rules

Sebastian Risi and Kenneth O. Stanley

In: Proceedings of the 11th International Conference on Simulation of Adaptive Behavior (SAB 2010). New York, NY: Springer, 2010 (11 pages)


Evolving the Placement and Density of Neurons in the HyperNEAT Substrate

Sebastian Risi, Joel Lehman, and Kenneth O. Stanley

In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2010). New York, NY: ACM, 2010 (8 pages)


bev

Transfer Learning through Indirect Encoding

Phillip Verbancsics and Kenneth O. Stanley

In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2010). New York, NY: ACM, 2010 (8 pages)


checkers

Autonomous Evolution of Topographic Regularities in Artificial Neural Networks

Jason Gauci and Kenneth O. Stanley

To Appear in: Neural Computation journal. Cambridge, MA: MIT Press, 2010 (Manuscript 38 pages)


Evolving Policy Geometry for Scalable Multiagent Learning

David B. D.Ambrosio, Joel Lehman, Sebastian Risi, and Kenneth O. Stanley

To Appear in: Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS-2010). 2010 (8 pages)


A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks

Kenneth O. Stanley, David B. D'Ambrosio, and Jason Gauci

In: Artificial Life journal. Cambridge, MA: MIT Press, 2009 (Manuscript 39 pages)


Generative Encoding for Multiagent Learning

David B. D'Ambrosio and Kenneth O. Stanley

Note: This paper is accompanied with a set of videos at http://eplex.cs.ucf.edu/multiagenthyperneat

In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2008). New York, NY: ACM, 2008 (8 pages)


checkers

A Case Study on the Critical Role of Geometric Regularity in Machine Learning

Jason Gauci and Kenneth O. Stanley

Note: This paper is accompanied with version 2.0 of the HyperNEAT software found here.

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (AAAI-2008). Menlo Park, CA: AAAI Press, 2008 (6 pages)


Generating Large-Scale Neural Networks Through Discovering Geometric Regularities

Jason J. Gauci and Kenneth O. Stanley

In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2007). New York, NY: ACM, 2007 (8 pages)


A Novel Generative Encoding for Exploiting Neural Network Sensor and Output Geometry

David B. D'Ambrosio and Kenneth O. Stanley

In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2007). New York, NY: ACM, 2007 (8 pages)


Compositional Pattern Producing Networks: A Novel Abstraction of Development

Kenneth O. Stanley

In: Genetic Programming and Evolvable Machines Special Issue on Developmental Systems 8(2): 131-162New York, NY: Springer, 2007 (36 pages)

Springer link to article in publication format (requires subscription to Springer): http://www.springerlink.com/content/804411v3703ph210


Exploiting Regularity Without Development

Kenneth O. Stanley

In: Proceedings of the AAAI Fall Symposium on Developmental Systems. Menlo Park, CA: AAAI Press, 2006 (8 pages)


Comparing Artificial Phenotypes with Natural Biological Patterns

Kenneth O. Stanley

Proceedings of the Genetic and Evolutionary Computation Conference (GECCO) Workshop Program. New York, NY: ACM Press, 2006 (2 pages)


HyperNEAT Publications and Projects from Outside EPlex

Significant research on HyperNEAT and CPPNs is ongoing at a number of research groups around the world:


Updates: 4/17/09: First public version of page completed. 11/10/09: Links to outside papers from CIG Group in Prague updated. 11/27/09: Added link to Schmidhuber's work in Methodology FAQ section. 11/28/09: Added link to "A Taxonomy for Artificial Embryogeny" in the Methodology FAQ section. 12/17/09: Added link to new HyperNEAT journal paper in Neural Computation journal by Gauci and Stanley. 1/8/10: Added answer to question on adding a bias to the substrate in the HyperNEAT Methodology FAQ. 1/17/10: Added answer to question on weight range cutting and normalizing in HyperNEAT, in the HyperNEAT Methodology FAQ. 1/23/10: Added answer to question on including hidden nodes in the substrate in the HyperNEAT Methodology FAQ. 1/27/10: Added link to D'Ambrosio AAMAS 2010 paper. 2/7/10: Added link to Clune paper inside FAQ answer. 3/21/10: Added Verbancsics and Stanley 2010 GECCO paper; added pointer to HyperNEAT introductory papers at front of EPlex papers section. 4/16/10: Added Risi and Stanley 2010 SAB paper. 4/18/10: Improved FAQ answer on option 2 for adding bias to substrate. 4/30/10: Added outside publications from Michigan State and Vermont. 6/4/10: Added outside publications from Vermont and Vrije (Amsterdam). 6/7/10: Added Verbancsics and Stanley 2010 JMLR paper. 6/19/10: Added links to Woolley and Stanley; and Gauci and Stanley 2010 PPSN papers. 6/21/10: Links to Colin Green's new HyperNEAT implementation added. 7/17/10: Added link to Knoester, Goldsby, and McKinley paper from Michigan State University. 8/19/10: Added new question to FAQ: "For what problems does HyperNEAT provide an advantage?" 9/8/10: Added link to new HyperSharpNEAT-based multiagent simulator. 12/3/10: Added new AHNI software package (by Oliver Coleman) to software section. 12/4/10: Added outside publication from Oliver Coleman at New South Wales. 12/19/10: Added outside publication from Jeff Clune et al. (including Ken Stanley) at Michigan State. 1/21/11: Added outside publication from Hiller and Lipson at Cornell University. 5/30/11: Added Risi and Stanley & Verbancsics and Stanley GECCO publications from 2011. 5/30/11: Added question to FAQ on evolving the placement and density of hidden nodes (i.e. ES-HyperNEAT). 9/19/12: Added link to Peter Chervenski's MultiNEAT software package. 10/2/12: Updated link to Oliver Coleman's AHNI software package. 4/6/13: Added links to recent publications from Clune while at Cornell. 9/30/13: Updated links in the FAQ. 5/6/15: Replaced list of software packages with link to the HyperNEAT Software Catalog.