**Abstract:**

This paper derives a population sizing relationship for genetic programming (GP). Following the population-sizing derivation for genetic algorithms in Goldberg, Deb, and Clark (1992), it considers building block decision making as a key facet. The analysis yields a GP-unique relationship because it has to account for bloat and for the fact that GP solutions often use subsolutions multiple times. The population-sizing relationship depends upon tree size, solution complexity, problem difficulty and building block expression probability. The relationship is used to analyze and empirically investigate population sizing for three model GP problems named ORDER, ON-OFF and LOUD. These problems exhibit bloat to differing extents and differ in whether their solutions require the use of a building block multiple times.

**Abstract:**

This paper describes a probabilistic model building genetic programming (PMBGP) developed based on the extended compact genetic algorithm (eCGA). Unlike traditional genetic programming, which use fixed recombination operators, the proposed PMBGA adapts linkages. The proposed algorithms, called the extended compact genetic programming (eCGP) adaptively identifies and exchanges non-overlapping building blocks by constructing and sampling probabilistic models of promising solutions. The results show that eCGP scales-up polynomially with the problem size (the number of functionals and terminals) on both GP-easy problem and boundedly difficult GP-hard problem.

**Abstract:**

This paper analyzes building block supply in the initial population for genetic programming. Facetwise models for the supply of a single schema as well as for the supply of all schemas in a partition are developed. An estimate for the population size, given the size (or size distribution) of trees, that ensures the presence of all raw building blocks with a given error is derived using these facetwise models. The facetwise models and the population sizing estimate are verified with empirical results.

- In this paper, we see whether chance discovery in the form of KeyGraphs can be used to reveal deep building blocks to competent genetic algorithms, thereby speeding innovation in particularly difficult problems. On an intellectual level, showing the connection between Key- Graphs and genetic algorithms as related pieces of the innovation puzzle is both scientifically and computationally interesting. GAs represent that aspect of human innovation that tries to innovate through the exchange or cross-fertilization of notions contained in different ideas; the KeyGraph procedure represents that portion of human innovation that pays special attention to and interprets salient fortuitous events. The paper goes beyond mere conjecture and performs pilot studies that show how KeyGraphs and competent GAs can work together to solve the problem of deep building blocks; the work is promising and steps toward a practical computational combine of the two procedures are suggested.