>First, the above average shorter schemata are the ones discovered early on. The reasoning is similar to that given for the early discovery of less specific conditions in the default hierarchy. A schema must have one of the three letters {1,0,#} at each of its defining positions. Thus if we select a particular set of k defining positions, there are 3k variants possible. For k = 4 (defining positions) there are 34 or 81 distinct schemata to be tested. Even a rather small population can, in a short time, have produced useful number of trials of all of these alternatives. >The number of “defining positions” for a schema is, at most, one more than its length, therefore short schema have fewer variants. These variants will be tested rather quickly, and if some are above average they will be quickly exploited, like the early exploitation of general rules in a default hierarchy.