## Description of procedure for deriving the Decomposition table

(Brien, C.J., and Bailey, 2009)

The following diagram illustrates the procedure for deriving the decomposition table. The algorithm for producing a table without expected mean squzres has been implemented in the AMTIER procedure (Brien and Payne, 2006). A description of the procedure follows the diagram or you can go to the description for a particular rectangle by clicking on it.

**Sets of objects and observational unit**Firstly, the sets of objects involved in the randomizations in the experiment are identified. Then the set of objects that are the observational units is identified. Federer (1975) defines this to be 'the smallest unit on which an observation is made'. An advantage of using the observational unit rather than the experimental unit is that for each response variable there is only one type of observational unit in an experiment whereas it is clear from Federer (1975) that there might be several different types of experimental unit. Thus, it should be easier to identify the observational unit.**Tiers**The crucial feature of the procedure is that the factors are divided into sets or tiers as described by Brien (1983, 1989 and 1992) and Brien and Bailey (2006) according to their status in the randomization that was performed in designing the experiment. For multitiered experiments there will be at least three tiers. It is vital for this that all factors involved in the experiment are identified.**Intratier formulae**One then uses each tier to form an intratier formulae by determining the nesting and crossing relationships between the factors in the tier. It may also be necessary to include pseudofactors in some formulae and indicate that some factors are independent of others. The notation we use in the formulae is that of Wilkinson and Rogers (1973) and Genstat (Payne et al., 1993). A*B indicates that the factors A and B are crossed, A/B indicates that the factor B is nested within the factor A, A+B indicates that the factors are independent and A//B indicates that B is a pseudofactor to A. The randomization diagrams are useful in formulating these.**Analysis formulae**These are obtained from the intratier formulae by considering for each whether crossed or nesting relationships between the factors in the current intratier formula and those in other formulae are appropriate.

Often, but not always, there is a one-to-one correspondence between the analysis formulae and the tiers. There is not when some factors occur in more than one structure formulae, because factors can occur in only one tier. Also, sometimes there are less and sometimes more structure formulae than tiers.**Decomposition table**Now form the decomposition table by going around the loop shown in the above figure. The process begins with the analysis formula involving only factors whose levels are intrinsically associated with the observational units - terms involving these factors make up the initial decomposition table. This table is extended by incorporating the terms from the second structure formulae into it as described below. The extended decomposition able is further extended by incorporating the terms from each of the other structure formulae until the terms from all structure formulae have been incorporated into the decomposition table.

For each analysis formula in turn, a circuit of the loop to extend the decomposition table proceeds as follows:**Derive the terms and sources from the current formula**Expand the structure formula using rules such as are given in Wilkinson and Rogers (1973) or Heiberger (1989); Monod and Bailey (1992) give details on the handling of pseudofactors. If the factors A and B are crossed (A*B in a formula), these rules lead to the terms A, B and A^B being included in the analysis where A^B represents the generalized factor formed from the factors A and B. If factor B is nested within factor A (A/B in a formula), the standard rules lead to the terms A and A^B.

More generally, for formulae L and M:where gf(L) is the generalized factor formed from the the factors in L and L^M is the sum of products of all pairs of terms in L and M.L / M = L + gf(L)^M L * M = L + M + L^M

As an example of using the rules for a more complicated formula we expand (A*B)/(C*D):

The source for each term is derived as follows:(A*B)/(C*D) = (A*B) + A^B^(C*D) = (A + B + A^B) + A^B^(C + D + C^D) = A + B + A^B + A^B^C + A^B^D + A^B^C^D - Form the generalized factor from those factors in the term that nest at least one of the other factors in the term.
- List all the factors that are not in the generalized factor of the nesting factors, each separated by ‘#’. Then add the to the end of the list the generalized factor of the nesting factors, placing it between square brackets.

A + B + A#B + C[A^B] + D[A^B] + C#D[A^B]

In this set of terms, the term C#D[A^B] stands for the interaction between C and D nested within each combination of the levels of A and B; that is [A^B] represents the combinations of A and B.**Incorporate current sources**and their degrees of freedom into the decomposition table.

Add a major column to the decompostion table consisting of columns for the sources, degrees of freedom and, for nonorthogonal experiments, efficiency factors for the current analysis formula. If the current formula is the first formula, which contains only unrandomized factors, the column will consist of a row for each source from that formula. When incorporating sources from other than the first formula, place them in the new major column alongside the sources already in the decomposition table with which they are confounded. This amounts to determing the experimental units for the generalized factor corresponding to a source. All sources from the same formula confounded with a particular term will be listed one under the other with the row for the term, with which they are confounded expanded to fit them. Also, if there are Residual degrees of freedom, a Residual source will need to be added, under the list of terms from the current formula. The number of Residual degrees of freedom is equal to the difference between those of the original source and the sum of the degrees of freedom of the sources incorporated under it.

Model sources that arise in two consecutive formulae will not have a line entered for the formula incorporated last. When two model sources are totally aliased, such as can occur with fractional factorial experiments, one will be omitted from the analysis and a note of it made under the table.

**Categorize terms as fixed or random**- One possible categorization of the terms is that all are classifed as random, except those terms that have only ever been randomized. This would lead to an analysis that is equaivalent to a randomization analysis.
- Another possibility is that each factor could be categorized as fixed or random. Then a term is fixed provided that it involves only fixed factors or as random if it involve a random factor.
- Otherwise, one could independently categorize each term as fixed or random. In the end fixed terms are one that allow for arbitraty differences between the effects whereas random terms require that the effects conform to a probability distribution, usually normal.

**Derive the expected mean squares and add to table**For orthogonal experiments, for each random term derived from a formula there will be a stratum variance component (‘ξ’) that is equal to a linear combination of the variance components for the terms derived from the formula. (We use ‘φ’ rather than ‘σ^{2}’ to avoid the use of both a subscript and a superscript.) For a particular row in the table, the expected mean square is a linear combination of the ξs for the terms corresponding to the sources in the row; the coefficient of each ξ is the number of replicates of each object amongst the observational units. The variance components making up a stratum variance component are all those corresponding to terms that are marginal to the term (i.e. a subset of the factors in the term) corresponding to the stratum variance component. The coefficient of a variance component in this linear combination is the replication of the generalized factor amongst the set of objects indexed by the factors in the generalized factor.

For each source corresponding to a fixed term, a*q*-function is included in the expected mean squares; a*q*-function is a quadratic form for a source that is the same quadratic form as for the sum of squares for that source, except that the response variable is replaced by its expectation