Dealing with Intermediate Computations

In complex spreadsheets, it’s not unusual to find cells whose values are computed from complicated combinations of subexpressions that contain many functions and operators. Trying to build formulas in these cells to produce the desired results can be very difficult.

The reasons for this are three-fold. First, if a formula is complicated, then it may contain more characters than will fit into Excel’s formula bar. This makes it very hard to see the entire formula in a glance. Second, Excel depends on parentheses to specify the order of computations. Deeply nested parentheses also make it difficult for the user to comprehend the structure of a formula. Finally, Excel represents a formula as a single line of text. Virtually all software text editors make code much easier to read by using vertical space, tabs, and colors to reveal its structure.

Because of these issues, many Excel formula developers choose not to develop single monolithic formulas. Instead, they change what would be a complicated single formula into a simpler master formula that refers to other cells that compute intermediate values. These cells, in turn, may refer to yet other cells that compute intermediate values for them, and so on. The simpler formulas in each of these intermediate cells are essentially subexpressions of the complicated formula that the formula developer chose not to build in a single cell.

The image below shows a spreadsheet that contains three ranges.

  • A1:B5 contains input values for a computation. It has values in cells B2:B5 and labels A, B, X, and Y for these values in cells A2:A5. The spreadsheet defines these labels as names for the cells containing their values, so that the labels can be used in formulas such as (A+B)*(X+Y).
  • D1:E4 contains the computation of (A+B)*(X+Y) broken down into intermediate computations of A+B in cell E2 and X+Y in cell E3. The computation of (A+B)*(X+Y) in cell E4 refers to the intermediate computations in cells E2 and E3.
  • G1:H2 shows a single formula performing the same computation in cell H2 without the need for intermediate computations.

Intermediate Computations - Intermediate Values

The image below shows the same spreadsheet but with formulas shown.

Intermediate Computations - Intermediate Values as Formulas

One can see the tradeoffs between using intermediate values and having a single-cell formula that refers only to the starting input values. The intermediate value approach has a simpler formula in the cell that computes the final value but has several disadvantages. First, the resulting spreadsheet has many more cells than necessary. These extra cells may cause the sheet(s) containing the intermediate values to become too large to fit on small screens, such as those on laptops or mobile devices. Second, the extra cells clutter up the spreadsheet, making it hard for the reader to distinguish between cells holding values of interest and cells holding intermediate values that have no reason to exist other than to provide feeder values for other cells. One can tell whether a cell contains a feeder value for another cell by using Excel’s auditing tools to look for dependents, but doing this does not distinguish between cells that have independent interest (e.g., row totals) and those that do not. Furthermore, using Excel’s auditing tools can be problematic if a dependent is on a different sheet than the intermediate-value cell: Excel does not point to specific cells on other sheets, but simply provides a graphical representation that the cell is on another sheet. A final disadvantage of the intermediate value approach is that it does not work for array formulas, which cannot be broken up into intermediate computations.

The single-cell formula approach allows for spreadsheets without extraneous cells, but it makes it harder to construct and debug formulas. FormulaForge solves this problem. Specifically, it allows Excel users to build and debug a complicated formula in a single cell without the disadvantages of using intermediate cells. It corrects the problems that occur when using Excel alone by

  • providing a way to develop a formula that will eventually be a single text line without having to work exclusively in a single text line,
  • providing a way other than parentheses to specify computational order,
  • providing a way to represent a formula’s structure using both vertical and horizontal space, and
  • making it easy to see intermediate values without using additional cells.

Formula Forge’s Tree View overcomes all the of the disadvantages of intermediate cells described above while at the same time maintaining the intermediate cell advantage of showing intermediate values. The image below shows the Tree View for our sample formula.

Intermediate Computations - Formula Forge Tree View

The example in this blog entry is very simple but illustrates the value of being able to construct a complicated formula in a single cell. As the complexity of a formula increases, the value of FormulaForge becomes more and more evident.