The gen on Metaware High C/C++'s iterator-driven for.

You may be familiar with other programming languages that have a mechanism where one can use an iterator in a for(each) loop, that uses a yield function/keyword to drive the loop. C♯ has such a capability. So, too, has Python. The C++ language, as standard, does not. However, it is not the case that no implementation of the C++ language has ever had such a mechanism.

MetaWare High C/C++ supported an extension to both the C and C++ languages that MetaWare called iterator-driven for loops. The extension is documented in Appendix C of the High C/C++ Language Reference. I have also written up what it would look like in the C++ Standard.

Using iterator-driven for

Iterator-driven for provides a way of factoring out non-trivial data structure traversal code that would otherwise be repeated in every for loop.

Simple unselective iteration from begin to end

The simplest example is, of course, the same as that targetted by Standard C++'s range-based for, namely a non-selective iteration from begin to end of some container class object:

std::list<Employee> employees;

for ( std::list<Employee>::iterator i = employees.begin();
      i != employees.end();
      ++i
) {
      i->fire();
}

As an iterator-driven for loop, this would be:

void all ( const std::list <Employee> & list ) -> ( Employee & ) {
      for ( std::list<Employee>::iterator i = list.begin();
            i != list.end();
            ++i
      ) {
            yield(*i);
      }
}

std::list<Employee> employees;

for employee <- all(employees) {
      employee.fire();
}

But iterator-driven for goes well beyond this, in four respects: the ability to yield more than just the object, the ability to have non-trivial iterators with internal state, improved code maintainability, and a simple mechanism for halting partway through.

MetaWare's iterator-driven for pre-dates the invention of the range-based for mechanism by at least a decade, and 1990s versions of High C/C++ don't have range-based for at all. It is used in the examples in the remainder of this article for the sake of brevity.

Complex iterations with nested loops and recursion

One common data structure is a list of things that have lists as data members. It's not possible to iterate over all elements of all a single range-based for. One needs a separate for statement for each level of sub-list. However, with iterator-based for one uses just one for statement. The internal complexity of navigating the data structures, that otherwise would be repeated everywhere in the code that one needs to iterate, is only present in one place — the iterator function. If one is going to St Ives, this can be a significant code reduction, or at the very least a significant reduction in indentation of the interesting bits of the logic:

void all_kits ( Man & man ) -> ( Kit & )
{
    for ( Wife & wife : man.wives ) {
        for ( Sack & sack : wife.sacks ) {
            for ( Cat & cat : sack.cats ) {
                for ( Kit & kit : cat.kits ) {
                    yield(kit);
                }
            }
        }
    }
}

void processing ( Man & man ) {
    for kit <- all_kits(man)
        kit.hello();  // This is not already hitting the right-hand edge of the window …
    for kit <- all_kits(man)
        kit.bonsai(); // … and this doesn't repeat the same four for statements.
}

Not only can iterator functions comprise more than one level of internal loop, they can also perform less linear traversals. MetaWare's own example is that of traversing a binary tree, using an iterator function that calls itself recursively:

void all_nodes ( Node & current ) -> ( Node & )
{
	if (Node * right = current.right_child()) all_nodes(*right);
	if (Node * left = current.left_child()) all_nodes(*left);
	yield(current);
}

void processing ( Node & root ) {
    for node <- all_nodes(root)
        node.do_something();
}

Yielding more than one thing

Iterator functions are not limited to just yielding the element from the innermost loop. Since yield is effectively a function call, it can have multiple parameters:

void all ( Inventory & inventory ) -> ( District &, Warehouse &, unsigned, unsigned, unsigned, Item & )
{
    for ( District & district : inventory.districts ) {
        for ( Warehouse & warehouse : district.warehouses ) {
            for ( unsigned x = 0 ; x < warehouse.aisles; ++x ) {
                for ( unsigned y = 0 ; y < warehouse.rows; ++y ) {
                    for ( unsigned z = 0 ; z < warehouse.shelves; ++y ) {
                        yield(district, warehouse, x, y, z, warehouse.content_at(x,y,z));
                    }
                }
            }
        }
    }
}

void print_insurance_claim ( Inventory & inventory ) {
    for district,warehouse,aisle,row,shelf,item <- all(inventory)
        std::cout
            << "In " << district.name()
            << ", in our warehouse at " << warehouse.address()
            << ", the " << item.description()
            << " on shelf " << shelf
            << " row " << row
            << " aisle " << aisle
            << ", that we insured for " << item.price() * 5000U
            << ", was recently lost in a totally accidental fire whilst we ate in a restaurant in front of witnesses.\n";
}

Internal state

Like functors, iterator functions have internal state — the automatic storage duration variables within the function. In preceding sections we've seen how that state can be used to store a "path" to the current data element when an iteration involves nested loops. It can also be used to do other kinds of processing:

void expensive ( const std::list <Employee> & employees, unsigned max_salary ) -> ( Employee & )
{
      for ( Employee & e : employees ) {
            if (e.salary > max_salary)
                  yield(e);
      }
}

void first ( const std::list <Employee> & employees, unsigned total ) -> ( Employee & )
{
      unsigned count = 0U;
      for ( Employee & e : employees ) {
            if (count++ > total) return;
            yield(e);
      }
}

std::list<Employee> employees;

if (whim)
      for employee <- first(employees, 50) {
            employee.fire();
      }
else
      for employee <- expensive(employees, MINIMUM_WAGE) {
            employee.fire();
      }

A break, goto, or return within the for statement body will — of course — correctly call the destructors of all automatic storage duration variables within the iterator function. So the internal state can be used for things such as open file streams, opened at the start of the iterator function and closed when their objects are destroyed.

External state

Unlike the case with iterators built with callback functions, with iterator-driven for the body of the for loop has the normal access, for for statements, to external state — i.e. everything that is in scope at the time. No special contortions are required for accessing automatic storage duration variables outwith the for loop, for example.

void decimate ( Army & army ) {
    unsigned counter = 0;
    for man <- all(army) {
        ++counter;
        if (10 == counter) {
        	kill(man);
        	counter = 0;
        }
    }
}

© Copyright 2011 Jonathan de Boyne Pollard. "Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its original, unmodified form as long as its last modification datestamp is preserved.