arch detail

arch detail

Tuesday, February 20, 2007

In Response to Berkeley's "Landscape of Parallel Computing Research"

I recently read The Landscape of Parallel Computing Research: A View from Berkeley. It's a really great article, and it's freely available, and if you're interested in the future of high-performance computing or operating systems (or generally any computing that will be affected by the "manycore" movement in processor design) I strongly suggest it.

In the article, they decide on the thirteen major computation kernels which will be the basis of most problems in the future, essentially setting up the SPEC of the future. Of the thirteen, twelve are heavily parallel.

The thirteenth problem is modeling finite-state automata (though they use the words "finite-state machine" or FSM). If you don't know what FSAs are, I just posted a brief summary in this blog which should be just enough to help this entry make sense.

So now, back to the article. The Berkeley paper mentions the following in passing:

Perhaps FSMs will prove to be embarrassingly sequential just as [another problem] is embarrassingly parallel. (page 14).
I found this statement to be immensely interesting, and actually a major oversimplification. You can imagine that computing the state of an discrete finite machine doesn't lend itself to much parallelism. At any given time we are in exactly one state, and after one input, we move to exactly one more state.

And if FSAs are hopelessly sequential, then we have a whole family of problems that can't be solved any faster by an 80-core processor in 2010 than a 1-core processor in 2002.

However, this does not factor in the possibility that our FSA is not a DFA. There are also a whole family of FSA called non-deterministic finite state automata, or NDFA. These machines are generally somewhat difficult to conceptualize, because at any given time we need to imagine the machine as being in a set of one or more states. We might have a rule like this:

S0: A -> S1
A -> S2
B -> S3

i.e.: When in state S0, if given the input A, we now transition to states S1 and S2.

It is a known fact that any NDFA can be trivially converted back to a DFA, which is just wonderful for sequential programming. But a well-designed NDFA can be much simpler (far fewer states, fewer steps towards completion) than an equivalent DFA. And in fact, NDFAs are often more intuitive for a human to design - and can easily model things like lambda calculus, which requires non-deterministic parsing.

However, NDFAs aren't very popular because:
  • NDFAs can't do anything a DFA can't (because every NDFA has an equivalent DFA), and
  • There are very simple, well-researched methods for turning an NDFA into a DFA
  • There are simple known rules and methods for manipulating DFAs automatically
  • DFAs get along very nicely with sequential programming.
So historically, since DFAs are simple and get along quite well with sequential programming, NDFAs have, in my past education, been treated as sort of a footnote. I'm not even sure there exist known methods for converting a complex DFA into a simpler NDFA (though I'd imagine there are).

In conclusion, I refute the statement by Berkeley that FSAs are embarassingly parallel and thus problematic for the future. What we need is a lot more research into a tricky and complex topic, particularly methods for creating simpler NDFA (this might be very tricky - the point of being an AI problem) and tools that allow us to utilize multiple processors to model NDFAs (this should be pretty simple).

More on this in a later post.

No comments: