Thursday, May 5, 2011

GUI Programming and MVC

This post was inspired by this post on William Cook's blog.

I've been thinking a lot about GUIs lately, and I think model-view-controller is a very good idea -- but one that existing languages do a poor job supporting. Smalltalk deserves credit for making these thoughts possible for Trygve Reenskaug to express, but it's been over 30 years; we really ought to have made it easy to express by now.

Now, the point of a controller is to take low-level events and turn them into high-level events. Note that "low-level" and "high-level" ought to be relative notions -- the toolkit might give us mouse and keyboard events, from which we write button widgets to turn them into click events, from which we write calculator button turns into application events (numbers and operations).

However, in reality new widgets tend to be extremely painful to write, since it usually involves mucking around with the guts of the event loop, and so people tend to stick with what the toolkit gives them. This way, they don't need to understand the deep implementation invariants of the GUI toolkit. As a result, they don't really build GUI abstractions, and so they don't need separate controllers. (A good example would be to think about how hard it would be to write a new text entry widget in your toolkit of choice.)

So I see the move from MVC to MV as a symptom of a problem. The C is there to let you build abstractions, but since it's really hard we don't. As a result new frameworks drop the C, and just grow by accretion -- the toolkit designers add some new widgets with each release, and everybody just uses them. I find this a little bit sad, honestly.

As you might guess from this blog, I find this quite an interesting problem. I think functional reactive programming offers a relatively convenient model (stream transducers) to write event processors with, but we have the problem that FRP systems tend towards the unusably inefficient. In some sense this is the opposite problem of MVC, which can be rather efficient, but can require very involved reasoning to get correct.

This basically leads to my current research program: can we compile functional reactive programming into model-view-controller code? Then you can combine the ease-of-use of FRP, with the relative efficiency[*] of the MVC design.

IMO, the system in our LICS paper is a good first step towards fixing this problem, but only a first step. It's quite efficient for many programs, but it's a bit too expressive: it's possible to write programs which leak rather a lot of memory without realizing it. Basically, the problem is that promoting streams across time requires buffering them, and it's possible to accidentally write programs which repeatedly buffer a stream at each tick, leading to unbounded memory use.

[*] As usual, "efficiency" is relative to the program architecture. MVC is a retained mode design, and if the UI is constantly changing a lot, you lose. For things like games, immediate mode GUIs seem like a better design to me. In the longer term, I'd like to try synthesizing these two designs, For example, even a live-document app like a spreadsheet or web page (the ideal cases for MVC) may want to embed video or 3D animations (which work better in immediate mode).


  1. This comment has been removed by the author.

  2. I agree that low level events must often be translated into higher-level events. I just think that this is naturally part of the responsibility of a view. So it's really just a question of packaging and modularity. I'm arguing for MV where the full view V could be decomposed into a small output-only view v and event controller c. You might want to look more closely at the old COM-based MFC library, where it was fairly common to define new widget types, although the tended to be wrapped versions of some more primitive views.

    I looked at your ICFP submission. The abstract is fascinating but would probably sound like Jabberwocky to most people:

    "We give a denotational model for graphical user interface (GUI) programming in terms of the cartesian closed category of ultrametric spaces. The metric structure allows us to capture natural restrictions on reactive systems, such as causality, while still allowing recursively defined values. We capture the arbitrariness of user input (e.g., a user gets to decide the stream of clicks she sends to a program) by making use of the fact that the closed subsets of a metric space themselves form a metric space under the Hausdorff metric, allowing us to interpret nondeterminism with a 'powerspace' monad on ultrametric spaces."

    What really scares me is that I actually understand the paper to some degree. But I have to admit that I don't think it is going to help me understand the kinds of problems that I care about in writing GUIs. I'm more interested in how GUIs are constructed at a high level, not how they are programmed at this micro-event level.

  3. I think that most people would find talk of continuations and coalgebras to be weird, too, but they use exceptions and objects all the time. :)

    More seriously, the problem we're starting (though not finished, by a long shot) to solve is how to build abstractions over event-driven programs. All the semantic stuff was forced on us basically as we worked out how to specify a kind of function abstraction which interacts nicely with event streams, which is a property that existing functional and OO libraries lack (basically because their existing procedure mechanism is just a bit too general).

    If you look at the calculator example in our paper, you can see how this starts to pay off even in small examples. Being able to treat event streams as their own kind of data, together with the ability to easily abstract, means that it pays off to introduce new widgets at a very fine grain. Even more importantly, the code ends up structured in the same way that I'd want to explain it to someone.

    I do plan to try translating larger programs (eg, JHotDraw, the E. coli of GUI programs), but there are a number of implementation deficiencies that I need to address first. If I'm lucky, I'll discover some semantic deficiencies too.

  4. I hate to fork this discussion further, especially since I do not allow comments on my blog, but my thoughts exceeded the length allowed here:

  5. NeelK,

    Having slept on your paper and blog post, I think I'd like to update my blog post further and give you a series of challenge problems to test your current and future ideas against.

    Also, I still don't understand what you mean here by letting programmers write code in FRP and then compile it to MVC. The nice thing about MVC is that it is easy to sequence events relative to an observer. FRP is very different in this regard, due to its continuous semantics, since it flips the traditional way MVC frameworks deal with change: the changed object does NOT have to know all of its sources of change!

    Finally, (imperative) data binding is really just a generalized Observer pattern where the Subject and Observer use a Registrar to completely decouple from one another and maintain anonymity. The Observer doesn't care who the Subject is, and only cares about being able to write distributed, ad-hoc queries (a'la Linda and Concurrent Prolog). Of course, these distributed, ad-hoc queries can be hard to see as such in some data binding examples, such as the WPF one, since the Binding class allows developers to use open recursion to create conditional binding rules. A good example is Paul Stovall's DelayBinding class, which allows an Observer to specify a distributed, ad-hoc query demanding for a Subject to provide some data after some time span has lapsed. The motivating problem is delayed search such as the Google Search textbox feature that recommends search auto-completion and results as the user types, but not update so often as to give users seizures :)

    In this regard, (imperative) data binding shares the same characteristic as MVC, since, again, the changed object DOES NOT have to know all of its sources of change. Also, thanks to open recursion, different policies for subscribing to updates can be 'mixed in'.

    Am I speaking heresy? There are other issues I plan to point out, but one at a time...

  6. William Cook said: "I'm arguing for MV where the full view V could be decomposed into a small output-only view v and event controller c. You might want to look more closely at the old COM-based MFC library, where it was fairly common to define new widget types, although the tended to be wrapped versions of some more primitive views."

    This is called Document-View architecture in 'Softie speak.

    The major reason you don't want to use Document-View architecture is because it doesn't allow for complex View update rules. Creating new widgets in MFC was not as common as you suggest, and certainly nowhere near as common as the explosion of custom controls in WPF and Silverlight.

    For a complex View update rules example, how would you design a spreadsheet or data grid control using Document-View architecture? As I recently posted on my blog a few weeks ago, Alan Kay's group developed spreadsheet software in which every spreadsheet cell had its own MVC. When user asked the spreadsheet software to graph a range of cells, the graph would ask each cell where it should be drawn, what color (the cell would ask itself, "Am I positive? Then I'm Black, too"), etc.

    The cell passed forward its capabilities to the graph, potentially allowing for rich interactions like drilling down into a point on the graph or tagging the point on the graph with a note. For example, a graph showing each US President's approval rating might tag points in the timeline with things like "September 11th Twin Towers Terrorist Attack" to demonstrate the sudden jump in George Bush's approval ratings. In this regard, the graph doesn't know what it is rendering, it only knows one responsibility: make sure things look good. This is an especially appealing way to design graphical components, especially when you consider known hard problems like labeling pie slices in pie charts. The responsibility of the pie chart in this scenario is to provide a heuristic that provides close to optimal labeling.

    I think NeelK's point remains: Document-View architecture is really just a lame heuristic due to weak languages. Smalltalk-80 made MVC possible but still harder than necessary, but we should do better than just relying on UML Package Diagrams aka "architectural patterns" like Presentation-Abstraction-Control, J2EE Model-2, Document-View and other heuristics. In composite applications like most real world enterprise software and most developer tools (IDEs), there are third-party plug-ins. Most of these composite applications would probably benefit from a formal theory of communication like the Pi Calculus, rather than CRAP like Microsoft Patterns & Practices' Prism framework (aka WPF Application Block) which just shoves most important plug-in communication into a global store with no invariants, which from the 1,000 feet perspective is no better than things like tcc and Daedelus which express transitions on a global store.