Proposal: remote prover connectivity for Isabelle/PIDE

(See also general notes on proposals.)

“Cloud computing” is one of these buzzwords without any particular meaning, but the idea to run heavy-duty computations remotely is rather old: some “big-iron” in the background provides the CPU and memory resources for substantial applications, while the user interacts with the system via some small local terminal. Already in the classic days of Proof General (around 1999) it was common-place to run Emacs locally on a workstation and the prover process remotely on a server (via rsh). Alternatively it was possible to run both the editor and the prover remotely and use X11 as display protocol, which was especially important for the rather heavy XEmacs of that time.

This normal mode of distributed computing was almost forgotten, when the performance of local laptops and remote servers were approaching the same order of magnitude (due to the demands of the gaming industry). This was only an episode over a single decade, though, and we are already back to the traditional situation where local and remote machines can differ significantly. In 2014, typical mobile devices were limited to 2–8 CPU cores and 2–8 GB RAM. This is very little compared to low-end workstations or high-end servers, with something like 8–36 CPU cores and 32–512 GB RAM, or more.

Note that some big Isabelle applications already go beyond the possibilities of small machines with only 4–8 GB RAM, but for more memory Poly/ML process needs to be switched from 32-bit to 64-bit mode, which also doubles the memory demands. Thus there is a discontinuity here: stepping out of the “small device” category means to go for 16–32 GB RAM minimum.

This motivates the demand for remote prover connectivity for Isabelle and its Prover IDE (PIDE). The most basic approach is to run the internal socket connection for the PIDE protocol between ML and Scala over ssh. This should be sufficient for fast and reliable local networks. For non-local networks, there are the usual questions about bandwidth, latency, and reliability of the connection. The PIDE protocol requires relatively high bandwidth (which is easily provided by common DSL connections), but can afford high latency due to its asynchronous nature. Lack of reliability might turn out a real problem, though: resetting a lost TCP/IP connection naively means to restart the prover process and recheck the whole session from start, which could take minutes or hours.

Thus a more advanced approach would keep both the ML and Scala side of PIDE together on the server. Remote access then works via a separate PIDE display protocol, which is postulated here and still needs to be defined and implemented. Depending on active buffers and open text areas in the editor, the remote side would provide continuous access to incoming PIDE document markup, without demanding persistent management of the whole PIDE state locally. Loosing the connection would merely mean to reconnect the IDE to the remote Isabelle/Scala/ML component, which keeps running indefinitely.

Thus the mode of operation becomes more like the re-connection facility of VNC or RDP (but not X11). Of course it is already possible today with Isabelle2014 to use VNC or RDP for a completely remote ML/Scala/IDE process, but remote ML/Scala and local IDE would make this more comfortable for the user, with better graphics performance and reactivity.

Taking this perspective of remote PIDE sessions one step further could mean to support low-bandwidth, high-latency, unreliable connections of mobile networks: sitting on a train with a laptop and local IDE, while re-connecting to a remote PIDE session on a big server, would really count as cloud computing. We should think here of editing whole libraries like AFP on the spot, with immediate feedback. A bit more efforts will be required to get there, though.

In summary, the following stages are possible, depending on the amount of resources spent on this subject:

  1. Simple remote PIDE socket connection via ssh, usable for fast and reliable local networks. (The jEdit text editor already provides some means to manage ssh, so this merely requires the usual study of sources with subsequent tinkering and polishing to make it work smoothly.)
  2. Separate PIDE display protocol where the editor is local and the Isabelle/Scala/ML session is remote. This should be usable for fast DSL network connections.
  3. Support for smooth disconnection and re-connection for mobile networks.
  4. Development of a completely different PIDE front-end that works on tablets or smart-phones (Android or iOS).

The last point is speculative: it merely sketches to horizon of what could eventually be targeted, if there were lots of resources and several enthusiastic people working on it.