|
Advanced Redex Trails:
fully-fledged tracing technology for functional programs (project proposal) Colin Runciman Department of Computer Science, University of York
This is a revised version of the proposal submitted
to EPSRC, the research council that is funding the project.
Some minor sections giving personal or financial details,
or addressing issues of EPSRC policy, have been omitted. Summary:In many fields, high expectations of Information Technology are limited in practice mainly by current methods of developing computer software. Declarative programming systems in general, and functional languages in particular, have an important part to play in an improved software technology. These languages free programmers from the need to express specific sequences of calculation. They also provide powerful ways of directly combining component functions. They are inherently safer than programming languages now in widespread use, and dramatically more concise.However, the very high-level nature of functional languages poses two big problems: (1) how to turn programs into efficient computations; (2) how to trace programming errors from the faults they cause. Since the mid 1980s, problem (1) has been a popular target of research, and excellent progress has been made with it: there are now optimising compilers for functional languages. But problem (2) has received less attention, and in practical terms it remains wide open. The lack of tracing and debugging tools is a long-standing gap in functional-language technology, deterring potential users. I therefore propose a decisive attack on the tracing problem for functional programs. My aim is to advance a successful but limited prototype, recently developed with ROPA funding, to the stage of a convincing tool for practical application. Several research problems must be solved to achieve this aim: to give just two examples, efficient low-level counterparts must be found for high-level combinators with subtle properties of abstraction, and new algorithms are needed for incremental compression of heap structures to file without disturbing lazy evaluation. Key phrases:declarative languages; functional programming; software development; fault-tracing tools; programming environments.1 Context1.1 Past ProjectsThe broad aim of my work in recent years has been to advance the state of the art in functional programming systems, enabling wider and more effective use to be made of this technology. I have run a series of research projects in this connection, with a mixture of government and industrial funding. The most important of these for this proposal have been:FLARE (1991--93, IEATP, GR/F 98857 C2117)This project aimed to demonstrate that functional programming systems could be effectively applied to a range of demanding applications. It was a collaboration between industrial and academic partners, each of whom developed applications in their own field of expertise. British Telecom managed the project. I was the technical co-ordinator and co-edited a book based on the project [runcimanwakeling95]. One important conclusion from FLARE was that the availability of profiling and tracing tools can be a critical factor in the successful use of functional programming. At York we developed novel methods for profiling heap memory [runcimanwakeling93a] and parallel evaluation [RuncimanWakeling94]. There was not time to address the more complex tracing problem, but it was put high on our list of topics for future research.Embedding Functional Programs (1994--, Canon Research Europe)Towards the end of FLARE, a separate PhD project at York aimed to adapt a functional language implementation for the special requirements of embedded systems, including demands for memory efficiency and a richer treatment of I/O [wallacerunciman95b]. CRE has funded a continuation of this work. The current techniques used for embedding software into devices such as typewriters and copiers are a long way from purely functional programming. But there are prototype software systems developed on large workstations in declarative languages that Canon might one day wish to embed in such office products. We have been researching extensions to functional languages and their implementations to make such embedding possible. A significant part of this work has been a novel treatment of compressed binary data [WallaceRunciman98]: this could be a useful part of our tracing technology, where one of the key problems is to reduce memory demands for trace structures representing entire computational histories.Selective tracing of functional computations using graph reduction with redex trails (1996--97, ROPA, GR/K64334)This 18-month project aimed to demonstrate the feasibility of a particular approach to the tracing problem by building a prototype implementation. We extended a compiled graph-reduction implementation of the functional language Haskell, transforming programs so that at run-time they could build trails of contracted redexes (intermediate expressions, re-written by appeal to definitions in the program) attaching a comprehensive derivation tree to every value [sparudruncimanplilp97]. We developed a special purpose browser for the examination of trails starting from outputs or run-time faults; we experimented with partial trails and pruning rules to increase the size of applications that could be handled [SparudRuncimanIFL97]. This ROPA project convinced us that redex trails can be made the basis of an effective tracing system, useful in real applications. The goal of the project now proposed is to fulfil this potential.1.2 InstitutionThe Computer Science department at York has internationally known research groups and attracted the highest possible rating (5*) in the most recent national Research Assessment Exercise. Software technology, including compilers and related tools, is a long-standing theme in the department's work.2 Proposed Research2.1 BackgroundThe tracing problemFunctional programming systems offer significant advantages over more conventional alternatives, particularly for complex symbolic applications. The abstracting power and declarative nature of functional languages, largely free from low-level concerns such as sequence of evaluation and memory management, enable computational ideas to be expressed very directly and concisely, with fewer possibilities for error.However, this property of being abstract is two-edged. Though there are fewer classes of possible mistakes in functional programs, errors do still occur, and programmers need to trace their causes. Time and again the lack of tools for tracing and debugging has deterred software developers from using functional languages [Wadler98]. This is not simple neglect on the part of implementors; a high level of abstraction separating language from machine makes tools genuinely hard to build. A computation of a conventional imperative program is comparatively easy to trace, step by step, but a functional computation performed using a sophisticated evaluation strategy such as normal order graph reduction involves a sequence of events not explicit in the program, and often baffling to the intuition. Hence the research challenges: What is an appropriate form of trace for functional computations? How best can such traces be constructed and applied in practice? Approaches to the problemIn the 1980s, little work was done on profiling and tracing functional computations. In the 1990s, work at York and elsewhere has delivered effective profiling techniques, leading to marked reductions in the time and space costs of many applications [RuncimanRojemoSchool96, sansompeytonjonestoplas97]. Success with the tracing problem has not come so easily: a review in a recent Australian thesis [watsonphd] notes some twenty different attempts at a solution, none of them conclusive!Approaches broadly divide into those that base traces on a linear sequence of events, and those that represent computational history using some sort of dependency graph. Methods based on a sequence of events have had some success for `mostly functional' languages with a strictly applicative evaluation sequence; Tolmach and Appel's system for the ML language is a notable example [tolmachappeljfp95]. The benefits of the sequential approach are that a trace can be displayed as computation proceeds, and that it is easy to trade time for space: given a few check-point states stored at well-chosen intervals, any past state of computational history can be reached by re-execution. However, the more ideal purely functional languages permit the definition of non-strict functions, implemented using the lazy evaluation strategy of call-by-need. Despite some valiant attempts [watsonphd], the evidence to date is that for such programs linear traces of the sequence of reductions are ineffective. Rather, to track faults in realistic computations, it is necessary to record a non-linear structure of dependencies between expressions. Redex trails prototypeIn our work, a network of redex trails [sparudruncimanplilp97, SparudRuncimanIFL97] represents these dependencies. Functional expressions can be viewed as trees or --- since sub-expressions can be shared and may recursively reference enclosing expressions --- as graphs. Evaluation repeatedly replaces one subgraph by another, where the reduction rules used to define replacements are derived from equations that constitute the program. At each reduction step, a redex matching the left-hand side of an equation is replaced by the corresponding instance of the right-hand side. To construct redex trails, at each reduction during the computation we arrange that from every node n of the resulting subgraph there is a link to the parent redex whose reduction caused n to be introduced. Using a special-purpose browser which reconstructs expressions in source form, the programmer can proceed backwards along the relevant trails from a fragment of output, from a run-time error, or from a point at which an apparently unproductive computation has been interrupted. Our ROPA project has shown the viability of this approach, as outlined in §1; but despite its promising results, our ROPA prototype is limited in a variety of ways, each prompting lines of further research:
Other alternativesThough we have developed redex trails as the basis for tracing, we are aware of alternatives. Perhaps the most advanced of these is Nilsson's Freja system, based on Evaluation Dependence Trees (EDTs) [NilssonPhd98]. His EDT nodes record equivalences between expressions, where each parent equivalence is a direct consequence of the child equivalences and an equation in the program. The starting point in an EDT is not a final reduction leading to an output or point of error but the root equivalence between an original main expression and its entire computed result. Two drawbacks of the EDT approach in comparison with redex trails are that incremental archiving to file is even more difficult (Nilsson rules it out in favour of repeated execution to obtain new fragments of the EDT) and that the user has to reason about larger displayed expressions.2.2 Programme and MethodologyOverall aim:a solution to the problem of tracing higher-order lazy functional computations that is effective in practice, based on tools for the construction and examination of redex trails.Primary objectives
Secondary objectives
Timeliness and noveltyThe need for a tracing tool that can cope with lazy higher-order functional programs has been recognised for almost twenty years. The lack of a tracer has been a key reason for the surprisingly limited use of a programming technology with so many advantages [Wadler98]. There are now optimising compilers for languages like Haskell, and tools for the integration of functional components into larger systems. A solution to the tracing problem is overdue. It is a difficult problem to tackle comprehensively, but the redex trails approach is distinctive, and with the results already obtained in the ROPA project we are well-placed to succeed.MethodologyA lot of work has gone into our ROPA prototype, with promising results. We should advance from the given position, not restart from scratch. The principles of the compile-time transformation, the functional representation of redex trails, and the message-passing architecture of the tracer (browser communicating with functional computation) should all be preserved. We shall also continue to work with the nhc compiler, but reducing dependencies on its internal world to increase portability. Java was used for the prototype browser, and proved quite satisfactory. Though we plan to re-think the browser design, implementation will again be in Java --- a browser that works well over the internet could prove useful when we want to support external users.So far as practical evaluation is concerned, we do not intend to set up rigorous laboratory experiments scrutinising tracing `micro-tasks'. Such experiments would absorb too much effort. Results from actual use in practice are preferable for our purpose, but general feedback from casual use might be too vague or sparse. One kind of evaluation exercise we intend to use requires two programmers. A explains to B a correctly working program of moderate size (around 500 lines) that A wrote some time ago. B now secretly introduces several deliberate errors into the program, of a kind undetected by the compiler. Given the faulty program, A uses the tracer to locate and fix all the errors, thinking aloud as they do so. B takes notes throughout, and afterwards A and B review the exercise, recording the main issues arising. Programme of workFigure 1 summarises the workplan in diagrammatic form. From the start, the two researchers will engage in joint tasks combining their skills as well as solo tasks demanding their individual expertise. Brief descriptions follow for each activity.
Project managementA weekly project meeting will provide a regular occasion to review current progress and confirm immediate plans. With a small team, working closely together, these meetings will naturally be informal. However, written records will be kept to avoid misunderstandings or things being forgotten. Interested visitors may be invited to join these meetings, and regular meetings of the Programming Languages and Systems research group will give further opportunity for brief reports and discussion.Once a quarter we shall review progress on the project as a whole, revising plans if necessary to secure the most important objectives. We shall also issue a short quarterly bulletin to our academic and industrial contacts, and any other interested parties. 2.3 Dissemination and exploitationWe shall seek in various ways to encourage both academic and industrial Haskell users to try our new tracer, and to participate in its further development. The plan is to release an experimental version of the tracer at the end of the first year, as the common platform for a concerted period of trial use. We shall make suitable advance announcements on the internet, seek opportunities to present our work on the tracer at recognised conferences, and also hold a one-day workshop at York. Within the UK, we may visit users to get them started. Besides establishing a mailing list for general feedback and discussion, we shall publish details of suggested forms of tracing experiment on the web.There are several archival FTP sites for public domain implementations of functional languages. We plan to make the final results of our work freely available by placing copies of software at these sites, in addition to publishing papers. 2.4 ResourcesThe appropriate scale for this project has been determined by comparison with the 18-month ROPA project during which the prototype redex-trail system was developed.PeopleIf we are to push this tracing technology far enough to make it really effective in practice, there is a lot of work to be done on several fronts. Hence the proposal to hire a team of two post-doctoral RAs for two years.Infrastructure and equipmentThe department of Computer Science at York provides computing facilities including networking and file-service, with technician support.Each of the RAs will need a workstation. The two most common platforms for developers and users of research languages are Suns and PCs. Of these, PCs offer better performance for the price --- and we have access to an UltraSparc compute server if a Sun platform is needed. At the time of writing £2000 buys a PC system including a 400MHz Pentium II processor, 128Mb RAM, 8.4Gb disk, 17"--19" monitor, CDROM, network and graphics cards. Suitable components will be chosen and purchased separately, and assembled by our departmental technicians. TravelWe request funds to participate in relevant conferences. ICFP is the premier international conference in functional programming; INTERACT is an international conference with user interface design as a central interest. IFL is an established European workshop emphasising new implementation methods and practical experiments in functional programming. The costing assumes that we all attend ICFP but only two of us attend each of IFL and INTERACT.We hope to maintain close working links with researchers in Sweden. Jan Sparud (Gothenburg) worked on the ROPA project, and we value continuing collaboration with him. Henrik Nilsson (Linköping) has developed a rival approach to functional tracing, based on Evaluation Dependence Trees. We propose to visit Sweden twice a year. Within the UK, we have links with relevant research groups at several other universities, from Exeter to St Andrews, and with industrial R&D sites, mainly in the South. We request funds for visits to discuss the development of the tracer, and to promote its experimental use. References
This document was translated from LATEX by HEVEA. |