Proc. of the 36th Annual Meeting of the GI Working Group

(1)

Tagungsband des 36ten Jahrestreffens der GI-Fachgruppe “Programmiersprachen und Rechenkonzepte”

Proc. of the 36th Annual Meeting of the GI Working Group

“Programming Languages and Computing Concepts”

Jens Knoop, Martin Steffen and Baltasar Trancón y Widemann (Eds.) Research report 488, September 2019

ISBN 978-82-7368-453-0

ISSN 0806-3036

(2)

(3)

36tes Jahrestreffen der GI-Fachgruppe

“Programmiersprachen und Rechenkonzepte”

Bad Honnef, 6.–8. Mai 2019

Tagungsband

(4)

Vorwort

Seit 1984 veranstaltet dieGI-Fachgruppe “Programmiersprachen und Rechen- konzepte” regelm¨aßig im Fr¨uhjahr ein Arbeitstreffen im Physikzentrum Bad Honnef. Das Treffen dient in erster Linie dem gegenseitigen Kennenlernen, dem Erfahrungsaustausch, der Diskussion und der Vertiefung gegenseitiger Kontakte.

In diesem j¨ahrlichen Forum werden Vortr¨age und Demonstrationen zu sowohl bereits abgeschlossenen als auch noch laufenden Arbeiten vorgestellt, unter anderem zu Themen wie

– Sprachen, Sprachparadigmen

– Korrektheit von Entwurf und Implementierung – Werkzeuge

– Software-/Hardware-Architekturen – Spezifikation, Entwurf

– Validierung, Verifikation – Implementierung, Integration – Sicherheit (Safety und Security) – eingebettete Systeme

– hardware-nahe Programmierung

Dieser Technische Bericht ist dem Treffen des Jahres 2019 gewidmet, dem 36ten Workshop aus der Reihe, welcher vom 6. bis 8. Mai stattfand. Daneben wurden einzelne Beitr¨age aus vorangegangenen Jahren mitaufgenommen. Diese Jahr gab es speziell Sitzungen mit Vortr¨agen zu den Themen Logikprogrammiering, Typsysteme, Erweiterung objectkorientierter Sprachen, Softwaretechnik und formale Modelle. Daneben stand auch eine Sitzung mit Demos auf dem Programm.

Allen Teilnehmenden gilt Dank, daß sie durch ihre Vorträge, Papiere und Diskussion den jährlichen Workshop zu einem interessanten und anregenden Ereignis machen. Besonderer Dank auch den Mitarbeitern und Mitarbeiterin- nen des Physikzentrums Bad Honnef, die, wie immer, durch ihre umfassende Betreuung für eine angenehme und anregende Atmosphäre gesorgt haben.

Jens Knoop, Martin Steffen und Baltasar Tranc´on y Widemann, September 2019

(5)

(6)

Foreword

TheGI Working Group“Programming Languages and Computing Concepts” ar- ranges since 1984 each spring a workshop in the the “Physikzentrum Bad Honnef”

(a convention centre of the German Physical Society). The meeting serves to foster exchange of ideas and experience, discussions, and to build up new connections or cultivate existent ones. The technical program of the workshop offers a forum for finished work as well as work-in-progress, in the form of presentations and demonstrations. The topics include items from the following (non-exclusive) list:

– languages, language paradigms

– correctness of design and implementation – tools

– software- and hardware architectures – specification and design

– validation and verification – implementation and integration – safety and security

– embedded systems

– hardware-close programming

This report collects contributions or presentation abstracts from the meeting in 2019, the 36th workshop in the series, which took place 6th–8th. May. Besides that, a few papers presented in earlier years are additionally included in these proceedings. This year, there were presentation sessions covering the topics such as logic programming, type systems, software technology, object-oriented languages and their extensions, and formal models, as well as one session reserved for demos.

Thanks to all participants for their presentations, papers, and discussions, that make the annual workshop an interesting and inspiring event. Special thanks also and in particular to the staff and employees at the hosting convention centre, thePhysikzentrums Bad Honnef, who took care of the arrangment and, with their support and assistance, contributed, as always, to an inspiring and enjoyable atmosphere.

Jens Knoop, Martin Steffen, and Baltasar Tranc´on y Widemann, September 2019

(7)

Inhaltsverzeichnis

Jan C. Dagef¨orde, Sanda Dylus, Jan Christiansen, Finn Teegen und Jan Rasmus Tikovsky

Strukturierte Traversierung des Ausf¨uhrungsbaums von

Muli-Programmen . . . 1 Robert Gl¨uck

Report on an Experiment in Protocol Verification by Pushdown

Automata . . . 7 Ralf Jung, Jacques Henri Jourdan, Robert Krebbers und Derek Dreyer

RustBelt: ein solides Fundament f¨ur Rust . . . 8 Herbert Kuchen und Andreas Fuchs

Testfallerzeugung f¨ur Informationssysteme . . . 9 Marcus Lepper, Michael Oehler, Hartmuth Kinzler und Baltasar

Tranc´on y Widemann

Diminuendo al bottom — clarifying the semantics of music notation

by re-modeling . . . 10 Martin Pl¨umicke

Java-TX: The Language . . . 11 Marcus Lepper und Baltasar Tranc´on y Widemann

Combinator 2: Judgement day . . . 24 Michael Uhl

Eclipse-Plug-in for developing in Java-TX . . . 25 Marcus Lepper und Baltasar Tranc´on y Widemann

D2d — XML for Authors . . . 26 Baltasar Tranc´on y Widemann und Marcus Lepper

Simple and Effective Relation-Based Approaches to XPath and XSLT

Type Checking . . . 36

(8)

(9)

Strukturierte Traversierung des

Ausführungsbaums von Muli-Programmen

Jan C. Dageförde¹, Sandra Dylus², Jan Christiansen³, Finn Teegen² und Jan Rasmus Tikovsky²

1 WWU Münster

2 CAU Kiel

3 Hochschule Flensburg

Zusammenfassung. Bei Constraint-Logic Object-Oriented Programm- ing, z. B. mit Muli, werden zur Lösung von Suchproblemen Suchbereiche definiert, die nichtdeterministisch zur Ausführung kommen. Nichtdeter- ministische Verzweigungen führen zur Erzeugung von Choice Points, welche die möglichen Alternativen verwalten und nachhalten, welche Alternativen bereits betrachtet wurden. Dadurch entsteht zur Laufzeit implizit ein Ausführungsbaum. Vor diesem Hintergrund führen wir eine explizite, dauerhafte Repräsentation des Ausführungsbaums für Muli ein. Darin stellen die inneren Knoten Choice Points dar, wohingegen Ausführungsergebnisse an den Blättern des Baums abgebildet werden.

Unsere Repräsentation ist an dieST-Suchbaumrepräsentation von Curry angelehnt. Sie kapselt aber insbesondere Seiteneffekte, die bei der Aus- führung von objektorientierten Programmen notwendigerweise auftreten, in Trail-Datenstrukturen. Wir implementieren Tiefen- und Breitensuche auf Basis dieser Repräsentation.⁴

1 Nichtdeterministische Ausführung imperativer Programme

Constraint-Logic Object-Oriented Programming vereint die Flexibilität imperativer und objektorientierter Programmierung mit Features logischer Programmie- rung. Dies ist insbesondere für Anwendungen relevant, deren Geschäftslogik zu großen Teilen aus objektorientiertem Programmcode besteht, die aber zeitweise versuchen, Suchprobleme zu lösen [DK19a]. Die Münster Logic-Imperative Lan- guage (Muli) ist eine Programmiersprache, die Java um Features des Constraint Programming ergänzt und für solche Anwendungen geeignet ist.

Um Lösungen für Suchprobleme zu finden, bringt die Muli-Ausführungs- umgebungSuchbereiche, d. h. Teile des imperativen Programmcodes, nichtdeterministisch zur Ausführung. Ein solcher Suchbereich besteht aus einem spe- ziellen Methodenaufruf mit logischen Variablen als Parametern, der von der Ausführungsumgebung symbolisch ausgeführt wird. Verzweigungen führen zur

4Extended Abstract des Beitrags, der am 06.05.2019 auf dem 36. Workshop der GI-Fachgruppe Programmiersprachen und Rechenkonzepte vorgestellt wurde.

(10)

Erzeugung vonChoice Points, welche die möglichen Alternativen verwalten und nachhalten, welche Alternativen bereits betrachtet wurden. Dadurch entsteht zur Laufzeit implizit ein Ausführungsbaum. Die bisherige Implementierung der Muli-Ausführungsumgebung verwaltete anstelle des Baums einen einzigen Stack mit noch aktiven Choice Points, also mit denjenigen, die Äste enthalten, die noch nicht betrachtet wurden. Anstelle des Ausführungsbaums war der Ausführungs- umgebung also stets nur der aktuell ausgeführte Pfad bekannt; sobald Teilbäume vollständig abgearbeitet waren, blieben keine Informationen über deren Struktur zurück. Dieses Vorgehen ist für reine Tiefensuche ausreichend und darüber hinaus speichereffizient, aber für die Implementierung anderer Traversierungsstrategien sowie für das Debugging von nichtdeterministischer Ausführung in Suchbereichen ungünstig.

Vor diesem Hintergrund führen wir eine explizite, dauerhafte Repräsentation des Ausführungsbaums für Muli ein, mit dessen Hilfe ein Suchbereich strukturiert traversiert werden kann (s. Abschnitt 2). Darin stellen innere Knoten Choice Points dar, wohingegen Ausführungsergebnisse als Blätter des Baums abgebildet werden. Unsere Repräsentation ist an dieSearchTree- und ST-Suchbaumreprä- sentationen [HPR12,BHH04] von Curry angelehnt, kapselt aber insbesondere Seiteneffekte, die bei der Ausführung von objektorientierten Programmen notwendigerweise auftreten, in einemTrail. In einem Prototyp implementieren wir Tiefen- und Breitensuche unter Verwendung dieser Repräsentation (s. Abschnitt 3).

2 Suchbaumrepräsentation für Constraint-Logic Object-Oriented Programming

Die frühere Implementierung der Suche in Muli, die bloß den aktuellen Aus- führungspfad samt Choice Points verwaltet, kann nur für die Implementierung von Tiefensuche verwendet werden. Auf Grundlage einer expliziten Suchbaum- repräsentation können hingegen weitere Traversierungsstrategien implementiert werden, die abseits eines einzelnen Pfads suchen, wie z. B. Breitensuche.

Wir schlagen eine Suchbaumrepräsentation vor, deren innere Knoten Choice Points sind, die während der Ausführung aufgesammelt werden. Die Blätter stehen für Lösungen von Suchbereichen, die in Constraint-Logic Object-Oriented Programming entweder konkrete Rückgabewerte (mittels return) oder Ex- ceptions (mittelsthrow) sind [DK19a]. Zusätzlich kann die Auswertung eines Asts fehlschlagen, entweder in Folge der Auswertung eines expliziten Aufrufs von Muli.fail() oder weil das zugehörige Constraint-System unlösbar ist.

Dadurch wird ein potentielles Ergebnis des betreffenden Astes von der Lösungs- menge ausgenommen, was ebenfalls im Baum repräsentiert wird. Aus diesen Überlegungen resultiert folgende abstrakte Suchbaumstruktur:

data ST a = Value a | Exception | Fail

| Choice (ST a) (ST a)

Die Ausführung von Muli-Programmen resultiert in Seiteneffekten, die den Ausführungszustand verändern. Dadurch entsteht jeder Knoten des Suchbaums in einem bestimmten Ausführungszustand, der u. a. anhand von Frame-Stack,

2 Jan C. Dagef¨orde et.al.

(11)

Operandenstack und Heap beschrieben wird. Bevor die Ausführung von einem Knoten ausgehend fortgesetzt wird, sollte die Ausführungsumgebung in genau den Zustand des Knotens versetzt werden. Zu diesem Zweck werden in jedem Choice-Knoten zusätzlich zwei korrelierte Trails (vgl. [DK19b]) hinterlegt, welche Informationen über Änderungen am Ausführungszustand speichern. Einer der Trails (Rückwärts-Trail) wird verwendet, um von einem Choice-Knoten in Rich- tung der Wurzel aufzusteigen, d. h. um alle Effekte rückgängig zu machen, die auf dem Weg zum Choice-Knoten stattfanden. Der andere Trail (auch Vorwärts-Trail) wird hingegen verwendet, um von der Wurzel ausgehend zu einem Choice-Knotens abzusteigen. Durch die Speicherung dieser Trails kann frei im Suchbaum navigiert werden.

Der komplette Suchbaum eines Suchbereichs ist erst nach vollständiger Aus- führung des Suchbereichs bekannt. Darüber hinaus gibt es Suchbereiche, deren Suchbäume – gewollt oder ungewollt – unendlich groß sind, z. B. durch Rekursion ohne Abbruchbedingung. Aus diesen Gründen wird die Repräsentation des Such- baums nicht-strikt implementiert, sodass Teilbäume zur Laufzeit unausgewertet bleiben können, bis die darin enthaltenen Lösungen angefordert werden. Zu diesem Zweck werden unausgewertete Suchbäume mittels des Stellvertreter-Musters in der KlasseSTProxyabgebildet, welche auf Anforderung die Auswertung be- ginnen. Nach Abschluss der Auswertung wird das Ergebnis (d. h. der nachfolgende resultierende Knoten) im Attributevaluatedhinterlegt, sodass ein Pfad nicht mehrfach ausgewertet werden muss. Die entsprechende Klassenstruktur ist in Abbildung 1 dargestellt.

ST A

Fail

Value - solution: A

A Exception

- solution: Throwable Choice

- st1: ST<A>

- ce1: Constraint - st2: ST<A>

- ce2: Constraint

- trail1: Stack<TrailElement>

- trail2: Stack<TrailElement>

A

STProxy - evaluated: ST<A>

A

Abb. 1.Klassenstruktur der Suchbaumrepräsentation

Strukturierte Traversierung des Ausf¨uhrungsbaums von Muli-Programmen 3

(12)

3 Strukturierte Traversierung

Die explizite, nicht-strikte Suchbaumrepräsentation dient als Grundlage für die Implementierung unterschiedlicher Traversierungsmethoden. Suchbaumrepräsen- tation und Traversierungsmethoden werden zunächst als Proof-of-Concept in einem Prototyp umgesetzt, der zur Evaluation dient, bevor die eigentliche Muli- Ausführungsumgebung verändert wird.

Der Prototyp simuliert die Ausführung von Bytecode-Instruktionen. Instruk- tionen, die nichtdeterministisch verzweigen, geben einChoice-Objekt zurück.

Instruktionen, die Lösungen erzeugen, gebenValue- bzw.Exception-Objekte zurück. Gleichermaßen wird im Fall eines Fehlschlags einFail-Objekt instantiiert und zurückgegeben.

Strikte Tiefensuche In strikter Tiefensuche wird der Suchbaum durch rekur- siven Abstieg traversiert. Dazu wird eine MethodestrictDFSimplementiert, die einen (i. d. R.) unausgewerteten Teilbaum annimmt und eine Liste von Lö- sungen zurückgibt. FallsstrictDFSauf ein Blatt trifft, wird für Fehlschläge oder Lösungen eine leere bzw. einelementige Liste zurückgegeben. Anderenfalls, d. h. bei Erreichen einesChoice-Knotens, wirdstrictDFSfür jeden Teilbaum rekursiv aufgerufen. Die zurückgegebenen Lösungslisten werden verkettet und wiederum zurückgegeben. Diese Vorgehensweise ist einfach umzusetzen, allerdings ist sie aufgrund der Striktheit ungünstig, falls der Suchbaum des auszuwertenden Suchbereichs unendlich wächst. In diesem Fall kann keine der gefundenn Lösungen an die aufrufende Anwendung zurückgegeben werden.

Nicht-strikte Tiefensuche Die Java Stream API bietet eine Schnittstelle an, die für nicht-strikte Lösungslisten genutzt werden kann. Vor diesem Hinter- grund wird eine KlasseTreeDFSIteratorimplementiert, welche die Schnitt- stellejava.util.Spliteratorimplementiert.TreeDFSIteratorwird von java.util.stream.StreamSupportgenutzt, um einen Stream zu konstru- ieren, welcher Ergebnisse einzeln auf Anforderung auswertet. Bei der Erzeugung einesTreeDFSIterator-Objekts wird der initial unausgewertete Baum überge- ben und in einem Feld gespeichert. Zusätzlich speichertTreeDFSIteratorein Feldnodes, welches einen Stack noch zu betrachtender Teilbäume enthält. Initial ist der unausgewertete Baum das einzige Element des Stacks. Gemäß der Schnitt- stelle vonjava.util.Spliteratorwird eine MethodetryAdvanceimple- mentiert, welcher einConsumerübergeben wird. Beim Aufruf vontryAdvance wird ein unausgewerteter Baum vomnodes-Stack geholt und ausgewertet. Wird ein Lösungsblatt entdeckt, wird der übergebeneConsumermit der gefundenen Lösung aufgerufen undtryAdvanceterminiert. Im Falle eines Fehlschlags wird rekursiv tryAdvance aufgerufen, um unmittelbar zur Suche nach der nach- folgenden Lösung überzugehen. WenntryAdvanceauf einenChoice-Knoten trifft, werden die gefundenen Teilbäume demnodes-Stack hinzugefügt. Anschlie- ßend wird ebenfallstryAdvancerekursiv aufgerufen, um die Suche nach einer Lösung fortzusetzen.

(13)

Nicht-strikte Breitensuche Nicht-strikte Breitensuche wird analog zur nicht- strikten Tiefensuche implementiert. Um über das Anfordern einzelner Lösungen hinweg den Zustand der Traversierung aufrechtzuerhalten, kommt anstelle eines Stacks auszuwertender Teilbäume eine FIFO-Warteschlange zum Einsatz.

Eine HilfsmethodeprintDFSkann verwendet werden, um Suchbäume zu inspi- zieren. Sie überführt den übergebenen (teilausgewerteten) Suchbaum in einen menschenlesbaren String. WiestrictDFSist sie rekursiv implementiert, wertet allerdings unausgewerteteSTProxy-Teilbäume nicht aus, sondern stellt sie als unausgewerteten Teilbaum dar. Dadurch kannprintDFSauch vor der vollstän- digen Auswertung eines Suchbaums verwendet werden, um den Fortschritt der Auswertung zu untersuchen.

Über die hier vorgestellten Strategien hinaus sind weitere Suchstrategien denkbar. Beispielsweise liegt eine Implementierung von Iterative-Deepening- Tiefensuche nahe; weiterhin wäre eine interaktive Suchstrategie denkbar, bei der Nutzer*innen zur Laufzeit entscheiden, in welche Teilbäume abgestiegen werden soll. Weitere Suchstrategien sollen in späteren Arbeiten untersucht werden.

4 Zusammenfassung und Ausblick

Die hier vorgestellte Suchbaumrepräsentation ist für die Strukturierung nichtdeterministischer Ausführung imperativer Programme zur Laufzeit nützlich. Sie kann als Grundlage für die strukturierte Traversierung von Suchbereichen verwendet werden. Dadurch eröffnet sich u. a. die Möglichkeit, Breitensuche über die Ausfüh- rung imperativer Programme mit Seiteneffekten zu implementieren. Desweiteren kann die explizite Repräsentation beim Debugging während der Entwicklung von Muli-Anwendungen behilflich sein, da der Suchbaum händisch untersucht werden kann, z. B. während die Ausführung durch Breakpoints unterbrochen ist.

Unsere prototypische Implementierung steht auf GitHub zur Verfügung.⁵ Sie kann nicht als vollwertige Muli-Ausführungsumgebung eingesetzt werden, demonstriert aber die für die Suchbaumrepräsentation relevanten Aspekte, insb.

Verzweigung und Traversierung. In späteren Arbeiten werden die gesammelten Erkenntnisse in die tatsächliche Muli-Ausführungsumgebung übertragen, um die dort stattfindende nichtdeterministische Ausführung besser zu strukturieren.

Dabei sollen insbesondere Breitensuche und Iterative Deepening als für imperative Programmierung (und somit auch für Constraint-Logic Object-Oriented Programming) neuartige Suchstrategien implementiert und evaluiert werden.

Danksagung Die hier vorgestellten Inhalte entstanden während eines For- schungsaufenthalts des Erstautors an der CAU Kiel. Vielen Dank, insbesondere an Sandra Dylus, für die Organisation des Aufenthalts, sowie an alle Beteiligten für die produktiven Diskussionen.

5https://github.com/Dagefoerde/muli-ST.

Strukturierte Traversierung des Ausf¨uhrungsbaums von Muli-Programmen 5

(14)

Literatur

BHH04. Braßel, Bernd ; Hanus, Michael ; Huch, Frank: Encapsulating Non- Determinism in Functional Logic Computations. In:Journal of Functional and Logic Programming (2004), S. 28

DK19a. Dageförde, Jan C. ;Kuchen, Herbert: A Compiler and Virtual Machine for Constraint-logic Object-oriented Programming with Muli. In:Journal of Computer Languages 53 (2019), S. 63–78. http://dx.doi.org/10.1016/

j.cola.2019.05.001

DK19b. Dageförde, Jan C. ;Kuchen, Herbert: Retrieval of Individual Solutions from Encapsulated Search with a Potentially Infinite Search Space. In:Proceedings of the 34th ACM/SIGAPP Symposium On Applied Computing. Limassol, Cyprus, 2019, S. 1552–1561. http://dx.doi.org/10.1145/3297280.3298912 HPR12. Hanus, Michael ;Peemöller, Björn ;Reck, Fabian: Search Strategies for

Functional Logic Programming. In:Proc. of the 5th Working Conference on Programming Languages (ATPS’12), GI LNI 199, 2012, S. 61–74

(15)

Report on an Experiment in Protocol Verification by Pushdown Automata

Robert Gl¨uck

DIKU, University of Copenhagen

We discuss an experiment to verify the security of ping-pong protocols (Dolev-Yao intruder model) using nondeterministic two-way pushdown automata (2NPDA) [2]. The pushdown automaton checks whether the intersection of a regular language (the protocol to verify) and a Dyck language containing all canceling words is empty. If the two languages have a word in common, an intruder can reveal a secret message sent between trusted users. The verification takes at most cubic time on a memoizing 2NPDA-simulator [1]. The talk also explores the use of nondeterministic programming for solving polynominal-time problems without the typical exponential-time penalty of push-down programs.

References

1. R. Gl¨uck. A practical simulation result for two-way pushdown automata. In Y.-S.

Han and K. Salomaa, editors,Implementation and Application of Automata, volume 9705 ofLNCS, 2016.

2. R. Gl¨uck. An experiment in ping-pong protocol verification by nondeterministic pushdown automata. In J. P. Gallagher et al, editor,Models for Formal Analysis of Real Systems, Verification and Program Transformation, volume 268 ofETPTCS, 2018.

(16)

RustBelt: ein solides Fundament f¨ur Rust

Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, Derek Dreyer

Rust ist eine junge systemnahe Programmiersprache die verspricht, dem scheinbar grundlegenden Trade-off zwischen starken Garantien (wie Typsi- cherheit) und starker Kontrolle (von Ressourcen wie dem Speicher) zu ent- kommen. Rust verwendet ein starkes Typsystem basierend auf dem Kon- zept von exklusivem Eigentum, erlaubt es jedoch Bibliotheken, die Aus- drucksf¨ahigkeit dieses Typsystem durch Verwendung unsicherer Programm- konstrukte zu erweitern. Die Korrektheit der Garantien h¨angt also nicht nur vom Typsystem selber ab, sondern auch von Bibliotheken die behaupten, unsichere Funktion sicher hinter einer Abstraktion zu verbergen.

In diesem Vortrag stelle ich unseren Korrektheitsbeweis im Rahmen des RustBelt-Projekts vor. Dies ist der erste formelle (und maschinenüberprüfte) Beweis eines realistischen Teils von Rust. Unser Beweis ist erweiterbar, in dem Sinne dass er für jede neue Bibliothek, welche unsichere Funktiona- litäten verwendet, festlegt, welche formellen Kriterien sie erfüllen muss, um als korrekte Erweiterung der Sprache zu gelten. Wir haben diese Kriteri- en für einige der wichtigsten Komponenten der Standardbibliothek formell bewiesen.

Das Papier zum Vortrag [1] steht im Web unter https://plv.mpi-sws.

org/rustbelt/popl18/ zur Verf¨ugung.

Literatur

[1] Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers und Derek Dreyer.

” RustBelt: securing the foundations of the rust programming language“.

In: PACMPL 2.POPL (2018), 66:1–66:34. doi: 10.1145/3158154.

(17)

Testfallerzeugung für Informationssysteme

Herbert Kuchen und Andreas Fuchs

Westfälische Wilhelms-Universität Münster, ERCIS [email protected], [email protected]

Abstract

In einem Informationssystem hängen die möglichen Kontroll- und Datenflüsse oft auch von Datenbankinhalten oder genutzten (Web-)Services ab. Um alle Kontroll- und/oder Datenflüsse testen zu können, müssen daher geeignete Datenbankinhalte und Mock-Services mit geeignetem Verhalten generiert werden. Auf dieser Grundlage getestete Komponenten eines Informationssystems werden mit hoher Wahrscheinlichkeit auch dann noch ordnungsgemäß funktionieren, wenn sich Datenbankinhalte oder eingebundene Services ändern. Unser Ansatz basiert auf einer Erweiterung des Testfall-Generators Muggl [3]. Muggl seinerseits verwendet eine symbolischen Auswertung von Java-Bytecode auf einer eigens hierfür entwickelten Symbolischen Java Virtual Machine (SJVM), die neben den aus der JVM bekannten Komponenten wie Frame-Stack und Trail auch über Komponenten verfügt, die aus der von der logischen Programmiersprache Prolog bekannten Warren Abstract Machine stammen, wie Choice-Point-Stack und Trail. Weiterhin wird ein Constraint-Solver eingebunden. Details zu unserem Ansatz findet man in [1] und [2].

Literatur

[1] Andreas Fuchs, Herbert Kuchen: Test-Case Generation for Web-Service Clients.

Proceedings of 33

^rd

ACM Symposium on Applied Computing (SAC), S. 1518-1527, Pau, France, 2018.

[2] Andreas Fuchs, Herbert Kuchen: Unit Testing of Database-Driven Java Enterprise Edition Applications. Proceedings of the 11th International Conference on Tests and Proofs (TAP), S.

59-76, Marburg, Deutschland, 2017.

[3] Tim A. Majchrzak, Herbert Kuchen: Automated Test-Case Generation Based on Coverage

Analysis. Proceedings of the 3rd IEEE International Symposium on Theoretical Aspects of

Software Engineering (TASE), S. 259-266 Tianjin, China, 2009.

(18)

Diminuendo al bottom — clarifying the semantics of music notation by re-modeling

Markus Lepper, Michael Oehler, Hartmuth Kinzler, and Baltasar Tranc´on y Widemann

Abstract. Die Fixierung musikalischer Aktivit¨aten und deren klanglicher Resultate in Partituren ist ein zentrales Moment der Kulturgeschichte seit mindestens zweitausend Jahren. Das heute dominierende System der

”Commen Western Notation” hat sich in den letzten dreihundert Jahren entwickelt, und wird ständig erweitert, um neue Produktionstechniken wiedergeben zu können. Dabei fehlt ihm jedoch völlig jede präzise Spezi- fikation. Vielmehr werden Bedeutungs- und Verwendungsregelen nur in- formell, im Lehrer-Schüler-Verhältnis, weitergegeben. Im Zuge zunehmen- der Automatisierung ist eine Diskussion auf exakteren Grundlagen zwin- gend notwendig.

Wir verwenden dafür (1) einen grundlegenden Trick der ”Strukturthe- orie” in den Naturwissenschaften: Modelle werden unabhängig von der Verwendung entworfen, ohne jeden erklärenden Anspruch, um dann a posteriori von den Fach-Experten verwendet zu werden, oder auch nicht.

(2) Die Methoden des Designs und der Spezifikation von ”Konstruktion- ssprachen” = ”Computersprachen”, besonders die Unterscheidung von Syntax, Semantik und den vermittelnden Funktionen werden zu einer Kette von Transformationen erweitert.

(19)

Java-TX: The Language

Martin Plümicke

Duale Hochschule Baden-Württemberg, Campus Horb Department of Computer Science

Florianstraße 15, D–72160 Horb [email protected]

Abstract. Java-TXis an extension ofJava. The main new features are global type inference and real function types for lambda expressions.

These extentions lead to a new more powerful overloading mechanism, which means that the principal type of a method is an intersection of method types.

1 Introduction

The programming language Java is extended since version 1.5 by many features from functional programming languages. In version 1.5 generics are introduced. Generics are known as parametric polymorphism in functional programming languages. In contrast to functional programming languages asHaskellor SMLobject-oriented languages likeJavaallows subtyping and states of objects.

Therefore the variance of type arguments has to be defined. InPIZZA, the first approach of parametric polymorphism in Java-like languages, the arguments were declared as invariant, which means forVector<T>≤^∗1Collection<T>and Integer≤^∗Objectholds Vector<Integer>≤^∗Collection<Integer> but neither

Vector<Integer>≤^∗ Collection<Object> (covariance) nor

Vector<Object> ≤^∗ Collection<Integer>(contravariance)

is correct. Invariance of type arguments is a hard restriction. Therefore inJava 5.0 so-called wildcards were introduced. Wildcards allow in some cases covariance and contravariance, respectively.

InJava 8lambda expressions were introduced, but no function types. The types of lambda expressions are defined as target types, which are functional interfaces (essentially interfaces with one method).

Local type inference was introduced in the versions five, seven, and nine. In Java 5.0the automatic determination of parameter instance was introduced. In Java 7the diamond operator was introduced. InJava 9, finally, the var keyword for types of local variables was introduced.

The features global type inference (no type type declarations are necessary without lossing static typing), real function types, and pattern-matching are not addressed inJavasince now.

1 ≤^∗ stands for the subtyping relation.

(20)

InJava-TXwe address global type inference and real function types.

The paper is organized as follows: First we describe in Section 2 the global type inference in Java-TX. After that in Section 3 we describe the introduction of real function types. In Section 4 we describe two additional features, a powerful overloading mechanism and the inference of generics, which follow both from the type inference algorithm. In the last section we conclude and give an outlook.

2 Globale type inference for Java

Global type inference allows to leave out all type annotations. Similar as in functional programming languages likeHaskellthe compiler determines a principal typing, such thatJava-TXis statically typed as originalJava[Plü14].

Let us consider first a simple example.

Example 1. The following program is given:

1 import java . lang . Integer ; 2

3 public c l a s s Fac { 4

5 getFac (n){

6 var res = 1;

7 var i = 1;

8 while(i <= n) {

9 res = res * i;

10 i ++;

11 }

12 return res ;

13 }

14 }

It is the simple iterativ implementation of the faculty function. The return and the argument type of getFacare leaved out. The type inference algorithm has to infer the types. The types are determined by the declaration in line 6 and the overloaded operator* in line 9. To reduce the complexity of the type inference algorithm for overloaded operator just as overloaded methods only types are inferred which are explicitly imported by the keyword import. Therefore for getFacthe typing

java . lang . Integer getFac ( java . lang . Integer n) is inferred.

Let us consider a second more complex example.

Example 2. The following program is given:

12 Martin Pl¨umicke

(21)

import java . util . Vector ;

c l a s s Matrix extends Vector<Vector<Integer>> { mul (m) {

var ret = new Matrix ();

var i = 0;

while(i < size ()) {

var v1 = this.elementAt(i );

var v2 = new Vector < Integer >();

var j = 0;

while(j < v1 . size ()) { var erg = 0;

var k = 0;

while(k < v1 . size ()) {

erg = erg + v1 .elementAt(k) *

m.elementAt(k ).elementAt(j );

k ++; }

v2 .addElement(new Integer ( erg ));

j ++; }

ret .addElement( v2 );

i ++; } return ret ; } }

The classMatrixis implemented as an extention ofVector<Vector<Integer>>.

The methodmulimplements the multiplication of two matrices.

The obvious typing would be Matrix mul ( Matrix m)

The question is, if this typing is the only possible typing. If not then it is the question if it is thebest typing.

It is simple to see that there are other correct typings. E.g.

Matrix mul ( Vector < Vector < Integer >> m) and

Vector < Vector < Integer >> mul ( Vector < Vector < Integer >> m) is correct. But

Matrix mul ( Vector <? extends Vector <? extends Integer >> m) is also correct.

If we summarize all correct typings of multo an intersection the type of mulis given in Fig. 1.

We remember the usual subtyping defintion for function types.

Definition 1 (Subtyping relation ≤^∗ on function types). For two given functions types(τ1, . . . , τn)→τ and(θ1, . . . , θn)→θ0 holds

(τ1, . . . , τn)→τ≤^∗(θ1, . . . , θn)→θ0

ifθi≤^∗τi andτ≤^∗θ.

Java-TX: The Language 13

(22)

mul : Matrix→Matrix

&Vector<Vector<Integer>>→Matrix

&Vector<Vector<Integer>>→Vector<Vector<Integer>>

&Vector<Vector<? extends Integer>>→Matrix

&Vector<Vector<? extends Integer>>

→Vector<Vector<Integer>>

&Vector<? extends Vector<Integer>>→Matrix

&Vector<? extends Vector<Integer>>

&Vector<? extends Vector<? extends Integer>>→Matrix

&Vector<? extends Vector<? extends Integer>>

Fig. 1.Intersection type of mul

Regarding this definition we should define the intersection type of all maximal elements as thebest type.

In the above example Vector<? extends Vector<? extends Integer>> → Matrix is the supertype of all other types. Therefore this should be the best type of mul.

In addition, we consider the declaration of aprincipal type for functional programs [DM82]: A type-scheme for a declaration is a principal type-scheme, if any other type-scheme for the declaration is a generic instance of it.

Finally we combine the ideas of best types in Javawith the principal type of functional programs and give a definition for a principal type ofJavamethods:

An intersection type-scheme with minimal number of elements for a declaration is a principal type-scheme, if any other type-scheme for the declaration is a subtype of generic instance of one element of the intersection type-scheme.

In the following we give the formal definition.

Definition 2 (Principal types ofJavamethods). An intersection type of a methodm, which contains no overloaded call (for overloading cp. Section 4.1)

m: ((θ1,1, . . . , θ1,n)→θ1)

&. . .&

((θm,1, . . . , θm,n,)→θm)

is called principalif for any correct type annotated method declaration rty m(ty1 a1, . . . ,tyn an){ . . .}

there is an element

((θi,1, . . . , θi,n,)→θi)

of the intersection type and there is a substitutionσ, such that 14 Martin Pl¨umicke

(23)

σ(θi)≤^∗rty,ty1≤^∗σ(θi,1), . . . ,tyn≤^∗σ(θi,n) and the number of elements of the intersection type is minimal.

The corresponding type inference algorithm wie gave in [Plü14] and the under- laying unification is given in [Plü09,SP18].

3 Real function types

IndeedJavaallows lambda expression, but there are no function types. Instead there are functional interface as target types of lambda expressions. There are many disadvantages because of the missing of function types.Java-TXsolves this disadvantages by introducing function types similar as inScalawithout loosing the convenience of functional interfaces as target type of lambda expressions [PS17].

InJava 8function types are simulated in the package:java.util.function:

public interface Function <T ,R > { R apply (T t );

}

public interface BiFunction <T ,U ,R > { R apply (T t , U u );

}

There are some inconveniences.

Subtyping

Although for subtypes holds (cp. Def. 1)

(T₁⁰, . . . , T_N⁰ )→T0 ≤^∗ (T1, . . . , TN)→T₀⁰, iffTi≤^∗T_i⁰ for the functional interfaceFunction

Function<T1⁰, T0> ≤^∗ Function<T1, T₀⁰>, forTi^∗T_i⁰,

is not correct, asJavahas use-side variance. Therefore arguments of types without wildcards are invariant.

Example 3. ForInteger ≤^∗ Number ≤^∗ Objectholds:

Number→Number ≤^∗ Integer→Object but

Function<Number,Number> f_NumNum= . . .

Function<Integer,Object> f_IntObj = f_NumNum iswrong!, as

(24)

Function<Number,Number> 6≤^∗ Function<Integer,Object>

This problem could by solved by wildcards. It holds

Function<T1⁰, T0> ≤^∗ Function<?super T1,?extends T⁰0>, forTi^∗T_i⁰. This means

Function<?super Integer,?extends Object> f_IntObj = f_NumNum is correct.

Direct application of lambda expressions

In theλ-calculusβ-conversion (direct application of a lambda expression to its arguments) is possible:

(λ x.E)arg=E[x/arg].

InJava 8this lambda term would have the following form:

(x -> h(x)).apply(arg);

Such expressions are not permitted. As the lambda expression has no explicite type it is not obvious if the methodapplyexists, at all. This problem could be solved by introducing a type-cast.

((Function<T,R>)x -> h(x)).apply(arg);

Summary

The Drawbacks of missing function types are all solved inJava 8:

Missing function types: The function types are replaced by the functional interfacesBi/Functionin the packagejava.util.function.

Subtyping problem: The problem that the functional interface’s behaviour differ from the usual definition of subtyping is solved by using wildcards.

Impossibility of direct application of lambda expressions: The impossibility to apply an lambda expression directly to its arguments is solved by using type-casts.

This means all problems are solvable, but the solution are not pretty. Therefore we introduced real function types inJava-TX. We extendedJavaby two sets of special functional interfaces

interface FunN$$ <-T1 , ... , -TN,+R > { R apply ( T1 arg1 , ... , TN argN);

}

interface FunVoidN$$ <-T1 , ... , -TN> { void apply ( T1 arg1 , ... , TN argN);

}

(25)

where

– FunN$$<T⁰1, . . . ,T⁰_N,T0>≤^∗FunN$$<T1, . . . ,TN,T⁰0>iffTi≤^∗T⁰_i – InFunN$$ no wildcards are allowed.

The Lambda–expressions are explicitly typed by FunN$$–types.

Example 4. let us considering the following changed matrix program:

public c l a s s MatrixOP extends Vector < Vector < Integer >> { mul = (m1 , m2 ) -> {

var ret = new MatrixOP ();

var i = 0;

while(i < m1 . size ()) { var v1 = m1 . elementAt (i );

var v2 = new Vector < Integer >();

var j = 0;

while(j < v1 . size ()) { var erg = 0;

var k = 0;

while(k < v1 . size ()) {

erg = erg + v1 . elementAt (k) * m2 . elementAt (k ). elementAt (j );

k ++; }

v2 . addElement ( erg );

j ++; }

ret . addElement ( v2 );

i ++;}

return ret ;}

InJava 8there are two possibilities to type the fieldmul:

– Usingjava.util.function.*:

mul: BiFunction<? super Vector<? extends Vector<? extends Integer>>,

? super Vector<? extends Vector<? extends Integer>>,

? extends MatrixOP>

This type declaration is less readable. Especially the mixture of super- und extends-wildcards in the second argument are very curious.

– Defining an own functional interface MatrixOperation:

interface MatrixOperation { MatrixOP

apply ( Vector <? extends Vector <? extends Integer >> arg1 , Vector <? extends Vector <? extends Integer >> arg2 );

}

mul: MatrixOperation

This type declaration is very short, but the typeMatrixOperationhides the most informations.

(26)

In contrast,Java-TXinfers the function type:

mul: Fun2$$<Vector<? extends Vector<? extends Integer>>, Vector<? extends Vector<? extends Integer>>, MatrixOP>

This type is complex, too. But the arguments of the function typeFun2$$have no wildcards. This reduces the confusion.

4 Additional features

There are some positiv side-effects from the type inference. We will consider two of them in this section. The first is a powerful overloading mechanism and the second is the possibility to generalize type variables from the result of the type inference algorithm.

4.1 Overloading

The idea The following example shows the overloading mechanism.

Example 5. Let the classesOLandOLMainbe given.

c l a s s OL {

m(x) { return x + x; } m(x) { return x || x; } }

c l a s s OLMain { main(x) {

var ol = new OL ();

return ol .m(x );

} }

The type of the first methodmis:

m : Integer→Integer

&Double→Double

&String→String,

as+is an overloaded operation symbol. The second methodmhas the type m:Boolean→Boolean.

In classOLMainan instance of OLis created and on the instance the overloaded methodmis called. This means thatmainhas all four types of the both methods m:

main : Integer→Integer

&Double→Double

&String→String

&Boolean→Boolean 18 Martin Pl¨umicke

(27)

Extended principal type definition In Definition 2 principal types are defined restricted to non overloaded calls. For this let us consider the following example.

Example 6. Let the followingJavaprogram be given:

import java.util.Vector;

import java.util.Stack;

class Put {

<T> putElement(T ele, Vector<T> v) { v.addElement(ele);

}

<T> putElement(T ele, Stack<T> s) { s.push(ele);

}

main(ele, x) { putElement(ele, x);

} }

The inferred intersection type of mainis

main:T×Vector<T>→void& T×Stack<T>→void.

With Def. 2 the principal would beT×Vector<T>→void. This would not be the principal type as the stack’s application would disappear. Therefore we have extend the definition.

Definition 3 (Principal type of Java methods with overloading). An intersection type of a methodm

m: ((θ1,1, . . . , θ1,n)→θ1)

&. . .&

((θm,1, . . . , θm,n,)→θm)

is called principal if the number of elements of the intersection is minimal and for any correct type annotated method declaration

rty m(ty1 a1, . . . ,tyn an){ . . .}

there is an element((θi,1, . . . , θi,n,)→θi)of the intersection type and there is a substitutionσ, such that

σ(θi)≤^∗rty,ty1≤^∗σ(θi,1), . . . ,tyn≤^∗σ(θi,n)

and the call-graphs of(θi,1, . . . , θi,n,)→θi and(ty1, . . . ,tyn)→rtyare equal.

(28)

Example 7. Continuing Example 6 the principal type ofmainis main:T×Vector<T>→void& T×Stack<T>→void,

as the call graphs ofT×Vector<T>→voidandT×Stack<T>→voiddiffer.

A detailed regard to call graphs and the corresponding algorithm can be found in [Plü08].

Heterogenous translation

Example 8. Let us consider the followingJavaprogram:

c l a s s OLFun { m(f , x) {

x = f. apply (x+x );

return x;

} }

We get for the arguments of mthe following typings:

Fun1$$<Double, Double>×Double→Double&

Fun1$$<Integer, Integer>×Integer→Integer&

Fun1$$<String, String>×String→String.

InJava–bytecode the arguments of the generic types are not considered (type- erasure). Indeed in bytecode the arguments are contained, but they are used only for the typecheck. TheJVM considers only the descriptions. For the class OLFunthe method headers in bytecode looks like this:

public java . lang . Double m( Fun1$$ < java . lang . Double , java . lang . Double >, java . lang . Double );

descriptor: (LFun1$$;Ljava/lang/Double;)Ljava/lang/Double;

public java . lang . Integer m( Fun1$$ < java . lang . Integer , java . lang . Integer >, java . lang . Integer );

descriptor: (LFun1$$;Ljava/lang/Integer;)Ljava/lang/Integer;

public java . lang . String m( Fun1$$ < java . lang . String , java . lang . String >, java . lang . String );

descriptor: (LFun1$$;Ljava/lang/String;)Ljava/lang/String;

This overloading is similar as in Example 5 no problem, as the method call can be resolved by the second argument.

But if we erases the second argument c l a s s OLFun {

x;

m(f) {

x = f. apply (x+x );

returen x;

} }

(29)

the type-erasure is a problem, as the method headers in bytecode are:

public java . lang . Double m( Fun1$$ < java . lang . Double , java . lang . Double >);

descriptor: (LFun1$$;)Ljava/lang/Double;

public java . lang . Integer m( Fun1$$ < java . lang . Integer , java . lang . Integer >);

descriptor: (LFun1$$;)Ljava/lang/Integer;

public java . lang . String m( Fun1$$ < java . lang . String , java . lang . String >);

descriptor: (LFun1$$;)Ljava/lang/String;

}

Now no method resolving is possible as all three methods have the same argument Fun1$$.

This problem could only be solved by heterogeneous translations, which pre- serves the arguments in the descriptors. In [ORW00] an idea for heterogeneous translation was given.

4.2 Generalized type variables

Similar as in type inference of functional programming languages, free type variables, which are not instanced by other types after type inference are generalized to generics. In comparison to functional programming languages inJavasubtyp- ing leads to a more powerful generalizations mechanism.

Example 9. Let us consider the identity function.

c l a s s Id {

id (x) { return x; } }

What is the type ofid? In an approach without subtyping the result would be

<T> T id (T x)

The result of our type inference algorithm is: For the following typing by fresh type variables (type-placeholders)

c l a s s Id { K id (L b )({ return (b ):: L; }):: M}

the result is{ {(L<K)} }. This means that the following class is generated in bytecode:

c l a s s Id {

<L extends K , K > K id (L x) { return x; } }

Let us consider an example with a field declared as a lambda expression.

Example 10. Let the identity function be given as a lambda expression assigned to a field of a class:

c l a s s lambdaId { lambdaId = x -> x;

}

(30)

The result of the type inference algorithm is given as c l a s s lambdaId {

K lambdaId = (( x )::CIC -> (( x )::CIC)::L)::P;

} with

{{(K=Fun1$$<CIC,CID>),(P=Fun1$$<L,BPP>), (CIC<L),(L<BPP),(BPP<CID)}}

In this example the free type variables are not restricted to a method as in Example 8. The type variables are valid in the whole class, as the fieldlambdaId is valid in the whole class, too. Therefore the free type variables become generics of the class. This means the following class is generated in bytecode:

c l a s s lambdaId <CIC extends CID, CID> { Fun1$$CIC, CID> lambdaId = x -> x;

}

5 Implementation

A prototypical compiler for Java-TX has been implemented. The compiler is written inJava, itself. Additionally, an eclipse plugin is presented, which allows to use the ful convenience ofJavatype inference.

(https://www.hb.dhbw-stuttgart.de/javatx)

6 Summary and Outlook

Javahas been developed in the last year such that concepts from the functional programming languages have been introduced. In the paper we presented an extension of Java, called Java-TX. Java-TX continues the range of introducing functional programming language features intoJava. We added the feature of global type inference. Global type inference means thatJavaprograms can be written without any type annotation.Java-TXpreserves static typing by a type inference algorithm, which infers a principal type for all fields and methods.

We discussed the principal type property inJavawith and without overloading.

The second main extension ofJava-TXis the introduction of real function types.

We gave some examples which shows that the lack of real function is very harm- ful. We introducedScala–like function types. For lambda expressions we defined these function types as explicit types. At once we preserved the concept of target typing for functional interfaces.

Finally we presented two side-effects from the extensions. First, we showed that one function declaration can be overloaded by different types. Second, we presented the generalization concept for free type variables, which is more powerful than in functional programming languages.

(31)

We showed that the overloading mechanism has some restrictions, caused by the type-erasure (erasure of argument types in parameterized types during compi- lation). Therefore we plan to realize an approach of heterogeneous translation.

There are some approaches. One approach is given inPIZZA[ORW00]. A further approach is used beC#[C#17]. There are possibilities following the ideas of [UTO13].

Another feature from functional programming languages which had not been introduced intoJavais pattern matching. There is an approach inPIZZA[OW97], too. This approach is restricted to switch-case statements. In combination with type inference more flexible approches could be possible.

References

C#17. C# language specification, 2017.

DM82. Luis Damas and Robin Milner. Principal type-schemes for functional programs.Proc. 9th Symposium on Principles of Programming Languages, 1982.

ORW00. Martin Odersky, Enno Runne, and Philip Wadler. Two Ways to Bake Your Pizza – Translating Parameterised Types into Java.Proceedings of a Dagstuhl Seminar, Springer Lecture Notes in Computer Science, 1766:114–132, 2000.

OW97. Martin Odersky and Philip Wadler. Pizza into Java: Translating theory into practice. InProceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’97, pages 146–159, New York, NY, USA, 1997. ACM.

Plü08. Martin Plümicke. Intersection Types inJava. In Luís Veiga, Vasco Amaral, Nigel Horspool, and Giacomo Cabri, editors,6th International Conference on Principles and Practices of Programming in Java, volume 347 of ACM International Conference Proceeding Series, pages 181–188, September 2008.

Plü09. Martin Plümicke. Java type unification with wildcards. In Dietmar Seipel, Michael Hanus, and Armin Wolf, editors,17th International Confer- ence, INAP 2007, and 21st Workshop on Logic Programming, WLP 2007, Würzburg, Germany, October 4-6, 2007, Revised Selected Papers, volume 5437 of Lecture Notes in Artificial Intelligence, pages 223–240. Springer- Verlag Heidelberg, 2009.

Plü14. Martin Plümicke. More type inference in Java 8. InPerspectives of Systems Informatics - 9th International Andrei Ershov Memorial Conference, PSI 2014, St. Petersburg, Russia, pages 168–174, 2014. Preliminary Proceedings.

PS17. Martin Plümicke and Andreas Stadelmeier. Introducing Scala-like function types into Java-TX. InProceedings of the 14th International Conference on Managed Languages and Runtimes, ManLang 2017, pages 23–34, New York, NY, USA, 2017. ACM.

SP18. Florian Steurer and Martin Plümicke. Erweiterung und Neuimplemen- tierung der Java Typunifikation. In Jens Knoop, Martin Steffen, and Bal- tasar Trancón y Widemann, editors,Proceedings of the 35th Annual Meeting of the GI Working Group Programming Languages and Computing Concepts, number 482, pages 134–149. Faculty of mathematics and natural sciences, UNIVERSITY OF OSLO, 2018. ISBN 978-82-7368-447-9, (in german).

UTO13. Vlad Ureche, Cristian Talau, and Martin Odersky. Miniboxing: improving the speed to code size tradeoff in parametric polymorphism translations. In OOPSLA, pages 73–92, 2013.

(32)

Combinator 2: Judgement Day

On the Design and Evaluation of an Expressive Parser Combinator Language Embedded in Java

Baltasar Tranc´on y Widemann and Markus Lepper

<semantics/>GmbH, Berlin, DE

Abstract

The theory of syntax analysis of computer languages allows tools to construct parsers for certain classes of context-free grammars completely automatically, based on global analysis. While these are algorithmically elegant and efficient, the input formats for such tools remain remarkably primitive;

support for abstractions and extensions of the feature set is at best rudimentary.

In the field of declarative programming languages in particular, the radically different approach of parser combinators enjoys some popularity. There no grammars exist, and therefore neither do global analyses. Constrastingly, each grammar expression is already a full-fledged parser on the one hand, and a first-class citizen of the host language on the other. Complex parsers are constructed from simpler ones by libraries of perfectly normal operations of the host language. Thus parser combinators as an embedded domain-specific language inherit the full expressiveness of their programming environment.

The projectRamus[1] investigates several wider aspects of syntax definition from the perspective of modern software engineering. In this scope, parser combinators are employed for direct programming, as well as in the role of a backend language for code generation from grammar-based formats.

Ramuscombinators are implemented in Java and support powerful functionality: non-determinism, pervasive diagnostics, transparent semantic actions, a procedural sublanguage for building abstract syntax models, type-safe evaluation, local self-optimization heuristics.

We discuss the design of theRamuscombinator language and evaluate its use pragmatically in the light of several realistic applications.

References

[1] Baltasar Tranc´on y Widemann and Markus Lepper. “Syntaxanalyse auf Wiedervorlage”. In:Proceedings 17. Kolloquium Programmiersprachen (KPS 2013). Ed. by R. Picht and W. Zimmermann. Technical Report 2014/02. Institute of Computer Science, Martin-Luther-Universit¨at Halle-Wittenberg, 2013, pp. 151–154.

(33)

Eclipse-Plug-in for developing in Java-TX

Michael Uhl, Duale Hochschule Baden-Württemberg

1. Scope

The scope of the contribution was to extend an existing Java-TX plug-in. The plug-in can call the Java-TX compiler to generate Java bytecode from Java-TX source, shows errors in the code and does type inserts. A type insert is to choose a possible type for a given token (class attribute, method parameter, method return value, local variable).

2. Structure and automatic build

The first task was to create a maven build for the plug-in. The maven plug-in „Tycho“ was used for this. To apply the build the structure of the project was adjusted to conform the

recommendations.

3. Type-Insert

The further development was to avoid calling the Java-TX compiler after a type insert was made.

An executed type insert only narrows the possibilities for further type inserts. Therefore the type insert possibilities can be reduced after a type insert was made.

Abb. 1: The Eclipse-Plugin for Java-TX

(34)

D2d — XML for Authors

(Technical Report – Bad Honnef 2013)

Markus Lepper¹and Baltasar Tranc´on y Widemann¹

<semantics/>GmbH, Berlin [email protected]

1 Introduction, Principles of d2d

XML is a de facto standard for the encoding of semi-structured text corpora. Its practical application ranges from mere technical configuration data to web sites with entertaining contents.

Both notions, “XML” and “text”, stand here for very different things: On one side the organization of the internal computable text model as a tree structure with its standardized update and retrieval methods (“W3C DOM”), and the family of tools operating on these (implementing “XSLT”, “XQuery”, etc.). On the other side its external representations: unicode text files containing a lot of

“angled braces”, their decoding ruled by a historically grown collection of hardly understandable, non-compositional quoting rules.

All this hinders the creation of XML encoded texts in the creative flow of authoring. Neither are syntax controlled editors, which support tagging by menu driven selections, auto completion, automated coloring and indenting, a solution for all those authors which are used to “writing” as a creative, flowing, intuitive and intimate process, indirect contact with that mysterious thing called text.

So far XML appears inappropriate for this kind of authoring situation. Nev- ertheless often its use is highly desirable: Technical documentations, cookbooks, screen plays, song lyrics, scientific analyses, even multi-volume fantasy novels can profit extraordinarily by only little interspersed mark-up.

This is the starting point of the “d2d” project. It stands for “directly to document” or “direct document denotation”, hence pronounced “triple dee”, and it tries to close this gap. It is both, a text format which realizes XML mark- up in a very unobtrusive way, and it is a software system which implements parsing, translating, parser definitions, documentation, user guidance, etc.

It is based on a simple idea, the realization of which turned out to be sur- prisingly complex, and has been driven on by the authors for now more than ten years.¹The main characteristics are:

– Simple way of writing and good readability by humans (without the need for any dedicated tool) as well as by machines.

– All tags marked with one single, user-defined character.

– Inference of (nearly all) closing tags.

1A rather early version is described in [4]; full documentation can be found at [6].

(35)

– Inference also of opening tags by a second, character-based level of parsers, used for small, highly structured data entities, interspersed in flow text.

– Support of standard text format definition formats (e.g. W3C DTD).

– Own language for text format definition (required at least for the character level parser definition). It employs free rewriting for parametrization and re-use of modules, and multi-lingual user documentation.

Sod2dis a concept, a format and a software system which addresses domain experts and enables them to write XML compatible texts and potentially opens to them the whole world of XML based processing. D2d has been sucessfully employed in the very diverse fields of technical documentation, book keeping, web content creation, interactive music theory, etc.

2 The d2d Parsing Process

2.1 Principles of thed2d Parsing Process

The process of reading a text file, interpreting it as conforming to particular text format definition in thed2dformat, and constructing the corresponding internal model, is called “D2d parsing”. This model can be written out as an external representation according to the XML standard [1]. The implementation of the d2dtool also allows to process this model directly, e.g. using a collection of XSLT rules to derive other models to be written out. The parsing process is controlled by the chosen root element. Element definitions can have tagged or character content models:

In the tagged case, the resulting sub-tree of the model is constructed according to the tags appearing in the input text. Tags are marked by a single user-defined lead-in character, which defaults to “#”. Closing tags can in most cases be omitted, because the parser uses simply LL(1) strategy: Whenever an opening tag can be accepted on some currently open higher level in the result tree, all intervening closing tags are supplied automatically. Nevertheless explicit closing tags may be added to resolve ambiguities, to close more than one stack level, or to increase the readability of the source.

Character based parsers accept plain character data. All tagging is added automatically as defined in the applicable parser rules. The basic strategy is a non-deterministic longest match. Thus this mechanism performs well for short input data, e.g. some ten lines of MathML. In practice this covers most instances of structured entities, interspersed in flow text. Besides, non-deterministic rules are much easier definable by computer language laymen.

2.2 File Sections

Every file to be processed by thed2d tool may start with sections containing local definitions. These have the form

α #d2d 2.0 module µ

D2d — XML for Authors 27

(36)

C=// allowed input characters I=// allowed identifiers D::=chars(C^?) | comment(C^?) | eof

| error(C^?) | warning(C^?)

| OPENi:I | CLOSEi:I | CLOSE^F_{i:(I∪{ })} | EMPTYi:I | EMPTY^F_i:I tokenize:C^?→D^?

: (D^?×C^?×(C×I)^?)→(D^?×C^?×(C×I)^?) (hi, α,hi) (δ,hi,hi)

tokenize(α) =δ Parenths : C 9→C

Parenths = {’(’7→’)’,’<’7→’>’,’[’7→’]’,’{’7→’}’,’.’7→’.’’!’7→’!’,

’\’7→’\’,’:’7→’:’,’$’7→’$’,’↑’7→’↑’}

α, α⁰⁰∈C^? α⁰⁰6=hi δ∈D^? γ∈(C\ {^NL})^? γ⁰6= a∗(a κ∈dom Parenths ∧ κ⁰=Parenths(κ)

ζ∈(0|1|. . .|9),(0|1|. . .|9|a|. . .|f)^? ∧ ζ⁰=char(parseInt(ζ)) α⁰∈(C\ {#, κ⁰})⁺ ∧ α⁰6= a((a ∧ α⁰6= a(∗a (δ,#(#| |^NL)⁺aα, π) (δ, #Iα, π)

(δ,( (γ^N_L Iα, π) (δJcomment(γ), ^N_L Iα, π) (δ,(∗γ⁰∗(Iα, π) (δJcomment(γ⁰), α, π) (δ,#someTag( |^NL)Iα, π) (δJOPENsomeTag, α, π)

(δ,#someTagκIα, π) (δJOPENsomeTag, α,(κ⁰,someTag)Iπ) (δ, κ⁰Iα,(κ⁰,someTag)Iπ) (δJCLOSEsomeTag, α, π)

(δ,#someTag/Iα, π) (δJEMPTYsomeTag, α, π) (δ,#someTag///Iα, π) (δJEMPTY^F_someTag, α, π) (δ,#someTagIα, π) (δJOPENsomeTag, α, π)

(δ,#ζ/Iα, π) (δJchars(ζ⁰), α, π)

(δ,#ζIα, π) (δJchars(ζ⁰), α, π)

(δ,#/someTag( |^NL)Iα, π) (δJCLOSEsomeTag, α, π) (δ,#///someTag( |^NL)Iα, π) (δJCLOSE^FsomeTag, α, π) (δ,#/( |^NL)Iα, π) (δJCLOSE, α, π) (δ,#/someTagIα, π) (δJCLOSEsomeTag, α, π) (δ,#///someTagIα, π) (δJCLOSE^FsomeTag, α, π)

(δ,#/Iα, π) (δJCLOSE, α, π)

(δ, α⁰Iα,(κ⁰, )Iπ) (δJchars(α⁰), α,(κ⁰, )Iπ) (δ,#eofIα, π6=hi) (δJwarning(“pending parentheses”),

#eofIα,hi)

(δ,#eofIα⁰⁰,hi) (δJwarning(“discarding trailing characters”),

#eof,hi)

(δ,#eof,hi) (δJeof, hi,hi)

(δ,hi, π) (δJerror(“premature end of file”),hi,hi) Table 1.Basic Data and Tokenization

28 Marcus Lepper und Baltasar Tranc´on y Widemann

(37)

(In this paragraph “ ” stands for non-empty sequences of whitespace and new- lines.)αis a prefix not containing such a section, and will be discarded totally.

This allowsd2dinput to be contained in arbitrary documents, like e-mails etc.

µmust be a valid module definition in the ddf format (see Section3). It will be parsed and the contained definitions can be used immediately in the following text corpus.

Zero to many such local definition sections can be contained in an input file.

Finally it has to follow either

α #d2d 2.0 text using m: t

In this casemis the name of a module, andtis the name of a tag parser definition from m. This is used as the topmost element for the document structure to be parsed and thus defines the initial state of the parsing process. The other possibility is

α #d2d 2.0 xslt text producing m:t

In this case an XSLT source will be parsed, and the module and tag do identify the top-level element of the output to be generated by the XSLT code.

In both cases, the rest of the file immediately after the “t” is the text corpus input, fed to thed2dparser, up to a final explicit “#eof”.

2.3 Tokenization

The function tokenize in Table 1 defines the next step for processing the text corpus data (not the local module definitions), namely converting the stream of characters into a stream of tokens. The comment lead-in character “(” and the command character “#” can be re-defined by the user, and default to “/”

and “#”, resp. The tokenization process is defined by applying alongest prefix matchdiscipline to the transformation rules given for “ ”. The closing and empty tags with three slashes mark those elements which are intentionally left incomplete by the user. The reaction of the tools in these cases is configurable.

The tokenization level supports a limited set of one character parentheses, as known from sed’s “s%...%...%” and L^ATEX’s “\verb%...%” syntax. It is unre- lated to the parser level, which can cause funny effects, but which nevertheless has turned out to be the cleanest way to define.

2.4 Tag Based Parsing

The second step, parsing, is to convert a sequences of tokens fromD^?to a single node from N, as defined by the function text2tree() in Tables 2 and 3.² This node represents the top-most element of the resulting document model, which can later be shipped out to standard XML file format, or further processed by XSLT, as soon as completed.

Parsing always starts in tag mode, i.e. looking for explicit tags “OPENi” in the token stream. Character data is treated as if tagged with an implicit pseudo- tag “#chars”. The tag parsing process is a stack-controlled recursive descent

2 The formulas in this tables have been published in [3].

D2d — XML for Authors 29