High-Performance Cross-Language Interoperability in a Multi

High-Performance Cross-Language

Interoperability in a Multi-language Runtime

Matthias Grimmer

Johannes Kepler University Linz,

Austria

[email protected]

Chris Seaton

Oracle Labs, United Kingdom

chris.seaton@oracle.com

Roland Schatz

Oracle Labs, Austria

roland.schatz@oracle.com

Thomas W

urthinger

Oracle Labs, Switzerland

thomas.wuerthinger@oracle.com

Hanspeter M

ossenb

ock

Johannes Kepler University Linz, Austria

hanspeter.moessenbo[email protected]

Abstract

Programmers combine different programming languages be-

cause it allows them to use the most suitable language for a

given problem, to gradually migrate existing projects from

one language to another, or to reuse existing source code.

However, existing cross-language mechanisms suffer from

complex interfaces, insufﬁcient ﬂexibility, or poor perfor-

mance.

We present the TrufﬂeVM, a multi-language runtime that

allows composing different language implementations in a

seamless way. It reduces the amount of required boiler-

plate code to a minimum by allowing programmers to ac-

cess foreign functions or objects by using the notation of

the host language. We compose language implementations

that translate source code to an intermediate representation

(IR), which is executed on top of a shared runtime system.

Language implementations use language-independent mes-

sages that the runtime resolves at their ﬁrst execution by

transforming them to efﬁcient foreign-language-speciﬁc op-

erations. The TrufﬂeVM avoids conversion or marshaling of

foreign objects at the language boundary and allows the dy-

namic compiler to perform its optimizations across language

boundaries, which guarantees high performance. This paper

presents an implementation of our ideas based on the Trufﬂe

system and its guest language implementations JavaScript,

Ruby, and C.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a

fee. Request permissions from [email protected].

DLS ’15, October 25-30, 2015, Pittsburgh, PA, USA.

 2015 ACM 978-1-4503-3690-1/15/10. . . $15.00.

http://dx.doi.org/10.1145/2816707.2816714

Categories and Subject Descriptors D.3.4 [Programming

Languages]: Processors—Run-time environments, Code

generation, Interpreters, Compilers, Optimization

Keywords cross-language; language interoperability; vir-

tual machine; optimization; language implementation

1. Introduction

The likelihood that a program is entirely written in a single

language is lower than ever [8]. Composition of languages is

important for mainly three reasons: programmers can use the

most suitable language for a given problem, can gradually

migrate existing projects from one language to another, and

can reuse existing source code.

There exists no programming language that is best for all

kinds of problems [2, 8]. High-level languages allow repre-

senting a subset of algorithms efﬁciently but sacriﬁce low-

level features such as pointer arithmetic and raw memory

accesses. A typical example is business logic written in a

high-level language such as JavaScript that uses a database

driver written in a low-level language such as C. Cross-

language interoperability allows programmers to pick the

most suitable language for a given part of a problem. Ex-

isting approaches cater primarily to composing two speciﬁc

languages, rather than arbitrary languages. These pairwise

efforts restrict the ﬂexibility of programmers because they

have to select languages based on given cross-language in-

terfaces.

Cross-language interoperability reduces the risks when

migrating software from one language to another. For ex-

ample, programmers can gradually port existing C code to

Ruby, rather than having to rewrite the whole project at

once. However, existing approaches (e.g. Ruby’s C exten-

sion mechanism) require the programmer to write wrapper

code to integrate foreign code in a project, which adds a

maintenance burden and also distracts the programmer from

the actual task at hand.

Finally, cross-language interoperability allows reusing

existing source code. Due to the large body of existing code

it is not feasible to rewrite existing libraries in a different lan-

guage. A more realistic approach is to use a cross-language

interoperability mechanism that allows reusing this existing

code. However, existing solutions convert data at the lan-

guage border [34], use generic data representations that can-

not be optimized for an individual language [35], or cannot

widen the compilation scope across languages [24], which

introduces a runtime overhead. Programmers have to sac-

riﬁce performance if components, written in different lan-

guages, are tightly coupled.

In this paper we present the TrufﬂeVM, a multi-language

runtime that composes individual language implementations

that run on top of the same virtual machine, and share

the same style of IR. Multi-language applications can ac-

cess foreign objects and can call foreign functions by sim-

ply using the operators of the host language, which makes

writing multi-language applications easy. In our system, a

multi-language program just uses different ﬁles for different

programming languages. The TrufﬂeVM makes language

boundaries mostly invisible because the runtime implicitly

bridges them. Hence, we can reduce the amount of required

boiler-plate code to a minimum. For example, the JavaScript

statement in Figure 1 can access the C structure obj as if

it were a regular JavaScript object. Only if semantics of

languages fundamentally differ, programmers should need

to revert to an API and therefore an explicit foreign object

access.

We are convinced that a well-designed cross-language

interoperability mechanism should not only target a ﬁxed

set of languages but should be general enough to sup-

port interoperability between arbitrary languages. The Truf-

ﬂeVM uses the dynamic access, which is an interoperability-

mechanism that allows combining arbitrary languages by

composing their implementations on top of the TrufﬂeVM.

The dynamic access is independent of programming lan-

guages and their implementations for the TrufﬂeVM. It is

possible to add new languages to the TrufﬂeVM without

affecting the existing language implementations.

Also, crossing language boundaries does not introduce

runtime overhead. First, we do not require a common rep-

resentation of data for all languages (e.g., in contrast to the

CLR [6]), but each language can use the data structures that

meet the requirements of this individual language best. For

example, a C implementation can allocate raw memory on

the native heap while a JavaScript implementation can use

dynamic objects on a managed heap. Second, each language

can access foreign objects or invoke foreign functions with-

out introducing run-time overhead. The dynamic access en-

ables dynamic compilers to optimize a foreign object access

like any regular object access and also to widen the compi-

struct S {

int value;

};

struct S * obj;

var a = obj.value;

C Code:JavaScript Code:

Figure 1: JavaScript can access C data structures.

lation scope across language boundaries and thus to perform

cross-language optimizations. For example, we can inline a

JavaScript function into a caller that is written in C.

As a case study, we compose the languages JavaScript,

Ruby, and C. This allows us to discuss the following core

aspects:

1. The host language implementation maps language-speciﬁc

operations to messages, which are used to access ob-

jects in a language-independent way. For example, the

JavaScript implementation maps the property access of

Figure 1 to a message.

2. The foreign language maps these messages to foreign-

language speciﬁc accesses. The TrufﬂeVM uses this

mapping to replace messages by efﬁcient foreign-language-

speciﬁc operations at their ﬁrst execution. For example,

the TrufﬂeVM replaces the message to read the member

value by a structure access.

3. Finally, we discuss how we can bridge differences be-

tween JavaScript, Ruby, and C. These differences are:

object-oriented vs. non-object-oriented; dynamically typed

vs. statically typed; explicit memory management vs. au-

tomatic memory management; safe memory access vs.

unsafe memory access.

We present a performance evaluation using multi-language

benchmarks. We simulate using a new language in an exist-

ing code base, or updating a legacy code base gradually by

translating parts of well-established benchmarks from lan-

guage A to language B. We use the languages JavaScript,

Ruby, and C and can show that language boundaries do not

cause a performance overhead. In summary, this paper con-

tributes the following:

•

We describe a multi-language runtime that composes dif-

ferent language implementations. Rather than composing

a pair of languages (e.g., in contrast to a foreign func-

tion interface (FFI)) we support interoperability between

arbitrary languages. We show how language implemen-

tations map language-dependent operations to language-

independent messages and vice versa.

•

We list the different semantics of JavaScript, Ruby, and

C and explain how we bridge these differences.

•

We show the simplicity and extensibility of the Truf-

ﬂeVM. Also, we evaluate the performance of the Truf-

ﬂeVM using non-trivial multi-language benchmarks.

2. System Overview

The TrufﬂeVM targets language implementations, hosted

by a general-purpose VM. In the context of this paper, a

language implementation (LI) translates the source code into

an IR. We could identify the following requirements:

Compatible IR: LIs have to translate the applications’

source code to compatible (but not necessarily equal)

IRs.

Rewriting capability: LIs need a rewriting capability that

allows them to replace IR snippets with different IR snip-

pets at run time.

Sharable data: The data structures used by LIs to represent

the data of an application need to be accessible by all LIs.

Dynamic compilation: The dynamic compiler of the host

VM compiles the IR of a program to highly efﬁcient

machine code at runtime.

These requirements exist for bytecode-based VMs (e.g. a

JVM): They use a common IR, which is bytecode. Tech-

niques such as bytecode quickening [7] provide the rewriting

capability and all languages share a common heap for their

data.

The same requirements also exist for Trufﬂe [39], a plat-

form for implementing high-performance LIs in Java. We

use Trufﬂe for our case study because there are many LIs

available; including JavaScript, Ruby, and C. Trufﬂe LIs are

abstract syntax tree (AST) interpreters, running on top of a

Java Virtual Machine. Source code is compiled to an AST,

which is then dynamically executed by the Trufﬂe frame-

work. Every node has a method that evaluates the node. By

calling these methods recursively, the whole AST is evalu-

ated. All nodes extend a common base class Node.

Trufﬂe ASTs are self-optimizing in the sense that AST

nodes can speculatively rewrite themselves with specialized

variants [38] at run time, e.g., based on proﬁle information

obtained during execution such as type information. If these

speculative assumptions turn out to be wrong, the special-

ized tree can be reverted to a more generic version that pro-

vides functionality for all possible cases. Trufﬂe guest lan-

guages use self-optimization via tree rewriting as a general

mechanism for dynamically optimizing code at run-time.

After an AST has become stable (i.e., no more rewritings

occur) and when the execution frequency has exceeded a

predeﬁned threshold, Trufﬂe dynamically compiles the AST

to highly optimized machine code. The Trufﬂe framework

uses the Graal compiler [27] (which is part of the Graal

VM) as its dynamic compiler. The compiler inlines node

execution methods of a hot AST into a single method and

performs aggressive optimizations over the whole tree. It

also inserts deoptimization points [20] in the machine code

where the speculative assumptions are checked. If they turn

out to be wrong, control is transferred back from compiled

code to the interpreted AST, where specialized nodes can be

HotSpot Runtime

Interpreter GC …Graal

Trufﬂe

Graal VM

TrufﬂeJS TrufﬂeRuby TrufﬂeC

Language

Implementations

*.js *.rb *.c

Application

Compatible IR

(AST)

Source Code

Shared runtime

Figure 2: Layers of a Trufﬂe-based system: TrufﬂeJS, Truf-

ﬂeRuby, and TrufﬂeC are hosted by the Trufﬂe framework

on top of the Graal VM.

reverted to a more generic version. The Graal VM [27] is

a minor modiﬁcation of the HotSpot VM: it adds the Graal

compiler, but reuses all other parts (including the garbage

collector and the interpreter) from the HotSpot VM. Figure 2

shows the layers of a Trufﬂe-based system.

In this paper we use three LIs on top of Trufﬂe:

TrufﬂeJS is a state-of-the-art JavaScript engine that is

fully compatible with the JavaScript standard. Truf-

ﬂeJS’ speedup varies between 0.3× and 3× (average:

0.66×) compared to Google’s V8; between 0.4× and

1.3× (average: 0.65×) compared to Mozilla’s Spider-

monkey; between 0.9× and 27× (average: 5.8×) com-

pared to Nashorn as included in JDK 8u5 (results taken

from [36]).

TrufﬂeRuby is a Ruby implementation, which is an ex-

perimental option of JRuby. TrufﬂeRuby performs well

compared to existing Ruby implementations. Its speedup

varies between 2.7× and 38.2× (average: 12.7×) com-

pared to MRI; between 1.3× and 39.8× (average: 6.2×)

compared to Rubinius; between 2.7× and 17× (aver-

age: 6.6×) compared to JRuby; and between 1× and

7.6× (average: 3.5×) compared to Topaz (results taken

from [36]).

TrufﬂeC is a C implementation on top of Trufﬂe [14, 16]

and can dynamically execute C code. TrufﬂeC’s speedup

varies between 0.6× and 1.1× (average: 0.81×) com-

pared to the best performance of the GNU C Compiler

(results taken from [16]). TrufﬂeC does not yet support

the full C standard. However, there are no conceptual

limitations and future work will address completeness is-

sues.

Foreign Object Access: In the context of this paper, an ob-

ject is a non-primitive entity of a user program, which we

want to share across different LIs. Examples include data

(such as JavaScript objects, Ruby objects, or C pointers), as

well as functions, classes or code-blocks. If the Ruby imple-

mentation accesses a Ruby object, the object is considered

a regular object. If Ruby (host language, LI

Host

) accesses a

C structure, the C structure is considered a foreign object

(we call C the foreign language, LI

Foreign

). Object accesses

obj

Using messages to

access a JS object

value

Read

var obj = {value: 42};

obj

value

Read

JS?

obj

value

JS object access

after message resolution

Message resolution

Figure 3: The dynamic access accesses a JavaScript object

using messages; Message resolution replaces the Read mes-

sage by a direct access.

are operations that an LI

Host

can perform on objects, e.g.,

method calls, property accesses, or ﬁeld reads. We base our

work on the dynamic access, which is a message-based ap-

proach to access foreign objects [13, 17].

Trufﬂe LIs use different layouts for objects. For exam-

ple, the JavaScript implementation allocates data on the Java

heap, whereas the C implementation allocates data on the

native heap. Hence, each LI uses language-speciﬁc nodes to

access regular objects. To access a foreign object, an LI

Host

can use the dynamic access: The dynamic access deﬁnes a

set of language-independent messages, used to access for-

eign objects. The left part of Figure 3 shows a Trufﬂe AST

snippet that reads the value property of a JavaScript object

obj using a Read message.

Message resolution: The LI

Host

uses a message to access a

foreign object, which has exactly one language it belongs to.

The TrufﬂeVM uses this foreign language LI

Foreign

to resolve

the message to a foreign-language-speciﬁc AST snippet

(message resolution) upon ﬁrst execution. The LI

Foreign

pro-

vides an AST snippet that can be inserted into the host AST

as a replacement for the messages. This snippet contains

foreign-language-speciﬁc operations that directly access

the receiver. Thus, message resolution replaces language-

independent messages by language-speciﬁc operations. In

order to notice an access to an object of a previously un-

seen foreign language message resolution inserts a guard

into the AST that checks the receiver’s language before it

is accessed. As can be seen in Figure 3, message resolution

inserts a JSReadProperty node (a JavaScript-speciﬁc AST

node that reads the property of a JavaScript object; in Fig-

ure 3 we use the abbreviated label "." for this node). Before

the AST accesses obj, it checks if obj is a JavaScript object

(is JS?). If obj is suddenly an object of a different language

the execution falls back to sending a Read message again,

which will then be resolved to a new AST snippet for this

language. An object access is language polymorphic if it has

varying receivers originating from different languages. In

the language polymorphic case, the TrufﬂeVM embeds the

different language-speciﬁc AST snippets in a chain like an

inline cache [19] and therefore avoids a loss in performance.

Primitive types: Besides objects, also values with a prim-

itive type can be shared across languages. The work of [17]

deﬁnes a set of shared primitive types. Trufﬂe languages

map language-speciﬁc primitive types from and to this set,

which allows exchanging primitive values.

In [17], we describe how this approach is used to com-

pose the C and Ruby implementations in order to support

C extensions for Ruby. The runtime that we present in this

paper composes arbitrary languages rather than a pair of lan-

guages (Ruby and C). Section 6 discusses in detail how our

approach differs from traditional FFIs on top of Trufﬂe.

3. Multi-Language Composition

The TrufﬂeVM can execute programs that are written in

multiple languages. Programmers use different ﬁles for dif-

ferent programming languages. For example, if parts of a

program are written in JavaScript and C, these parts are in

different ﬁles. Distinct ﬁles for each programming language

allow us to reuse the existing parsers of each LI without

modiﬁcation. Syntactical and grammatical combination is

out of scope for this work.

Programmers can export data and functions to a multi-

language scope and also import data and functions from

this scope. This allows programmers to explicitly share data

among other languages. JavaScript, Ruby, and C provide

built-ins that allow exporting and importing data to and from

the multi-language scope.

3.1 Implicit Foreign Object Accesses

The TrufﬂeVM allows programmers to access foreign ob-

jects transparently. The VM maps host-language-speciﬁc

operations to language-independent messages, which are

then mapped back to foreign-language-speciﬁc operations.

Trufﬂe LIs compile source code to a tree of nodes, i.e., an

AST. N

and N

deﬁne ﬁnite sets of nodes of LIs A and

B. Each node has r : N

→ N children, where N denotes

the set of natural numbers. If n ∈ N

is a node, then r(n)

is the number of its children. We call nodes with r = 0 leaf

nodes. An AST t ∈ T

is a tree of nodes n ∈ N

. By

n(t

, ..., t

) we denote a tree with root node n ∈ N

and k

sub-trees t

, . . . , t

∈ T

, where k = r(n).

The dynamic access deﬁnes a set of messages, which are

modeled as language-independent nodes N

Msg

= {Read, Write, Execute, Unbox, IsNull}

(1)

If the LI

Host

A uses messages to access a foreign object, the

tree t

a,m

∈ T

∪N

Msg

consists of language-speciﬁc nodes

and language-independent nodes N

Msg

To compose JavaScript, Ruby, and C we use the messages

n ∈ N

Msg

where the sub-trees t

, . . . , t

∈ T

∪N

Msg

n(t

, ..., t

) evaluate to the arguments of the message:

Read: Trufﬂe LIs use the Read message to access a ﬁeld of

an object or an element of an array. It can also be used

to access methods of classes or objects, i.e., to lookup

executable methods from classes and objects.

Read(t

rec

, t

) ∈ T

∪N

Msg

(2)

The ﬁrst subtree t

rec

evaluates to the receiver of the Read

message, the second subtree t

to a name or an index.

Write: An LI uses the Write message to set the ﬁeld of an

object or the element of an array. It can also be used to

add or change the methods of classes and objects.

Write(t

rec

, t

val

) ∈ T

∪N

Msg

(3)

The ﬁrst subtree t

rec

evaluates to the receiver of the Write

message, the second subtree t

to a name or an index,

and the third subtree t

val

to a value.

Execute: LIs execute methods or functions using an Exe-

cute message.

Execute(t

, t

, . . . , t

) ∈ T

∪N

Msg

(4)

The ﬁrst subtree t

evaluates to the function/method itself,

the other arguments t

, . . . , t

to arguments.

Unbox: Programmers often use an object type to wrap a

value of a primitive type in order to make it look like a

real object. An Unbox message unwraps such a wrapper

object and produces a primitive value. LIs use this mes-

sage to unbox a boxed value whenever a primitive value

is required.

Unbox(t

rec

) ∈ T

∪N

Msg

(5)

The subtree t

rec

evaluates to the receiver object.

IsNull: Many programming languages use null/nil for

an undeﬁned, uninitialized, empty, or meaningless value.

The IsNull message allows the LI to do a language-

independent null-check.

IsNull(t

rec

) ∈ T

∪N

Msg

(6)

The subtree t

rec

evaluates to the receiver object.

3.1.1 Mapping language-speciﬁc operations to

messages

If language A encounters a foreign object at runtime and

the regular object access operations cannot be used, then

language A uses the dynamic access. The LI

maps an AST

with a language-speciﬁc object access t

∈ T

to an AST

with a language-independent access t

a,m

∈ T

∪N

Msg

using

a the function f

−−→ T

∪N

Msg

(7)

The function f

replaces the language speciﬁc access and

inserts a language independent access instead. The other

parts of the AST t

remain unchanged. An LI

Host

that ac-

cesses foreign objects has to deﬁne this function f

Consider the example in Figure 4 (showing a property ac-

cess in JavaScript: obj.value), f

replaces the JavaScript-

speciﬁc object access (JSReadProperty, node "." in the

AST) with a language-independent Read message (see left

part of Figure 4).

JSReadProperty(t

obj

, t

value

)

7−→ Read(t

obj

, t

value

) (8)

Rather than using a JSReadProperty node to access the

value property of the receiver obj, the JavaScript imple-

mentation uses a Read message.

3.1.2 Mapping messages to language-speciﬁc

operations

The TrufﬂeVM uses the LI

Foreign

B to map the host AST with

a language-independent access t

a,m

∈ T

∪N

Msg

to an AST

with a foreign language-speciﬁc access t

a,b

∈ T

∪N

using the function g

∪N

Msg

−−→ T

∪N

(9)

Message resolution removes the language-independent ac-

cess and inserts a language speciﬁc access instead, which

produces an AST that consists of nodes N

∪N

. The other

parts of t

a,m

remain unchanged. With respect to the example

in Figure 4, the TrufﬂeVM replaces the Read message with

a C-speciﬁc access operation upon its ﬁrst execution. More

speciﬁcally, it uses g

to replace the Read message with a

CMemberRead node (node "->" in the AST):

Read(t

obj

, t

value

)

7−−→ CMemberRead(t

obj

, t

value

) (10)

The result is a JavaScript AST that embeds a C access

operation t

JS,C

∈ T

∪N

Language implementers have to deﬁne the functions f

and g

; the TrufﬂeVM then creates the pairwise combina-

tion automatically by composing these functions at runtime:

◦ f

: T

→ T

∪N

(11)

When accessing foreign objects, the TrufﬂeVM automati-

cally creates an AST t

a,b

∈ T

∪N

where the main part is

speciﬁc to language A and the foreign object access is spe-

ciﬁc to language B.

For further reference, we provide a detailed table that lists

all mappings from language-speciﬁc operations to messages

and vice versa

3.1.3 Limitation

Consider a function f

that maps an AST with a language-

speciﬁc object access t

∈ T

to an AST with a language-

independent access t

a,m

∈ T

∪N

Msg

and a function g

that

http://ssw.jku.at/General/Staff/Grimmer/TruffleVM_table.pdf

a .

obj

obj is a

C struct

obj

Msg

Message Resolution

Msg

Regular object access

Language-independent

object access

Foreign object access

value

Read

obj

value

Figure 4: Accessing a C structure form JavaScript; Message resolution inserts a C struct access into a JavaScript AST.

maps t

a,m

to an AST with a foreign-language-speciﬁc object

access t

a,b

∈ T

∪N

−−→ T

∪N

Msg

∪N

Msg

−−→ T

∪N

(12)

When composing f

and g

three different cases can

occur:

1. If g

is deﬁned for t

a,m

∈ T

∪N

Msg

a foreign object

can be accessed implicitly. The TrufﬂeVM can replace

the language-independent object access with a B-speciﬁc

access.

2. If g

is not deﬁned for t

a,m

∈ T

∪N

Msg

, we report

a runtime error with a high-level diagnostic message.

The foreign object access is not supported by the foreign

language. For example, if JavaScript accesses the length

property of a C array, we report an error. C cannot provide

length information for arrays.

3. A foreign object access might not be expressible in A,

i.e., one wants to create t

a,m

∈ T

∪N

Msg

but language

A does not provide syntax for this access. For example,

a C programmer cannot access the length property of a

JavaScript array. In this case one has to fall back to an

explicit foreign object access.

3.2 Explicit Foreign Object Accesses

A host language might not provide syntax for a speciﬁc

foreign object access. Consider the JavaScript array arr of

Figure 5, which is used in a C program: C does not provide

syntax for accessing the length property of an array.

We overcome these issues by exposing the dynamic ac-

cess to the programmer. Using this interface, the program-

mer can fall back to an explicit foreign object access. The

programmer sends a message directly in order to access a

foreign object. In other words, this interface allows pro-

grammers to handcraft the foreign object access of t

a,m

∈

∪N

Msg

Every LI on top of Trufﬂe has an API for explicit message

sending. For example, to access the length property of

a JavaScript array from C (see Figure 5), the programmer

var arr = new Array(5);

int arr = // …

int length = Read(arr, “length”);

JavaScript Code:C Code:

Figure 5: C accessing the length property of a JavaScript

array.

uses the built-in C function Read. The C implementation

substitutes this Read invocation by a Read message.

4. Different Language Paradigms and

Features

In this section we describe how we map the different

paradigms and features of the languages JavaScript, Ruby,

and C to messages. It is not possible to cover all features and

paradigms of the vast amount of languages that are avail-

able. Hence, we focus on JavaScript, Ruby, and C and show

how we map the concepts of these fundamentally differ-

ent languages to the dynamic access, which demonstrates

the feasibility of the TrufﬂeVM. We explain how we deal

with dynamic typing versus static typing, object-oriented

versus non-object oriented semantics, explicit versus auto-

matic memory management, as well as safe versus unsafe

memory accesses.

4.1 Dynamic Typing vs. Static Typing

In [37], Wrigstad et al. describe an approach called like types

— they show how dynamic objects can be used in a stati-

cally typed language by binding them to like-type variables.

Occurrences of like-type variables are checked statically, but

their usage is checked dynamically. The TrufﬂeVM is simi-

lar, except that in our case any pointer variable in C can be

bound to a foreign object:

We bind foreign dynamic objects to pointer variables that

are associated with static type information. If a pointer is

bound to a dynamically typed value, we check the usage

dynamically, i.e., upon each access we check whether the

operation on the foreign object is possible. We report a run-

time error otherwise.

Figure 6 shows a C program, which uses a JavaScript

object jsObject. The C code associates jsObject with

the static type struct JsObject*, which is deﬁned by the

programmer (Figure 6, Label 1). When the C code accesses

the JavaScript object (Label 3), we check whether bar exists

and report an error otherwise.

4.2 Object-Oriented vs. Non-Object Oriented

Semantics

The object-oriented programming paradigm allows pro-

grammers to create objects that contain both data and code,

known as ﬁelds and methods. Also, objects can extend each

other (e.g. class-based inheritance or prototype-based inher-

itance) — when accessing ﬁelds or methods, the object does

a lookup and provides a ﬁeld value or a method.

The TrufﬂeVM uses the dynamic access, which retains

this mechanism. Consider the method invocation (from C to

JavaScript) at Label 3 in Figure 6: TrufﬂeC maps this access

to the following messages:

CCall(CMemberRead(t

jsObject

, t

bar

), t

jsObject

, t

)

7−−→ Execute(Read(t

jsObject

, t

bar

), t

jsObject

, t

)

(13)

TrufﬂeJS resolves this access to an AST snippet that does

the lookup of method bar and executes it:

Execute(Read(t

jsObject

, t

bar

), t

jsObject

, t

)

7−−→ JSCall(JSReadProperty(t

jsObject

, t

bar

), t

jsObject

, t

)

(14)

A method call in an object-oriented language passes the

this object as an implicit argument. Non-object oriented lan-

guages that invoke methods therefore need to explicitly pass

the this object. For example, the JavaScript function bar (see

Figure 6) expects the this object as the ﬁrst argument. Hence,

the ﬁrst argument of the method call in C (Label 3) is the this

object jsObject.

Vice versa, the signature of a non-object-oriented func-

tion needs to contain the this argument if the caller is an

object-oriented language. For example, if JavaScript calls

the C function foo (Label 4), JavaScript passes the this ob-

ject as the ﬁrst argument. The signature of the C function

foo (Label 5) explicitly contains the this object. A wrong

number of arguments causes a runtime error.

Future work: The TrufﬂeVM currently does not support

cross-language inheritance, i.e., class-based inheritance or

prototype-based inheritance is only possible with objects

that originate from the same language. We are convinced

that the TrufﬂeVM is extensible in this respect and therefore

our future research will focus on inheritance across language

boundaries.

4.3 Explicit vs. Automatic Memory Management

Trufﬂe LIs are running on a shared runtime and can ex-

change data, independent of whether it is managed or un-

managed:

Unmanaged allocations: Trufﬂe LIs keep unmanaged al-

locations on the native heap, which is not garbage collected.

For example, TrufﬂeC allocates data on the native heap.

TrufﬂeC represents all pointers (pointers to values, arrays,

structures, and functions) as managed Java objects of type

CAddress that wrap a 64-bit address value [15] and attach

type information to the address value. TrufﬂeC uses this type

information to resolve messages and provide AST snippets

that can access the data, stored on the native heap. When ac-

cessing a CAddress object via a dynamic access, the access

will resolve to a raw memory accesses. The dynamic access

allows accessing unmanaged data from a language that oth-

erwise only uses managed data.

Managed allocations: The JavaScript and Ruby imple-

mentations allocate objects on the Java heap. If an applica-

tion binds a managed object to a C variable, then TrufﬂeC

keeps this variable as a Java object of type Object. Thus,

the Java garbage collector can trace managed objects even if

they are referenced from unmanaged languages.

Trade-offs: If a pointer variable of an unmanaged lan-

guage references an object of a managed language, opera-

tions are restricted. First, pointer arithmetic on foreign ob-

jects is only allowed as an alternative to array indexing. For

example, C programmers can access a JavaScript array either

with indexing (e.g. jsArray[1]) or by pointer arithmetic

(*(jsArray + 1)). However, it is not allowed to manipu-

late a pointer variable, referencing a managed object in any

other way (e.g. jsArray = jsArray + 1).

Second, pointer variables referencing managed objects

cannot be cast to primitive values (such as long or int).

References to the Java heap cannot be represented as primi-

tive values like it is possible for raw memory addresses. We

report a high-level error-message in that case.

4.4 Safe vs. Unsafe Memory Accesses

C is an unsafe language and does not check memory ac-

cesses at runtime, i.e., there are no runtime checks that en-

sure that pointers are only dereferenced if they point to a

valid memory region and that pointers are not used after the

referenced object has been deallocated. TrufﬂeC allocates

data on the native heap and uses raw memory operations to

access it, which is unsafe. This has the following implica-

tions on multi-language applications:

Unsafe accesses: If an unsafe language (such as C) shares

data with a safe language (such as JavaScript), all access

operations are unsafe. For example, accessing a C array in

JavaScript is unsafe. If the index is out of bounds, the access

has an undeﬁned behavior (as deﬁned by the C speciﬁca-

tion). However, accessing a C array directly is more efﬁcient

than accessing dynamic JavaScript array because less run-

time checks are required.

Safe accesses: Accessing data structures of a safe lan-

guage (such as JavaScript) from an unsafe language (such

struct JsObject {

void (*bar)(void *receiver, int b);

};

struct JsObject *jsObject = // …

jsObject->bar(jsObject, 84);

void foo(void *receiver, int a);

var jsObject = {

bar: function (b) {

// …

}

// …

foo(42);

5 4

Figure 6: Foreign object deﬁnition in C and an object-oriented object access.

as C) is safe. For example, accessing a JavaScript array in

C is safe. TrufﬂeC implements the access by a Read or

Write message, which TrufﬂeJS resolves with operations

that check if the index is within the array bounds and grow

the array in case the access was out of bounds.

5. Evaluation

In this section we discuss why we claim that the TrufﬂeVM

improves the current state-of-the-art in cross-language in-

teroperability. We focus on simplicity of foreign object ac-

cesses, as well as on extensibility of the TrufﬂeVM with re-

spect to new languages. We also evaluate the performance

of a multi-language application.

5.1 Simplicity

Accessing foreign objects with the operators of the host lan-

guage improves simplicity. Programmers are not forced to

write boiler-plate code as long as an object access can be

mapped from language A to language B (t

7−−→ t

a,m

7−−→

a,b

). We make the mapping of language operations to mes-

sages largely the task of language implementers rather than

the task of application programmers.

If not otherwise possible, programmers can also handcraft

accesses to foreign objects. All languages expose an API

that allows programmers to explicitly access foreign objects

using the dynamic access.

We modiﬁed single-language benchmarks such that parts

of them were written in a different language. The other parts

did not have to be changed, because accesses to foreign-

language objects can simply be written in the language of the

host. The only extra code that we needed was for importing

and exporting objects form and to the multi-language scope.

5.2 Extensibility

Many existing cross-language mechanisms cannot be ex-

tended to other languages. For example, FFIs are designed

for a set of two languages and it is hard to extend them to

include other languages. We can add new languages to the

TrufﬂeVM if they support the following:

Language-independent access: The LI

Host

has to map an

AST with language-speciﬁc accesses to an AST with

language-independent accesses, i.e., an LI

Host

has to de-

ﬁne T

−−→ T

∪N

Msg

. If the semantics of a new

language do not allow the mapping of certain operations

to messages, language implementers can still provide an

API to support an explicit foreign object access (see Sec-

tion 3.2), which limits the implicit foreign object access

but still guarantees good performance.

Resolve a language-independent access: An LI

Foreign

has to deﬁne a mapping from an AST with language-

independent accesses to an AST with foreign language-

speciﬁc accesses T

∪N

Msg

−−→ T

∪N

. This map-

ping allows the language implementer to decide how

other languages can access objects.

Multi-language scope: The LI has to provide infrastruc-

ture for the application programmer to export and import

objects to and from the multi-language scope.

Implementing these requirements for an existing Trufﬂe lan-

guage is little effort: A single programmer was able to imple-

ment the dynamic access for TrufﬂeRuby within one week

and we could add it to the TrufﬂeVM.

5.3 High Performance

We evaluated the TrufﬂeVM with a number of benchmarks

that show how multi-language applications perform com-

pared to single-language applications.

Benchmarks: For this evaluation we use benchmarks that

heavily access objects and arrays. The benchmarks (the Sci-

Mark benchmarks

and benchmarks from the Computer

Language Benchmarks Game

) compute a Fast Fourier

Transformation (FFT), a Jacobi successive over-relaxation

(SOR), a Monte Carlo integration (MC), a sparse matrix mul-

tiplication (SM), a dense LU matrix factorization (LU), sort

an array using a tree data structure (TS), generate and write

random DNA sequences (Fasta), and solve the towers of

Hanoi problem (Tower).

Experimental Setup: The benchmarks were executed on

an Intel Core i7-4770 quad-core 3.4GHz CPU running 64

Bit Debian 7 (Linux3.2.0-4-amd64) with 16 GB of memory.

We base the TrufﬂeVM on Graal revision bf586af6fa0c

from the ofﬁcial OpenJDK Graal repository

. In this evalua-

tion we show the score for each benchmark and its conﬁgu-

http://math.nist.gov/scimark2/index.html

http://benchmarksgame.alioth.debian.org/

http://openjdk.java.net/projects/graal/

ration, which is the proportion of the execution count of the

benchmark and the time needed (executions/second).

For this evaluation we are interested in peak performance

of long running applications. Hence, we executed every

benchmark 10 times with the same parameters after an ini-

tial warm-up of 50 iterations to arrive at a stable peak per-

formance and calculated the averages for each conﬁguration

using the arithmetic mean.

Using foreign objects has no inﬂuence on compile time.

We compile ASTs where messages are already resolved, i.e.,

for the compiler there is no difference between a foreign or

a regular object access. Also, message resolution happens

at the ﬁrst execution and happens only once. Compared to

the single-language implementations, the warm-up time did

not change. A general evaluation of warm-up performance

of Trufﬂe LIs is out of scope for this work.

The error bars in the charts of this section show the

standard deviation. The x-axis of the charts in Figure 7, 8,

and 9 shows the different benchmarks. The y-axis of the

charts in Figure 7, 8, and 9 shows the average scores (higher

is better) of the benchmarks. Where we summarize across

different benchmarks we report a geometric mean [10].

5.3.1 Results of single-language benchmarks

We compare the performance of the individual languages on

our benchmarks. The results in Figure 7 are normalized to

the C performance. This evaluation shows that JavaScript

code is on average 37% slower and Ruby code on average

66% slower than C code. We explain the differences as

follows:

C data accesses do not require runtime checks (such as

array bounds checks), but the memory is accessed directly.

This efﬁcient data access makes C the fastest language for

most benchmarks.

However, if a program allocates data in a frequently exe-

cuted part of the program, the managed languages (JavaScript

and Ruby) can outperform C. Allocations in TrufﬂeC (using

calloc) are more expensive than the instantiation of a new

object on the Java heap. TrufﬂeC does a native call to exe-

cute the calloc function of the underlying OS. TrufﬂeJS or

TrufﬂeRuby allocate a new object on the Java heap using se-

quential allocation in thread-local allocation buffers, which

explains why JavaScript and Ruby perform better than C on

Treesort. The Ruby semantics require that Ruby objects are

accessed via getter or setter methods. TrufﬂeRuby uses a

dispatch mechanism to access these methods. This dispatch

mechanism introduces additional runtime checks, which ex-

plains why Ruby is in general slower than JavaScript or C.

5.3.2 Results of multi-language benchmarks

We modiﬁed the benchmarks by extracting all array and

object allocations into factory functions. We then replaced

these factory functions with implementations in different

languages, making the benchmarks multi-language applica-

tions.

Comp.

FFT

SOR

Fasta

Tower

0.63

0.81

0.57

0.27

0.44

1.62

0.46

0.64

0.33

0.38

0.21

0.58

0.1

0.21

1.14

0.55

0.23

C (baseline) JS Ruby

Figure 7: Performance of individual languages on our bench-

marks (normalized to C performance; higher is better).

Comp.

FFT

SOR

Fasta

Tower

0.6

0.87

0.45

0.95

0.26

0.41

1.48

0.5

0.58

0.4

0.42

0.23

0.78

0.14

0.21

1.19

0.53

0.43

C (baseline) C w. JS C w. Ruby

(a) Main part in C.

Comp.

FFT

SOR

Fasta

Tower

1.29

1.15

1.61

0.92

1.76

2.11

0.68

1.41

1.27

0.55

0.49

0.44

0.77

0.56

0.5

0.67

0.79

0.32

JS w. C JS (baseline) JS w. Ruby

(b) Main part in JavaScript.

Comp.

FFT

SOR

Fasta

Tower

1.85

2.34

2.92

1.22

1.63

4.22

0.86

1.88

1.47

1.09

1.95

1.56

1.08

1.06

1.35

1.03

0.57

0.71

Ruby w. C Ruby w. JS Ruby (baseline)

Figure 8: Performance of multi-language applications

(higher is better).

We grouped our evaluations (Figure 8) such that their

main part was either written in C, in JavaScript, or in Ruby.

For each group we used the single-language implementation

as the baseline. We then replaced the factory functions by

implementations in a different language and compared the

multi-language conﬁgurations to the baseline of each group.

These multi-language applications heavily access foreign

arrays/objects and call foreign functions, which makes them

good candidates for our evaluation.

C Objects: C data structures are unsafe; access operations

are not checked at runtime, which makes them efﬁcient

in terms of performance. Hence, using C data structures

in JavaScript or Ruby applications improves the run-

time performance. However, an allocation with calloc

is more expensive than an allocation on the Java heap.

The Treesort benchmark allocates objects in its main

loop, hence, factory functions in JavaScript or Ruby per-

form better than factory functions written in C.

JS Objects: TrufﬂeJS uses a dynamic object implementa-

tion where each access involves run-time checks. Exam-

ples of such checks are array bounds checks to dynam-

ically grow JavaScript arrays or property access checks

to dynamically add properties to an object. These checks

are the reason why JavaScript objects perform worse than

C objects.

Ruby Objects: TrufﬂeRuby’s dispatch mechanism for ac-

cessing objects introduces a performance overhead com-

pared to JavaScript and C, which explains why Ruby ob-

jects are in general slower than JavaScript objects or C

objects.

Our evaluation shows that the performance of a multi-

language program depends on the performance of the indi-

vidual language parts. Using heavy-weight foreign data has a

negative impact on performance: Figure 8a and 8b show that

using heavy-weight Ruby objects in C or JavaScript causes a

slowdown of up to 7×. On average, using Ruby data reduces

the C performance by a factor of 2.5 and the JavaScript

performance by a factor of 1.8. On the other hand, using

efﬁcient foreign data has a positive effect on performance:

For example, Figure 8b and 8c show that using efﬁcient C

data in JavaScript or Ruby can improve performance by up

to a factor of 4.22. On average, using C data improves the

JavaScript performance by a factor of 1.29 and the Ruby

performance by a factor of 1.85.

5.3.3 Removing language boundaries

Message resolution as part of the dynamic access allows

Trufﬂe’s dynamic compiler to apply its optimizations across

language boundaries (cross-language inlining). Message

resolution removes language boundaries by merging differ-

ent language-speciﬁc AST parts, which allows the compiler

to inline method calls even if the callee is a foreign function.

Widening the compilation span across different languages

enables the compiler to apply optimizations to a wider range

of code.

Message resolution also allows Trufﬂe’s dynamic com-

piler to apply escape analysis and scalar replacement [31]

to foreign objects. Consider a JavaScript program that allo-

cates an object, which is used by the C part of an applica-

Comp.

FFT

SOR

Fasta

Tower

1.29

1.15

1.61

0.92

1.76

2.11

0.68

1.41

1.27

0.21

0.17

0.12

0.45

0.15

0.2

0.26

0.49

0.11

JS w. C JS (baseline) JS w. C; no msg res.

Figure 9: Performance evaluation of multi-language applica-

tions without message resolution (higher is better).

tion. Message resolution ensures that Trufﬂe’s escape anal-

ysis can analyze the object access, independent of the host

language. If the JavaScript object does not escape the com-

pilation scope, scalar replacement can remove the allocation

and replace all usages of the object with scalar values.

To demonstrate the performance improvement due to

message resolution we temporarily disable it. When dis-

abling message resolution, the LI

Host

still uses the dynamic

access to access foreign objects but the TrufﬂeVM does not

replace the messages in t

a,m

∈ T

∪N

Msg

. It uses LI

Foreign

to locally execute the access operation and return the result.

We do not introduce additional complexity to the dynamic

access. However, LI

Host

has to treat LI

Foreign

as a black box,

which introduces a language boundary. In Figure 9 we show

the performance of our JavaScript benchmarks using C data

structures with and without message resolution. When dis-

abling message resolution, every data access as well as every

function call crosses the language boundary, which results

in a performance that is for JavaScript and C on average 6×

slower than the performance with message resolution. The

dynamic compiler cannot perform optimizations across lan-

guage boundaries, which explains the loss in performance.

We expect similar results for the other conﬁgurations, how-

ever, we have not measured them yet because disabling mes-

sage resolution for an LI requires a signiﬁcant engineering

effort.

6. Foreign Function Interfaces with Trufﬂe

We consider the TrufﬂeVM to be very different from FFIs.

Usually, an FFI composes a ﬁxed pair of languages, while

the TrufﬂeVM allows interoperability between arbitrary lan-

guages as long as they comply with the requirements de-

scribed in Section 2.

Of course, the dynamic access can also be used for im-

plementing FFIs, which we discussed in previous work:

C data access for JavaScript [15]: In this work we ac-

cessed C data from within JavaScript, which is simi-

lar to using Typed Arrays in JavaScript. The perfor-

mance evaluation shows that using C arrays from within

JavaScript is on average 19% faster than using Typed

Arrays [15]. Performance improves because a C data

access is more efﬁcient than accessing a Typed Array.

C extension support for Ruby [17]: C extensions allow

Ruby programmers to write parts of their application in

C. The Ruby engine MRI exposes a C extension API

that consists of functions to access Ruby objects from C.

In [17] we composed Ruby and C by implementing this

API in TrufﬂeC. TrufﬂeC substitutes every call to this

API with the dynamic access that accesses Ruby data.

We ran existing Ruby modules that used a C extension

for the computationally intense parts of the algorithm,

i.e., Ruby code calls C functions frequently, which in

turn heavily access Ruby data. The performance is on

average over 3× faster than the MRI implementation of

Ruby running native C extensions.

7. Related Work

7.1 Common Language Runtime

The Microsoft Common Language Infrastructure (CLI) de-

scribes LIs that compile different languages to a com-

mon IR that is executed by the Common Language Run-

time (CLR) [6]. The CLR can execute conventional object-

oriented imperative languages and the functional language

F#. Languages running under the CLR are restricted to a

subset of features that can be mapped to the Common Lan-

guage Speciﬁcation (CLS) of the shared object model, i.e.,

a language that conforms to this CLS can exchange objects

with other conforming languages. CLS-compliant LIs gen-

erate metadata to describe user-deﬁned types. This metadata

contains enough information to enable cross-language oper-

ations and foreign object accesses.

The CLS cannot directly call low-level languages such as

C. Native calls are done via the annotation-based PInvoke

and the FFI-like IJW interface, which uses explicit marshal-

ing and a pinning API.

Microsoft’s approach is different from ours because it

forces CLS-compliant languages to use a predeﬁned repre-

sentation of data on the heap and to use a shared set of op-

erations to access it. The TrufﬂeVM, on the other hand, al-

lows every language to have its own representation of objects

and to deﬁne individual access operations. Object accesses

are done via the dynamic access, i.e., they are mapped to

language-independent messages which are dynamically re-

solved for different languages at runtime. We embed foreign-

language-speciﬁc accesses into the host IR, which creates

a homogeneous IR that the dynamic compiler can optimize

even across language boundaries.

We argue that the TrufﬂeVM is more ﬂexible because co-

operating languages are not limited to languages that com-

ply with a predeﬁned object model and a ﬁxed set of ac-

cess operations. The TrufﬂeVM allows the efﬁcient compo-

sition of managed and unmanaged languages on top of a sin-

gle runtime and the exchange of data as diverse as dynamic

JavaScript objects and unmanaged C structures.

7.2 Foreign Function Interfaces

Most modern VMs expose an FFI such as Java’s JNI [24],

Java’s Native Access, or Java’s Compiled Native Interface.

An FFI deﬁnes a speciﬁc interface between two languages:

a pair of languages can be composed by using an API that

allows accessing foreign objects. The result is rather inﬂex-

ible, i.e., the host language can only interact with a foreign

language by writing code that is speciﬁc to this pair of lan-

guages. Also, FFIs primarily allow integrating C/C++ code,

e.g., Ruby and C (native Ruby extension), R and C (native

R extensions), or Java and C. They hardly allow integrating

code written in a different language than C.

Wrapper generation tools (e.g. the tool Swig [4] or the

tool described by Reppy and Song [28]) use annotations

to generate FFI code from C/C++ interfaces, rather than

requiring users to write FFI code by hand. However, these

tools add a maintenance burden: programmers need to copy

API deﬁnitions and apply annotations outside the original

source code. A similar approach is described in [23], where

existing interfaces are transcribed into a new notion instead

of using annotations.

Compilation barriers at language boundaries have a neg-

ative impact on performance. To widen the compilation span

across multiple languages, Stepanian et al. [32] describe an

approach that allows inlining native functions into a Java ap-

plication using a JIT compiler. They can show how inlining

substantially reduces the overhead of JNI calls.

Kell et al. [22] describe invisible VMs, which allow a

simple and low-overhead foreign function interfacing and

the direct use of native tools. They implement the Python

language and minimize the FFI overhead.

There are many other approaches that target a ﬁxed pair

of languages [5, 12, 21, 29, 37]. These approaches are tai-

lored towards interoperability between two speciﬁc lan-

guages and cannot be generalized for arbitrary languages

and VMs. In contrast to them, the TrufﬂeVM provides true

cross-language interoperability rather than just pairwise in-

teroperability: we can compose languages without writing

boilerplate code, without targeting a ﬁxed set of languages,

and without introducing a compilation barrier when crossing

language boundaries. The TrufﬂeVM requires LIs to imple-

ment the dynamic access in order to become interoperable

with other languages. Hence, the TrufﬂeVM can be easily

extended with new languages.

7.3 Multi-Language Source Code

Another approach to cross-language interoperability is to

compose languages at a very ﬁne granularity by allowing

the programmer to toggle between syntax and semantics of

languages on the source code level [2, 3, 18]. Jeannie [18] al-

lows toggling between C and Java, hence, the two languages

can be combined more directly than via an FFI. A similar ap-

proach is used by Barrett et al., in which the authors describe

a combination of Python and Prolog called Unipycation [2]

or Python and PHP called PyHyp [3]. Unipycation and Py-

Hyp compose languages by directly combining their inter-

preters. We share the same goals with Barrett et al., namely

to retain the performance of different language parts when

composing them, however, the TrufﬂeVM is not restricted to

a ﬁxed set of languages.

Jeannie, Unipycation, and PyHyp allow a more ﬁne-

grained language composition than the TrufﬂeVM. How-

ever, the code, written by programmers, consists of multiple

languages, which requires adaption of source-level tools (in-

cluding debuggers).

7.4 Interface Deﬁnition Languages

Interface Description Languages (IDLs) implement cross-

language interoperability via message-based inter-process

communication between separate runtimes. An IDL allows

the deﬁnition of interfaces that can be mapped to multi-

ple languages. An IDL interface is translated to stubs in

the host language and in the foreign language, which can

then be used for cross-language communication [26, 30, 34].

These per-language stubs marshal data to and from a com-

mon wire representation. However, this approach introduces

a marshalling and copying overhead as well as an addi-

tional maintenance burden (learning and using an IDL and

its toolchain).

Using IDLs in the context of single-process applications

has only been explored in limited ways [9, 35]. Also, these

approaches retain the marshalling overhead and cannot share

objects directly. The TrufﬂeVM avoids copying of objects at

language borders and rather uses the dynamic access. Mes-

sage resolution makes language boundaries transparent to

the dynamic compiler, which can optimize across language

boundaries. Furthermore, by allowing the programmer to

implicitly access foreign objects we make the mapping of

language operations to messages largely the task of language

implementers rather than the task of end programmers.

7.5 Multi-Language Semantics

The semantics of multi-language composability is a well re-

searched area [1, 11, 12, 25, 33, 37], however, most of these

approaches do not have an efﬁcient implementation. Our

work uses some inspiring ideas from existing approaches

(such as like types from Wrigstad et al.[37], Section 4.1) and

therefore stands to complement such efforts.

8. Conclusion

In this paper we presented a novel approach for compos-

ing language implementations, hosted by a shared runtime.

The TrufﬂeVM allows programmers to directly access for-

eign objects using the operators of the host language. Lan-

guage implementations access foreign objects via the dy-

namic access, which means that language implementations

use language-independent messages that are resolved at their

ﬁrst execution and transformed to efﬁcient language-speciﬁc

operations. We deﬁne a mapping from language-speciﬁc ob-

ject accesses to language-independent messages and vice

versa. This approach makes the mapping of language op-

erations to messages largely the task of the language im-

plementer rather than the task of the end programmer. The

dynamic access allows us adding new languages to our plat-

form without affecting existing languages.

The TrufﬂeVM leads to excellent performance of multi-

language applications because of two reasons: First, mes-

sage resolution replaces language-independent messages

with efﬁcient language-speciﬁc operations. Accessing for-

eign objects becomes as efﬁcient as accessing objects of the

host language. Second, the dynamic compiler can perform

optimizations across language borders because these borders

were removed by message resolution.

The work presented in this paper improves the simplicity,

the ﬂexibility, and the performance of multi-language appli-

cations. It is the basis for a wide variety of different areas of

future research on which we will focus. Topics are, for ex-

ample, multi-language concurrency and parallelism, cross-

language inheritance, or cross-language debuggers.

Acknowledgments

We thank all members of the Virtual Machine Research

Group at Oracle Labs and the Institute of System Software

at the Johannes Kepler University Linz for their valuable

feedback on this work and on this paper. We thank Daniele

Bonetta, Stefan Marr, and Christian Wirth for feedback on

this paper. We especially thank Stephen Kell for signiﬁcant

contributions to our literature survey.

Oracle, Java, and HotSpot are trademarks of Oracle and/or

its afﬁliates. Other names may be trademarks of their respec-

tive owners.

References

[1] M. Abadi, L. Cardelli, B. Pierce, and G. Plotkin. Dynamic

Typing in a Statically-typed Language. In Proceedings of

POPL, 1989. URL http://doi.acm.org/10.1145/75277.

75296.

[2] E. Barrett, C. F. Bolz, and L. Tratt. Unipycation: A Case

Study in Cross-language Tracing. In Proceedings of the 7th

VMIL, 2013. URL http://doi.acm.org/10.1145/2542142.

2542146.

[3] E. Barrett, L. Diekmann, and L. Tratt. Fine-grained Language

Composition. CoRR, abs/1503.08623, 2015.

[4] D. M. Beazley et al. SWIG: An easy to use tool for integrat-

ing scripting languages with C and C++. In Proceedings of

USENIX Tcl/Tk workshop, 1996.

[5] M. Blume. No-longer-foreign: Teaching an ML compiler to

speak C natively. Electronic Notes in Theoretical Computer

Science, 2001.

[6] D. Box and C. Sells. Essential .NET. The Common Language

Runtime, 2002.

[7] S. Brunthaler. Efﬁcient Interpretation Using Quickening. In

Proceedings of DLS, 2010. URL http://doi.acm.org/10.

1145/1869631.1869633.

[8] D. Chisnall. The Challenge of Cross-language Interoperabil-

ity. Commun. ACM, 2013. URL http://doi.acm.org/10.

1145/2534706.2534719.

[9] S. Finne, D. Leijen, E. Meijer, and S. Peyton Jones. Calling

Hell from Heaven and Heaven from Hell. In Proceedings

of ICFP, 1999. URL http://doi.acm.org/10.1145/317636.

317790.

[10] P. J. Fleming and J. J. Wallace. How not to lie with statistics:

The correct way to summarize benchmark results. 1986. URL

http://doi.acm.org/10.1145/5666.5673.

[11] K. Gray. Safe Cross-Language Inheritance. In ECOOP. 2008.

URL http://dx.doi.org/10.1007/978-3-540-70592-5_4.

[12] K. E. Gray, R. B. Findler, and M. Flatt. Fine-grained In-

teroperability Through Mirrors and Contracts. In Proceed-

ings of OOPSLA, 2005. URL http://doi.acm.org/10.1145/

1094811.1094830.

[13] M. Grimmer. High-performance language interoperability in

multi-language runtimes. In Proceedings of SPLASH, 2014.

URL http://doi.acm.org/10.1145/2660252.2660256.

[14] M. Grimmer, M. Rigger, R. Schatz, L. Stadler, and

H. M

ossenb

ock. TrufﬂeC: Dynamic Execution of C on a

Java Virtual Machine. In Proceedings of PPPJ, 2014. URL

http://dx.doi.org/10.1145/2647508.2647528.

[15] M. Grimmer, T. W

urthinger, A. W

oß, and H. M

ossenb

ock.

An Efﬁcient Approach for Accessing C Data Structures from

JavaScript. In Proceedings of ICOOOLPS, 2014. URL http:

//dx.doi.org/10.1145/2633301.2633302.

[16] M. Grimmer, R. Schatz, C. Seaton, T. W

urthinger, and

H. M

ossenb

ock. Memory-safe Execution of C on a Java VM.

In Proceedings of PLAS, 2015. URL http://doi.acm.org/

10.1145/2786558.2786565.

[17] M. Grimmer, C. Seaton, T. Wuerthinger, and H. Moessen-

boeck. Dynamically Composing Languages in a Modular

Way: Supporting C Extensions for Dynamic Languages. In

Proceedings of MODULARITY, 2015. URL http://doi.acm.

org/10.1145/2724525.2728790.

[18] M. Hirzel and R. Grimm. Jeannie: Granting Java Native In-

terface Developers Their Wishes. In Proceedings of OOP-

SLA, 2007. URL http://doi.acm.org/10.1145/1297027.

1297030.

[19] U. H

olzle, C. Chambers, and D. Ungar. Optimizing

Dynamically-typed Object-oriented Languages with Poly-

morphic Inline Caches. In ECOOP. 1991. . URL http:

//dx.doi.org/10.1007/BFb0057013.

[20] U. H

olzle, C. Chambers, and D. Ungar. Debugging Optimized

Code with Dynamic Deoptimization. In Proceedings of PLDI,

1992. URL http://doi.acm.org/10.1145/143095.143114.

[21] S. P. Jones, T. Nordin, and A. Reid. Greencard: a foreign-

language interface for haskell. In Proc. Haskell Workshop,

1997.

[22] S. Kell and C. Irwin. Virtual machines should be invisible. In

Proceedings of SPLASH, 2011. URL http://doi.acm.org/

10.1145/2095050.2095099.

[23] F. Klock II. The layers of larcenys foreign function interface.

In Scheme and Functional Programming Workshop. Citeseer,

2007.

[24] S. Liang. Java Native Interface: Programmer’s Guide and

Reference. Boston, MA, USA, 1st edition, 1999.

[25] J. Matthews and R. B. Findler. Operational semantics for

multi-language programs. In Proceedings of POPL, 2007.

URL http://doi.acm.org/10.1145/1190216.1190220.

[26] M. D. Network. XPCOM Speciﬁcation. https://developer.

mozilla.org/en-US/docs/Mozilla/XPCOM, 2014.

[27] Oracle. OpenJDK: Graal project. http://openjdk.java.

net/projects/graal/, 2013.

[28] J. Reppy and C. Song. Application-speciﬁc Foreign-interface

Generation. In Proceedings of GPCE, 2006. . URL http:

//doi.acm.org/10.1145/1173706.1173714.

[29] J. R. Rose and H. Muller. Integrating the scheme and c

languages. In Proceedings of LFP, 1992. URL http://doi.

acm.org/10.1145/141471.141559.

[30] M. Slee, A. Agarwal, and M. Kwiatkowski. Thrift: Scalable

cross-language services implementation. Facebook White Pa-

per, 2007.

[31] L. Stadler, T. W

urthinger, and H. M

ossenb

ock. Partial Escape

Analysis and Scalar Replacement for Java. In Proceedings of

CGO, 2014. . URL http://doi.acm.org/10.1145/2544137.

2544157.

[32] L. Stepanian, A. D. Brown, A. Kielstra, G. Koblents, and

K. Stoodley. Inlining Java Native Calls at Runtime. In

Proceedings of VEE, 2005. URL http://doi.acm.org/10.

1145/1064979.1064997.

[33] V. Trifonov and Z. Shao. Safe and principled language inter-

operation. 1999.

[34] N. Wang, D. C. Schmidt, and C. O’Ryan. Overview of the

CORBA Component Model. In Component-Based Software

Engineering, 2001.

[35] M. Wegiel and C. Krintz. Cross-language, Type-safe, and

Transparent Object Sharing for Co-located Managed Run-

times. In Proceedings of OOPSLA, 2010. URL http://doi.

acm.org/10.1145/1869459.1869479.

[36] A. W

oß, C. Wirth, D. Bonetta, C. Seaton, C. Humer, and

H. M

ossenb

ock. An Object Storage Model for the Trufﬂe

Language Implementation Framework. In Proceedings of

PPPJ, 2014. URL http://dx.doi.org/10.1145/2647508.

2647517.

[37] T. Wrigstad, F. Z. Nardelli, S. Lebresne, J.

Ostlund, and

J. Vitek. Integrating typed and untyped code in a script-

ing language. In Proceedings of POPL, 2010. . URL

http://doi.acm.org/10.1145/1706299.1706343.

[38] T. W

urthinger, A. W

oß, L. Stadler, G. Duboscq, D. Simon,

and C. Wimmer. Self-optimizing AST interpreters. In Pro-

ceedings of DLS, 2012. . URL http://doi.acm.org/10.

1145/2384577.2384587.

[39] T. W

urthinger, C. Wimmer, A. W

oß, L. Stadler, G. Duboscq,

C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM

to rule them all. In Proceedings of ONWARD!, 2013.