cvm

0.5.0

Reference implementation of the CVM bytecode system.

CVM is an implementation of a Common Lisp evaluator, compiler, file compiler, and FASL loader, all written in portable Common Lisp. It compiles Lisp into a bytecode representation, which is interpreted by a simple virtual machine. The compiler is complete but simple, performing only easy optimizations; this makes it fast, and suited for code that does not necessarily need to run quickly, such as that evaluated just once.

Compilation and bytecode interpretation take place relative to specified environments. Clostrum can be used to build a first-class environment to execute code in, meaning CVM can be used to execute code in an isolated environment, like a sandbox.

Bytecode functions exist as real functions, so they can be called the same as any other functions. Bytecode functions and native functions can coexist without difficulty and call each other.

The FASL format is simple and portable. Lisp source files can be compiled in one implementation and then loaded it into another, as long as the compilation and loading environments agree.

Quick start

Load the cvm ASDF system. There is a dependency on Clostrum, which is not available on Quicklisp, so you'll need to set that up yourself.

Before compiling or evaluating code, you need to set the client in order to inform Trucler how to get global definitions. On SBCL you can use the host environment as follows:

(setf cvm.machine:*client* (make-instance 'trucler-native-sbcl:client))

The procedure on CCL is analogous. Or, you can use some other trucler client and environment, such as Trucler's reference implementation.

Now you can compile code with cvm.compile:compile and disassemble it with cvm.machine:disassemble:

(defvar *f* (cvm.compile:compile '(lambda (x) (let ((y 5)) (print y) #'(lambda () (+ y x))))))
(cvm.machine:disassemble *f*) ; =>
---module---
  check-arg-count-= 1
  bind-required-args 1
  const '5
  set 1
  fdefinition 'PRINT
  ref 1
  call 1
  ref 1
  ref 0
  make-closure '#<CVM.MACHINE:BYTECODE-FUNCTION {100C2D803B}>
  pop
  return
; No value

To actually run code, first set up a stack for the VM with (cvm.vm-native:initialize-vm N), where N is how many objects the stack will be able to hold, say 20000. Then you can simply call the functions returned by compile:

(funcall *f* 5) ; =>
5
#<CVM.MACHINE:BYTECODE-CLOSURE>

You can get a running trace of the machine state by binding cvm.vm-native:*trace* to true around a call:

(let ((cvm.vm-native:*trace* t)) (funcall *f* 3)) ; =>

  check-arg-count-= 1 ; bp 1 sp 3 locals #(0 0) stack #()
  bind-required-args 1 ; bp 1 sp 3 locals #(0 0) stack #()
  const '5 ; bp 1 sp 3 locals #(3 0) stack #()
  set 1 ; bp 1 sp 4 locals #(3 0) stack #(5)
  fdefinition 'PRINT ; bp 1 sp 3 locals #(3 5) stack #()
  ref 1 ; bp 1 sp 4 locals #(3 5) stack #(#<FUNCTION PRINT>)
  call 1 ; bp 1 sp 5 locals #(3 5) stack #(#<FUNCTION PRINT> 5)

5
  ref 1 ; bp 1 sp 3 locals #(3 5) stack #()
  ref 0 ; bp 1 sp 4 locals #(3 5) stack #(5)
  make-closure '#<CVM.MACHINE:BYTECODE-FUNCTION NIL> ; bp 1 sp 5 locals #(3 5) stack #(5 3)
  pop ; bp 1 sp 4 locals #(3 5) stack #(#<CVM.MACHINE:BYTECODE-CLOSURE NIL>)
  return ; bp 1 sp 3 locals #(3 5) stack #()

#<CVM.MACHINE:BYTECODE-CLOSURE {100C2D80CB}>

First-class environments

The cvm/vm-cross subsystem allows CVM to be used for compiling and running Lisp code in arbitrary first-class environments, in concert with Clostrum. Here is an example:

;;; cvm-cross does not itself load a global environment implementation,
;;; since it can be used with any. Here we use clostrum-basic for that.
;;; We also need clostrum-trucler to be able to compile relative to
;;; a Clostrum environment.
(ql:quickload '(:clostrum-basic :clostrum-trucler))

;;; Set up the client to use cvm-cross, and initialize the VM.
(setf cvm.machine:*client* (make-instance 'cvm.vm-cross:client))
(cvm.vm-cross:initialize-vm 20000)

;;; Construct our environments.
(defvar *rte* (make-instance 'clostrum-basic:run-time-environment))

Subsystems

CVM defines a variety of subsystems that can be loaded independently. It's set up this way so that you can, for example, load one of the VM definitions and run bytecode compiled elsewhere, without needing to load any of the compiler's multitudinous dependencies.

Assuming the compilation and loader environments match (e.g. any function appearing in a macroexpansion is actually available in the load-time environment), there is no problem with compiling code with one VM and loading it with another. Using multiple VMs in the same image also works.

Implementation status

Works. Except:

More TODO

System Information

0.5.0