- Namespace restriction. Encapsulation. Capabilities. Whatever: the code should only be able to do what I say it can do.
- CPU timeslice restriction: Either I should be able to run some code for so long, pause it, and then resume it where it left off, or I should be able to run some code asynchronously, giving it 1/Nth CPU time.
- Memory allocation restrictions: this user's code should only be able to allocate up to N megabytes.
- Efficient at rather high scales: I should be able to run at least 500, maybe 1000 completely isolated functions concurrently. If all they're doing is sleep()ing, then this shouldn't put the host under heavy load.
- Some simple way to expose new, audited functions. This is pretty easy. If it can't run in a Python process and allow Python functions to be exposed to it, then I can at least run it in a separate process with method invocations going over a trivial RPC protocol.
- A decent set of secure built-in operations, but nothing really fancy. People are going to be interfacing with my API and nothing else, so they won't need a hugely rich core language API. It does need to be secure, though: large exponentiation should either be disallowed or interruptible, for example.
- Secure, ok? For example, don't talk to me about Python (as implemented by CPython). I have very little trust for any restricted execution system that's tacked onto an existing complex runtime.
In case you need some context for these requirements to make sense, think LambdaMOO or Second Life. Any user can upload code to run with their rights; it can manipulate the world through an audited "trusted" API. The code isn't allowed to interfere with the host system or other users' code. Another possible application is a wiki which allows people to upload code to be executed when a page is viewed.
Here's the list of things I've considered:
- PLT looks damned close with its sandbox library, but its execution limitation is in terms of wall-clock seconds, not CPU seconds, and it's not continuable.
- Monte could definitely do it some day, if its development remains steady.
- Lua... well, can Lua do it? I'd love it if someone would actually make a point on this.
- Haskell might be able to do it. Haskell can do anything, it seems. (I think I saw some code once which somehow implemented continuations in Haskell, for Haskell. Without being a Haskell runtime or compiler. What? I don't know.) The problem is it's inscrutable to me. I'd love it if someone commented about this.
- PyPy with the sandbox translation option. It's very cool, but I'm not sure it's usable at the 1000 node scale, because I can't see a way to run multiple isolated interpreters within one process. It also misses things like CPU timeslice restriction and memory allocation restriction, as far as I can tell.