Overview

In the last post, I setup a a container with a pre-built version of Ruby ready to debug.

In this post, I’ll talk about how to pass it a ruby script and watch it execute.

Debug the VM while it executes a script

Continuing from last time, ensure you’re attached to the debug container and that you can set breakpoints and run the program.

We want to create a simple_test.rb file that that we can pass into the interpreter with the following contents:

puts "hello world"

In the launch.json file, add this filename to the args list:

"args": ["simple_test.rb"],

Now when we hit F5 to start debugging, we should see it passed in our arguments list, like in the following screenshot:

args passed into main

Stepping through the interpreter we can try to see how it compiles and runs the simple_test.rb script.

At a high level, most interpreters work by parsing a file, building an abstract syntax tree (AST), and then traversing that tree to generate instructions that the VM can execute.

With this in mind, we can start stepping through the code until we see something we can recognize. For me, this started happening in the ruby.c > process_options(int, char**, ruby_cmdline_options_t *) method.

It seems that most of the work for loading the file, parsing it, building an AST, and then traversing it to create an instruction sequence (iseq) that the VM can execute, is done in here (I’ve simplified the example and removed a lot of surrounding code):

static VALUE
process_options(int argc, char **argv, ruby_cmdline_options_t *opt)
{
    // ...
    parser = rb_parser_new();
    // ...

    // ...
    ast = load_file(parser, opt->script_name, f, 1, opt);    
    // ...

    // ...
    iseq = rb_iseq_new_main(&ast->body, opt->script_name, path, vm_block_iseq(base_block));
    // ...

    // ...
    rb_exec_event_hook_script_compiled(ec, iseq, Qnil);
    // ...

    return (VALUE)iseq;
}

However, I still don’t see any output, so the execution of the script must be happening somewhere else. Stepping through some more, we get to the ruby_run_node(void *) method in the eval.c file:

int
ruby_run_node(void *n)
{
    rb_execution_context_t *ec = GET_EC();
    int status;
    if (!ruby_executable_node(n, &status)) {
        rb_ec_cleanup(ec, 0);
    return status;
    }
    ruby_init_stack((void *)&status);
    return rb_ec_cleanup(ec, rb_ec_exec_node(ec, n));
}

This seems to be our entry point into actually executing the compiled bytecode in the VM. Once we execute the rb_ec_exec_node(ec, n) method, we get some output. After that, the VM cleans itself up, and the interpreter exits.

We now have a pretty decent starting point for delving into the ruby interpreter and getting a bottoms-up understanding of how it actually works. However, my personal preference at this point is to start getting a top-down overview of the system so that I can reconcile the source code with a high-level mental model.

But I’ll leave that for future posts.