Give Your Agents a REPL

Coding agents are only as good as the feedback they can get from the app they’re working on. Most of the tooling I’ve seen so far is outside-in: run the tests, read the logs, grep the source, hit an HTTP endpoint, look at the response. That works, but it’s a long way from how I actually debug a running Rails app.

When I’m debugging, I drop into rails console, poke at User.find_by(...), inspect ActionController::Base.descendants, call a service object with a weird argument, and watch what blows up. The running process is the source of truth. It holds the loaded classes, the in-memory cache, the monkey-patches, the current ENV, and the middleware stack as it actually got assembled. Source files are just the recipe.

I think agents should be able to do the same thing, so I’m going to show how to expose an IRB session over a local socket in development and give an agent a CLI to talk to it. It’s a few minutes of code and it changes what the agent can do.

Why not just `rails runner` or a shell tool?

A few reasons rails runner "puts User.count" isn’t the same thing:

Cold boot per call. Every invocation reloads the Rails environment. On a reasonably sized app that’s 3 to 10 seconds. An agent that iterates 20 times to understand a bug just spent four minutes waiting for boot.
No state carries over. The agent can’t set u = User.last and then poke at u across calls. Every question has to be self-contained, which pushes the agent toward writing bigger, more speculative one-shots instead of small probes.
Different process. The running dev server has the state that matters: the request you just made, the Solid Queue job mid-flight, the cache that got warmed. A fresh runner process sees none of it.
No introspection of live objects. ObjectSpace.each_object(ActiveRecord::Base) in a fresh process tells you nothing. In the live process, it tells you what the last request allocated.

A socket-attached IRB inside the running server fixes all four.

What we’re building

A background thread in the dev server listens on a Unix socket. On connect, it runs a small eval loop whose input and output are that socket. The loop runs in the context of the app, so it sees the same loaded constants, the same DB connection pool, the same everything. It’s development-only, never loaded in production. The socket lives under tmp/ with 0600 permissions so only the owning user can connect.

The client side is even simpler: a CLI that opens the socket, sends an expression, reads until the prompt comes back, and prints the output.

The server: `config/initializers/agent_console.rb`

# config/initializers/agent_console.rb
return unless Rails.env.development?
return if defined?(Rails::Console)        # skip inside `rails console` itself
return if $PROGRAM_NAME.end_with?("rake") # skip rake tasks

require "socket"

module AgentConsole
  SOCKET_PATH = Rails.root.join("tmp", "agent-console.sock").to_s

  def self.start!
    File.unlink(SOCKET_PATH) if File.exist?(SOCKET_PATH)
    server = UNIXServer.new(SOCKET_PATH)
    File.chmod(0o600, SOCKET_PATH)

    Thread.new do
      Thread.current.name = "agent-console"
      loop do
        client = server.accept
        Thread.new(client) { |c| handle(c) }
      end
    end

    at_exit { File.unlink(SOCKET_PATH) if File.exist?(SOCKET_PATH) }
    Rails.logger.info("[agent-console] listening on #{SOCKET_PATH}")
  end

  def self.handle(client)
    client.puts "agent-console ready. ruby #{RUBY_VERSION}, rails #{Rails.version}"
    loop do
      client.write("\n>> ")
      line = client.gets
      break if line.nil?
      begin
        result = TOPLEVEL_BINDING.eval(line)
        client.puts("=> #{result.inspect}")
      rescue Exception => e
        client.puts("!! #{e.class}: #{e.message}")
        client.puts(e.backtrace.first(5).join("\n"))
      end
    end
  ensure
    client.close rescue nil
  end
end

AgentConsole.start!

A couple of notes on the code above. It evaluates in TOPLEVEL_BINDING, so the socket sees the same top-level constants the app does: User, Rails.application, ActiveRecord::Base, and so on. It also explicitly skips loading inside rails console and rake tasks. You don’t want two IRB sessions fighting over the same terminal, and you don’t want rake tasks to leak a listening socket.

For multi-line input (defining a method, a do…end block), the snippet above is deliberately minimal: one line per eval. A production-quality version should buffer until the parser says the expression is complete. Ripper.sexp(source) returning non-nil is a cheap way to detect that.

The client: a tiny CLI the agent can call

#!/usr/bin/env ruby
# bin/agent-console
require "socket"

SOCK = File.expand_path("../tmp/agent-console.sock", __dir__)
expr = ARGV.join(" ")
abort "usage: bin/agent-console '<ruby expression>'" if expr.empty?

UNIXSocket.open(SOCK) do |s|
  s.read_nonblock(4096) rescue nil  # drain banner + first prompt
  s.puts(expr)
  s.close_write
  print s.read
end

Make it executable and the agent has a tool:

$ bin/agent-console 'User.count'
=> 1423

$ bin/agent-console 'ActionController::Base.descendants.map(&:name).sort.first(3)'
=> ["Admin::SessionsController", "Api::V1::BaseController", "ApplicationController"]

$ bin/agent-console 'Rails.application.config.middleware.map(&:inspect)'
=> ["ActionDispatch::HostAuthorization", "Rack::Sendfile", ...]

Each call is a fresh connection, so local variables don’t persist across CLI invocations. State inside the server process does though, which is useful: a Rails.cache.write("probe", 1) from one call is visible to the next.

If you want local variables to carry across calls too, have the server key sessions by a token the client passes on connect and keep a Binding per token. Then the agent can do:

$ bin/agent-console --session=debug-1 'u = User.find(42)'
$ bin/agent-console --session=debug-1 'u.orders.where(state: "pending").count'

The second call reuses u from the first.

What this unlocks

Once the agent can open an IRB into the live process, a lot of workflows that were clunky become direct:

“What does this controller return for this input?” Call the action with a fake request object, inspect the response, no HTTP round-trip, no restart. Stacktraces come back in-band.
“Why is this N+1 happening?” Set ActiveRecord::Base.logger = Logger.new(STDOUT) for the next query, run it, turn logging back off. No editing files, no restart, no polluting other requests.
“Is this monkey-patch actually loaded?” User.instance_method(:save).source_location answers it in one line.
“What’s in the cache right now?” Walk Rails.cache directly.
Reproducing a bug from a log line: paste the params into the console, re-invoke the service object, and see the real exception with the real object graph instead of a reconstruction.

The agent stops guessing from source and starts asking the process. That’s much closer to how I work, and the quality of the patches the agent produces goes up.

Safety

An IRB socket is remote code execution by design. The rules:

Development only. return unless Rails.env.development? at the top of the initializer. For extra paranoia, also assert that ENV["RAILS_ENV"] matches and that no DATABASE_URL pointing at prod is set.
Unix socket, not TCP. A UNIXServer is reachable only by processes on the same machine. A TCPServer on 127.0.0.1 is reachable by anything on the loopback, including other containers in some Docker setups. Use a Unix socket and chmod 0600 it.
Under tmp/, not a shared dir. /tmp is world-writable and shared across users. Rails.root/tmp is scoped to the checkout.
No production backdoor. Don’t add a “just in case” flag to enable this in staging. Staging runs real data. If you need a REPL against real data, that’s a bastion plus rails console with audit logging, not an always-on socket.
Audit if you want to. Logging every expression the agent evaluates to a file is cheap insurance and makes “what did the agent do” reviewable after the fact.

Why this matters

The bottleneck for coding agents right now is feedback quality, not token count or model size. An agent that can only read source files is reasoning about a static representation of a system that is, at runtime, substantially different. An agent that can open a REPL into the live process is reasoning about the thing itself.

Every language with a good REPL should make this trivial. Ruby does, because IRB is already there, Binding is first-class, and Unix sockets are in the standard library. Node has repl.start with custom input and output. Python has code.InteractiveConsole. Elixir has IEx.pry and remote shells built in.

Give the agent a socket into your app and see how much better it gets at working on it.

Give Your Agents a REPL

Expose IRB over a Unix socket so coding agents can introspect and control a running Rails app

Why not just `rails runner` or a shell tool?

What we’re building

The server: `config/initializers/agent_console.rb`

The client: a tiny CLI the agent can call

What this unlocks

Safety

Why this matters

Do you want to learn Phlex 💪 and enjoy these code examples?

Support Beautiful Ruby by pre-ordering the Phlex on Rails video course.

Give Your Agents a REPL

Expose IRB over a Unix socket so coding agents can introspect and control a running Rails app

Why not just rails runner or a shell tool?

What we’re building

The server: config/initializers/agent_console.rb

The client: a tiny CLI the agent can call

What this unlocks

Safety

Why this matters

Do you want to learn Phlex 💪 and enjoy these code examples?

Support Beautiful Ruby by pre-ordering the Phlex on Rails video course.

Why not just `rails runner` or a shell tool?

The server: `config/initializers/agent_console.rb`