Coding agents are only as good as the feedback they can get from the app they’re working on. Most of the tooling I’ve seen so far is outside-in: run the tests, read the logs, grep the source, hit an HTTP endpoint, look at the response. That works, but it’s a long way from how I actually debug a running Rails app.
When I’m debugging, I drop into rails console, poke at User.find_by(...), inspect ActionController::Base.descendants, call a service object with a weird argument, and watch what blows up. The running process is the source of truth. It holds the loaded classes, the in-memory cache, the monkey-patches, the current ENV, and the middleware stack as it actually got assembled. Source files are just the recipe.
I think agents should be able to do the same thing, so I’m going to show how to expose an IRB session over a local socket in development and give an agent a CLI to talk to it. It’s a few minutes of code and it changes what the agent can do.
Why not just rails runner or a shell tool?
A few reasons rails runner "puts User.count" isn’t the same thing:
- Cold boot per call. Every invocation reloads the Rails environment. On a reasonably sized app that’s 3 to 10 seconds. An agent that iterates 20 times to understand a bug just spent four minutes waiting for boot.
- No state carries over. The agent can’t set
u = User.lastand then poke atuacross calls. Every question has to be self-contained, which pushes the agent toward writing bigger, more speculative one-shots instead of small probes. - Different process. The running dev server has the state that matters: the request you just made, the Solid Queue job mid-flight, the cache that got warmed. A fresh
runnerprocess sees none of it. - No introspection of live objects.
ObjectSpace.each_object(ActiveRecord::Base)in a fresh process tells you nothing. In the live process, it tells you what the last request allocated.
A socket-attached IRB inside the running server fixes all four.
What we’re building
A background thread in the dev server listens on a Unix socket. On connect, it runs a small eval loop whose input and output are that socket. The loop runs in the context of the app, so it sees the same loaded constants, the same DB connection pool, the same everything. It’s development-only, never loaded in production. The socket lives under tmp/ with 0600 permissions so only the owning user can connect.
The client side is even simpler: a CLI that opens the socket, sends an expression, reads until the prompt comes back, and prints the output.
The server: config/initializers/agent_console.rb
# config/initializers/agent_console.rb
return unless Rails.env.development?
return if defined?(Rails::Console) # skip inside `rails console` itself
return if $PROGRAM_NAME.end_with?("rake") # skip rake tasks
require "socket"
module AgentConsole
SOCKET_PATH = Rails.root.join("tmp", "agent-console.sock").to_s
def self.start!
File.unlink(SOCKET_PATH) if File.exist?(SOCKET_PATH)
server = UNIXServer.new(SOCKET_PATH)
File.chmod(0o600, SOCKET_PATH)
Thread.new do
Thread.current.name = "agent-console"
loop do
client = server.accept
Thread.new(client) { |c| handle(c) }
end
end
at_exit { File.unlink(SOCKET_PATH) if File.exist?(SOCKET_PATH) }
Rails.logger.info("[agent-console] listening on #{SOCKET_PATH}")
end
def self.handle(client)
client.puts "agent-console ready. ruby #{RUBY_VERSION}, rails #{Rails.version}"
loop do
client.write("\n>> ")
line = client.gets
break if line.nil?
begin
result = TOPLEVEL_BINDING.eval(line)
client.puts("=> #{result.inspect}")
rescue Exception => e
client.puts("!! #{e.class}: #{e.message}")
client.puts(e.backtrace.first(5).join("\n"))
end
end
ensure
client.close rescue nil
end
end
AgentConsole.start!
A couple of notes on the code above. It evaluates in TOPLEVEL_BINDING, so the socket sees the same top-level constants the app does: User, Rails.application, ActiveRecord::Base, and so on. It also explicitly skips loading inside rails console and rake tasks. You don’t want two IRB sessions fighting over the same terminal, and you don’t want rake tasks to leak a listening socket.
For multi-line input (defining a method, a do…end block), the snippet above is deliberately minimal: one line per eval. A production-quality version should buffer until the parser says the expression is complete. Ripper.sexp(source) returning non-nil is a cheap way to detect that.
The client: a tiny CLI the agent can call
#!/usr/bin/env ruby
# bin/agent-console
require "socket"
SOCK = File.expand_path("../tmp/agent-console.sock", __dir__)
expr = ARGV.join(" ")
abort "usage: bin/agent-console '<ruby expression>'" if expr.empty?
UNIXSocket.open(SOCK) do |s|
s.read_nonblock(4096) rescue nil # drain banner + first prompt
s.puts(expr)
s.close_write
print s.read
end
Make it executable and the agent has a tool:
$ bin/agent-console 'User.count'
=> 1423
$ bin/agent-console 'ActionController::Base.descendants.map(&:name).sort.first(3)'
=> ["Admin::SessionsController", "Api::V1::BaseController", "ApplicationController"]
$ bin/agent-console 'Rails.application.config.middleware.map(&:inspect)'
=> ["ActionDispatch::HostAuthorization", "Rack::Sendfile", ...]
Each call is a fresh connection, so local variables don’t persist across CLI invocations. State inside the server process does though, which is useful: a Rails.cache.write("probe", 1) from one call is visible to the next.
If you want local variables to carry across calls too, have the server key sessions by a token the client passes on connect and keep a Binding per token. Then the agent can do:
$ bin/agent-console --session=debug-1 'u = User.find(42)'
$ bin/agent-console --session=debug-1 'u.orders.where(state: "pending").count'
The second call reuses u from the first.
What this unlocks
Once the agent can open an IRB into the live process, a lot of workflows that were clunky become direct:
- “What does this controller return for this input?” Call the action with a fake request object, inspect the response, no HTTP round-trip, no restart. Stacktraces come back in-band.
- “Why is this N+1 happening?” Set
ActiveRecord::Base.logger = Logger.new(STDOUT)for the next query, run it, turn logging back off. No editing files, no restart, no polluting other requests. - “Is this monkey-patch actually loaded?”
User.instance_method(:save).source_locationanswers it in one line. - “What’s in the cache right now?” Walk
Rails.cachedirectly. - Reproducing a bug from a log line: paste the params into the console, re-invoke the service object, and see the real exception with the real object graph instead of a reconstruction.
The agent stops guessing from source and starts asking the process. That’s much closer to how I work, and the quality of the patches the agent produces goes up.
Safety
An IRB socket is remote code execution by design. The rules:
- Development only.
return unless Rails.env.development?at the top of the initializer. For extra paranoia, also assert thatENV["RAILS_ENV"]matches and that noDATABASE_URLpointing at prod is set. - Unix socket, not TCP. A
UNIXServeris reachable only by processes on the same machine. ATCPServeron127.0.0.1is reachable by anything on the loopback, including other containers in some Docker setups. Use a Unix socket andchmod 0600it. - Under
tmp/, not a shared dir./tmpis world-writable and shared across users.Rails.root/tmpis scoped to the checkout. - No production backdoor. Don’t add a “just in case” flag to enable this in staging. Staging runs real data. If you need a REPL against real data, that’s a bastion plus
rails consolewith audit logging, not an always-on socket. - Audit if you want to. Logging every expression the agent evaluates to a file is cheap insurance and makes “what did the agent do” reviewable after the fact.
Why this matters
The bottleneck for coding agents right now is feedback quality, not token count or model size. An agent that can only read source files is reasoning about a static representation of a system that is, at runtime, substantially different. An agent that can open a REPL into the live process is reasoning about the thing itself.
Every language with a good REPL should make this trivial. Ruby does, because IRB is already there, Binding is first-class, and Unix sockets are in the standard library. Node has repl.start with custom input and output. Python has code.InteractiveConsole. Elixir has IEx.pry and remote shells built in.
Give the agent a socket into your app and see how much better it gets at working on it.