Recently, I was working on weaponizing a particular bug with a colleague. For reasons unfathomable to me, we’ve been implementing our exploit in Ruby. As part of this, we wrote a rinky-dink port scanner that attempts to find instances of the service on a given host because it has a propensity to pick random ports to listen on. This was necessary because one can only really fingerprint the service by sending it specific malformed messages and detecting whether a specific error message was returned. So color me surprised when my colleague messages me and tells me that they have the scanner working, but with the caveat that “when I run it in a Docker container it hangs on read and when I run it in a VM it works.” And so began a day of stark raving incredulity.
require 'socket' def test_target(host, port) begin socket = TCPSocket.new host, port status = :open rescue Errno::ECONNREFUSED status = :closed rescue Errno::ETIMEDOUT status = :filtered end service = nil if status == :open socket.write "<REDACTED>" begin line = socket.read (1024 * 8) if line.include? "<REDACTED>" service = :thething end rescue IO::WaitReadable IO.select([socket]) retry end end return [host, port, status, service] end puts test_target('localhost', '<port>').inspect
The above code is a simplified version of the already pretty simple scanner. So let’s start with the failing case. We’ll run the code in a Docker container and claim our error.
root@970ccbf76e01:/mnt# ruby scan.rb Traceback (most recent call last): 2: from scan.rb:30:in `<main>' 1: from scan.rb:17:in `test_target' scan.rb:17:in `read': Connection reset by peer (Errno::ECONNRESET)
So that checks out. Next, we should confirm that the server is actually replying to us. A quick tcpdump
will confirm that the server gets our broken message, replies with its own and then closes the connection.
root@970ccbf76e01:/mnt# tcpdump -i lo -n -X 'port <port>' tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes 10:12:50.611092 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [S], ... ... 10:12:50.611108 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [S.], ... ... 10:12:50.611120 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [.], ... ... 10:12:50.611241 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [P.], ... (<REDACTED>) 10:12:50.611653 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [P.], ... (<ERROR MESSAGE>) 10:12:50.611684 IP 127.0.0.1.58562 > 127.0.0.1.PORT: Flags [.], ... ... 10:12:50.611904 IP 127.0.0.1.PORT > 127.0.0.1.58562: Flags [R.], ... (RST)
At this point, I’m starting to get suspicious of potential eldritch ruby-isms creeping there way in from the depths of R’lyeh. We know for a fact that the remote side sent us some data and then closed the connection with a RST packet after the client ACK’d the data. So it should not have been possible for an error to have been received on the socket before the data. The Berkeley sockets API is hot garbage, but it’s not that bad. My first thought to attempt to shake out any ruby horrors is ruby’s TracePoint API.
Using this API, we can put together a fairly simple tracer that executes around a block.
def start_trace trace = TracePoint.new(:call, :return, :line, :c_call, :c_return, :b_call, :b_return, :raise, :thread_begin, :thread_end) { |tp| if [:call, :c_call].include?(tp.event) p ["#{tp.path}:#{tp.lineno}", tp.event, "#{tp.defined_class}.#{tp.method_id}", tp.self.method(tp.method_id), tp.self.method(tp.method_id).parameters, tp.parameters] else p ["#{tp.path}:#{tp.lineno}", tp.event, "#{tp.defined_class}.#{tp.method_id}", tp.self.method(tp.method_id)] end } trace.enable yield ensure trace.disable end
With this, we can trace through the entirety of the the socket.read
call, and even hoist the tracer around whatever pieces of code we want.
start_trace { line = socket.read (1024 * 8) }
root@970ccbf76e01:/mnt# ruby scan-tpex.rb ["<internal:prelude>:137", :c_return, "TracePoint.__enable", #<Method: TracePoint#__enable>] ["<internal:prelude>:138", :return, "TracePoint.enable", #<Method: TracePoint#enable>] ["scan-tpex.rb:15", :line, "Object.start_trace", #<Method: main.start_trace>] ["scan-tpex.rb:34", :b_call, "Object.test_target", #<Method: main.test_target>] ["scan-tpex.rb:35", :line, "Object.test_target", #<Method: main.test_target>] ["scan-tpex.rb:35", :c_call, "IO.read", #<Method: TCPSocket(IO)#read>, [[:rest]], [[:rest]]] ["scan-tpex.rb:35", :c_call, "SystemCallError.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>, [[:rest]], [[:rest]]] ["scan-tpex.rb:35", :c_call, "Exception.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>, [[:rest]], [[:rest]]] ["scan-tpex.rb:35", :c_return, "Exception.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>] ["scan-tpex.rb:35", :c_return, "SystemCallError.initialize", #<Method: Errno::ECONNRESET(SystemCallError)#initialize>] ["scan-tpex.rb:35", :c_call, "Exception.exception", #<Method: Errno::ECONNRESET(Exception)#exception>, [[:rest]], [[:rest]]] ["scan-tpex.rb:35", :c_return, "Exception.exception", #<Method: Errno::ECONNRESET(Exception)#exception>] ["scan-tpex.rb:35", :c_call, "Exception.backtrace", #<Method: Errno::ECONNRESET(Exception)#backtrace>, [], []] ["scan-tpex.rb:35", :c_return, "Exception.backtrace", #<Method: Errno::ECONNRESET(Exception)#backtrace>] ["scan-tpex.rb:35", :raise, "IO.read", #<Method: TCPSocket(IO)#read>] ["scan-tpex.rb:35", :c_return, "IO.read", #<Method: TCPSocket(IO)#read>] ["scan-tpex.rb:35", :b_return, "Object.test_target", #<Method: main.test_target>] ["scan-tpex.rb:17", :line, "Object.start_trace", #<Method: main.start_trace>] ["scan-tpex.rb:17", :c_call, "TracePoint.disable", #<Method: TracePoint#disable>, [], []] scan-tpex.rb:35:in `read': Connection reset by peer (Errno::ECONNRESET) from scan-tpex.rb:35:in `block in test_target' from scan-tpex.rb:15:in `start_trace' from scan-tpex.rb:34:in `test_target' from scan-tpex.rb:50:in `<main>'
The first and last couple of lines are typical of one of the things tracer is hooking on, once it’s actually enabled internally, it can see the the inner workings of the TracePoint API itself, from the remaining parts of it being enabled to the beginning parts of it being disabled per the trace.enable
and trace.disable
calls from start_trace
. However, from our perspective, the only interesting lines are the following:
["scan-tpex.rb:35", :c_call, "IO.read", #<Method: TCPSocket(IO)#read>, [[:rest]], [[:rest]]]
:c_call, "SystemCallError.initialize"
to :c_return, "Exception.backtrace"
["scan-tpex.rb:35", :raise, "IO.read", #<Method: TCPSocket(IO)#read>]
["scan-tpex.rb:35", :c_return, "IO.read", #<Method: TCPSocket(IO)#read>]
From this we can tell that once the call heads into the native C function for IO.read
, the TracePoint API can only see it start constructing and raising an exception, which then causes it to return from IO.read
. This is not an ideal amount of granularity, but at least it gives us a place to start looking. An interesting thing to note here is that calling the read
method of a TCPSocket
object is a direct call to the read
method of TCPSocket
‘s superclass, IO
. Therefore, the place to start looking is in the io.c
source file within the main Ruby codebase, where we can find that the IO#read
method is registered to the io_read
C function. To save readers the trouble of attempting to make sense of this file, the call path we are looking for is is_read
-> io_fread
-> bufread_call
-> io_bufread
. This last function has a loop around rb_read_internal
, which is a wrapper for internal_read_func
, which directly calls read(2)
.
static long io_bufread(char *ptr, long len, rb_io_t *fptr) { long offset = 0; long n = len; long c; if (READ_DATA_PENDING(fptr) == 0) { while (n > 0) { again: c = rb_read_internal(fptr->fd, ptr+offset, n); if (c == 0) break; if (c < 0) { if (fptr_wait_readable(fptr)) goto again; return -1; } offset += c; if ((n -= c) <= 0) break; } return len - n; } while (n > 0) { c = read_buffered_data(ptr+offset, n, fptr); if (c > 0) { offset += c; if ((n -= c) <= 0) break; } rb_io_check_closed(fptr); if (io_fillbuf(fptr) < 0) { break; } } return len - n; }
The important thing about this code is the first while loop that calls rb_read_internal. Essentially, io_bufread
tries to read(2)
from the socket until the amount requested has been reached, or an EOL is returned. We can quickly confirm that is the case with strace
.
root@970ccbf76e01:/mnt# strace -f -s 200 ruby scan.rb ... socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 5 fcntl(5, F_GETFD) = 0x1 (flags FD_CLOEXEC) connect(5, {sa_family=AF_INET, sin_port=htons(<port>), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 fstat(5, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 write(5, "<REDACTED>"..., 10) = 10 read(5, "<REDACTED>"..., 8192) = 695 read(5, 0x5587027e4297, 7497) = -1 ECONNRESET (Connection reset by peer) write(2, "\33[1mTraceback\33[m (most recent call last):\n", 42Traceback (most recent call last): ) = 42 ...
Two read(2)
syscalls, with the second one performed immediately after the first and attempting to read the 1024*8 byte size minus the amount read by the first syscall.
As the service being scanned closes the connection immediately after returning its response, any subsequent attempts to read on the socket to that service will fail. However, because of the quirky nature of io_bufread
, which appears to collapse multiple read(2)
calls into a single operation that tries to read the entire buffer, the second read for the remaining data will fail, which causes the entire TcpSocket#read
(really IO#read
) operation to fail, even though data was, in fact, read.
Per the documentation for io_read
, this is actually the expected behavior:
If length is a positive integer, read
tries to read length bytes without any conversion
(binary mode). It returns nil
if an EOF is encountered before anything can be read.
Fewer than length bytes are returned if an EOF is encountered during the read.
So what is the right way to write this code? Much like with C, the answer is not to use generic file IO functions (i.e. read(2)
and write(2)
) for sockets, but instead the socket-specific functions (i.e. send(2)
and recv(2)
). Much like everything else in Ruby, there is a bit of indirection around TcpSocket#recv
(really BasicSocket#recv
, which wraps recvfrom(2)
), but it doesn’t exhibit the error collapsing behavior of IO#read
and will do what you would have expected from calling read
on a TcpSocket
.
Truth be told, one of the very first things I tried when my colleague raised this issue was calling recv
instead of read
and using strace
to discern between the two, which very quickly yielded the two read(2)
calls. But from there, the journey to find why there was a difference consumed disproportionately more time given that the TracePoint
API ended up being completely useless, especially because it can’t really access method parameters, especially those for native functions. Because of that, most of this analysis was done with a mix of source code trawling and very hacky Frida-based hooking of the few functions within Ruby/libruby
that actually have symbols.
read_inteceptor = Interceptor.attach(mod.getExportByName('read'), { onEnter: function(args) { console.log(" read(" + args[0] + ", ..., " + args[2] + ")") console.log(" " + Thread.backtrace(this.context, Backtracer.FUZZY) .map(DebugSymbol.fromAddress).join('\n') ); }, onLeave: function(retval) { console.log(" read() -> " + retval + "\n") } }); io_bufread_inteceptor = Interceptor.attach(sym_to_addr_map['io_bufread'].address, { onEnter: function(args) { console.log("io_bufread(..., " + args[1] + ", ...)") }, onLeave: function(retval) { console.log("io_bufread() -> " + retval) } }); rb_io_wait_readable_inteceptor = Interceptor.attach(sym_to_addr_map['rb_io_wait_readable'].address, { onEnter: function(args) { console.log(" rb_io_wait_readable(...)"); }, onLeave: function(retval) { console.log(" rb_io_wait_readable() -> " + retval) } }); io_fillbuf_inteceptor = Interceptor.attach(sym_to_addr_map['io_fillbuf.part.39'].address, { onEnter: function(args) { console.log("io_fillbuf(...)") }, onLeave: function(retval) { console.log("io_fillbuf() -> " + retval + "\n") } });
Along this adventure, I learned that, much like how glibc syscall stubs don’t have a one-to-one mapping to the syscalls you would expect them to call, Ruby’s POSIX syscall-like methods have a mind of their own. I also learned that Ruby’s TracePoint
API doesn’t help much once you get to the C parts of (C)Ruby/MRI, and that vanilla syscall tracing remains an invaluable part of the Ruby debugger’s toolkit (though I’m sure one can get more throughput with eBPF-based syscall tracers). As it turns out, TracePoint
, Ruby’s own tracing feature, can’t even trace Ruby’s underlying virtual machine opcodes (i.e. those printed by RubyVM::InstructionSequence.disasm
). Admittedly, the nature of this part of Ruby is implementation specific, but the lack of deeper granularity is still felt. In a later post, I’ll discuss the Ruby tracer I’ve been working on, which can trace (C)Ruby/MRI’s internal bytecode VM (and more!).