Software: TORQUE
Affected Versions: All 2.5 releases up to and including 2.5.13
CVE Reference: CVE-2014-0749
Authors: John Fitzpatrick
Severity: High Risk
Vendor: Adaptive Computing
Vendor Response: Incorporated fix into 2.5 development branch, no advisory
Description
A buffer overflow exists in older versions of TORQUE which can be exploited in order to remotely execute code from an unauthenticated perspective. This issue is exploitable in all versions of the 2.5 branch, up to and including 2.5.13.
Impact
Successful exploitation allows remote execution of code as root.
Cause
This issue exists as a result of a misplaced bounds check.
Solution
Despite still being widely used Torque 2.5.x is now end of life and no longer supported by Adaptive. The latest version of the 2.5 branch (2.5.13) is vulnerable to this issue. HPCsec have submitted a fix to the 2.5-dev GitHub repository (which is still active) which resolves this issue. It is strongly recommended that a version of 2.5-dev (later than pull request #171) is updated to.
Code changes in the 4.2.x branch significantly enhance the security posture of TORQUE and so HPCsec would recommend updating to this branch if possible.
Technical Details
TORQUE is a widely used resource manager. There are several branches 2.x, 3.x and 4.×. The code is open source, but maintained by Adaptive Computing.
Operations such as job submissions and querying of job queues within TORQUE are handled by the pbs_server component. It was found that the pbs_server did not perform sufficient bounds checking on messages sent to it. As a result it was found to be possible to submit messages which resulted in an overflow leading to arbitrary code execution. This could be achieved from a remote, unauthenticated perspective regardless of whether the source IP address is permitted to submit jobs or not.
The vulnerability exists because the file disrsi_.c fails to ensure that the length of count (which is read from the request packet) is less than dis_umaxd prior to being used in a later memcpy(). As a result a specially crafted request can smuggle through a count value which is later decremented and becomes the ct value in a memcpy() made from within tcp_gets():
memcpy((char *)str, tp->tdis_leadp, ct);
This failure to validate count allows control over the size of the memcpy() to be leveraged and as a result control over the amount of data read from the remainder of the packet. If this value is large the memcpy() will overwrite the stack and so can be leveraged in order to gain control over the execution of the program.
A backtrace showing the flow of execution is shown below:
#0 0x0000003dd4a88b9a in memcpy () from /lib64/libc.so.6
#1 0x00007fa0008cb65b in tcp_gets (fd=11, str=0x7fff8dfce741 '3' <repeats 26 times>,
"Ab1Ab2Ab3",
ct=332) at ../Libifl/tcp_dis.c:567
#2 0x00007fa0008be994 in disrsi_ (stream=11, negate=0x7fff8dfce93c, value=0x7fff8dfce938,
count=333)
at ../Libdis/disrsi_.c:187
#3 0x00007fa0008bea1a in disrsi_ (stream=11, negate=0x7fff8dfce93c, value=0x7fff8dfce938,
count=<value optimized out>) at ../Libdis/disrsi_.c:216
#4 0x00007fa0008bea1a in disrsi_ (stream=11, negate=0x7fff8dfce93c, value=0x7fff8dfce938,
count=<value optimized out>) at ../Libdis/disrsi_.c:216
#5 0x00007fa0008bdfab in disrfst (stream=11, achars=33, value=0x27f0b58 "")
at ../Libdis/disrfst.c:125
#6 0x00007fa0008c13ba in decode_DIS_ReqHdr (sock=11, preq=0x27f0b20,
proto_type=0x7fff8dfce9dc,
proto_ver=0x7fff8dfce9d8) at ../Libifl/dec_ReqHdr.c:141
#7 0x0000000000409ba1 in dis_request_read (sfds=11, request=0x27f0b20) at dis_read.c:137
#8 0x000000000041cb6e in process_request (sfds=11) at process_request.c:355
#9 0x00007fa0008d4899 in wait_request (waittime=<value optimized out>, SState=0x72c258)
at ../Libnet/net_server.c:508
#10 0x000000000041afeb in main_loop () at pbsd_main.c:1203
#11 0x000000000041bd15 in main (argc=<value optimized out>, argv=<value optimized out>)
at pbsd_main.c:1760
TORQUE is required to run as root and so successful exploitation leads to code execution as root. HPCsec have created a proof of concept exploit for TORQUE running on 64bit versions of CentOS which makes use of return oriented programming and ROP gadgets in order to execute arbitrary code as root. This vulnerability can be exploited reliably and remotely. It is possible to reach this path of execution from a remote and unauthenticated perspective (and regardless of whether the attackers system is in the acl_hosts list or not). It is expected that code execution within a 32bit environment is simpler to achieve.
Whilst the necessary bounds check was found to be missing from all versions of TORQUE reviewed this issue was only found to be directly exploitable in the 2.5 branch; code changes which have taken place in the 4.x branches prevent the condition required for exploitation from being reached. The vulnerability exists because the necessary check on the size of count occurs too late within the disrsi_.c file. The fix is, therefore, to introduce the appropriate check on the size of “count”. Replacing disrsi_.c with the patched 2.5-dev version (https://github.com/adaptivecomputing/torque/blob/2.5-dev/src/lib/Libdis/disrsi_.c) and recompiling should be sufficient to resolve this issue.
Timeline
2012: Vulnerability identified
06/12/2012: Proof of concept developed
22/07/2013: Vulnerability reported to Adaptive Computing
20/08/2013: Requested update from Adaptive
22/08/2013: Github pull request to resolve issue made with a fix
21/01/2014: Further communication with Adaptive
13/05/2014: Advisory published