ServerlessGoat Example

Intro

Visualization of the Data-Flow in ServerlessGoat; generated by CodeShield

Similar to “traditional” web applications Serverless applications are equally subject to security vulnerabilities. However, due to their distributed nature and new architecture, security vulnerabilities appear in different forms in the context of Serverless applications. Indeed, the OWASP Foundation has published an interpretation of the OWASP Top-10 list of Web Application Security Risks for Serverless.

An example application that illustrates eight of the ten most critical Serverless application vulnerabilities is ServerlessGoat, a Serverless version of their (in-)famous WebGoat application.

Overview

ServerlessGoat implements an MS-Word .doc to text converter service. For this, the app accepts a user-supplied URL to an MS-Word document and processes as follows:

  1. Download the document via the supplied URL using the curl OS command (line 3)
  2. Convert it to text using the Linux catdoc tool (line 3)
  3. Store the resulting text in an S3 bucket (line 8–14)
  4. Respond with a URL to the generated text in the S3 bucket (line 16–21)

Users can then access the plain-text conversion of the generated MS-Word document via the returned S3 URL.

Main Logic of ServerlessGoat

Command Injection (SAS-01)

Vulnerabilities in the Lambda Function “FunctionConvert”.

CodeShield analyzes the project and reports the vulnerabilities it found directly in its graph view. For the remainder of this article, we explain the command injection vulnerability in detail.

Looking at the implementation, one notices that the proposed document_url query string parameter is used in an OS-command invocation (Line 3), enabling a command injection attack.

An attacker can craft adocument_url query string parameter that leads to completely ignoring the piped catdoc invocation and instead execute an arbitrary OS-command, e.g. https://foobar; cat /var/task/index.js # .

Even worse, an attacker can also acquire the output of the executed OS command by accessing the proposed S3 bucket URL, since the URL to the generated output is returned in line 19.

Investigating the Vulnerability

On the left side, you see an overview of the code contained in the repository. On the right side, you see the AWS resources and the data flows between them. CodeShield generates this visualization based on the SAM/CloudFormation files and source code.

All nodes in the graph are linked to its source code. Thus, you can click any node to jump to its definition.

Command Injection Vulnerability in “FunctionConvert”

To investigate the command injection vulnerability simply click on the report “OWASP SAS-10: Injection” of the component “FunctionConvert” in the graph. On the left side, the source code of the component immediately pops up with the vulnerable code highlighted.

String command = String.format(  "curl --silent -L %s | /lib64/ld-linux-x86-64.so.2 %s -", documentUrl, catdocExecutable.toAbsolutePath().toString());
Process process = new ProcessBuilder("/bin/sh", "-c", command).start();

Also, CodeShield raises a warning explaining the vulnerability.

Tracking the Data Flow – Root Cause & Consequence Analysis

CodeShield allows you to easily track and assess if the vulnerable code is reachable. Therefore, CodeShield provides a Root Cause and Consequence analysis.

Command Injection Vulnerability in “FunctionConvert”

The Root Cause analysis helps you to trace where the data that flow into the component originate from. This helps to locate where a fix (e.g. input sanitization) would be appropriate and if the vulnerability can be exploited from an outside attacker at all. Whereas the Consequence analysis shows which components are potentially affected by the vulnerability. This helps to assess which components and data are at risk of being abused by an attacker.

To trigger either of the analyses, simply click on the button “Analysis” of the vulnerable component and select the desired analysis. CodeShield then automatically highlights the root/affected components and where the data flows by marking the components and edges between those in red color.

Last modified November 25, 2020