We’re excited to introduce our new open-source tool for facilitating continuous vulnerability scanning in the cloud: NessAWS. If you’ve read other posts on this blog, you’ll notice that we won’t be talking about the dark web, the latest PII dump, or anything related to threat intelligence or incident response. The infrastructure team at Terbium is responsible for maintaining the security posture for the company. It’s our job to focus on all of information security, including asset inventory, secure configuration, auditing, implementing least-privilege, etc.
While developing our information security strategy, we thought a lot about continuous monitoring – what it means to us, how to implement it in our environment, and how we would use the data to support risk-based decisions. This post will describe our dynamic infrastructure environment, our approach to continuous vulnerability monitoring, justification for the need of a tool, and a brief overview of the design and implementation of NessAWS.
The Challenges of Dynamic Infrastructure
NessAWS, as the name implies, is meant for orchestrating vulnerability scans against multiple Amazon Web Services (AWS) accounts/regions. Almost all our company’s infrastructure runs on AWS, including load balancers, API servers, internal tools, databases, etc. We have multiple AWS accounts for development, staging, and production environments. Did I also mention that Terbium operates a massive dark web crawler? We use EC2 Spot Fleets, which are discounted compute resources (from now on I’ll refer to these as “instances”) that fluctuate in price based on availability. This infrastructure type is perfect for our distributed microservices deployment, since losing an instance in the middle of processing will not interrupt our pipeline. Spot fleets also allow us to define a max price that we are willing to pay for instances so we don’t break the bank. For systems that require high-availability, we use EC2 Auto Scaling Groups, which can automatically scale up or down instances based on metrics and thresholds that we define.
While these services provide considerable benefit from an availability and cost perspective, they present challenges from a security perspective. The number of instances we have fluctuates drastically based on the amount of work in our pipeline and the market availability of instances. Thus, we have instances constantly spinning up and being terminated into which we want visibility. We regularly fluctuate through at least 2,000 unique instances per day, which isn’t a lot compared to big companies like Netflix, but enough to require automation.
Selecting a Vulnerability Scanner
As the first half of the name implies, NessAWS is integrated with Tenable’s Nessus Vulnerability Scanner. The reasons for choosing to utilize Nessus are varied and biased: we had experience working with this scanner, we had a leftover Professional license from a previous project, and we knew there was a well-documented API that could be used to automate scans. We tried other solutions such as AWS Inspector, but we were unhappy with the results (this blog post lines up well with our experience).
Design and Development
With our decision to use Nessus Professional instead of an agent-based scanning approach, we knew there were a few constraints to consider. For example, you must submit a penetration testing request and wait for AWS to approve it before launching scans. AWS provides a web form that you must fill out while signed into your root account. The form requires you to detail the instance ID’s you wish to scan, IP addresses of the source of scans and destination instances, the timeframe you will conduct the scan, and other miscellaneous information. This was an additional process we wanted to automate, but knew from experience that requests can take 24-72 hours or sometimes require a second submittal to be approved. Thus, we needed a flexible way to submit the penetration test request and another process to perform the scans.
We also needed a method for explicitly defining which instances to scan. AWS supports associating arbitrary tags to instances. Tags allow you to categorize your resources in a key-value manner. Commonly used tags are name, description, purpose, or environment. We concluded that a tag for NessAWS would need to contain the name of the Nessus scan to execute on this instance and an additional value that could be specified to scan “classes” of instances. These classes could be subgroups of instances that we want to scan, such as webservers, databases, API servers, etc. In our implementation, we took this a step further by defining the risk level associated with each instance. For our high-risk instances (publicly-exposed, high value) we tag these as “daily” indicating that we want them to be scanned accordingly. We also have tags “weekly” and “monthly” for moderate and low risk instances. AWS makes it easy to propagate these tags to ephemeral instances using built-in services like Launch Configurations. There are also other free & open-source tools to tag instances, so you should only need to configure these tags once and review them periodically.
With these constraints considered, we developed a concept of operations (CONOPS) regarding how our scanning process would function. Here’s our original CONOPS diagram:
After defining requirements based on this CONOPS, we started designing and building. Without getting into all the gory details of our implementation, we learned a lot of lessons that influenced the final product. For example:
- Split the process into two distinct commands: one to submit the penetration test request and another to perform the scans through Nessus.
- We originally planned to build a daemon that would run continuously, keeping track of penetration test requests and launching scans based on a pre-defined cadence. We felt this was a bit overkill and more difficult to automate.
- We believe the point and shoot design of this implementation is easier to manage. To automate running NessAWS periodically, we decided to use tried-and-true solutions like cron.
- Splitting also ensured the safety of scans. Only instances that were included in the request would be scanned, and we could check the current date against the requested dates to ensure compliance with the original request.
- Don’t over-automate.
- As you can see, we also planned to have a process to update Nessus plugins before running scans. We ended up moving away from this, as it adds unnecessary complexity and there may be cases where we want to run scans multiple times without updating plugins.
- We also stayed away from trying to implement or verify networking to assure that Nessus scans would complete (such as VPC peering, opening security groups, etc.).
- Send the penetration test request through email.
- It isn’t well documented, but you can submit a penetration test request through email to firstname.lastname@example.org. The email template is slightly different from the web form, much easier to automate, and doesn’t require logging in to your AWS root account.
- We also learned after talking to AWS that you can submit a request with, “all instances in our VPC” under the instance IDs you wish to scan. Also, the longest amount of time you can submit a request for is 3 months. Thus, you may choose to manually submit four penetrating testing requests per year instead of using NessAWS to submit these requests.
- Be flexible with configuration options.
- When we decided that we wanted to open-source this tool, we knew that other cloud environments would not be the same as ours. Thus, we tried to make as much of the tool as configurable as possible, even down to the tag key that is used to detect which Nessus scan to run.
- We also began to realize that our configuration file was getting large and unwieldy, so we tried to provide as many sensible defaults as possible.
- Don’t build a SIEM.
- Similar to the point above, we understood that every company’s method of ingesting and working with vulnerability data is different. We originally planned for a single Excel spreadsheet as the only output option, but even for us this turned out to be unusable. We elected to narrow our focus and not build an all-in-one vulnerability management solution. Thus, we decided to keep the Excel option for those who will find it useful but also allow no output or the raw CSV files from Nessus in case you plan on ingesting the results to your SIEM of choice.
We feel that we have successfully implemented a solution for managing and launching vulnerability scans in our cloud environment. Our hope is that if you’re on AWS and use Nessus Professional, you’ll find it useful too. I’d also like to shout-out some other GitHub projects we found while researching this problem that provided us with inspiration:
As for how we manage all of our vulnerability data, we’ll have to save that for another blog post.