Serverless Threat Modelling Part 2 🚀
During Serverless Threat Modelling what are the most common threats I typically see, and how can you bake this into your future solution designs as standard? In Part 3 we will understand advanced Serverless security threats with an expert guest blogger!
In the first article we covered what Serverless Threat Modelling is and why you would want to use it; following a fictitious company called ‘LeeJames HR’, showing how we would run through the process using the STRIDE model, and what the tangible benefits were.
“The new era of Serverless on AWS, and how quickly we can chain services together into complex architectures to meet customer needs, also has the added downside of increasing the overall threat landscape compares to more traditional architectures.” — Lee Gilmore
🔵 What is Serverless Threat Modelling?
🔵 Let’s look through the STRIDE model.
🔵 How to approach Serverless Threat Modelling?
🔵 What threats were detected, and how did it affect the architecture?
You can access the previous article here:
So what are the usual culprits? 😈
Let’s have a look at some of the usual things I see pop up in Serverless Threat Modelling sessions as standard.
How hard is it for an attacker to modify the data they submit to your system? Can they break a trust boundary and modify the code which runs as part of your system?
👿 Event Data Injection
Validating event data is something that I rarely see in teams but for me is imperative; however with the increase use of EDA services like Amazon EventBridge, how do you know the event data you are consuming has not been compromised in decoupled enterprise systems? For me personally, teams should be validating any data which crosses service boundaries.
👿 API Validation
Many development teams are not aware that we can use JSON schema validation at the API Gateway level, and rely solely on validation further down the service execution flow. Adding basic validation at the top level can be an extra layer of security.
👿 MFA Delete on S3 buckets
If we have an S3 bucket which stores files which are imperative to your company running, for example signed documents which can’t be recreated, then we need to add MFA delete on the S3 bucket to protect it.
👿 No planned DR!
I often run sessions with teams where disaster recovery has not been thought about from the outset — at all! An example would be a DynamoDB table which doesn’t have any backups or PITR. Extrapolate this out via many AWS services and we can see how this can be totally missed. (See TACTICAL DD(R))
Disaster recovery options in the cloud
Disaster recovery strategies available to you within AWS can be broadly categorized into four approaches, ranging from…
When we go down the approach of monolambdas we increase the threat landscape, as this lambda now has many more privileges on average as it needs to perform more tasks and interact with more services (compared to splitting our functionality into many single function lambdas). This typically means that a compromised Lambda (perhaps via an attached API Gateway) can affect more services downstream through more open IAM privileges.
👿 Email infiltration
If we are sending out emails directly to end customers via batching from an SQS queue, then what is stopping an internal bad actor (annoyed employee) from amending messages on the queue. In doing so we run the risk of reputation damage to your company with spurious emails, so security controls around this area is key.
👿 No scanning of file uploads
Quite often I see the uploading of files into S3 buckets through pre-signed URLs, however there is no scanning of the files for malware or threats when they hit the bucket. In the past I have used tools such as ClamAV to prevent any malicious uploads from bad actors.
Virus scan S3 buckets with a serverless ClamAV based CDK construct | Amazon Web Services
Edit: March 10th 2022 - Updated post to use AWS Cloud Development Kit (CDK) v2. Protecting systems from malware is an…
Denial of Service
Can someone break a system so valid users are unable to use it? Denial of service attacks work by flooding, wiping or otherwise breaking a particular service or system.
🔥 DDoS / Denial of Wallet
Do we have protection in place like rate limiting to prevent DDoS and Denial of Wallet attacks?
🔥 Flooding downstream services
We need to be aware that if we don’t have queues in place to throttle throughput to downstream services we can quite easily affect whole enterprise organisations if these systems are not as scalable, and are services which are cornerstones of the org.
🔥 Reserved concurrency
Lack of reserved concurrency on certain Lambdas could take out your full account. An example would be a lambda in your website which is being used for a health check endpoint behind API Gateway, which is called only once per minute (which has no reserved concurrency in place). What if an attack bypasses any lack of rate limiting, and exhausts all of your account level Lambda concurrency based on one single innocuous Lambda?
🔥 Database connection overload!
Quite often teams are not aware of the issues we can have with lambdas scaling out against less scalable databases, essentially causing an exhaustion limit on either database connections, or memory and CPU. The last thing we want to do is DDoS ourselves! You can read more here:
Serverless DocumentDB Connection Caching Service — Part 1 🚀
How to cache database connections with your Serverless solutions when using Amazon DocumentDB with a dedicated data…
How hard is it for users to deny performing an action? What evidence does the system collect to help you to prove otherwise?
📔 S3 bucket access logs
Server access logging provides detailed records for the requests that are made to a bucket. Server access logs are useful for many applications. For example, access log information can be useful in security and access audits.
Logging requests using server access logging
Server access logging provides detailed records for the requests that are made to a bucket. Server access logs are…
I quite often see that many solutions don’t have any auditing of actors (users) performing operations, for example an employee updating a persons payslip in an HR system. Without auditing of users activity how can you investigate unusual activity or look at non-repudiation? (See TACTICAL DD(R))
📔 Inadequate logging
Quite often I see teams implement inadequate logging, essentially very basic logs stating Lambda started and Lambda completed. A correct logging strategy should be in place for an organisation or there is limited information for investigation purposes.
📔 Lack of monitoring
It can be very easy to build a solution as a team and forget to add any monitoring, tracing or observability; so this is something that I try to pick up on as early as possible in the design phases.
Monitoring and Logging
Communication and collaboration are fundamental in a DevOps philosophy. To facilitate this, feedback is critical. In…
📔 Lack of centralised logging
If we don’t centralise our access logs (such as CloudTrail) into a centralised account which is locked down, then attackers can quite happily cover their tracks for month (if not years).
CloudTrail log files are an audit log of actions taken by a user, role or an AWS service. The integrity, completeness and availability of these logs is crucial for forensic and auditing purposes. By logging to a dedicated and centralised Amazon S3 bucket, you can enforce strict security controls, access, and segregation of duties — AWS
Security best practices in AWS CloudTrail
AWS CloudTrail provides a number of security features to consider as you develop and implement your own security…
Can someone view information they are not supposed to have access to? Information disclosure threats involve the exposure or interception of information to unauthorised individuals.
💣 Where is config stored?
Where are we storing sensitive configuration like username and passwords, or access tokens; as if it is at build time (using SSM and the Serverless Framework perhaps), then we may find that our config is being pushed to Lambda environment variables. This means that anybody that may have read only access to the console will have access to the secrets now.
Building well-architected serverless applications: Implementing application workload security …
This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and…
💣 PII in logs
Are we storing PII (Personal Identifiable Information) in our logs? If we are then firstly this may be against GDPR, and secondly this can cause a potential data leak within our organisation if exposed. (See TACTICAL DD(R))
Detecting and redacting PII using Amazon Comprehend | Amazon Web Services
Amazon Comprehend is a natural language processing (NLP) service that uses machine learning (ML) to find insights and…
💣 Exposing private APIs to the internet
A key consideration with APIs are whether or not they should be accessible on the public internet? If they are private to your organisation and have no external consumers, then make them private; which lowers the attack surface considerably.
Serverless Private APIs — Part 2 🚀
How to allow private serverless platform APIs to communicate securely internally using custom domain names within your…
How hard is it for an attacker to pretend to be someone with authority to use the system? Can someone spoof an identity and then abuse its authority?
🤡 Insider threats
Remember: even internal actors are threats! Trust nobody in your organisation. Don’t see threat actors as living outside of your APIs and systems.
🤡 Proposal of hand cranked authentication
Any proposal of hand cranked authentication is a no no. Use a service like Amazon Cognito where it has been battle tested and created by experts. Hand cranked authentication massively opens up the threat landscape of your services.
Elevation of Privilege
Can an unprivileged user gain more access to the system than they should have? Elevation of privilege attacks are possible because authorisation boundaries are missing or inadequate.
🕵️ API enumeration
Quite often teams authenticate consumers (say B2B machine to machine flow), but fail to think about the fact that competitors may be able to enumerate API resource IDs to access information they shouldn’t have access to. This would also go for end customers, for example an employee accessing another employees payslips in an HR system using this method.
🕵️ SSO auth without groups
Its great using SSO for internal actors, but if you are only checking that they have a valid user account and not using groups from the returned JWT (such as AD groups when using Microsoft AD SSO), or using the Principal ID of the token as an authorisation lookup, then you are essentially stating that anybody within the company can access your application…(See TACTICAL DD(R))
🕵️ Off-boarding of users
I also quite often see that there are no controls around people moving between teams or leaving the company, where they can be demoted or move to another part of the company (or leave entirely), but still have the same access as they had previously. Get a process in place around this!
How can we prevent some of these?
We could start with having your teams know about TACTICAL DD(R) which will allow them to cover off some of the typical issues above which will feed into future designs:
Serverless TACTICAL DD(R) 🚀
What is TACTICAL DD(R) as a tactical approach to nonfunctional requirements when it comes to Serverless solutions, and…
The second thing would be to have your teams and architects understand the five Serverless Architecture Layers, which will automatically cover a large portion of the issues above through the use of both Platforms and Cross-cutting concerns which feeds into reference architecture:
I hope you found that useful as a few things to look out for in your next Serverless Threat Modelling sessions.
Stay tuned for Part 3 which is a deep dive on Advanced Serverless Threats from an expert guest blogger which I am excited for!
Go and subscribe to my Enterprise Serverless Newsletter here for more of the same content:
Wrapping up 👋
Please go and subscribe on my YouTube channel for similar content!
I would love to connect with you also on any of the following:
If you enjoyed the posts please follow my profile Lee James Gilmore for further posts/series, and don’t forget to connect and say Hi 👋
Please also use the ‘clap’ feature at the bottom of the post if you enjoyed it! (You can clap more than once!!)
I consider myself a serverless advocate with a love of all things AWS, innovation, software architecture and technology.”
*** The information provided are my own personal views and I accept no responsibility on the use of the information. ***
You may also be interested in the following: