Uncertain Threats Part #1: Strategies for handling uncertainty in identity investigations

James Fox, Cyber Security Operations Consultant | 11 April 2024

Introduction

Monitoring identity providers for suspicious sign-in activity is the bread and butter of security operations in the cloud. With more than 77% of attacks seeing compromised credentials as an initial access method (Sophos, 2024) it is incredibly important that cloud IDPs such as Microsoft Entra ID are monitored closely for signs of account compromise, and when identified, a range of responses are made available. Despite this imperative, it can be a struggle for security operations analysts to come to high-confidence determinations on if a suspicious sign-in warrants action or not. Often a compromised account only becomes apparent given later-stage threat activity, in part, due to an unfortunately necessary reliance on anomaly-based detections in monitoring sign-in activity. These anomaly based alerts provide little context to an analyst to determine with confidence that an account has been compromised or not and can be exceptionally noisy depending on how “mobile” the organisation is (cough cough BYOD, overseas or remote workers, and 3rd party contractors). The result of is are wishy-washy alert triages that run the risk of an over or under reaction.

Figure 1: Classic Microsoft Entra ID Protection alerts that wake up the SOC at 3AM

Strategy #1: Relate to a known threat or scenario

By far and away the most efficient strategy to limit uncertainty when investigating anomalous sign-ins is to relate indicators in the activity period to some known threat or scenario, either based on external cyber threat intelligence or incidents faced internally. This aims to give an analyst something concrete to work with, rather than just “weird” or anomalous events.

The general process for performing this is to:

  1. Ahead of time, build a matrix of threats and scenarios alongside what evidence proves or disproves them.
  2. At triage, the analyst will collect all pieces of evidence in the matrix from the available log sources - marking down where there are gaps in telemetry.
  3. Then, each threat is evaluated against the collected evidence for merit, taking into account any identified gaps in telemetry.
  4. If the anomalous sign-in activity matches a known threat with many matching pieces of evidence, then we can say with confidence that it is worth actioning.

This process is similar to an analysis of competing hypotheses or ACH - which a fantastic generalist methodology for dealing with uncertainty in a slew of cybersecurity applications (more here from the CIA). An example matrix of threats that we routinely look out for at Fortian whilst investigating sign-in alerts has been given below. This isn’t complete (there are too many to go through here) but provides a good starting point as an illustration.

Threat or Scenario
Supporting evidence
Disconfirming evidence
Adversary-in-the-middle (AITM) phishing
  • Sign-in from a previously unseen IP address for the user against the OfficeHome or Microsoft Office application.
  • Overlapping sign-in activity by source device
  • Previously unseen country or city
  • Previously unseen browser/app combination for the user
  • URL click in Teams, Exchange, or on a PDF just before the suspicious sign-in
  • Previously unseen residential autonomous system for the user
  • Cloud hosting or VPN provider IP address
  • From an onboarded device
  • From an IP address with a high successful sign-in rate across the organisation
  • Legacy authentication
  • Device code authentication
Device code phishing
  • Authentication protocol of deviceCode
  • Targeting the Microsoft Authentication Broker application (may indicate PRT phishing)
  • Against a user account
  • Against a non-privileged user
  • Previously unseen country or city
  • Previously unseen browser/app combination for the user
  • URL click in Teams, Exchange, or on a PDF just before the suspicious sign-in
  • Previously unseen residential autonomous system for the user
  • Cloud hosting or VPN provider IP address
  • Against a headless device like a TV
  • Teams phone registration
  • Targeted application and user combination is expected to use device code authentication, such as an administrator consenting to CLI tools
Helpdesk social engineering
  • MFA method and/or password had been reset in the days leading up to the suspicious sign-in
  • Multiple MFA methods added/removed before and after the suspicious sign-in
  • Awareness that there is a vulnerability in helpdesk authentication reset procedures and verification
  • Targeted user is a high value account or has a public profile
  • Previously unseen country or city
  • Previously unseen browser/app combination for the user
  • Previously unseen residential autonomous system for the user
  • Cloud hosting or VPN provider IP address
  • Sign-ins from expected devices between the suspicious activity and the MFA reset (where all MFA methods were reset/removed, and not just a new method added)
  • Strong, known (and tested) verification processes for authentication reset requests
  • From an onboarded device previously known to the user

Figure 2: An example suspicious sign-in triage matrix for mapping anomalies to known threats

The “threats and scenarios” detailed in the matrix don’t necessarily need to be malicious events. In fact, it can be beneficial to have a wide-range of hypothesis detailed so that an analyst can come to an equally high-confidence determination of benign as well as malicious activity. For example, an organisation that has a high prevalence of consumer VPN usage may use a scenario called “legitimate VPN usage” and map out indicators that distinguish this activity from malicious scenarios.

Strategy #2: Score the anomaly (AKA how weird is it anyway?)

There can be instances where the above analysis falls flat, namely, when collected evidence doesn’t convincingly point toward a single or any known threats. In many instances this is compounded by gaps in telemetry . In this case, all we have to work with is the anomaly itself. The question then turns from “what type of threat does this represent?” to “how likely is this to be some unknown threat?”. We can do this by benchmarking several key properties in the suspicious sign-ins against past organisation and user activity, then use this information to reason about an overall anomaly score enriched with enough context to pull apart “normal weird” from “yikes weird”.

Figure 3: Clearly this account is compromised right?! Nope, 3rd party contractor using an unsanctioned VPN.

Fortian runs several analytics when investigating sign-in anomalies to determine how anomalous they actually are. Typically these are modelled as true or false statements for simplicities sake. Once these are defined and run consistently when performing triage, an analyst is given an easy point of comparison between past anomalies or incidents. From this, a natural baseline can be formed and used to pull the trigger with confidence on a remediation action.

We found the following analytics work well in general cases of identity compromise in Entra ID:

Highly anomalous

  • Previously unseen authentication type (legacy, device code, ROPC) for the user.
  • Previously unseen VPN provider for the user.
  • Source IP address belongs to cloud hosting provider either not used by the organisation, or one that is used but in an anomalous region.
  • Weaker multi-factor authentication method used than the user is capable of based on past activity (E.G a user signed in using a TOTP code rather than a push notification).
  • User has accessed an IT administrative application without justifying business context given the user’s role in the organisation.
  • User has failed an anomalous number of distinct MFA flows in the past month compared with the rest of the organisation.

Moderately anomalous

  • Previously unseen workstation for the user based on user agent browser/OS combination.
  • Previously unseen location (country) for the user in the past 3 days.
  • Previously unseen residential Autonomous System (AS) for the user.
  • The user is known to be targeted by password attacks (E.G password spraying, credential stuffing, brute-force) based on past incidents or threat intelligence.
  • First time sign-in in the past 60 days and the account is not newly created.
  • Nearby sign-ins contain multiple distinct devices (I.E overlapping sign-in activity) involving at least one previously unseen device based on OS/browser combination.
  • Access of previously unseen application for the organisation.

Common anomalies

  • Previously unseen mobile or other personal device for the user based on user agent browser/OS combination.
  • Previously unseen location (city) for the user in the past 3 days.
  • Recorded sign-in time is outside of traditional work times with respect to the user’s time zone.
  • Recorded sign-in time is outside of traditional work times with respect to the organisation’s time zone.
  • Previously unseen IP address for the organisation.
  • First time sign-in in the past 30 days and account is not newly created.
  • Access of previously unseen application for the user.

As an example, an analyst is triaging a suspicious sign-in and has noted that the user usually signs in from their onboarded work laptop from an office IP address in Melbourne. However, the user has now signed in from an offboarded device using a VPN during European work hours. From this, the analyst records the following anomalous properties:

  • Previously unseen IP address for the organisation (common anomaly)
  • The user is known to be targeted by password attacks (moderate)
  • Previously unseen VPN provider for the user (high)
  • Previously unseen workstation for the user based on user agent browser/OS combination (moderate)
  • Previously unseen residential Autonomous System (AS) for the user (moderate).
  • Recorded sign-in time is outside of traditional work times with respect to the organisation’s time zone (common).

= 1 high, 3 moderate, 2 common

The analyst then compares the computed anomaly score against past incidents and investigations (either from record or based on experience), and notes that the expectation is that there is a larger volume of common anomalies and no high anomalies. Therefore, the analyst can determines that the user’s account is likely to be compromised.

How anomalous each of these features are will naturally vary organisation by organisation - which is more than fine! Adjusting scoring based on business context or adding new properties is good practice so that we don’t repeat the problem we are trying to solve in the first place (E.G if we consistently get 5 high scoring anomalies for every suspicious sign-in, and these get confirmed as benign, then we are right back where we started).

The main issue with this strategy is that it is a bit tedious, with a lot of analysis that needs to be performed on often frequently occurring alerts. Thankfully most of the brunt effort of running the necessary queries can be easily automated away in a SIEM/SOAR platform of choice. We find that defining common investigation patterns such as those above inside of KQL functions works exceptionally well to speed things up and ensure consistency.

Strategy #3: Perform low-impact response actions

So far we’ve gone through two strategies that aim reduce the uncertainty of determining if an account has been compromised given some suspicious sign-in activity. We’re still left with situations where uncertainty simply can’t be reduced to a reasonable degree given the data on hand. This is particularly evident in environments where setting baselines for anomalous activity is exceptionally difficult or there are gaps in monitoring or log data. For example, maybe we don’t have the context about what working hours are, if consumer VPNs are expected, where users are physically located, etc. In many instances this can be inferred from telemetry, but often not. Instead, we can balance the probability that the account is compromised against the potential business impact of responding to the (uncertain) threat.

There are several options available to security operations analysts to deal with threats against Microsoft Entra ID identities that don’t involve going scorched-earth by locking an account, removing MFA methods, nuking tokens, and resetting the password twice. Simple, low-impact actions can be performed without actually needing the full context about what the threat is - just the likelihood that it exists.

  • Revoke the user’s sign-in sessions: This will force the user to re-sign into their account, including completing MFA where enabled. In instances where token theft such as through AiTM phishing is suspected, this is often enough to remediate the threat (often is doing heavy lifting here). This will not revoke device primary refresh tokens as this would be a high-impact action.
  • Reset the user’s password: Up for debate about how disruptive this is and will need to be considered with respect to the potential impact a runaway compromised account could cause. This action is typically performed where there is uncertainty if an account’s password has been stolen in some password or credential harvesting attack (E.G password spraying, simple credential harvesting attachments, credential stuffing, etc).
  • Extend monitoring: As outlined earlier, identity compromise often reveals itself through malicious activity undertaken after the first suspicious sign-ins. Instead of relying on analysts to piece this together after the fact, a suspect account can be added to an extended monitoring list that can be used to create alerts in higher-fidelity for some given time period. For example, a new device being registered to Entra ID might be an informational severity alert, but if it relates to a user on the extended monitoring list, this can be bumped up to high/medium severity for immediate actioning. This is easily achieved in Microsoft Sentinel through the use of a Watchlist and a Logic App to re-categorise the severity of incidents that involve monitored entitles.
Figure 4: An entry in an example extended monitoring Watchlist. This can be manually updated by analysts using the Sentinel watchlist editor or through an entity Playbook.

The above uncertain response actions are not actually guaranteed to remediate the threat - since the true nature of the threat remains unknown to the analyst. Subsequently, it is critical that when performed that they are recorded and visible for future investigations. Future anomalous sign-in activity against the account could be due to a new threat, or an old uncertain threat not fully remediated (again, we’re playing based on probability at this point). This also means that uncertain actions should absolutely not be relied on as a crutch by analysts and should only be used after it becomes apparent that uncertainty can’t be appropriately reduced through analysis.

Conclusion

Despite it’s critical importance, investigating suspicious sign-in activity can be challenging for security operations analysts due to the complex task of reducing uncertainty. There are however, several simple analytical strategies that can be prepared ahead of time to help analysts come to high-confidence conclusions about the nature of threats against identity:

  1. Relate to a known threat or scenario: Take properties of the sign-in activity and weigh them up against known attacks for merit.
  2. Score the anomaly: Where activity doesn’t definitely point toward a known threat or scenario, we can benchmark it against known user and organisation-wide activity to interrogate the likelihood that it is an unknown threat.
  3. Perform low-impact response actions: Lastly, in the face of unreducible uncertainty, security operations analysts can choose to perform low-impact actions against suspect identities - erring on the side of caution whilst limiting business impact.

In the next part of the series “Uncertain Threats Part 2”, we will demonstrate how Bayesian Belief Networks (BNNs) can be used to model uncertainty in security operations investigations.

Stay tuned!

CONTACT US

Speak with a Fortian Security Specialist

Request a consultation with one of our security specialists today.

Get in touch