Automated Investigation at Scale: How Prophet AI Protects High-Exposure Partner Environments

Augusto Barros
Augusto Barros
June 11, 2026

Not all Microsoft 365 tenants present the same detection problem. An enterprise with managed endpoints and a mature Conditional Access policy is one kind of challenge. An organization at the center of a large partner ecosystem, where credentials are spread across dozens of external companies, devices are unmanaged, MFA enforcement is inconsistent, and login patterns are diverse by design, is a different problem entirely.

This post documents what automated investigation looks like in the second kind of environment: four accounts from a recent investigation sample, spanning credential spraying, a confirmed stolen password, AiTM session hijacking against a disabled account, and the long tail of a sustained campaign. Prophet AI investigated each in minutes, with a documented evidence trail behind every determination.

Why partner environments are hard to defend

The environment documented here operates a Microsoft 365 tenant accessed by a broad network of external partners. Those partners do not use corporate-managed devices. They have no endpoint security agents. They log in from residential ISPs, small business networks, and commercial VPN services. Their browsers, operating systems, and login hours reflect genuine global diversity. Some accounts are active; some are legacy accounts that should have been decommissioned but were not.

This creates a detection environment that is simultaneously high-signal and high-noise. The detection tools are doing their jobs: flagging anomalies, firing on suspicious IPs, generating alerts when behavior deviates from baseline. But in this environment, "deviates from baseline" covers a lot of legitimate ground. Partners log in from unexpected countries. Login hours vary. Device types change. Many alerts are real threats, and many require significant investigation to confirm.

The result is a volume problem. Over a short observation window, this environment generated dozens of security investigations across multiple user accounts, each requiring correlation across login history, IP reputation, user agent analysis, application access patterns, and cross-alert context. Done manually, this work would consume analyst teams around the clock. Done poorly, or not at all due to resource constraints, it creates the conditions for confirmed compromises to go undetected while the team is occupied elsewhere.

This is the environment where automated investigation provides the most measurable value.

{{ebook-cta}}

What the investigation sample shows

The investigations documented here span multiple accounts, multiple attack techniques, and multiple points in the threat lifecycle, from early-stage credential testing to confirmed post-compromise access and, in several cases, continued attacker persistence against already-disabled accounts.

Across these investigations, a consistent pattern emerges: the detection systems generate real signals, but the signals are rarely individually definitive. Comprehensive, automated investigation work converts them into actionable determinations: connecting an anomalous IP to a user's login history, surfacing prior investigations, identifying behavioral fingerprints, and recognizing that a "blocked" attempt still confirms a valid password.

Account 1: A credential spray that broke a clean baseline

The first account in this sample had an unusually clean behavioral baseline: fourteen days of login history originating exclusively from a single residential IP on a US cable provider. No VPN usage. No variation in device type. A consistent, predictable pattern.

One day, that baseline broke completely. An ExpressVPN IP, hosted by a commercial hosting provider and associated with password-spraying infrastructure in a same-day investigation, appeared for the first time. 35 successful logins were recorded from this IP within approximately two minutes, a burst pattern characteristic of automated credential testing rather than human login behavior. The session introduced a macOS user agent that had never appeared in the account's history; the account had used Windows exclusively across all prior sessions.

Post-authentication, the session accessed an application with no prior access history, consistent with an attacker performing initial reconnaissance after gaining access to a new account.

A second investigation on the same account, performed approximately an hour later, confirmed the infrastructure link: the same IP had already been identified as password-spraying infrastructure in another investigation earlier that day. Prophet AI surfaced the cross-investigation relationship, connecting what appeared to be two separate alerts into a single coherent picture of account compromise.

The determination rested on the combination of signals: a clean residential baseline shattered by a burst of automated logins from spray infrastructure, on a device type never previously seen, accessing an application with no prior history, with a corroborating investigation already on record. Each signal alone is defensible. Together they are unambiguous.

Account 2: A confirmed password, stopped by Conditional Access

This investigation illustrates one of the most underappreciated distinctions in identity security: the difference between a blocked authentication and a failed one.

A sign-in attempt from a confirmed Tor exit node, flagged as malicious by three independent threat intelligence sources, was blocked by Conditional Access policy. On the surface, this looks like a success story: the control worked. But the block was triggered by Error 50053, which indicates the authentication was stopped because of the malicious source IP. The password itself had been entered correctly.

The attacker validated the user's password before being stopped at the network layer. The credential is confirmed compromised.

This account's entire legitimate login history originates from Brazilian ISPs, a specific set of providers consistent with a known user location. The Tor exit node login, from infrastructure with no geographic relationship to Brazil and no legitimate business justification, confirmed that a third party had obtained the account credentials and was attempting access.

Conditional Access succeeded in blocking the session. But "blocked" and "safe" are not synonyms when the password has already been proven valid. Immediate credential remediation, including a forced password reset and session token revocation, was required regardless of whether any access was granted.

Prophet AI's determination documented exactly this distinction: the threat is real and confirmed; the blocking control is holding, but the credential itself requires remediation.

Account 3: Session hijacking against a disabled account

This case represents the most technically sophisticated attack documented in this sample, and the most alarming from a defensive standpoint.

The account in question had been disabled in Entra ID for months. It should not have been capable of generating successful authentication events. It did.

Microsoft Defender detected a suspicious sign-in from a Colombian hosting provider IP with zero prior history on this account. The sign-in occurred concurrently with a legitimate-looking session from New York: geographically impossible travel, a strong indicator of credential compromise. The Colombian session followed multiple failed attempts with invalid-credential errors before succeeding, consistent with credential stuffing. The successful sign-in carried Microsoft's maximum risk score of 100.

A separate investigation filed four hours later identified the mechanism: five same-day Defender alerts explicitly naming AiTM (Adversary-in-the-Middle) session hijacking and CSRF speedbump indicators. The account's entire two-week login baseline was a single US IP. The Colombia-based hosting provider had zero prior history. Successful OAuth2 and CMSI authentication events were recorded despite the account being disabled, because the stolen session tokens had been created before the disablement and remained valid after it.

This is the session token persistence problem in its clearest form. Account disablement in Entra ID does not automatically invalidate OAuth tokens issued before the disablement. An attacker who obtained session tokens through an AiTM proxy during an active session can continue using those tokens for as long as their TTL permits, days or weeks, regardless of what happens to the underlying account. We documented a similar pattern of zombie credentials in subsidiary infrastructure in an earlier investigation writeup.

Prophet AI identified the AiTM pattern, correlated the cross-investigation evidence, and documented the persistence mechanism, giving the response team the context to understand that credential reset alone would not be sufficient and that active token revocation was required.

Account 4: The long tail of a sustained campaign

The final investigation shows what the end of a sustained credential campaign looks like when remediation has succeeded but the attacker has not given up.

This account had accumulated 16 prior alerts over at least two weeks, 15 of them determined Malicious. Prior investigations had documented successful logins from infrastructure spanning multiple countries and providers, confirming genuine credential compromise that had already been remediated through account disablement.

The alert that triggered this investigation was an IPv6 address confirmed as malicious by multiple threat intelligence sources. The attempt failed with Error 50057, user account disabled. No access was granted. No post-authentication activity occurred.

On its own, this is a low-severity, routine blocked attempt against a known-compromised, already-remediated account. Without campaign context, it might be deprioritized or auto-resolved.

Set against the 16 prior related investigations, it becomes something more operationally useful: confirmation that the attacker is still active, still targeting this account, still attempting access from new infrastructure. That persistence has implications beyond this single account. It suggests a threat actor who has invested significant effort in this identity and may be simultaneously targeting related accounts, testing adjacent credentials, or attempting to leverage any previously obtained access (email contents, tokens, application data) to pivot elsewhere in the environment.

Prophet AI's determination documented the full campaign arc in context, enabling the response team to treat the event as an active threat requiring continued monitoring of adjacent identities rather than a closed matter.

The economics of automated investigation

These four accounts represent a fraction of the investigation volume this environment generates. Each investigation required:

  • Querying login history across 14 to 30 days to establish behavioral baselines
  • Cross-referencing source IPs against multiple threat intelligence feeds
  • Correlating current alerts with prior investigations for the same identity
  • Analyzing user agent strings against historical fingerprints
  • Assessing post-authentication application access against usage history
  • Determining whether blocks represent genuine failures or confirmed-credential blocks
  • Identifying whether access occurred through credentials or session token replay

Done manually, a single thorough investigation of this type can take hours from a skilled analyst. For the volume this environment generates, staffing to manual investigation pace would require a large, dedicated team working continuously, with no capacity for proactive threat hunting, incident response, or any work beyond alert triage.

Prophet AI completed the investigations in this sample in a few minutes each, including query execution and cross-investigation correlation. Prophet AI investigates every alert in this environment with the same depth, in minutes, and documents the full evidence trail behind each determination. What reaches the human team is the set of cases that actually require judgment: confirmed compromises requiring remediation decisions, incidents with ambiguous evidence, and cases where the investigation output surfaces novel attack patterns worth deeper analysis. The shift is from humans triaging volume to humans making decisions. We have written before about why investigation depth, rather than more detection, is what closes this gap.

Defensive lessons from this sample

Two issues appear consistently across this investigation sample and deserve direct attention, because they are structural gaps that amplify the impact of any credential compromise.

Inconsistent MFA enforcement

Across the investigations in this sample, MFA was either absent or unenforced on accounts that were successfully compromised. This is a common pattern in partner environments where Conditional Access policies are designed for internal employees and extended to external partners without consistent enforcement.

In several investigations, successful logins completed without any MFA prompt despite MFA being nominally available. The mechanism varies: some legacy authentication protocols bypass modern MFA challenges; some Conditional Access policies include carve-outs for specific applications or network locations; some partner accounts were provisioned before MFA policies existed and were never updated.

The practical effect is the same in every case: a credential-stuffing attack that a consistent MFA requirement would stop succeeds because the account falls into an unprotected path. Auditing Conditional Access policies specifically for partner and guest accounts, and eliminating exemptions that lack a clear business requirement, is the most effective defensive action available in this environment type. MFA itself is also a target; see our analysis of MFA fatigue attacks.

Successful authentication after account disablement

Multiple investigations in this sample documented successful authentication events for accounts that had been disabled in Entra ID, in one case for many months.

This occurs through two distinct mechanisms, both present in this sample. First, in hybrid identity environments, Entra ID disablement does not always propagate immediately or completely to all authentication paths, particularly for legacy protocols and certain application integrations. Second, and more significantly, OAuth tokens issued before disablement remain valid until they expire or are explicitly revoked, regardless of subsequent account state changes.

An attacker can obtain valid session tokens through credential compromise, AiTM session hijacking, and other means, then continue using those tokens after the account is disabled for as long as the token TTL permits.

The operational implication: account disablement is a necessary response to confirmed compromise, but it is not sufficient. Forced token revocation (the "Revoke sign-in sessions" action in Entra ID, for example) must accompany disablement to terminate attacker access through existing tokens. Where AiTM session hijacking is suspected, the response must also include review and revocation of any OAuth application grants made during the compromised session window.

What this means for high-exposure environments

The investigations documented here are not exceptional. They are representative of the volume and variety of identity threats that a high-exposure partner environment generates on a daily basis.

The speed and depth of the investigation is what changed. Each determination drew on login history, IP intelligence, cross-investigation correlation, behavioral baselining, and application access analysis, assembled automatically in minutes, with enough context for a human analyst to act immediately rather than spend the next hour reconstructing the picture from raw logs.

That is the practical case for automated investigation in high-volume environments: analyst judgment applied to every case that warrants it, rather than only the fraction a manually paced team can reach before the next alert wave arrives.

If your environment looks like this one, with broad partner access, unmanaged devices, and more identity alerts than your team can investigate, request a demo to see how Prophet AI handles it.

70% of SOCs will pilot AI Agents. Only 15% will see results

This Gartner research arms security operations leaders with a list of specific questions to ask vendors during evaluation

Download Gartner Report
Download Ebook
70% of SOCs will pilot AI Agents. Only 15% will see results

Frequently Asked Questions

Insights
Exit icon