Phishing Investigation: A Practical Guide for SOC Analysts

Ajmal Kohgadai
Ajmal Kohgadai
June 2, 2026

Phishing investigation is the process by which security operations center (SOC) analysts determine whether a reported message is malicious, identify recipients, assess any interaction, scope the resulting exposure, and contain the compromise. The discipline applies across all phishing variants, including credential-harvest email, business email compromise (BEC), brand impersonation, and abuse of legitimate communication platforms.

A phishing investigation answers four questions in sequence: whether the message is genuinely malicious, who received it, whether anyone interacted with it, and the extent of any compromise that resulted. The questions are the same regardless of the lure used; the artifacts and evidence sources change.

Background

MITRE ATT&CK catalogs phishing as technique T1566, with sub-techniques covering attachments, links, and abuse of legitimate services. The framework treats phishing as a delivery mechanism rather than a single outcome, since the same message may serve reconnaissance, credential theft, malware delivery, or fraud.

The FBI Internet Crime Complaint Center recorded phishing and spoofing as the most-reported crime category in its 2024 annual report. The Verizon Data Breach Investigations Report (2026) identified the human element as a contributing factor in 62 percent of breaches.

{{ebook-cta}}

The investigation workflow

A phishing investigation follows the same core workflow regardless of lure: confirm the report, gather sender and message evidence, determine who was exposed and whether anyone interacted, scope the blast radius, and remediate.

Confirmation

The first step validates that a reported message is in fact phishing, deduplicates it against campaigns already under investigation, and assigns a severity that drives prioritization. The majority of reported messages turn out to be benign or low-impact, but each requires validation because the cost of missing a genuine account compromise is the entire reason the queue exists. Verizon data places the median click rate on phishing simulations near 1.5 percent even with sustained training, so user clicks remain a constant feature of the workload.

Evidence collection

Analysts collect full message headers, sending infrastructure, authentication results (SPF, DKIM, DMARC), embedded URLs, and any attachments. Live URLs and attachments are resolved or detonated in isolation, typically in a sandbox environment, to prevent inadvertent execution during analysis.

Exposure and interaction

Recipients are identified and partitioned by interaction: those who only received the message, those who opened it, those who clicked embedded links or opened attachments, and those who entered credentials or replied. Interaction converts a delivered message into an incident.

Scoping

For any interacting recipient, the analyst reviews authentication logs, sign-in history, recently created inbox rules, OAuth grants, and outbound mail activity. These second-order indicators reveal whether attacker access has extended beyond the original inbox.

Containment and remediation

Containment actions include purging the message across mailboxes, resetting affected credentials, revoking active sessions, blocking malicious infrastructure, and notifying affected users. The actions taken follow from the evidence gathered. Remediation applied before scoping risks leaving persistent access mechanisms such as forwarding rules or OAuth grants intact.

Variants

The five-step workflow is constant. What varies by lure is where the analyst spends time and which evidence carries the verdict.

Each variant has dedicated coverage in the linked guides. The point of a unified workflow is that the method does not change between them. The same five steps apply, and the lure indicates which step deserves the most scrutiny.

Prioritization across variants typically follows business impact rather than arrival order. A single targeted payment-fraud attempt aimed at finance outranks a high-volume credential-harvest campaign, even when the latter generates more queue activity.

Evidence preservation

Evidence is preserved before remediation, since containment actions can destroy the artifacts needed to support post-incident reporting, audit, or law-enforcement disclosure. Standard evidence includes:

  • The original message in its native format, with full headers and any attachments intact.
  • Authentication results (SPF, DKIM, DMARC) and the sending infrastructure recorded at receipt.
  • Embedded URLs and their resolved destinations, captured before takedown or rotation removes them.
  • Attachment hashes and any sandbox detonation reports.
  • Recipient and interaction records: who received, opened, clicked, or replied.
  • For any interacting account, sign-in history, inbox-rule changes, and OAuth grants.

Indicators of compromise extracted from these artifacts (sender infrastructure, lookalike domains, credential-harvest URLs, malicious file hashes) feed two destinations. They close the current case, and they update detection rules so subsequent instances are caught without manual intervention. An investigation that ends without writing back into detection engineering is a case likely to recur.

Scale and automation

The investigation workflow does not compress easily. Routine alerts such as retroactive quarantine notices, where a delivered message is purged after a verdict is issued, force the same manual sequence for each instance: identify recipients during the dwell window, check for clicks or interaction, and assess any session or credential exposure. At volume, these low-impact cases consume analyst time that would otherwise go to higher-severity work.

Automation of the workflow takes two forms. Security orchestration, automation, and response (SOAR) platforms execute fixed playbooks to collect headers, run enrichment lookups, and trigger predefined containment actions. AI-based investigation systems apply the workflow to each reported message and produce a documented determination, including the queries run and the evidence considered. Both approaches preserve the core five-step sequence; the difference lies in how each step is executed and how many cases the system can handle without human input.

Phishing reports rarely arrive in isolation. Each one is an input into a broader alert triage pipeline that aggregates detections from multiple sources and routes them for analysis, so a phishing workflow has to hold up inside a much larger queue.

What stays fixed is the method. A credential-harvest email, a BEC request, and a notification routed through legitimate infrastructure all resolve through the same five steps; the lure only decides where the scrutiny lands. For most teams the constraint is applying that method consistently at the volume the queue delivers, which is where automation has to carry the depth a manual process cannot sustain. See how Prophet AI investigates phishing on every reported alert.

70% of SOCs will pilot AI Agents. Only 15% will see results

This Gartner research arms security operations leaders with a list of specific questions to ask vendors during evaluation

Download Gartner Report
Download Ebook
70% of SOCs will pilot AI Agents. Only 15% will see results

Frequently Asked Questions

Insights
Exit icon