The Rise of Big Data | Solving Today’s Challenges with SentinelOne XDR

Extended Detection and Response (XDR) has become a prominent topic amongst security vendors and analysts in recent months. The promise of improved threat detections across a broader range of interconnected hardware and software solutions feels a lot like the early days of SIEM as it expanded beyond simple log management capabilities. Like with most legacy SIEM deployments, early XDR customers have struggled to balance investments in tooling and positive business outcomes. In fact, most enterprises have yet to fully consider the cost and complexity associated with data collection and analytics required by some vendor XDR solutions. Let’s take a look at the challenges in more detail and consider how SentinelOne is revolutionizing the XDR landscape by tackling one of the largest and most complex obstacles threatening successful XDR adoption: data management at scale.

Data Is Growing, Exponentially

IDC predicts that by 2025, the total volume of data stored globally will reach 175ZB! That’s a whopping 5-fold increase from 2018 (33ZB). For those that stopped counting at gigabytes, one zettabyte is equal to one trillion GB. Stored on DVD media, that would be a stack of disks spanning to the moon and back 12 times! But how does this data break down, and how much of this can be used to inform better security decisions to keep enterprises safe from targeted attacks? Let’s dig a little deeper into the data.

Of the predicted 175ZB, roughly 85% is enterprise and/or public cloud data storage. More importantly, IDC predicts that by 2025 as much as 30% of this data will be classified as ‘real-time, sensorized’ telemetry from endpoint and IoT devices. This presents an enormous challenge – as well as opportunity – for enterprises looking to improve their security posture by leveraging this abundant wealth of data.

Source: IDC/Seagate DataAge Study

Remember that data alone is not useful and that more data does not magically become more useful by volume. Data must be contextualized and analyzed to become information. By that same understanding, we know that information only becomes knowledge once we apply meaningful linkages between multiple points of information, assembling the contextualized data into actionable results. Therefore, data without context tends to be superfluous, and our human brains quickly try to expel such unimportant bits of data.

Effective Data Management Requires Context

Most enterprises today generate mountains of telemetry data for each and every entity including the activity logs from users, devices, applications and sensors. In this ‘age-of-observability’ we can be certain that nothing important happens without a corresponding record of it having occurred. This typically takes the form of a log or event: a transactional message that describes the entity, action, attributes and possibly a response condition. Additional forms of telemetry can contain simple metrics containing sampled or summarized measurements.

Information security has taught us that even the most innocuous and banal sets of data might somehow be relevant in the scope of an investigation or malicious detection. Frequently, we don’t know what we don’t know until long after a successful breach from a stealthy adversary. While most attacks can be thwarted by an effective endpoint detection and prevention platform before they impact the enterprise, analysis of the breadcrumbs trail left behind can be the only effective means to identify the attackers’ TTPs (tactics, techniques and procedures) as well as possible motivations and the scope of an attack.

Singularity ActiveEDR/XDR leverages the unique capabilities of SentinelOne’s patented Storyline technology to stitch together disparate security events into a single timeline and attack visualization, complete with MITRE ATT&CK technique attribution as well as threat actor details where possible.

Rogue Devices / Shadow IT Creates Information Blind Spots

There’s also much to be learned by what is NOT in the sensor data collected by an enterprise. Attackers are opportunistic and will target any and all exposed devices – not just the ones that are known to the security operations team. As the enterprise attack surface expands (thanks to IoT, cloud transformation, containerized workloads and BYOD) so too does the need to expand our sources of telemetry, minimizing or eliminating any blind spots that inevitably exist.

Most organizations struggle to maintain an accurate inventory of connected devices, and fewer yet have the ability to identify when rogue or orphan devices appear on the network that could pose a potential security risk.

By harnessing the existing sensor grid – and the data collected from it – enterprises can more quickly identify gaps in security coverage to protect  more of the attack surface. When event volumes from existing sensors change without a justified policy modification, security operations can be notified to ensure a configuration change – whether malicious or benign – hasn’t left the device in a state where logging is disabled or reduced.

Singularity Ranger provides enterprises visibility into the entirety of their device estate, enabling security operations teams to quickly identify unmanaged/at-risk devices, fingerprinting their characteristics and highlighting those without protection capabilities. Ranger Deploy can then perform remote agent installation and policy enforcement of supported systems to reduce the enterprise attack surface and improve an organization’s security posture.

Singularity Ranger
Network Visibility & Control. A cloud
delivered, software-defined network discovery solution designed to add global network visibility and control with minimal friction.

Accessibility Through Integration

The volume of sensor data is not the only significant challenge facing enterprises today. More important is the location and cross-platform accessibility of discrete data silos. In cybersecurity use cases, this has for years been the purview of a Security Information Event Management (SIEM) platform where logs/events were collected and stored from the most common sources of telemetry, namely firewalls, intrusion detection platforms, legacy antivirus solutions and a short list of critical server assets.

With the advent of Endpoint Detection & Response (EDR) enterprises have access to enormous volumes of high-fidelity, high-value, real-time event data from protected endpoints, but this data typically resides in an entirely separate data repository from SIEM. As more enterprise workloads are moved to PaaS/IaaS solutions, we see yet another disconnected silo of data from a new set of sensors.

Combining these disparate and quite unique sets of endpoint, cloud, network and security data in one location is costly, and the value realized is often difficult if not impossible to justify. As enterprise security architectures become more diverse, it is more important than ever that cross-vendor data analytic models become part of an effective detection and protection arsenal.

The Singularity Marketplace ensures that the growing list of partners in the SentinelOne security ecosystem can be easily integrated into both the data collection pipeline as well as the response and remediation options of a diverse enterprise.

The sheer number of telemetry sources, combined with the unique nature of each data source (different formats, content, context and cardinality) has created a challenging data problem for today’s enterprise. To effectively consume, parse, enrich, normalize, store and analyze this massive set of data is not a cost-effective proposition for most organizations. As a result, most enterprises are faced with the burden of selectively choosing which data sources to process based on the perceived value of each as it relates to business process improvement or greater security efficacy.

Singularity Marketplace
Extend the power of the Singularity XDR Platform with one-click applications for unified prevention, detection, and response.

All data is not equal in terms of value from a security operations perspective. Sometimes, the easiest data to consume (WMI logs from Windows, for example) can be the least useful in terms of threat detection and security incident triage. More often, the most voluminous logs within an enterprise like network flow data, email transaction logs, DNS request/response events and authentication alerts provide greater value, but the low signal-to-noise ratio makes them too cumbersome to collect and process in real-time without an efficient, performant and scalable data management platform.

Data Retention: The Key to Effective Threat Hunting

Another challenge facing enterprise security teams is the cost implication of long-term retention and searchability of collected telemetry. Consuming high-value, high-volume data but being forced to ‘roll over’ after 30 days certainly fails the SecOps use case of historical hunting.

In fact, most vendors tend to cap retention at between 7 and 30 days! As we saw recently with the SolarWinds supply chain attack, it was months before the security community were made aware of the malicious artifacts and adversarial TTPs. This meant that many organizations were unable to perform the historical hunting across the relevant time window because those logs had already aged out of the platform or had been moved into offline archives making it difficult to triage the scope of the attack.

Customers of the SentinelOne Singularity platform can perform real-time threat hunting across a live 365-day retention period, allowing SOC analysts full artifact and adversarial TTP visibility across an entire year of event collection.

Automated hunting and alerting rules can be created using SentinelOne patented STAR™ (Storyline Active Response) functionality, triggering on data from real-time and historical EDR and 3rd-party telemetry stored in the Deep Visibility data store. Content packs containing relevant adversarial artifacts (IoCs) are published for automated detection of known threat actor campaigns. For even longer term retention, we will be offering a capability called HindSight, which provides a facility to archive even longer periods of data for limitless retroactive threat hunting across the entire scope and duration of data collected.

Key Takeaways

The solution to the challenges of data management at scale is a data management strategy that democratizes the data generated, collected and analyzed by an enterprise.

As a general rule:

  • No one application should hold your data hostage
  • Duplication of data in multiple repositories is costly and unmanageable
  • Maintaining disparate data silos leads to missed threat detections and blind spots in security incident triage and scoping efforts
  • Enterprises should never be faced with the necessity to collect/store reduced volumes of highly relevant sensor data to justify the Cost:Value equation

In the next post in this blog series on XDR, we will highlight some of the unique capabilities delivered through the SentinelOne Data Platform (formerly Scalyr Event Data Cloud). Stay tuned for a deeper look into how SentinelOne is transforming the XDR landscape with unparalleled sensor collection and processing capabilities, improved signal to noise reduction, meaningful threat detections that span multiple sources and prescriptive and actionable response integrations.

SentinelOne Singularity XDR
See how SentinelOne XDR provides end-to-end enterprise visibility, powerful analytics, and automated response across your complete technology stack.