Visibility Events / Indexes / Data sources

Linux Endpoint Detection and Response (EDR) solutions rely on collecting and analyzing a wide range of visibility events, indices, and data sources to detect threats, investigate incidents, and provide comprehensive security monitoring. These data points are critical for understanding system behavior, identifying anomalies, and responding to security incidents. Below is a detailed explanation of the key visibility events, indexes, and data sources in a Linux EDR context:

1. VISIBILITY EVENTS:

Visibility events are specific activities or occurrences on a Linux system that are monitored and logged by the EDR solution. These events provide the raw data needed for threat detection and forensic analysis.

Process Events -> Monitors process creation and termination events on Linux systems:

Process Execution: Logs when a process is started, including the command line, parent process, and user context.
Process Termination: Tracks when a process ends, including the exit code and duration.
Process Tree: Captures the hierarchy of processes (parent-child relationships) to understand the sequence of execution.
Process Injection: Detects when a process injects code into another process (common in malware).

File Events -> Tracks file-level interactions on Linux systems, essential for uncovering persistence mechanisms, staging activity, and payload delivery:

File Creation/Deletion: Logs when files are created, modified, or deleted, including the file path and user context.
File Access: Tracks when sensitive files are read or executed.
File Integrity Monitoring (FIM): Monitors changes to critical system files (e.g., /etc/passwd, /bin/bash).

Script Events -> Provides visibility into script execution on Linux systems, capturing shell scripts, Python, and other scripting languages commonly used in attacks:

Script Creation: Logs when script files are created, modified, or deleted, including the file path and user context.
Script Execution: Provides inspection capability of executed scripts

Network Events -> Tracks network connections and DNS resolution on Linux systems to establish context around external communication and potential command-and-control.

Network Connections: Logs inbound and outbound connections, including IP addresses, ports, and protocols.
DNS Queries: Tracks DNS requests made by processes, which can reveal communication with malicious domains.
Socket Activity: Monitors socket creation, binding, and communication.

User Creation and Authentication Events -> provides insight into user account management activities on Linux systems, tracking the creation, modification, deletion, and authentication of user accounts.

User/Group Creation: Logs when users are created, modified, or deleted, including the binary and user context.
User Logins/Logouts: Tracks user login sessions, including SSH logins, console logins, and sudo usage.
Failed Login Attempts: Detects brute-force attacks or unauthorized access attempts.
Privilege Escalation: Logs when users escalate privileges (e.g., using sudo or su).

Scheduled Task Events -> captures telemetry related to scheduled tasks and cron jobs on Linux systems, which are common persistence mechanisms:

cron Execution/Modification

Service / systemctl Events -> tracks systemd service operations on Linux systems, which are often used for persistence or execution:

Service creation, modification, and deletion events

Kernel / Driver / eBPF Events -> Monitors kernel modules and eBPF events on Linux systems that may affect the stability, security, or integrity of the system.

System Calls: Monitors system calls made by processes, which can reveal malicious activity (e.g., execve, ptrace).
Kernel Module Loading: Detects when kernel modules are loaded or unloaded (common in rootkit attacks).
eBPF program Loading: Detects when eBPF programs are loaded or unloaded (common in rootkit attacks, credential stealers).

Container and Orchestration Events:

Container Start/Stop: Logs when containers are started or stopped.
Image Integrity: Monitors changes to container images.
Kubernetes API Calls: Tracks interactions with the Kubernetes API server.

Malware and Exploit Indicators:

Memory Anomalies: Detects unusual memory usage or code injection.
Suspicious Script Execution: Logs execution of scripts (e.g., Bash, Python) with unusual patterns.
Exploit Attempts: Detects attempts to exploit vulnerabilities (e.g., buffer overflows, privilege escalation).

Agent Events -> includes operational telemetry from the EDR agent itself to track its lifecycle and health on Linux systems:

Agent start and stop events
Agent status
Logs: warning, notice, critical
Agent connection status to the cloud API service

2. INDEXES:

Indexes are structured representations of the collected visibility events, enabling efficient searching, correlation, and analysis. Key indexes in a Linux EDR solution include:

Process Index

Tracks all running processes, including their command lines, parent processes, and user contexts.
Enables correlation of processes with other events (e.g., file access, network connections).

File Index

Maintains a record of file activities, including creation, modification, and access.
Helps identify unauthorized changes to critical files.

Network Index

Logs all network connections and DNS queries.
Enables detection of communication with known malicious IPs or domains.

User Index

Tracks user activities, including logins, privilege escalations, and file access.
Helps identify compromised accounts or insider threats.

Threat Intelligence Index

Integrates external threat intelligence feeds to enrich event data with known IOCs (e.g., malicious IPs, file hashes).

Compliance Index

Tracks events related to compliance requirements (e.g., failed login attempts, file modifications).
Generates reports for auditing purposes.

3. DATA SOURCES:

Data sources are the underlying systems or components from which the EDR solution collects visibility events. Key data sources in a Linux environment include:

System Logs

Syslog: General system logs that capture a wide range of events, including authentication, kernel messages, and application logs.
Auditd: The Linux audit daemon, which provides detailed logs of system calls, file access, and user activities.
Journald: The systemd journal, which collects structured logs from services and applications.

Kernel-Level Data

eBPF (extended Berkeley Packet Filter): A powerful framework for collecting low-level system data, including process execution, network activity, and system calls.
Kernel Audit Framework: Provides detailed logs of kernel-level events, such as process creation and file access.

File System

Inotify: A Linux kernel subsystem that monitors file system events, such as file creation, modification, and deletion.
File Integrity Monitoring (FIM) Tools: Tools like AIDE or Tripwire that track changes to critical files.

Network Data

Packet Captures: Tools like tcpdump or libpcap that capture network traffic for analysis.
Netfilter/iptables: Logs network connections and firewall events.

Container and Orchestration Data

Container Runtime Logs: Logs from Docker, containerd, or other container runtimes.
Kubernetes Audit Logs: Logs from the Kubernetes API server, capturing cluster activities.

Memory and Process Data

/proc Filesystem: Provides real-time information about running processes, memory usage, and system state.
Core Dumps: Captures memory snapshots of processes for forensic analysis.

Threat Intelligence Feeds

External sources of IOCs, such as malicious IPs, domains, and file hashes, which enrich the EDR's detection capabilities.

How These Components Work Together

Data Collection: The EDR solution collects visibility events from various data sources (e.g., syslog, auditd, eBPF).
Indexing: The collected events are indexed for efficient searching and correlation.
Analysis: The EDR engine analyzes the indexed data to detect threats, using rules, machine learning, and threat intelligence.
Alerting: When a threat is detected, the EDR generates an alert with detailed context.
Response: Security teams use the EDR's tools to investigate and respond to the threat, leveraging forensic data and automated response capabilities.

A detailed list of EDR visibility events includes:

Process events:
- Process creation
- Process access
- Process termination
- Process injection / shared object loaded
- Remote thread creation
- Process tampering
File events:
- File creation
- File Opened
- File Deleted
- File modified
- File renamed
Service events:
- Systemd Service Creation
- Systemd Service Modification
- Systemd Service Deletion
Driver Events:
- LKM Driver Loaded
- LKM Driver UnLoaded
- LKM Driver Modification
eBPF events:
- eBPF program Loaded
- eBPF Map Created
- eBPF List Created
Agent Events:
- Agent Start
- Agent Stop
- Agent Install
- Agent Uninstall
- Agent Tampering
- Agent Keep-alive
- Agent Errors
User / Group Account events:
- Local Account Creation
- Local Account Modification
- Local Account Deleted
- Local Group Creation
- Local Group Modification
- Local Group Deleted
- Account Login
- Account Logout
- SSH Key Modification
Network events:
- General network activities:
  - TCP Connection
  - UDP Connection
  - URL
  - File Downloaded
  - network history:
    - first seen
    - last seen
  - bytes sent/received
- TLS fingerprinting:
  - JA3 / JA4
  - JARM
- SSH fingerprinting:
  - HASSH
- HTTP logging:
  - non-HTTP traffic on HTTP port
  - More than one HTTP User-Agent per IP
  - Malicious HTTP Referer
  - Many users per IP
  - Bad HTTP User-Agent(s)
  - Many connections from "suspected countries" (ex. Russia)
  - Credit card strings in HTTP traffic
  - Strings with high entropy (encoded etc) in HTTP traffic
  - (Malicious) URL patterns
  - HTTP header-based detection of attacks
  - Detection of HTTP C2 Activity
  - Web shells detection
  - Number of requests
  - Number of requests for a domain name
  - Number of requests to URL
- DNS:
  - DNS logging
  - Number of distinct DNS servers in use
  - Number of NXDOMAIN responses by a client
  - Number of DNS requests with given entry type
  - Maximum length of DNS request with given entry type
  - Number of DNS queries for local DNS domain only
  - Number of DNS queries for local DNS domain by a client
  - Number of usages of specific DNS server by a client
  - Number of different ASNs per list (round robin DNS) of resolved IP addresses
  - Reverse DNS (PTR record) domain match resolved IP
  - Domain name Creation Date is taken from registration/whois info
  - Number of Google Results about queried DNS domain name
  - Unusual Time of specific DNS query
  - Domain name First Time Seen / last time seen
  - Maximum size of DNS response no matter what type
  - Long 2nd level domain name
  - Long DNS domain name, with a high number of subdomains
  - Length of the largest meaningful string in queried DNS domain name
  - Percentage of numerical characters per DNS domain name
  - Length of the largest meaningful string in queried DNS domain name
  - Number of ALL DNS queries per client
  - Short life DNS domain query; Ranges of TTL values for DNS response with given entry type
  - Domain state
  - Daily similarities in DNS request count per client
  - Number of distinct TTL values in DNS response
  - Malicious, malware, phishing domain name lists → security feeds
  - Number of distinct DNS servers in use
  - The sum of bytes of DNS requests with a given entry type
  - The sum of bytes of DNS responses with a given entry type
  - The sum of bytes of all DNS requests
  - The sum of bytes of all DNS responses
  - The sum of bytes of all DNS transmission
  - Number of hosts with non-corporate defined DNS servers
  - High number of DNS responses with NXDOMAIN error code (DGA)
  - High number of potentially malicious types of DNS queries (TXT, MX)
  - The ratio of queried DNS record types per client
  - Top queried DNS record type per client
  - DNS Records with anomalies (like higher entropy, longer than average etc)
  - Possible similar-looking domain name detection → dnstwist integration
  - Signals coming from ET rules related to DNS Query/Responses
  - TOR-based domain names
  - DNS rebinding
  - SRV DNS records
  - Punycode TLD
  - 2 letter subdomains
  - The query for the sub-domain name contains the IP address
  - Domain name based
  - IPs in DNS response (single, round-robin DNS) are in the same IP class (B) as known C&C servers (from feeds)
  - Low-frequency querying (we need to log all DNS queries for this)
  - DNS resolve to localhost (loopback) IP
  - DNS tunneling

Enhanced Linux telemetry enables organizations to build more robust detection mechanisms, accelerate incident response, and empower defenders with the visibility they need to safeguard their most essential systems. During the next steps, you will learn how to validate your telemetry and general OS/network visibility effectively.

Linux EDR solutions rely on a rich set of visibility events, indexes, and data sources to provide comprehensive security monitoring. By collecting and analyzing data from system logs, kernel-level events, file systems, network activity, and containers, EDR tools can detect threats, investigate incidents, and ensure the security of Linux systems in dynamic and complex environments. These components work together to provide the visibility and context needed for effective threat detection and response.