OSQuery for DFIR: Turning Your Linux System Into a Live Forensics Database
- DFIRHive

- Oct 17
- 8 min read

I’ve been using OSQuery a lot recently during live response investigations, and it keeps reminding me why it deserves more attention in digital forensics. It changes how you look at an endpoint.
Instead of jumping between a dozen utilities — ps, netstat, find, grep — OSQuery gives you the same visibility in one structured interface. It lets you query your operating system as if it were a database, and it just answers — no parsing, no scripting, no noise.
“What I love most about OSQuery is that it treats the operating system as a relational database — every process, socket, user, or persistence mechanism is represented as a table.”
"All you do is ask the right question in SQL."
Setting Up:
When performing live response on a compromised system, the first rule is simple: avoid installing anything new. Even a harmless binary can modify volatile evidence or alter timestamps that matter later.
I usually run OSQuery as a static binary, either from a removable drive or a mounted forensic image. This keeps the target system untouched.
If you’re testing in your lab or an internal endpoint, you can install it normally. For this walkthrough, we’ll use the standard install method.
Let’s start with setup.
Step 01: Update the package list
sudo apt update

Step 02: Download the right package
Head to the official OSQuery downloads page and grab the package suitable for your environment. For me, that’s Debian x86_64.
The current stable version (at the time of writing) is 5.19.0.

Step 03: Install OSQuery
Once downloaded, install it using your preferred method.
If you’re using a .deb package:
sudo dpkg -i /home/ubuntu/Downloads/osquery_5.19.0-1.linux_amd64.debAnd, make sure the service is enabled:
sudo apt install osquery
sudo systemctl start osqueryd
sudo systemctl enable osqueryd
Step 04: Launch OSQuery in interactive mode
sudo osqueryi
You’ll see:
Using a virtual database. Need help, type '.help'
osquery>
This shell acts like a SQL interface to your live system — you can query tables and explore data without leaving the terminal.
Exploring OSQuery Interactive Mode:
To understand the tool, run the .help command in the interactive terminal

Meta Commands
All meta commands start with a dot (.). These helps to explore and inspect OSQuery’s structure.
.tables
.tables process
.table users
.tables: To view everything OSQuery can query on your system.
.tables process: If you’re only looking for process-related data.

.table user- To list all the tables with the term user in them

Inspecting Specific Table Structures:
Before querying a table, it’s good practice to see what fields it contains: We can list a table's schema with the following meta-command: .schema table_name
.schema users

Now, let’s query few columns to pull live data:
SQL QUERY SYNTAX: select column1, column2, column3 from table;
Example:
SELECT username, description, uid, shell, directory FROM users;
Schema Documentation(Online)
OSQuery maintains an updated schema reference for every release. It’s available here: https://osquery.io/schema/5.19.0
This documentation shows:
All tables available in the current version (e.g., 156 for Linux).
OS compatibility per table.
Each table’s column names, data types, and descriptions.

Here’s what to focus on when navigating it:
Version Selector– Choose the OSQuery version you’re running (for me, 5.19.0). Each version can have different tables or fields.
Table Details – Shows the table name, description, and the columns it exposes. Each column lists its data type (TEXT, INTEGER, BIGINT, etc.) and meaning.
Platform Filter– One can toggle between Linux, Windows, or macOS to see which tables apply to your environment.
Alphabetical List– The sidebar lists all available tables (156 in this version), matching what you’d see with .tables inside OSQuery.
Description Reference – If a column or table isn’t documented clearly, the page links directly to its source on GitHub .
Running Our First Queries — Asking the System Real Questions
OSQuery supports SQLite syntax, so most of your commands will start with SELECT. We’re not updating or deleting data; we’re querying what already exists.
Let’s start simple.
Example 1: To view installed packages on a Debian-based system:
SELECT name, version, source, size FROM deb_packages LIMIT 1;

This command pulls key details — package name, version, source, and size — helping to identify software versions or potential rogue installations.(LIMIT was used followed by the number to limit the results to display.)
Example 2: Count All Packages
In case, need a quick overview:
SELECT count(*) FROM deb_packages;

This count gives a fast baseline — especially useful when comparing clean vs. infected systems or checking integrity during investigations.
Example 3 — Filtering Data Using WHERE
Use it when we want to focus on a specific user or object instead of pulling the entire table.
SELECT * FROM users WHERE username='ubuntu';

Note: Some tables require a WHERE clause — for instance, the file table — or they’ll throw an error. OSQuery does this to protect system resources from being scanned too broadly.
Example 4 — Joining Tables for Context
Now that we’ve looked at filtering, let’s add more context to our data.
For example, suppose we want to see which processes are running under which user. We can join the processes and users tables using their shared column -uid
SELECT p.pid, p.name, p.path, u.username
FROM processes p
JOIN users u ON u.uid = p.uid
LIMIT 10;

Quick Tip — Using Operators and Wildcards
When filtering results with WHERE, we can also refine searches using simple operators or wildcards.
Common Operators: = (equal), <> (not equal), >, <, >=, <=, BETWEEN, and LIKE (pattern match).Wildcards:
% → matches multiple characters
_ → matches a single character
Using OSQuery for Live Response: Hunting Users, Processes & Footprints
Now that we’ve got the basics of OSQuery and understand how it structures system data, let’s look at a few practical queries that actually help during live response.
Once comfortable with these, one can explore further depending on what kind of data or artifacts you’re after.
Investigating User Accounts
Let's start by checking the users table. It’s a simple table, but it helps establish quick context — especially if there are signs of privilege escalation or lateral movement.
SELECT username, uid, description FROM users;

This lists all user accounts present on the host.
It’s always worth comparing this list against the expected set of system users.
Process Hunting:
It’s basically a structured version of ps aux, which is to list all the running processes,but far easier to filter, sort, and combine with other data.
SELECT pid, name, path, state FROM processes;
pid: The unique process ID
name: The name of the executable
path: The full filesystem path to the executable
state: The current state of the process (Idle / Sleep, etc.)
Finding Suspicious Execution Paths
Malicious Actors often run binaries from writable locations like /tmp or /var/tmp to avoid detection and bypass permissions.
SELECT pid, name, path FROM processes WHERE path LIKE '/tmp/%' OR path LIKE '/var/tmp/%';
If anything appears here, it’s worth checking its parent process, execution time, and which user triggered it.
Detecting Fileless or Memory-Resident Processes
One useful feature in OSQuery is its ability to detect processes that exist only in memory — meaning they’re not tied to a file on disk. This often points to fileless malware or injected code.
SELECT pid, name, path, cmdline, start_time
FROM processes
WHERE on_disk = 0;

If on_disk is 0, OSQuery couldn’t locate the binary on the filesystem.This doesn’t automatically mean it’s malicious — some legitimate processes spawn temporary child processes — but it’s always worth verifying.
Finding Orphan Processes
A normal Linux process tree always has parent-child relationships. Suppose we found a process without a parent, it’s worth checking. Sometimes this happens naturally, but in compromised systems, it can point to persistence or stealth mechanisms.
SELECT pid, name, parent, path FROM processes WHERE parent NOT IN (SELECT pid FROM processes);

Processes Executed from User Directories
Legitimate system binaries usually run from locations like /bin, /usr/bin, or /sbin.
Anything executing from /home or /Users is worth reviewing.
SELECT pid, name, path, cmdline, start_time FROM processes WHERE path LIKE '/home/%' OR path LIKE '/Users/%';

Network Triage: Mapping Active Connections
After reviewing processes, the next l step is checking network activity.
The goal is to connect the dots — which process is communicating over the network, and where.
Listing Active Connections
SELECT pid, family, remote_address, remote_port, local_address, local_port, state
FROM process_open_sockets LIMIT 20;

This query works like netstat, but with process mapping built in. It lets you see which PIDs are tied to which network sockets — a clean way to identify connections by process name.
Identifying Remote Connections
Outbound connections, especially to unknown IPs, can reveal C2 or data exfiltration activity. Let's now filter for any process with a remote connection:
SELECT pid, local_address, remote_address, local_port, remote_port
FROM process_open_sockets
WHERE remote_address IS NOT NULL;

To make it easier to read, join this with the processes table:
SELECT p.name, p.path, s.remote_address, s.remote_port
FROM processes p
JOIN process_open_sockets s
ON p.pid = s.pid
WHERE s.remote_address IS NOT NULL;
That query will give a direct process-to-network map — useful when investigating possible outbound traffic or beaconing.
Listing Interfaces and Listening Ports
To see all configured network interfaces and listening ports:
SELECT * FROM interface_addresses;
SELECT * FROM listening_ports;


These help identify open or bound services. Unexpected listening ports (especially high-numbered ones) might suggest unauthorized services.
Checking DNS Configuration
Sometimes persistence or redirection happens through DNS configuration changes. To identify this:
SELECT * FROM dns_resolvers;

While this table is limited, it can still reveal modified resolvers or unexpected DNS entries — both common in redirected traffic setups.
Disk and File Artifacts
Once processes and network activity are mapped, the next step is looking at what’s happening on disk. Even with stealthy attacks, there’s usually something left behind — temporary files, hidden binaries, or modified configurations.
Listing Open Files
This shows files currently open by active processes.
SELECT pid, fd, path FROM process_open_files;

File accessed from temp directory
To focus on temporary or suspicious areas:
SELECT pid, fd, path FROM process_open_files WHERE path LIKE '/tmp/%';
If you find something odd (e.g. /tmp/.hidden), use its PID to check which process owns it:
SELECT pid, name, path FROM processes WHERE pid = '556';Finding Hidden Files
Attackers commonly hide files by prefixing them with a dot (.). This will examine the root directory.
SELECT filename, path, directory, size, type FROM file WHERE path LIKE '/.%';
If you find executables under /, /etc/, or /usr/bin/ that start with a dot, they’re almost never legitimate.
Note: Extend this query to include /tmp/, /opt/, or /var/log/ depending on what you’re investigating.
Recently Modified Files
Recently changed files can point to when the compromise occurred.
SELECT filename, path, directory, type, size
FROM file
WHERE path LIKE '/etc/%' AND (mtime > (strftime('%s','now') - 86400));
This looks for files modified in the last 24 hours under /etc. This is to to check for edited configuration files, modified SSH keys, or new cron jobs.
Recently Modified Binaries
Attackers sometimes tamper with legitimate binaries to add malicious code.To find binaries changed in the last 24 hours:
SELECT filename, path, directory, mtime
FROM file
WHERE (path LIKE '/opt/%' OR path LIKE '/bin/%')
AND (mtime > (strftime('%s','now') - 86400));Even if this query comes back empty, it’s good for baseline validation during early triage.
Checking Suspicious Packages
In Linux environments, attackers may install packages to maintain access or hide actions. We can identify recently installed or modified packages with:
SELECT name, version, source
FROM deb_packages
WHERE install_time > (strftime('%s','now') - 86400);This highlights any packages added within the last day — a quick way to detect persistence through software additions.
Closing Thoughts
After spending time using OSQuery, it’s clear that it’s more than just a collection tool — it’s a solid framework for endpoint visibility. It fits naturally into forensic workflows because it’s fast, structured, and doesn’t interfere with the system.




Comments