Splunking questions from BOTS v3 dataset – Q215

The idea is pretty simple: let’s pick a few questions from the SOC BOTS v3 dataset and try to find the answers by leveraging plain SPL to find the answers. To make it easier to read, one question at a time.

If you are not familiar with BOTS game, go ahead and click here before proceeding and make sure you sign up for the next challenge! The featured header image shows how they carefully treat the winner (Photo credit: Jesse Trucks).

Did I check the answers before engaging on hunting? Of course not! 🤞 That’s part of the fun. But to be fair, once I’m not able to find the answer, this is going to be highlighted. Besides sharing some SPL-foo, hopefully this provides more inspiration to Dave and Ryan to keep up with their challenging game!

For all queries, the following prefix is used so that the whole dataset is considered, unless I narrow it down by sourcetype, which is IMO the recommended minimal constraint for a query.

index=botsv3 sourcetype=* earliest=0

The instructions to make the dataset ready in your lab are available from GitHub here.

First question, please!

What is the FQDN of the endpoint that is running a different Windows operating system edition than the others?
BOTSv3 question #215, 500 points

Just to give you an idea about the weight of each question, there are questions ranging from 100 to 1000 points in the dataset. Once a wrong answer is provided, the team score is decreased (penalty). So planning the strategy is part of the fun given that time is also limited.

I carefully selected this one to demonstrate a few ‘base query’ tricks not well known to some users. Without any previous context, one can assume “windows” and perhaps “operating system version” should be strings part of the raw events. How that would translate to SPL for a beginner?

As you can see, that query alone yields more than 3000 events (out of ~2 million) while returning only one sourcetype (Sysmon) out 100+. Despite making it a good small sample to start with – remember the beginner mindset here – it’s still too many events to sift through.

For experienced Sysmon data analysts that’s an easy one: is the OS version available from the Sysmon events? In this case, we need to check which Event IDs (field EventCode) are available from the resultset:

That means, only Event ID 1: Process creation is returning (more here). After zooming in one event, that’s easy to confirm: no OS version in there (Product only hints it’s Windows OS):

By the way, if you want to quickly check or manipulate all field names available from a query via SPL, here’s a command that comes in handy in many other situations (guaranteed!): fieldsummary:

Simply add | fieldsummary to the end of a query to get this quick analysis (and all possible extracted field names)

The discovery query

So what now? Let’s widen the scope for more possibilities and break it down by sourcetype while returning the latest matching _raw event (actual log) for each sourcetype (that would be my first shot!).

WinHostMon sourcetype provides 204 events, and those hold the OS version

That seems promising! What happens there?

In one screen, you get a good glimpse of what the data is about, while spending only a few secs (~2s in my machine).

Logical operators

Splunk implicitly adds an AND between search parameters. That is, the following query:

index=A sourcetype=B

Is the same as:

index=A AND sourcetype=B

Therefore, the following SPL from the base query (first line) reads as “Splunk, from all sourcetypes part of the botv3 index, give me the events containing the string “Windows”, AND the any of the strings “os” OR “operating”, AND any of the strings “ver” OR “version”.

index=botsv3 sourcetype=* earliest=0 CASE(Windows) (os OR operating) (ver OR version)

Time Boundaries

The key-value pair earliest=0 controls where the time-frame starts as timestamp. That’s a quick way to scan for all events matching the query constraints – regardless of the event’s _time value.

The end (latest) is also implicit and defaults to “now”. That’s a common practice when dealing with simulation events or datasets for which we don’t know the exact time boundaries upfront. Otherwise, the Time Picker is the way to go.

Case-Sensitiveness

By default, the base query is NOT case sensitive for values (think field=value).

That means, the following query is still going to provide hits:

index=boTsV3

While the following is not:

iNdex=boTsV3

Same applies for single/loose strings or terms, which are treated as values (ex.: os, operating).

To speed things up, I simply guessed the OS name would come capital. For instance, all AV-based events (endpoint telemetry) I’ve seen so far behave like that (capitalized Windows string).

Splunk provides the CASE() directive to allow case-sensitive queries based on field values. Therefore, only events containing Windows with capital ‘W’ will return from above query.

But that’s not the best part of it.

If you are not familiar with Bloom Filters, please do yourself a favor and make sure you leverage that whenever possible, especially for those bucket-heavy queries. That’s one of the main search engine features contributing towards a faster search against raw events. The following part is definitely contributing for a faster search:

(os OR operating) (ver OR version)

Speaking of blooming, life is not a bed of roses. Knowing when to use (loose) search terms in combination with regular field=value deserves its own post, read more about it here.

Finally, the answer!

The previous query results suggest we check for all possible OS and Version tuples available per host, from WinHostMon sourcetype. That translates into a more constrained query:

Bingo! OS and Version were easily guessed as extracted fields from the raw event

So “BSTOLL-L” seems like our candidate here! It’s the only host running a different OS among all hosts.

Wait, the question asks for a FQDN!

The question asks for FQDN value. That is, a Full qualified Domain Name. How to guess that one?

Again, leveraging a bit of experience here. Remember, even if that would take more time, the team must hunt multiple answers in parallel, so you may leave this query running while skipping to the next.

My next query looks like this:

Less than a second for that one to return!

Now, check the content’s of ComputerName field in that event.

I’m basically betting on finding the answer on the first event returned from that query: | head 1. Since a short name has no dots in it, as soon as we spot an event containing a string like BSTOLL-L (dot) something, we know it’s very likely the FQDN or long name attributed to that host.

The EVAL’s match() function is perhaps one of the most useful for string matching (hence the name?), this is particular important for writing reliable correlation searches and such.

So it makes it a good guess, no? After checking the answers doc, we can confirm the answer is indeed “bstoll-l.froth.ly“, which corresponds to the FQDN of host BSTOLL-L!

Hope this is useful for beginners or BOTS newcomers while somehow entertaining for long-time Splunkers. Until next question!

Opstune.com