MustLearnKQL Book
MustLearnKQL Book
MustLearnKQL Book
Query Language (KQL). If you’d like the 90-second post-commercial recap that seems to
be a standard part of every TV show these days…
The full series index (including code and queries) is located here:
https://aka.ms/MustLearnKQL
This book is updated every time a new part of this series is posted. The most current
edition of this book will always be located at: https://cda.ms/3m1
Reference........................................................................................................................................................................................... 5
Tools ..................................................................................................................................................................................................... 6
Video .................................................................................................................................................................................................... 7
Must Learn KQL Part 4: Search for Fun and Profit .................................................................................................. 18
Charts........................................................................................................................................................................................... 31
EXTRA ................................................................................................................................................................................................ 32
Format query............................................................................................................................................................................ 35
Settings........................................................................................................................................................................................ 38
In-UI Reference........................................................................................................................................................................ 39
Tabs............................................................................................................................................................................................... 40
Where Operator...................................................................................................................................................................... 52
After hearing that our customers’ largest barrier to using things like Defender,
Microsoft Sentinel and even reporting for Intune is KQL, the query language, that
was a wake-up call for me. And, of course, (if you know me) I want to do something
about it. KQL is a beautifully simple query language to learn. And, believe me – if I
can learn it, there’s no question that you can learn it. I feel bad that there’s just not
enough knowledge around it because I’ve taken for granted that everyone already
had the proper resources to become proficient. But, that’s not the case.
Internally, plans are being developed now to make KQL learning a bigger focus and
you’ll see new education around this query language start to take shape in various
areas on the Microsoft properties and elsewhere. So, that’s good news for
everyone.
There’s bits and pieces already scattered about the Internet, but they are seemingly
now difficult to identify and locate.
So, as a first step in a series that I’ll be writing called “Must Learn KQL“, I want to
supply some good resources that can be used to accomplish the other things I’ll talk
about going forward. Some of these I use everyday. Some I use only when the need
arises, but they’re valuable nonetheless. This is a working document, so expect
updates over time. This is not a definitive list by any means, so if you have other
resources not listed here that you find valuable and believe others would benefit,
let me know and I’ll add them in.
Stay tuned as I map out this series. Of course, since my area of forte at Microsoft is
security, the series will be security focused. So, the knowledge you gain will help
you with our security platforms but also anything data centric that utilizes KQL.
One last tidbit of a tip… I use Microsoft Edge’s Collections feature quite a bit. This is
an extremely useful tool for capturing and grouping topics. If you find any of the
links below valuable, I suggest using Edge Collections so you can always come back
to them later.
Reference
The code repository for this series (GitHub)
Kusto Query Language Reference Guide
Practice Environments
Write your first query with Kusto Query Language (Learn module)
Data Explorer – not security focused. Contains things like geographical data and
weather patterns. Exercises for this can be found in the Learn Azure Sentinel book
below.
Actual Books
Learn Azure Sentinel: Integrate Azure security with artificial intelligence to build
secure cloud systems – this book uses Data Explorer (see above) for hands-on
exercises.
Azure Sentinel in Action: Architect, design, implement, and operate Azure Sentinel
as the core of your security solutions – this book is the next edition of the one just
above and also used Data Explorer for hands-on examples.
Tools
Kusto.Explorer – a rich desktop application that enables you to explore your data
using the Kusto Query Language in an easy-to-use user interface.
Kusto CLI – a command-line utility that is used to send requests to Kusto, and
display the results.
getschema operator – As I noted in Part 5 of this series: this is the Rosetta stone of
KQL operators. When used, getschema displays the Column Name, Column Ordinal,
Data Type, and Column Type for a table. This is important information for filtering
data. Part 5 talks about this.
Kusto King
Video
TeachJing’s KQL Tutorial Series
Azure Sentinel webinar: KQL part 1 of 3 – Learn the KQL you need for Azure
Sentinel
Azure Sentinel webinar: KQL part 3 of 3 – Optimizing Azure Sentinel KQL queries
performance
Matt Zorich’s (the originator of the #365daysofkql Twitter hashtag) KQL queries
To start the journey learning KQL in this Must Learn KQL series, it’s helpful to
understand where the name KQL came from and why the reference makes so
much sense. Once you understand the idea behind the query language, a lightbulb
should go off and prepare you for the rest of the series through an expanded scope
of learning capability.
Plus, not everyone knows about this, so you’ll be the cool kid. And, if you ever
play Trivial Pursuit and this question comes up, you’ll win the pie piece and possibly
the entire game. How can that not be good knowledge?
The question?
Growing up, my family was one-of-those-families that attended church anytime the
church doors were open. As such, the majority of my parents’ friends were at
church. This meant that they would spend time before and after church services
catching up with their friends, sometimes in a local restaurant where they’d all
gather to have pie and coffee. Of course, Facebook didn’t exist then, so in-person
connections were even more important. Well…and there was pie. My mom, in
particular, wanted to catch up with everyone she hadn’t seen in a few days so this
meant that our round-trip from home to church and back could take 3-4 hours.
On Sunday nights this was particularly problematic for me in that I wanted to rush
home to catch TV shows like the Six Million Dollar Man, The Magical World of
Disney, Mutual of Omaha’s Wild Kingdom, and the TV show that’s the topic of our
discussion here: The Undersea World of Jacques Cousteau…
That’s right. KQL is named after the undersea pioneer, Jacques Cousteau.
So, as you can imagine, I tried my dead-level best every Sunday night to rush my
mom along. It didn’t always work and was mostly just annoying, and you can bet I
caught a few groundings from my insistence. But, still, this topic of discovering the
undiscoverable drove me to concoct every type of machination imaginable to get
home sooner on Sunday nights. I can’t tell you the number of times I faked illness
on Sunday afternoon in attempt to stay home Sunday night. And, as you can
imagine my mom quickly caught on and instituted a policy that if I stayed home on
Sunday nights, I couldn’t go to school on Monday. Which…at the time…I truly loved
school, so that halted that plan. Give me a few years, and that wouldn’t have
worked. Timing is everything.
So, KQL is named after Jacques Cousteau. Even today, you can find evidence of this
in our own Azure Monitor Docs. If you go to the datatable operator page right now,
you’ll still find a reference to Mr. Cousteau in an example that lists his date of birth,
the date he entered the naval academy, when he published his first book entitled
“The Silent World: A Story of Undersea Discovery and Adventure,” and the date of
his passing.
So, I hope you’re catching on to this. If not, what is it that we are trying to
accomplish when we query data tables for security purposes? What is that we’re
trying to accomplish though Hunting exercises and operations?
The answer? We are exploring the depths of our data. We are attempting
to surface the critical and necessary security information that will tell us about
potential exposure through simple, powerful queries.
Much like the story of the failed voyage of the Titanic. It wasn’t the beautiful,
pristine, easy-to-see and avoid iceberg mass that existed above the surface of the
ocean that sunk the unsinkable ship and sent over 1,500 people to their grave. No,
it’s was the huge mass under the surface that the captain and crew couldn’t see and
couldn’t swerve to avoid that doomed the luxury passenger liner.
And, like that, it’s the information that exists underneath the viewable rows and
columns of data in our tables that we need to expose to identify threats and
compromise and use to guard the gates. Just the initial rows and columns of
exposed data isn’t enough. We must delve into the depths of the data to find
actionable information. And we need to do it quickly.
As a security person, you know that if a threat exists in the environment, you are on
the clock to discover it, report it, investigate it, and remediate. A poorly performing
query language can be the biggest barrier to that and become a security flaw. I’ve
sat with customers who use other query languages and other SIEM-like tools that
thought it was status quo that query results would take hours or sometimes days.
When I showed that KQL produced those same results in seconds, they were
astonished. So, the technology and infrastructure behind the query language is also
critically important.
In the next post, I’ll talk about the actual structure of a query. Even though the
structure can deviate, understanding a common workflow of a KQL query can have
powerful results and help you develop the logic needed to build your own
workflows when it’s time to create your own queries. In addition to being well-
performing to enhance efficiency, the query language itself is simple to use and
learn which, in turn, makes for more efficiency.
So, while we’re Just Above Sea Level in this post (I hope you now appreciate the
reference), we’ll be using KQL as the sonar and diving bell to search the depths of
our data.
I tell customers all the time that it’s not necessary to be a pro at creating KQL
queries. It’s OK not to be a pro on day 1 and still be able to use tools like Microsoft
Sentinel to monitor security for the environment. If you understand the workflow of
the query and can comprehend it line-by-line, you’ll be fine. Because ultimately, the
query is unimportant. Seriously. What’s important for our efforts as security folks is
the results of the query. The results contain the critical information we need to
understand if a threat exists and then – if it does exist – how that threat occurred
from compromise to impact.
Now, those that go on to develop their own queries and own Sentinel Analytics
Rules after becoming a KQL pro will be much more capable. And that should be
your goal, too. BUT don’t get hung up on that. Again, it’s about the results.
We’ve made it so crazily easy to share KQL queries that it’s quite possible you may
never have to create your own KQL query (aside: I highly doubt it but COULD BE
possible).
In a future post in this series, I’ll go over the actual interface you use to write and
run the KQL queries in-depth but suffice to say that almost every service in Azure
has a Logs blade (option in the Azure portal interface/menu) to accommodate
querying that service’s logs. This area provides for saving your queries, but also
to share your queries.
Share your queries
Because of this built-in capability, many of our customers regularly share their
creations with each other, other colleagues, to their own blogs and GitHub repos,
and even to the official Microsoft Sentinel GitHub repository
(https://aka.ms/ASGitHub). In Part 1 of this series, I supplied links to these and
more. So, to prove my point…yes, it’s absolutely possible you might not have to
write your own KQL query for a long time.
So, because of that, it becomes even more critical that you at least understand the
workflow. Again, if you can read a query line-by-line and determine that the results
will produce what you are looking for, you’re golden. If, through your newfound
understanding, the query can’t produce your requirements, you can modify it by
line instead of a wholesale adaption. This should be your first KQL goal: read
queries.
Through this series, I’ll provide queries for you to use and get hands-on experience
because I believe in learning by doing. We’ll be using the links in the Practice
Environments section in Part 1 for the hands-on. But focus initially more on the
structure and logical workflow.
P.S. I’ve enabled image linking in this post so you can click or tap to open the image in a larger view. So,
you can open the image in a new window or new tab to better follow along.
1. The first step is to identify the table we want to query against. This table
will contain the information that we’re looking for. In our example here,
we’re querying the SecurityEvent table. The SecurityEvent table contains
security events collected from windows machines by Microsoft Defender
for Cloud or Microsoft Sentinel. For a full list of all services tables, see
the Azure Monitor Logs table reference (also available in Part 1).
2. The pipe (|) character (the shifted key above the Enter key on most
keyboards) is used to separate commands issued to the query engine.
You can see here that each command is on its own line. It doesn’t have
to be this way. A KQL query can be all one single line. For our efforts, and
as a recommendation, I prefer each command on its own line. For me,
it’s just neater and more organized which makes it easier to troubleshoot
when a query fails or when I need to adjust the query to produce
different results.
3. Next, we want to filter the data in some way. If I simply entered the table
and ran that as its own, single query, it will run just fine. Doing that
returns all rows and columns (up to a limit – which I believe is now
50,000 rows) of the data stored in the table. But our goal is get exact
data back. As an analyst looking for threats, we don’t want to have to sift
through 50,000 rows of data. No, we want to look for specific things.
The Where operator is one of the best ways to accomplish this. You can
see here in the example that I’m filtering first by when the occurrence
happened (TimeGenerated) and then (remember the pipe character
– another line, another command) by a common Windows Event ID (4624 –
successful login).
4. The next step in our workflow is to provide data aggregation. What do
we want to do with this filtered data? In our case in the example, we
want to create a count of the Accounts (usernames) that produced a
successful login (EventID 4624) in the last 24 hours (TimeGenerated).
5. Next let’s tell the query engine how we want to order the results. Using
the Order operator, I’m telling the query engine that when the results are
displayed, I want it shown in alphabetical order by the Account column.
The ‘asc’ in the query in the Order Data step is what produces this
ordering. If we wanted descending order we’d use ‘desc’. Don’t worry,
we’ll dig deeper into each of these operators as we go along in the
series.
6. Generally, the last thing that I’ll do with this search query is tell the query
engine exactly what data I want displayed. The Project operator is a
powerful command. We’ll dig deeper into this operator later in this
series, but for our step here, I’m telling the query engine that after all my
filtering, data aggregation, and ordering, I only want to display two
columns in my results: Account and SuccessfulLogins
It searched our stored security events in the SecurityEvent table for all Accounts that
had a successful login in the last hour and chose to display only the Account and
number of successful logins per Account in alphabetical order.
See that? The Account column is in alphabetical order ascending and the SuccessfulLogons column
shows how many times each Account successfully logged in.
If you need to, jump back through each step above until you get a good
understanding of the workflow. Again, this is very common, and you’ll see this
structure many times working with Microsoft Sentinel and Defender products.
Remember, it’s about the results. If you can look at this example and get a good feel
that you understand how the results were accomplished, line-by-line, you’re on
your way.
I invite you, though, to take this example and copy/paste it into a Logs environment
to test. You can have this query to play with it in your own Microsoft Sentinel
environment, or using the KQL Playground I provided as a resource in Part 1.
SecurityEvent
This query is also available from the GitHub repository for this blog
series: https://cda.ms/3fS
I’d like to share one extra tidbit with you that you might find helpful as you start
testing this KQL query example in your own, or our, environment.
Every language (scripting, coding, querying) has the capability to add comments or
comment-out code through special characters. When the query, scripting, or
development engine locates these characters, it just skips them. KQL has this same
type of character. The character for KQL is the double forwardslash, or //
When you start testing this post’s KQL query example, comment-out a line or two
(put the double forwardslash at the beginning of the line) and rerun the query just
to see how eliminating a single line can alter the results. You’ll find that this is an
important technique as you start developing your own KQL queries. I’ll talk about
this more later, too.
In the next post (Part 4) I’ll talk through another, yet just as powerful, way to search
for information using KQL that is a top pocket tool for Threat Hunters.
And, then I’ll come back for Part 5 and show how to tie together both search
methods to create the full operation of hunting to Analytics Rule. But don’t worry,
that’s not the end. I have no clue how many parts this series be. A lot of it depends
on you.
Now that we have some understanding of the workflow (from Part 3) under our
belts, I’m going to deviate from that for a brief minute in this post and then I’ll bring
it back together in Part 5 and combine Parts 4 and 5 to provide something extra
meaningful to show you how it all fits together like an unsolved Hardy Boys mystery
novel. Hopefully, you’re starting to see that my efforts here are logical and designed
to accumulate enough knowledge that is necessary to move to the next plane of
understanding.
What I want to do in this post, is give you something you can use today. When I’m
done here, you should be able to take the knowledge and the query snippets to do
your own hunting – or, rather, look inside your own environment to get an
understanding of what is happening that’s worth exposing and investigating.
One of the easiest ways to get started with KQL is the search operator. In Part 3, I
talked through the structure and workflow of a search query. In this post, I’ll talk
about the search operator (or command) and how it could be the most powerful
KQL operator in the universe but will always be the best tool in the toolbelt to start
any search operation.
Search is the first operator I reach for when trying to verify if something exists
within the environment. In fact, our whole goal for using KQL as a security tool is to
answer the following questions:
1. Does it exist?
2. Where does it exist?
3. Why does it exist?
4. BONUS: There’s a final question to this that’s not part of this KQL series, but
one that’s important to the total equation and one that should be part of
your SOC processes. That question is: How do we respond?
If you click or tap the image to open it in a larger view, you’ll see how the power of
the search operator enables you to answer these questions.
Its starts with an idea or theory that “something” exists in the environment. You
may have gotten this idea from a dream or nightmare that someone in your
organization is performing nefarious activities. But, most likely, the idea came from
a news report or a post on social media from a trusted source about a nation-state
actor being active with a new kind of ransomware.
Once these reports are available, someone (like Microsoft) will supply the Indicators
of Compromise (IOCs) so you can search your environment to see if they exist. IOCs
could be a number of things including filenames, file hashes, IP addresses, domain
names, and more.
If they don’t exist, you move on. If any of them do exist, you start to dig deeper to
figure out where they exist, so you can, for example, quarantine systems or users,
or block IP addresses or domains.
And, then you need to determine why they exist. Did a specific user click on
something they shouldn’t have clicked on in an email? Or did a threat actor
successfully compromise a Domain Controller through control over a service or
elevate user account? Could it be that there is more impact on your environment
than you originally thought?
All of this can be exposed through the simple process of search using the search
operator.
Let’s walk through this together with a few simple queries that you can take and
use to test your own environment. (click or tap the image to open the larger version in
a new browser tab to following along)
In step 1 in the image, I’m performing a simple search for a username. In this case,
it’s an ego search – I’m searching in my own environment for my own activity. This
could be an IOC that you want to search for. Just replace my name with the string of
text you want to expose in the results.
search "rodtrent"
As you can see in the image, my search produced results, telling me that this thing I
searched for does exist in my environment.
Since it does exist, I want to understand where it exists. I do this by making a simple
adjustment to my original query by adding a line that tells the query engine to just
show me the specific tables that my IOC exists in. This will give me a good indication
of what type of activity it was. Step 2 shows…
search "rodtrent"
| distinct $table
Let’s assume that I’m looking for user activity because the reported threat is
malware. I know that user activity is most generally recorded and contained in a
few places including Microsoft Office and Defender for Endpoint.
Now that I have my results of rodtrent’s activity in the OfficeActivity table, I can
begin sifting through the rows and columns of data to learn more about the
occurrence and to start to tune my query even more.
When we come back for Part 5, I’ll show you how to turn your search query into a
workflow like I talked about in Part 3.
One last thing for this post. I mentioned that user activity is generally reported from
the Microsoft Office and Defender for Endpoint tables. I’ve given you examples for
searching the OfficeActivity table. But Defender for Endpoint is more than one
table. In fact, Defender for Endpoint consists of the following 10 tables:
DeviceEvents, DeviceFileCertificatelnfo, DeviceFileEvents, DevicelmageLoadEvents,
Devicelnfo, DeviceLogonEvents, DeviceNetworkEvents, DeviceNetworklnfo,
DeviceProcessEvents, and DeviceRegistryEvents.
Fortunately, the KQL search operator supports the wildcard character. So, you can
search for those IOCs across the entire Defender for Endpoint solution by doing the
following:
I’ve given you examples for searching the OfficeActivity table. But Defender for
Endpoint is more than one table. In fact, Defender for Endpoint consists of the
following 10 tables: DeviceEvents, DeviceFileCertificatelnfo, DeviceFileEvents,
DevicelmageLoadEvents, Devicelnfo, DeviceLogonEvents, DeviceNetworkEvents,
DeviceNetworklnfo, DeviceProcessEvents, and DeviceRegistryEvents.
And, incidentally, if you have the Defender for 365 Data Connector enabled for
Microsoft Sentinel and you enable the Microsoft Defender for Office 365 logs, the
OfficeActivity table isn’t the only Microsoft Office data you can query. Enabling
these logs gives you access to EmailEvents, EmailUrlInfo, EmailAttachmentInfo, and
EmailPostDeliveryEvents tables which means you can take advantage of the search
operator’s wildcard capability here, too.
All the query code in this post is contained in the series’ GitHub repo
here: https://cda.ms/3gG
Now, that we’ve talked about using the Search operator in Part 4 to answer those
three basic SOC analyst questions of: 1) Does it exist? 2) Where does it exist? and, 3)
Why does it exist?, we can take that learning and the results of that type of query
and meld it with the standard search query structure I talked about in Part 3.
In part 4, I ended with a query to locate activity by a user called “rodtrent“. I found
that this rodtrent person had performed potentially strange activity in the
OfficeActivity table (the table for Office 365 activity) that needs to be checked out.
As shown, the search operator is a powerful tool to find things of interest. The
results of the search operator query were thousands of rows of data. That’s
inefficient.
So, now that we’ve found something interesting, we want to use the structure of the
Search Query to pare down the results to minimize the effort and workload to
identify that that something interesting is something notable and worth investigating.
If you need to, open up Part 3 in a new Window or browser Tab to review the
Search Query Workflow as I walk through the next section.
In the following example, note that this is a non-issue situation, but I want to start
with a basic Search query before we start building toward more complex queries in
future posts to get a fully rounded understanding of the “why” behind why we do
this. The one below is even simpler than the one discussed in Part 3 where I also
talk about aggregating and ordering data. I’ll come back to those concepts later,
particularly when I get into creating your own in-query visualizations like pie and
bar charts. No, for our efforts in this post, I want to focus on how easy it is to filter
the data. Again, KQL isn’t hard, and some of your most powerful queries may only
be a few lines of code.
Turning your hunting operations into more formal Search structure queries is the
building blocks for creating your own Analytics Rules in Microsoft Sentinel. Analytics
Rules should be precise logic to enable your operations to focus exactly where it
needs to focus; and because, capturing data outside of what was intended is both
inefficient and problematic for isolating actual security events.
The example (available from the series’ GitHub repo at: https://cda.ms/3jd):
New
search query
Let’s break this new Search query down together like was done in Part 3. This one,
again, is even a tad bit simpler than when describing the Search workflow, but as
you’ll see, it’s the where operator that is sometimes our biggest, most powerful, and
best workhorse and pal for tuning efficient results.
1. The first step in our workflow is to query the OfficeActivity table. If you
remember, from our time together in Part 4, we’re looking for user
activity (in our case the user “rodtrent“) in Microsoft Office.
2. As per the discussion in Part 3 on workflow, I want to highlight the
importance of the pipe command once again. I don’t rehash the
importance here. If you missed it, jump to Part 3 to catch up.
3. In step 3 of the new Search query, I’m filtering how the query engine
searches. I’m first telling to only look at data in the last 24 hours
(TimeGenerated), then only looking through a column called UserId for
the string “rodtrent”, then telling the query engine to only capture
Exchange activity from the RecordType data column, and finally
pinpointing the search to only Send operations. So, essentially, I’m
looking for any emails that rodtrent sent in the last 24 hours.
• Filtering the data is the key to everything. <= Read that again.
Filtering the data that is returned produces exact, actionable
data. It also improves the results performance of our queries.
Where the search operator may return thousands of rows of
data in 15 seconds (or less), by properly filtering the data to
return exactly what is necessary returns just the number of
rows of data we asked for which greatly improves the
processing time. Where the search operator may have taken
15 seconds, our new Search structure query will take 5
seconds or less. The Where operator is the key to this
operation. Learn it. Know it. Keep the Where operator
reference page handy: https://cda.ms/3jh.
4. Finally, I’m using the project operator to control exactly what is show in
the results window. In this case, I only want to show the user, the user’s
IP address, and the server where the email originated from.
The results?
As you can plainly see in the query results, this matches exactly what my query
proposed.
EXTRA: We saw in Part 4 with our Search operator, how results from our queries
are in named rows and columns of data. And, you see here in this post, how I’m
constantly filtering against known column names in the tables. Some might wonder
how I come up with those schema names. Of course, it helps that I work with these
tables constantly, but I do have a couple secrets to share. First off, as noted in Part
1, I use the Azure Monitor Logs table reference quite a bit. However, there’s also the
Rosetta stone of KQL operators: getschema
Running a simple…
OfficeActivity
| getshema
…will produce a list of all the named columns of a specific table. The example above
displays all the named columns of the OfficeActivity table. Each of these columns
can be used in your where operator filtering efforts.
In this post, I’ve given you a simple query to practice with. In Part 6, I’ll come back
and dig into the actual interface for developing your own queries (instead of just
running the ones I’ve given).
I preface this post by saying this: everything discussed in this post about the User
Interface (UI) can be done (and should be done, eventually) in the KQL query itself.
When you’re just starting with KQL, the UI can be a blessing. As you get further in
your learning and comfortability with the query language, it can be a crutch –
particularly when you need to find something quickly because of a perceived
security threat and view it in a way that’s most meaningful. Still, understanding the
UI’s capabilities is important.
In this post, I’ll give you a whirlwind tour of the UI, but again with the assumption
that, eventually, every action it provides I’ll cover on how to accomplish it using KQL
as we get further and further along in this series.
The Logs blade exists in almost every Azure service, allowing you to query the
activity logs for that service. For our purposes for Microsoft Sentinel, since all of
those services’ (and more) logs are consolidated in the Log Analytics workspace for
Microsoft Sentinel, we get to use the UI to query everything. It can be a bit of a
power rush.
For those that already have deep-level experience with the Logs UI in Azure
services, this may not be your favorite part of this series, but you also may learn
something you missed or that’s been updated recently, so make sure not to
overlook anything important. And, please, please, PLEASE – if you’re the expert in
the UI and with KQL, pass this along to someone who needs it.
Like everything in Azure, there’s updates and enhancements constantly, so I’ll try to
keep this part of the series up-to-date continually. My youngest son is the epitome
of FOMO (fear of missing out) and I feel like him sometimes when I’ve been away
from the Azure portal or the Microsoft Sentinel console for even a day. Every day
can be a new adventure. As a customer, you might think, or even become
frustrated that it’s hard to keep up with all the changes going on in the Azure
services and other products. But, believe me, those of us that work at Microsoft are
faced with the exact same scenario and the same difficulties in keeping up-to-date.
So, we can help each other in this respect. See something in this part of the series
that’s slightly off or maybe improved? Or, maybe I’ve chosen not to cover an area or
feature that you need more knowledge about. Let me know and I’ll get it updated
toot sweet.
HANDS-ON: If you’d like to follow along yourself with the UI areas and descriptions
in this post (instead of just reading through them in the text), use the KQL
Playground that is referenced as a Practice Environment in the resources list
of Part 1.
I’ll start this part of the series talking about those areas in the UI that are most
important to our efforts in learning how to manipulate the KQL query data, and
then follow up with the rest of the interface in the Extras section below, so you get
the full intimate affair. And don’t forget to come back for Part 7 for the Schema
Talk (see the TOC) where I’ll finish up covering the UI with those areas of the UI that
pertain to working with the tables.
To focus on a specific column, select the Filter icon, then select values to adjust the
results display.
The example query in the above and following images is located here: https://cda.ms/3mD
Sorting results
To sort the results by a specific column, such as timestamp, click the column title.
One click sorts in ascending order while a second click will sort in descending. An
arrow will display in the column next to the column title to show which direction the
results are sorted.
Grouping results
To group the results, first toggle the Group Columns option, then simply click and hold and drag the
column header above the other columns.
Selecting columns to display
To add and remove a column that is displayed select the Columns button.
You will notice when you work with this UI feature, there’s a number of columns
that are omitted from the results display. There’s some intelligence built in that
looks at the table data and only shows results that it deems pertinent to the
operation – in our case, that operation is security monitoring. It also locates
columns that contain no data and omits these from the display. All of these
measures are intended functions to help build efficiency and eliminate unnecessary
data, but also to improve query results performance. But, using this feature (and
actual KQL operators like project we’ll talk about later on), you can use the UI to pick
and choose what to review.
Charts
To add a chart as a visual format you can select the CHART option just about the
results window at the bottom of the UI. On the right-hand side you have many
options for manipulating the visual aspect the data.
The example query in the image above is available from here: https://cda.ms/3mF
Note that charting is dependent on tabular data. I’ll talk about this when we get to
the summarize, render, and bin operators in this series. (See the TOC)
EXTRA
In the previous section, I’ve discussed those areas in the UI that are going to help
you manipulate the results. Again, while those are important areas, I’ll show how to
accomplish each of those using actual KQL query operators, so you don’t have to
rely on the UI.
You might notice I didn’t spend any time talking about the Tables, Queries, and
Functions areas in the Logs blade. I’ll actually come back to those in Part 7 when I
talk about the schema. (See the TOC)
But before closing out this part of the series, I do want to also highlight some other
cool areas of the UI that you might enjoy and have fun with.
You can save your queries to Query Packs and then look them up and use them later. For more
information on Query Packs, see: Query packs in Azure Monitor Logs and How to Save an Azure
Sentinel Query to a Custom Query Pack
Share Queries!
Sharing your fabulous query creations is an important capability for a number of reasons and not
just for an ego boost or pat on the back when bragging to friends and colleagues.
There are four sharing options:
1. Copy link to query: Since the Azure portal and Microsoft Sentinel console is web-
based, you can share the direct URL to the query you created by pasting it somewhere
(email, Teams chat or channel, etc.). When you share the link and someone with proper
access clicks on it, they are taken directly to the Logs blade and the query is run, so they
can review the same results. This is an awesome team activity where you can get an
extra set of eyeballs on a potential situation.
2. Copy query text: This function just copies the query itself so you can send that
somewhere (to a team member, to a GitHub repo, etc.)
3. Copy results: Right now, this function literally does the exact same thing as the Copy
link to query option. So, we’ll put a pin here for when this changes in the future.
4. Share to community: This option is super-fantastic! By utilizing this sharing feature,
the query you’ve created is copied and placed into an email template that is addressed
to the Azure Monitor team at Microsoft. By submitting this after entering the requested
information in the email template fields, your creation will be vetted and published to
the GitHub repository for the Azure Monitor community! Imagine your name in lights,
idolized for your contributions that helped solve the latest global security threat!
And, by the way, you can also submit your KQL creations to the official GitHub
repository for Microsoft Sentinel. See Add in your new or updated contributions
to GitHub for steps on how to accomplish that.
Format query
A super-cool, super-useful tool is the Format button in the UI. This button takes a
badly formatted query and reformats it so it a) works, or b) is in a more uniform,
more readable format.
As I noted in Part 3 about Workflow, because of the power of the pipe (|) command
separator, a KQL query can be a single line of code. But that’s a bit useless if you
want to be able to determine what the query’s intent is or need to debug it. This
option turns it into a better format.
Queries Galore
In addition to all the awesome KQL query goodness available from all over the
Internet, there’s a slew of example KQL queries available to access in the Logs blade
itself. Just tap or click the Queries button to gain access.
Exporting Queries
The Export option in the UI gives you the ability to export the query results in a
number of ways.
You can export all data to a csv, export only the data in the displayed results,
generate an M query for use in creating a Power BI dashboard, and export and
open immediately in Microsoft Excel.
You can create rules for either Azure Monitor or Microsoft Sentinel directly from the
Logs UI. This is an awesome feature that allows you to create and tune your query
until it’s perfect and then begin the steps to turn it into a rule to automatically
analyze security for your environment. We’re not quite at that step in this series, so
we’ll come back to this feature in Part 21. (See the TOC)
Pin to Dashboard
Pin to Dashboard is an interesting feature in that you can take the query results
that are formatted as a chart and pin the visualization directly to the standard
Azure portal dashboard. This dashboard can be your own private collection of
visualizations or a collection that is shared among your teammates or even
supplied so your manager has purview into operations.
Settings
I’m not going to dig into each option, but the Settings icon contains configuration
adjustments including things like how double-clicking works, if you want to see
tables that contain no data, how many rows per page should display by default in
the results window, and other things.
In-UI Reference
Lastly, to round out this intimate review of the Logs UI, there’s a very good, very
solid collection of references built into the UI. Some of those I’ve already supplied
as references in Part 1, but, like everything in Azure, this is also updated continually.
So, keep an eye out here for updates.
Tabs
To help keep you organized, much like a web browser the UI also supports tabs.
Tabbing
If you’re a die-hard keyboarding fan like myself, rest easy knowing that you can help
speed up your query development using a couple key combinations. It’s also for us
lazy people who can’t suffer the time to lift our hands from the keyboard to locate
the mouse and click on one of it’s buttons.
Keyboard shortcuts
Shift + Enter causes the query to run. Ctrl + Enter starts a new command line,
complete with the command (pipe (|)) character.
Much like how addressing an email works, the Logs UI will try everything it can to
use autocomplete to try and figure out what it is you want to accomplish. Just start
typing in the query area and the applicable options will display in a list.
…
Next, in Part 7 (see the TOC), there’s a bit more of the UI to talk about. But that
deserves its own part since we’ll be talking in relation to working with the tables
and the schema.
Must Learn KQL Part 7: Schema Talk
Before jumping directly into talking through some common KQL operators and
providing you example queries for hands-on learning (see the TOC) in the next part
of this series, there’s some lingering discussion from the last post around the UI,
but also how this relates to table schema. I wanted to keep this information
separate from the rest and in its own area because it will help you determine where
things exist in the tables and how to better pinpoint the data. You saw in Part 4 that
it’s easy to find anything in the data. But as you start getting closer and closer to
taking the knowledge to develop your very own Analytics Rules for Microsoft
Sentinel, you want to take the learning from Part 5 and go just a tad bit further. This
where an understanding of the schema becomes important.
The table schema is important. As with any data storage function or service, data is
collected and stored – most times appropriately – in organized columns. I noted
in Part 5 about the getschema operator for KQL that produces the list of all
columns and their types.
Example:
OfficeActivity
| getshema
Sample results:
Results from getshema
As you can see in the results, getschema shows a lot of great information. It shows
the actual column names that are important to know for what types of information
can be found, but also note the DataType and ColumnType results. These tell us
how to query the data – or, rather, the approach we need to take (the type of KQL
operator) to query, extract, and manipulate the data.
Using just the information displayed in the screenshot example, I can see that I can
use Part 5‘s knowledge to show regular Exchange users that sent emails. The
following example shows that.
OfficeActivity
Note that not everything is as neatly stored and defined as the OfficeActivity table
in the screenshot. I said earlier that most times data is stored neatly and orderly.
There are exceptions and you need to be aware of these. In these cases, you’ll need
to utilize some parsing functions of KQL to extract the data yourself. But let’s not
focus on that here in this post. I promise, I’ll dig into that later in the series (see
the TOC) just before creating your first Analytics Rule.
But fortunately, most times data is store neatly and orderly. This is where the Data
Connectors come into play in Microsoft Sentinel. The parsing is done for you when
an actual Data Connector is in play. The “parser” is part of the Data Connector or
the Sentinel Solution. For those situations where an official Data Connector does
not exist, you may be called on to create your own parser. Again, I’ll cover this later
in this series, but I do want to call this out, as its important. So, for your efforts as
you begin building your KQL knowledge, stick with the tables that are part of a Data
Connector, otherwise you’ll bump off into unknown territory that can get miry fast.
OK…with this knowledge firmly in-hand, let’s jump back to the UI to talk about some
areas in the console that help shortcut some of this activity.
Column Types
As shown in the screenshot example, there are various KQL column types. Again,
knowing these date column types will alter your approach for querying specific
columns. I don’t want to spend a lot of time here on this as to not start the varying
levels of confusion. But I’ll include this here so I can refer back to it later on in the
series.
• Basic
• int, long (numerical types)
• bool: true, false (logical operators)
• string: “example”, ‘example’
• Time
• datetime: datetime(2016-11-20 22:30:15.4), now(), ago(4d)
• timespan: 2d, 20m, time(1.13:20:05.10), 100ms
• Complex
• dynamic: JSON format
For anyone that’s worked with any query language or data format before, these are
not uncommon or new. As I talked about in Part 2, KQL – the query language – was
not designed to be difficult nor revolutionary. The revolutionary part is how it
utilizes the power of the cloud (Azure) to accomplish sifting through mass seas of
data quickly and efficiently. No, KQL – the query language – takes the best pieces of
a lot of existing query languages. For example, anyone that’s worked with SQL
Server, will have an easy time with KQL.
Back to the UI
The UI has an area that aids in organizing and customizing the table/schema view,
but it also has capabilities to enable easier and quicker access to KQL query
creation. In this post, I’m not going to focus heavily on areas 2-4. You should be able
to figure out how to click through and use most of those on your own. And, while I’ll
provide a quick overview of all the areas just now, I’ll circle back and focus on the
Tables area. As you’re getting started learning KQL, this is the important area that
will save you a lot of time learning to create your own queries.
UI Overview:
1. This is the Tables list. This is where you can find all the available tables for which you
can create queries against. We’ll focus on this area just below.
2. This is the Queries list. This tab area contains a slew of pre-made KQL queries that you
can spend hours and days executing, reverse engineering, and all other matters of
query learning importance. These are separated by category types like Applications,
Audit, Azure Monitor, Azure Resources, Containers, Databases, Desktop Analytics, IT &
Management Tools, Network, Security, Virtual Machines, Windows Virtual Desktop,
Workloads, and Others.
3. This is the Functions list. A Function is like a stored procedure in SQL, except in our case
the query code is in KQL. This is a hugely useful component of KQL. I’ll cover this in-
depth later in the series (see the TOC). Did you know that the Watchlist feature of
Microsoft Sentinel relies heavily on a Function? If you access the Function tab in the UI,
you’ll see the _GetWatchlist function.
4. The Filter tab. The Filter tab is absolutely awesome and delivers another shortcut
method of developing your KQL queries. After running a query the Filter tab will contain
a list of empty data columns that you can select to filter out of the query results. Once a
column is selected and applied, you can see in the screenshot that the query is updated
automatically with the where operator to use as the filter mechanism and then the
query is rerun. The isempty() component is used, which, in itself is a powerful tool that
we’ll talk about later in this series.
Filter tab
Schema Area Focus
I noted in Part 6 that everything that can be done in the UI we should eventually
accomplish in the KQL query itself. That’s still the case here, but the UI provides
some neat shortcuts that shouldn’t be overlooked.
1. First off, every Table in the list can be expanded to show the schema underneath. So,
instead of always resorting to the getschema operator, you can expand the Table while
you’re creating your queries to have a quick-glance reference list of what you can query
against.
2. Secondly, if you hover your mouse cursor over a Table name, a new pop-up window
displays that provides even more query shortcut value. Also of importance, notice that
the pop-up will display the description of the table.
3. If you click the Use in editor option, the Table name will automatically be placed in the
query window so you can start querying against the table.
4. The Useful links option links directly to the Azure Monitor Logs table reference that I
provided as a resource in Part 1.
5. And, finally, the most excellent, super-cool shortcut is the capability to click and look at
sample results from the table itself. Clicking on this will produce its own window similar
to the following:
Data Sampling
Incidentally, this most excellent, super-cool shortcut is actually a KQL query itself
that uses the take operator that I’ll cover later in the series. In fact, it’s a take 10
similar to the following:
OfficeActivity
| take 10
This tells the query engine to display a random set of 10 records as a data sample.
Because its random, every time it runs different data will display.
OK, now that we have all the concepts and UI functionality finally out of the way, it’s
time to start building queries using the most common KQL operators. From this
point on in the series, I’ll supply a KQL example based on an operator you can
expect to use and see constantly in Microsoft Sentinel and our other security
platform services. You should make it your intent to make use of the public KQL
Playground I supplied in the Part 1 resources, or your own environment, to get
hands-on with each operator I talk about.
You’ll see as I go along, I’ll take a simple query and start to build on it with each new
part in this series. We’ll begin simple and end up with a pretty interesting, but more
complex query than what we started with.
Must Learn KQL Part 8: The Where Operator
Hands-on Recommendations
Before jumping directly into coverage of the first KQL operator, I want to extend
some recommendations on how to proceed to ensure you get the most out of the
hands-on opportunities through the remainder of this series.
In each new part of this series, I’ll talk about a specific KQL operator, command, or
concept and supply example queries that you can use to get hands-on experience.
The examples will be available here in the text, but also in the Examples folder of
the GitHub repository for this series (https://aka.ms/MustLearnKQL).
Bear with me (and forgive me) while I repeat myself. In part Part 5: Turn Search into
Workflow, I said the following…
Filtering the data is the key to everything. <= Read that again. Filtering
the data that is returned produces exact, actionable data. It also
improves the results performance of our queries. Where the search
operator may return thousands of rows of data in 15 seconds (or less),
by properly filtering the data to return exactly what is necessary
returns just the number of rows of data we asked for which greatly
improves the processing time. Where the search operator may have
taken 15 seconds, our new Search structure query will take 5 seconds
or less. The Where operator is the key to this operation. Learn it. Know
it. Keep the Where operator reference page handy: https://cda.ms/3jh.
Rod Trent, circa Part 5 of the Must Learn KQL series
That still holds true. So, based on that, would you agree with me that that makes
this Part 8 one of the most important in the series? You betcha.
The syntax for the where operator will always be the same. Using our knowledge
from Part 3 on workflow, you know that the flow of the query needs to follow a
logical path. We need to tell the query engine the table we want to query against,
then we need to tell it how to filter that data.
TableName
| where predicate
Allowable predicates:
• String predicates: ==, has, contains, startswith, endswith, matches regex, etc
• Numeric/Date predicates: ==, !=, <, >, <=, >=
• Empty predicates: isempty(), notempty(), isnull(), notnull()
Where operator example:
In the following example, I’ve added the commenting character (the double-
forwardslash covered in Part 3) to each line to explain what it is accomplishing.
As shown, the example queries the SecurityEvent table, looking for normal users
(non-admins) that had a successful login in the last hour. Can you see that? For
each command line (separated by the pipe character (|) I talked about in Part 3) the
where operator is enacting on the data in a specific way based on the predicate. In
the example, I’ve used the where operator three different times to further filter the
results that will be produced. I can use the where operator ad nauseam, until the
results are exactly what I need them to be.
EXTRA: There is one additional piece of clarification I need to make. In the third (last
line) where statement of the example query there’s an interesting looking predicate
(=~). The tilde (~) character can be used in string predicates to cause the query
engine to ignore case (case insensitivity). So, for our example, I’m telling the query
engine to find every occurrence of the word “user” in the AccountType column no
matter if it’s spelled “User” or “user” or “uSEr”, etc. Otherwise, it’s going to return my
request verbatim which could result in zero results for the AccountType column.
The tilde is an extremely useful tool particularly if there have been data or schema
changes.
EXTRA CREDIT: If you’re hungry for more of the where operator, and just want to
continue building your KQL knowledge until the next part in this series (see the TOC),
take the original query example to the KQL Playground (https://aka.ms/LADemo)
and run it line-by-line to see how each line changes the results. You can insert and
remove the double-forwardslash (//) character at the beginning of each command
line to comment it out or to include it.
For example, the following query will show more data than just in the last hour
because, as you can see, the TimeGenerated filter line with the double-
forwardslash character.
Because limit and take are so similar and used for the same purposes, I’m going to
combine those in this part of this series. I’m not going to rehash my hands-on
recommendations here, but please check out the section in Part 8 for those if you
either missed it or have forgotten. In my opinion, the hands-on part of this series is
the most important piece.
Up front – there are no functional differences between limit and take. They’re like
fraternal twins. They have the same origin and similar attributes but have different
names and looks.
In some cases, there are those KQL operators or commands that have similar
functions, but one is better than another in how it reacts with the underlying
technologies. Or, better said, one is better performing in most situations than
another. In fact, we have a living document around this. See the KQL Best Practices
doc for more information. Take special notice of the has and contains operators in
the list in the Best Practices doc since I talked about the String Predicates in Part 8.
That said, since there are no true functional differences between limit and take it
comes down to personal preference.
Tablename
| limit <number>
-or-
Tablename
| take <number>
There are a few things to keep in mind about these fraternal twin operators:
• Sort is not guaranteed to be preserved. This speaks for itself. Don’t expect any
special sorting of columns of data to work.
• Consistent result is not guaranteed. No matter how many times you run the same
query with limit or take, it will most assuredly produce different results. The results are
always random.
• Very useful when trying out new queries or performing data sampling. Data
Sampling is a powerful capability of any data scientist or meager KQL query maven. This
is a similar activity for when we used the search operator in Part 4.
• Default limit is 30,000. No matter what number you supply in the query, the results
will never show more than 30,000. That’s a hard limit. And, when you think about it,
since limit and take are part of a data sampling technique, you may want to seriously
rethink your strategy (and use a different operator) if you need more than 1,000 rows of
data returned – and that’s a generous number.
And guess what? I’ve supplied both the limit and take operator versions so you can
start to formulate your favorite.
-or-
Also notice that I’m using the same query example from Part 8 – just adding
the limit and take command lines at the end. I’ll use this same query throughout (as
much as possible) to show a standard method of query development that will lead
to creating your very first Analytics Rule for Microsoft Sentinel. Creating an Analytics
Rule for Microsoft Sentinel is a very similar process of starting simple and building
bigger.
Your results for either query example will look like the following. Just remember
that your results will be slightly different because of the random nature of the
operators.
Randomness