Introduction to Splunk
Splunk is referred as a product or tool, which is used to analyze data in large volumes in the business world. It is very powerful and versatile search tool that populates a real-time log and hence ease monitoring and troubleshooting issues happening in our application. The founders of Splunk are Michael Baum, Rob Das, and Erik Swan. It is developed in 2003 but Splunk is in more demand after its Splunk 3.0 release in 2008-09.
Splunk works as indexing the data, uses the data to search and investigate, add knowledge to your data, set up monitors and alert, report and analyze, prepare dashboards. Splunk collects the data securely and then helps in storing and indexing the data at a centralized location with role-based access. So it doesn’t matter how unstructured or diverse our data maybe we can easily monitor, report and analyze our data.
Concepts Of Splunk:
Splunk adds knowledge to your data with the help of knowledge objects (like Tags, Fields, and Saved Searches, Reports, Dashboards, Alerts etc.). These knowledge objects can be shared and reuse: These knowledge objects concepts are explained below:
About Splunk Home:
Splunk Home is the main window to the apps and data accessible from this Splunk. Splunk Home includes a search bar and three panels: Apps, Data, and Help.
- This app search bar is used by a user to run the search query. The app search bar and the standard Splunk search bar are similar and include a time range picker.
- The Data panel is used by a user to add new data and manage the data. It shows how long ago data was indexed the earliest and latest event of data and the volume of data.
When you have data in Splunk, you can see a brief summary:
- Click Add Data to get new data into Splunk.
- Click Manage Inputs to view and edit existing input definitions.
Uploading Data into Splunk:
A user can upload a different type of data like text files, csv files, event logs, weblogs any machine data into Splunk. After uploading the data, Splunk immediately indexes the data and makes the data available for searching. A user can perform any type of search on this data and can create reports, dashboards, and charts etc.
Step 1. Click Add data, in Splunk Home.
Step 2. Click from files and directories.
Step 3. There are two options preview data before indexing and skip preview. If you want to preview data before indexing select preview data and browse the file otherwise select skip preview and press continue.
Step 4. Select Upload and index a file and browse for the data file.
Step 5. More settings
- Under Host, set the values of a Set host to “regex on a path” and Regular expression to “1”
- Under source type set the value of the set, the Source type is “Automatic”.
- Under the index set, the value of setting the destination index to be “default”.
Step 6. Click save and Splunk displays a message data is being indexed successfully.
To start the search, click start searching.
To see more details about the uploaded data, click Data Summary.
Data Summary dialog which displays three tabs: Hosts, Sources, Source types.
The host of an event is typically the hostname, IP address, or fully qualified domain name of the network machine.
The source of an event is the file or directory path, network port, or script.
The source type of event tells you what kind of data it is, usually based on how it’s formatted.
Most commonly used commands:
Top/Rare: This command returns the top and rare values of the given field in the search bar.
Stats: The stats command is used statistical calculations over a dataset. It is similar to SQL aggregation. There’s more than one command for statistical calculations. The stats, Chart, and time chart commands perform the same statistical calculations on your data, but return slightly different output.
- Sourcetype=”csv”| stats dc(Origin)
- sourcetype=”csv”| stats values(UniqueCarrier) by Month
Below are the statistical functions that you can use with the stats command.
Avg(X): Returns average of the values of field X.
Count(X): Returns the number of occurrences of the field X.
Dc(X): Returns the count of distinct values of field X.
Max(X): Returns the maximum value of field X.
Min(X): Returns the minimum value of field X.
Sum(X): Returns the sum of the values of field X.
Values(X): Returns a list of all distinct values of field X
Chart: The chart command creates tabular data output suitable for charting. You specify the x-axis variable using over or by.
E.g.: sourcetype=”csv”| chart values(UniqueCarrier) by Month
Timechart: The timechart command creates a chart for a statistical aggregation applied
to a field against time as the x-axis.
E.g.: sourcetype=”csv”| timechart values(UniqueCarrier) by Month
Table: This command returns a table formed by the fields used in the search argument list
Dedup: Removing redundant data is the point of the dedup filtering command.
Charts/reports We can create Reports and Charts for better visualization and understanding. All kinds of charts can be drawn. For example Pie, Line, Bar, and Area etc.
Dashboards are the most common types of views. Each dashboard contains one or more panels, each of which can contain visualizations such as charts, tables, event lists, and maps. Basically, Dashboards are a collection of searches and reports.
To create a dashboard, save a chart/report as a dashboard panel.
Mention the dashboard title, description, and panel title and save it.
The dashboard has been created successfully. And to vies to click on view dashboard.
Conclusion – What Is Splunk
Splunk is the platform which is used for real-time operations. It is used for application management, security and performance management. It is freely available to use and easily accessible. It helps in visualizing the data with help of charts and graphs It can be easy learning for the beginners. It is also one of the main product or tool for the DevOps and Agile developers.
This has been a guide to Splunk. Here we have discussed some basic concepts of Splunk, steps for uploading data into Splunk, etc. You may also look at the following article to learn more –