Paying close attention to actual execution plans during the development process is not a guarantee that query plans will stay predictable and efficient for deployed code over time, especially if a variety of parameters are available to your users. In addition, a data warehouse is characteristically a work in progress, which usually spawns a mixture of new, deployed, and ad hoc queries. This can lead to uncertainty in run times, heavy resource utilization, and general application unpredictability. Spikes to database I/O, CPU utilization, and run times are all possible based on a variety of these activities within your data warehouse environment.
Even with a more static T-SQL code base, there are changes within databases, connections, and the server that may alter existing query plans. These include:
Data modifications and the impact on data distribution
New, revised, or updated indexes and statistics
SQL Server configuration changes and SET options
User behavior can also affect query performance over time. A common example is varying parameters used by dynamic queries, stored procedures, and reports. User options can translate into alternate execution paths within your T-SQL code, which may result in extended run times. This situation leads to the need to monitor the queries running in your data warehouse environment so that you can identify, profile, and manage poorly performing code.
SQL Server’s MAXDOP (Maximum Degree of Parallelism) setting can have a significant effect on query performance, especially when working with partitioned tables. For a helpful discussion on this topic see: http://blogs.msdn.com/b/sqlcat/archive/2005/11/30/498415.aspx
Key Query Performance Metrics
Regardless of the method you use for monitoring data warehouse queries, there are three important measures to evaluate when determining query performance:
Database I/O (typically reads for a data warehouse)
Extended run times for a query are usually the first indicator of a performance problem with ad hoc or deployed code. Just as rush hour is not the best indication of how fast your car is able to go, query duration, while important, is not definitive in the sense that it may simply reflect excessive server activity. Conversely, database reads and CPU time are measures that are directly attributed to a SQL database query and not influenced by other processes running concurrently in your data warehouse.
Options for Monitoring T-SQL Code
SQL Server provides several built-in options for monitoring database performance that you can configure to provide the key measures discussed above:
SQL Server Profiler
sp_trace_create and related system stored procedures
Management Data Warehouse
SQL Server Utility
SQL Server Management Studio Reports
Note: for a discussion comparing client-side with server-side traces, see: http://sqlserverpedia.com/wiki/The_Server-side_Trace:_What,_Why,_and_How
SQL Server Profiler
Profiler is an interactive, front-end application that uses SQL Server Management Objects (SMO) and system stored procedures such as sp_trace_create to provide near real-time query metrics. The data captured is based on the occurrence of specified events such as SQL Batch, SQL Statements, RPC completed, and others depending on your configuration.
Profiler is most useful to quickly investigate a performance issue or to filter and spot-check certain queries. However, because Profiler is not the most convenient option for daily or routine monitoring, SQL Server provides other database monitoring options, as Figure 5-12 illustrates.
Figure 5-12: SQL Server 2008 R2 Performance monitoring Features
The sp_trace_create procedure is actually one of a series of system stored procedures that can be used for automating the capture of query performance data. You can specify which events and columns to collect (sp_trace_setevent), set a variety of filters to eliminate unneeded data (sp_trace_setfilter), and set limits on the size and/or length of time for capturing the trace.
A monitoring solution based sp_trace_create involves a fair amount of custom development for data capture, processing, and reporting. For more information about these system stored procedures and how to use them to build a custom monitoring system, see SQL Server Profiler Stored Procedures (Transact-SQL).
Management Data Warehouse and SQL Server Utility
Starting with SQL Server 2008, two new monitoring features and components are available for performance management: Data Collections and the Management Data Warehouse (MDW). SQL Server 2008 R2 further builds upon these features with the new SQL Server Utility.
In the next section, we explore using Data Collections and the MDW for setting up a SQL Server monitoring solution.
Setting Up the Management Data Warehouse and Data Collection Components
The MDW is designed to hold performance-related statistics stored in a SQL Server 2008 database. SQL Server 2008 has a Configure Management Data Warehouse wizard that helps you get started with this newer SQL Server monitoring extension. Figure 5-13 shows how to launch the wizard.
Figure 5-13: Launching the Configure Management Data Warehouse Wizard
The wizard steps you through creating the MDW database and configuring the standard System Data Collections Sets (Disk Usage, Query Activity, and Server Activity). You should start the MDW configuration process on the SQL Server instance that you want to use as the Management Data Warehouse repository. This instance would ideally be separate from the SQL Server instance(s) that are being monitored (i.e., collecting the data).
For step-by-step assistance on using the wizard, see the article “SQL Server 2008 Data Collections and the Management Data Warehouse.”
Note: Any name can be provided for the resulting Management Data Warehouse database. As a matter of practice, MDW is often used, and is assumed in the following examples. Also be aware that once you configure and enable the MDW, there is no way to remove the corresponding SQL Server Agent jobs that are created. You can disable data collection, which will also disable the SQL Agent job, but the jobs will continue to reside in SQL Agent’s list of jobs.
Once you have configured the MDW for your environment, you will want to manually add the “SQL Trace Collection Set”, which is not automatically set up by the wizard. The SQL Trace Collection Set provides key metrics such as Query Duration, Database Reads, and CPU Time that will then be populated within the MDW database.
Note: the SQL Trace Collection Set can run independent of the other collection sets; however the ‘Management Data Warehouse’ must first be configured as referenced above.
Albeit a slight detour, the best way to set up the SQL Trace Collection Set for the MDW is to start Profiler, as Figure 5-14 shows, and create an active trace.
Figure 5-14: Starting SQL Server Profiler
Once in Profiler, click File, New Trace… . After connecting to SQL Server, you can set the desired columns and events to capture. The standard (default) template automatically sets most of the events and columns of interest for query monitoring. To minimize server overhead for data collection, you will likely want to remove all events with the exception of RPC:Completed and SQL:BatchCompleted. You can click the Events Selection tab to specifically choose the items to collect, as Figure 5-15 shows.
Figure 5-15: Setting the events and columns to collect in Profiler
After clicking Run, you can immediately stop the trace because you only need to extract the trace definition. You can export the trace definition script for a SQL Trace Collection Set, as Figure 5-16 shows.
Figure 5-16: Using Profiler to script a SQL Trace Collection Set
The resulting script can then be loaded into SSMS and executed. However, before executing the script, you will want to provide a friendlier name for the collection set, such as SQL Trace, which is shown in the code snippet below:
-- *** with the name you want to use for the collection set.
DECLARE @collection_set_id int;
@name = N'SQL Trace',
@schedule_name = N'CollectorSchedule_Every_15min',
@collection_mode = 0, -- cached mode needed for Trace collections
@logging_level = 0, -- minimum logging
@days_until_expiration = 5,
Note: The parameter option @days_until_expiration = 5 determines how many days the trace data will be stored in the MDW database before it is automatically deleted. This value can be increased for longer data retention, but remember that this also increases the monitoring overhead of the MDW database.
After running this script, you will see a new Data Collection set called SQL Trace (you may need to right-click and refresh the Data Collection container). To start the collection set, right-click the SQL Trace collection and then click Start Data Collection Set, as Figure 5-17 shows.
Figure 5-17: Starting the SQL Trace Collection Set
You can double-click the SQL Trace Data Collection to view its configured properties, which Figure 5-18 shows.
Figure 5-18: SQL Trace Collection Set properties
Be aware of the configurable collection frequency, which in this case is every 900 seconds (15 min), as well as how long the data is retained in the MDW (the default script is set to 5 days). Because the resulting trace table ([snapshots].[trace_data]) can be very large, especially in high volume environments, you may prefer to reduce the number of days for data retention. This, however, lessens the ability to trend reoccurring query patterns.
To retain the ability to trend, you could consider an alternative approach of creating a simple ETL routine to export (and perhaps summarize) only the trace rows of interest. This is especially beneficial when trending stored procedures or query patterns over a long period of time.
Note: You can also reduce the size of the resulting trace table by adding filters to the original trace you created in Profiler to create the collection set.
When you start data collection sets, corresponding SQL Agents jobs are created to schedule and manage the data collection in the MDW. The query in Figure 5-19 will return details of collection sets, including the collection_set_id, which can be used to correlate with the actual SQL Agent jobs.
Figure 5-19: Querying Data Collection Sets
Figure 5-20 shows a comprehensive list of the SQL Agent jobs that are created to collect, upload, aggregate, and purge data stored in the MDW database.
Figure 5-20: SQL Agent jobs used by the MDW
This completes the setup of the MDW and standard data collectors. Next, we’ll look at the standard reports available in the MDW and how to leverage the SQL Trace data collections through queries and customized reports.
Standard Management Data Warehouse Reporting
There are three standard reports that are available out of the box for the MDW. These can be accessed by right-clicking the Data Collection container in SSMS, as in Figure 5-21.
Figure 5-21: Viewing Management Data Warehouse reports
These reports are designed with a clean dashboard presentation style and include many interactive features for sort orders, time frame selections, and other report links. These links provide drill-down as well as drill-through capabilities to reports, some of which are normally accessed via other containers in SSMS. Let’s look at an overview of the details and analysis provided by each report.
Disk Usage Summary
The Disk Usage Summary report, shown in Figure 5-22, lists each database contained in the SQL Server instance and provides a spark line to indicate the database growth trend. The Start Size is determined by the time of the first snapshot (i.e., when the MDW Disk Usage Collection Set was configured to start collecting data).
Figure 5-22: Disk Usage Summary Report
The link beside each database name allows for drill-through to a more familiar SSMS report, shown in Figure 5-23, which provides a breakdown of how the database space is being used.
Figure 5-23: Database Disk Usage Report
These reports provide important insight into database disk consumption and database growth trends. In the context of a data warehouse, this information may be helpful for:
Identifying larger-than-anticipated growth rates
Detecting problems with database maintenance plans
Justifying architectural changes for data archiving and the creation of summary tables
Factoring disk usage into long-term disk storage capacity planning
Server Activity History
The Server Activity History report provides an overview of SQL Server activity and resource utilization and can indicate resource contention. Figure 5-24 shows a partial view of the report.
Figure 5-24: Partial view of Server Activity History Report
This is a highly interactive report that lets you alter time frames and drill into many report components. You can see historical resource trends in different time ranges, and you can go back to specific points in time to correlate resource consumption to specific events. This allows you to both analyze trends and troubleshoot specific problems that have been reported to you after a certain amount of time passed.
A considerable amount of data is gathered from the underlying collection set to make this report possible. The sources include the following dynamic management views:
The report also pulls from more than 75 SQL Server performance counters in the following categories:
In addition, the Server Activity History report includes interactive SQL Server-specific charts, which let you track SQL wait states along with key counters for trending overall SQL activity, as Figure 5-25 shows.
Server Activity History Report (continued)
When viewing the above report, you can click a vertical bar in the SQL Server Waits chart to see the Wait Category table that Figure 5-26 shows.
Figure 5-26: Wait Category Table
From there, you can click a link under the Wait Category column to go to several other reports related to the category selected. For example, the Logging link renders a context-sensitive Wait Category report with a drill-down into the executing queries, as Figure 5-27 shows.
Sampled Waits by Application and Query
The CPU link branches over to the next section that we cover, which details query statistics.
Query Statistics History
The Query Statistics History report pulls its source data primarily from the sys.dm_exec_requests, sys.dm_exec_sessions, and sys.dm_exec_query_stats dynamic management views. The report provides a top 10 listing based on resource utilization of CPU, Duration, Total I/O, Physical Reads, and Logical Writes by second. The queries are ranked and linked to a detail report, as Figure 5-28 shows.
Query Statistics History with drill-through
Querying the SQL Trace Collection in the Management Data Warehouse
The SQL Trace Collection Set provides additional source data that can be used to analyze the resource utilization and performance of SQL queries. Unlike the built-in reports we just saw, which are based on snapshots of performance counters and data management views, a SQL Trace is configurable to capture key metrics at the time a query actually completes. This provides start-to-finish measurements of individual queries and can lead to very specific analysis (e.g., pattern matching) of the collected performance data.
Based on the earlier SQL Trace configuration, we are collecting the following measures: CPU Time, Disk I/O, and Query Duration. These were also set to be recorded based on the completion of the following events:
Based on this configuration
, completed query metrics are placed in the MDW table called [snapshots].[trace_data]. Using this table, you can view the top 10 queries for any of our key metrics. For example, you can retrieve the top 10 queries based on duration by using the following code:
-- Script M-1
SELECT TOP 10 *
ORDER BY [Duration] DESC -- or BY [Reads], or [CPU]
This result of this query should indicate the 10 longest-running queries that completed based on data collected from all the SQL Server instances monitored by this MDW (scroll to the right to see the Server Name column).
You can obtain more information about the snapshots used to gather this trace information by joining to the [core].[snapshots] view on the snapshot_id column:
-- Script M-2
SELECT TOP 10 s.*, t.*
FROM [snapshots].[trace_data] t
JOIN [core].[snapshots] s
ON s.[snapshot_id] = t.[snapshot_id]
ORDER BY [Duration] DESC
You may find it helpful to group by a substring of the TextData column. This may better illustrate recurring queries that perhaps vary within the WHERE clause but are based on the same table joins and columns. An example of this query, using just the first 80 characters of the Text Data column and based on disk reads, is displayed below:
-- Script M-3
SELECT TOP 10
COUNT(*) AS [RunCount]
, SUM([Reads]) AS [SumReads]
, LEFT([TextData],80) AS [TextShort]
GROUP BY LEFT([TextData],80)
ORDER BY [SumReads] DESC
The technique of grouping by the first 80 characters is used here in an attempt to find recurring code. You may need to alter this method to better suit the SQL code patterns in your database environment. The objective is to find recurring code that carries a high disk I/O (read) cost and then to be able to identify the origin/source of the query for subsequent tuning efforts.
Another option is to trend SUM and AVG query metrics over time. The following query again groups by the first 80 characters of the TextData column to spot a recurring query pattern and then uses a secondary GROUP BY on date to show a performance trend. A LIKE predicate is also placed on the TextData column to narrow the queries to those referencing the Sales table:
-- Script M-4
SELECT CONVERT(VARCHAR(10), [EndTime], 120) AS [DisplayDate]
, LEFT([TextData],80) AS [TextShort]
, COUNT(*) AS [TimesRun]
, SUM(Duration) AS [SumDuration]
, SUM(Reads) AS [SumReads]
, SUM(CPU) AS [SumCPU]
, AVG(Duration) AS [AvgDuration]
, AVG(Reads) AS [AvgReads]
, AVG(CPU) AS [AvgCPU]
WHERE [TextData] LIKE '%Sales%'
GROUP BY LEFT([TextData],80), CONVERT(VARCHAR(10), [EndTime], 120)
ORDER BY [TextShort], [DisplayDate]
As noted earlier, due to the typical size of trace tables, it is recommended that the most notable or problematic queries (rows) be moved to a history table. This allows for a broader period of time for trending, while keeping the base trace_data table in the MDW to a minimum size. You may also choose to use an aggregation technique, similar to that in the query above, to further lessen the storage and processing overhead for your historical table(s).
The examples in this section really just scratch the surface of the types of queries that can be used with the MDW trace_data table. For routine use of this information, you may want to create parameter-driven SQL Server Reporting Services (SSRS) reports based on similar queries or maybe use the trace_data table (or a derivation thereof) as a fact table within a PowerPivot solution.