Friday, March 20, 2015

CXPacket Wait in SQL Server



CXPACKET
Occurs with parallel query plans when trying to synchronize the query processor exchange iterator. If waiting is excessive and cannot be reduced by tuning the query (such as adding indexes), consider adjusting the cost threshold for parallelism or lowering the degree of parallelism.

CXPacket wait occurs when we use multiple processors/thread to execute a query.
When a server has multiple CPU, we can configure setting to use multiple cpu/thread, if query is taking
Excess time compare to approximate time of a query.

We can define this setting using “cost threshold for parallelism”, having default value 5 seconds, we can change values to something else based on our requirement
Suppose we have mentioned 100 seconds and if any query is taking more than 100 seconds then it will start parallel processing based on the configuration values of server max degree of parallelism.
According to below screenshot our server has two CPU’s, CPU0 and CPU1 and let us
Change the server configuration values “cost threshold for parallelism” to 100 and max degree of parallelism to 2.

Using this configuration query will start using two cpu/thread, if query takes more than 100 seconds.
According to below Figures A and B, we have two cpu, defined “cost threshold for parallelism” to 100 and max degree of parallelism to 2, highlighted in color yellow and black in figure A and Figure B respectively
Figure A





Figure B






We can change these setting using system defined stored procedure sp_configure.
As per below setting, if any query taking more than 100 seconds will qualify for parallel processing and SQL server will create and execute parallel plans using two thread(CPU0 & CPU1)

EXEC sys.sp_configure N'cost threshold for parallelism', N'100'
GO
EXEC sys.sp_configure N'max degree of parallelism', N'2'
GO
RECONFIGURE WITH OVERRIDE
GO






We can also set parallel thread setting using OPTION (MAXDOP 2) at query level like below
Let us create table CXPacketInfo in tempdb database and insert 100000 rows

USE TEMPDB

CREATE TABLE CXPacketInfo (CX_id INT, CXO_id INT, CX_amount INT, CX_description CHAR(2000))



BEGIN TRAN
DECLARE @i INT
SET @i = 1

WHILE (@i <= 100000 ) /*>*/
BEGIN

INSERT INTO CXPacketInfo VALUES (@i % 1, @i, RAND() * 100000, REPLICATE('a', 2000))
SET @i = @i + 1

END

COMMIT TRAN


SET STATISTICS time ON
GO

/*Execute query with single thread & Clear the wait stats entry in dmv sys.dm_os_wait_stats */


DBCC sqlperf('sys.dm_os_wait_stats', clear)

DECLARE @order_amount INT
SELECT @order_amount= Max(CX_amount)
FROM CXPacketInfo o
OPTION (maxdop 1) /* Execute query with single thread */


SELECT * FROM sys.dm_os_wait_stats
WHERE wait_type = 'CXPACKET'



Output:Output of dmv is showing 0 waiting_tasks_count and 0 wait_time_ms

wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms
CXPACKET 0 0 0 0


Execution Plan: In execution plan we can see query is only using single thread and this single thread is processing 100000 Rows without parallelism.


Execution Plan A:





/*Execute query with single thread Clear the wait stats entry in dmv sys.dm_os_wait_stats */

DBCC sqlperf('sys.dm_os_wait_stats', clear)

DECLARE @order_amount INT

SELECT @order_amount= Max(CX_amount)
FROM CXPacketInfo o

OPTION (maxdop 2) /* Execute query with two threads */

SELECT * FROM sys.dm_os_wait_stats
WHERE wait_type = 'CXPACKET'

Output: Output of dmv is showing 4 waiting_tasks_count and 65 wait_time_ms


wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms
CXPACKET 4 65 65 0


Execution Plan: In execution plan we can see query is using two threads and these two threads are processing 100000 Rows.
In Right hand rectangle box following Threads are used



Thread Rows Comments
Thread 0 0 Organizer Thread
Thread 1 55280 First Thread
Thread 2 44720 Second Thread

Thread 0 is organizer thread, when processing threads (Thread 1 & Thread 2) have time differences for processing the rows then organizer thread (Thread 0) has to wait for lacking behind thread, this wait we called CXPacket wait.
In plan we can also see parallelism is used in below execution plan marked in rectangle.




Execution Plan B:



Here's a list of additional considerations:
Make certain statistics are up-to-date; the optimizer thrives—or dies—based on up-to-date statistics.
• Fragmentation of internal structures & external disk must be kept low.
• In no case should max degree of parallelism be set to a value higher than the number of physical cores.
• For most servers max degree of parallelism should be set to no more than eight, even if more than eight cores are available.
• For NUMA-enabled servers, max degree of parallelism should not exceed the number of cores assigned to each NUMA node.
• Hyper-Threading often (not always) compromises SQL Server performance. My recommendation: In the absence of unequivocal supporting evidence that H-T enhances performance in your environment, H-T should be disabled.
• OLTP – parallelism can have a negative impact on performance.
• OLAP/reporting benefits from parallelism.
• Look at your plan cache, which of your queries use parallelism
• Check query plans – do you have missing indexes or out of date statistics
• What other waits are occurring that might be causing CXPACKET
• Consider using MAXDOP to fine tune specific queries.
• Set max degree of parallelism and cost threshold for parallelism to best suit your server’s function.
• Do not adjust max degree of parallelism without proper analysis of your servers workload/queries

No comments :

Post a Comment