Friday, August 21, 2015

Book Mark Lookup Key lookup and RID Lookup


In SQL Server execution plans many times we see Key lookup and RID lookup, let us excavate in details what these lookups are and why they appear how we can remove them from execution plans.
1. RID Lookup: when we SELECT any columns (in HEAP Table), which are not part of any indexes but the columns which are in WHERE clause are part of non clustered index then RID (row identifier) lookup happens.
HEAP Table: A Table without clustered index is called HEAP Table

2. Key Lookup: when we SELECT any columns(on Clustered Table), which are not part of any indexes but the columns which are in WHERE clause are part of non clustered index, then Key lookup happens.
When we create a clustered index on a table then it reorder the way records and leaf level of clustered index stores actual data pages with all column values.

When we create non clustered indexes on clustered table (a table with clustered index) then leaf level of each non clustered index store key value of clustered index and key/non key columns values of non clustered index, using this key value of clustered index it perform key lookup and clustered index scan.

When we create non clustered indexes on heap table then, leaf level of each non clustered index store key/non key columns values of non clustered index and 8 bytes row identifier for each row of each non clustered indexes for RID lookup.

Let us consider/check below scenarios.
/*Note: in scenarios i have given perticular blog statement wise like stmt1START,stmt1End and so on*/

/*stmt1START (CREATE Test lookup table)*/

CREATE TABLE TESTLOOKUP
(
EmpId INT,
EmpName VARCHAR(100),
DeptName VARCHAR(100),
Addr VARCHAR(100),
Phone INT
)
/*stmt1END (CREATE Test lookup table)*/

/*stmt2START (INSERT some test rows to verify scenarios)*/
INSERT INTO TESTLOOKUP
SELECT 1 AS ID,'Dinesh' AS NAME, 'IT' AS DEPT, 'DELHI' AS LOC, 1237891 AS PHONE
UNION ALL
SELECT 2, 'Babu', 'COMPUTER', 'MUMBAI', 1236391
UNION ALL
SELECT 3, 'VERMA', 'MGMT', 'PUNE', 1236471
UNION ALL
SELECT 4, 'ALLEN', 'IT', 'NEWYORK', 1647891
UNION ALL
SELECT 5, 'AMIT', 'SALES', 'HYD', 1254891
UNION ALL
SELECT 6, 'ROHAN', 'RETAIL', 'BANGALORE',1234891
UNION ALL
SELECT 7, 'SOHAN', 'IT', 'DELHI', 1236352
UNION ALL
SELECT 8, 'BRYON', 'MGMT', 'GGN', 3456691
GO

INSERT INTO TESTLOOKUP
SELECT TOP 1 Max(EmpId) + 1, 'Dinesh ', 'IT','MUMBAI',1237891 FROM TESTLOOKUP
go 100

INSERT INTO TESTLOOKUP
SELECT TOP 1 Max(EmpId) + 1,'VERMA', 'ITT','DELHI',1237891
FROM TESTLOOKUP
go 3000

/*stmt2END (INSERT some test rows for testing purpose)*/

/*stmt3START (select and check number of records(3108) */
in execution plan, table scan happens below)*/

SELECT *
FROM TESTLOOKUP

/*Execution Plan1*/
/*stmt3END (select and check number of records in execution plan table scan happens)*/

/*stmt4START (select records from table, a table without any indexes,
in execution plan we see table scan for every select stmt) */

SELECT EmpId,
EmpName,
DeptName
FROM TESTLOOKUP
WHERE EmpName = 'SOHAN'

/*Execution Plan2*/

SELECT EmpId,
EmpName,
DeptName
FROM TESTLOOKUP
WHERE EmpName = 'VERMA'

/*Execution Plan3*/

/*stmt4END(select records from table, a table without any indexes,
in execution plan we see table scan for every select stmt) */


/*stmt5START (create non clustered index )*/
CREATE NONCLUSTERED INDEX idxNonClsEmpName
ON TESTLOOKUP(EmpName)

/*stmt5END (create non clustered index )*/

/*stmt6START (select records from the table based on the least value like SOHAN, in execution plan we see RID lookup)*/
SELECT EmpId,
EmpName,
DeptName
FROM TESTLOOKUP
WHERE EmpName = 'SOHAN'

/*Execution Plan4*/

/*stmt6END(select records from the heap table based on the least value like SOHAN, in execution plan we see RID lookup)*/

/*stmt7START (select records from the heap table based on the large value like VERMA, in execution plan we see TABLE SCAN)*/
SELECT EmpId,
EmpName,
DeptName
FROM TESTLOOKUP
WHERE EmpName = 'VERMA'


/*Execution Plan5*/

/*stmt7END (select records from the table based on the large value like VERMA,
in execution plan we see TABLE SCAN)
*/


/* WHEN LOOKUP & WHEN SCAN?
Now here we have question, like when we search EmpName ='SOHAN' then RID LOOKUP and when we search EMPNAME='VERMA' then table scan, why?
Ans: this is because for SOHAN, selectivity is very high for SOHAN means we have less records like in current data distribution we have only 1 record for SOHAN but for VERMA, selectivity is very less means we have huge records >3000 for VERMA.
when we have very high selectivity optimizer go for LOOKUP but when we have very low selectivity optimizer go for scan and
scan whole table/index rather than searching/lookup a values.
*/



/*stmt8START (create clustered index)*/
CREATE CLUSTERED INDEX IdxClsEmpId
ON TESTLOOKUP(EmpId ASC )

/*stmt8END (create clustered index)*/

/*stmt9START(select records from clustered table based on the least value like SOHAN,
in execution plan we see Key lookup )*/

SELECT EmpId,
EmpName,
DeptName
FROM TESTLOOKUP
WHERE EmpName = 'SOHAN'
/*Execution Plan6*/

/*stmt9END (select records from clustered table based on the least value like SOHAN,
in execution plan we see Key lookup )*/


/*stmt10START (select records from clustered table based on the large value like VERMA,
in execution plan we see clustered index scan)*/

SELECT EmpId,
EmpName,
DeptName
FROM TESTLOOKUP
WHERE EmpName = 'VERMA'

/*Execution Plan7*/

/*stmt10END(select records from clustered table based on the large value like VERMA,
in execution plan we see clustered index scan)*/


/*LET US REMOVE KEY lookup, to remove key lookup add DeptName in
non clustered index or create covering index*/

DROP INDEX idxNonClsEmpName on TESTLOOKUP
CREATE NONCLUSTERED INDEX idxNonClsEmpNameDeptName
ON TESTLOOKUP(EmpName, DeptName)


SELECT EmpId,
EmpName,
DeptName
FROM TESTLOOKUP
WHERE EmpName = 'SOHAN'
/*Execution Plan8*/

SELECT EmpId,
EmpName,
DeptName
FROM TESTLOOKUP
WHERE EmpName = 'VERMA'

/*Execution Plan9*/

Note: Please try above SQL statements in Test/development environment, this is must know information for performance optimization

Thursday, July 2, 2015

Transaction in SQL Server


Transaction in SQL Server

Executing below piece of SQL Code, we will know why we should use Transaction in SQL Server.
according to below code when we execute insert tSQL script with transaction it gives around 25 times more performance comparing to without Transaction tSQL Script.

we should always have Transaction scope AS SMALL AS POSSIBLE based on requirements, as it locks the resources which are part of the transactions till the Transaction is COMMIT/ROLLBACK.

having the transaction with small scope help us to get better performance, fewer locking/blocking and better data/referential integrity.


CREATE TABLE TestTran
(
TranId INT IDENTITY(1, 1),
TranName VARCHAR(100)
)

SET NOCOUNT ON

DECLARE @StartTime1 DATETIME
DECLARE @StartTime2 DATETIME
DECLARE @EndTime1 DATETIME
DECLARE @EndTime2 DATETIME

/*Start SQL Code With Transaction*/
SET @StartTime1= Getdate()
DECLARE @WithTranI INT =1
BEGIN TRAN

WHILE ( @WithTranI <= 100000 ) /*>*/
BEGIN
INSERT INTO TestTran
(TranName)
SELECT 'Transaction ' + Cast(@WithTranI AS VARCHAR)

SET @WithTranI =@WithTranI + 1
END
COMMIT
SET @EndTime1= Getdate()

SELECT Datediff(millisecond, @StartTime1, @EndTime1) AS TimediffWithTranInMilliSecond
SELECT Datediff(second, @StartTime1, @EndTime1) AS TimediffWithTranInSecond
/*End SQL Code With Transaction*/


/*Start SQL Code Without Transaction*/
SET @StartTime2= Getdate()
DECLARE @WithOutTranI INT =1

WHILE ( @WithOutTranI <= 100000 ) /*>*/
BEGIN
INSERT INTO TestTran
(TranName)
SELECT 'Transaction ' + Cast(@WithOutTranI AS VARCHAR)

SET @WithOutTranI =@WithOutTranI + 1
END

SET @EndTime2= Getdate()

SELECT Datediff(millisecond, @StartTime2, @EndTime2) AS TimediffWithOutTranInMilliSecond
SELECT Datediff(second, @StartTime2, @EndTime2) AS TimediffWithOutTranInSecond
/*End SQL Code Without Transaction*/


DROP TABLE TestTran

GO


Note: Please execute this script in development/test environment and use transaction scope wisely.

Tuesday, June 16, 2015

Nth Highest Salary in SQL Server

Please find Below Script to get 14 different ways to find Nth Highest salary.

In script, I have selected 3rd highest salary.
For any other highest salary, Please modify queries accordingly
i.e in below Queries for 5th Highest 3 need to replace by 5 except Nested Sub Queries.

There might be some other ways too, as of now I was able to figure out below.


CREATE TABLE Employees
(
Empid INT IDENTITY(1, 1),
NAME VARCHAR(20),
Salary INT,
DeptId INT
)

INSERT INTO Employees
(NAME,
Salary,
DeptId)
VALUES ('Dinesh', 5000, 1),
('Babu' , 4000, 1),
('Verma' , 6000, 2),
('Allen' , 7000, 3),
('Joseph', 8000, 1),
('Denny' , 1000, 5),
('Ramy' , 2000, 5),
('Ramesh', 2000, 1),
('John' , 9000, 2),
('Mohan' , 5000, 1),
('Sohan' , 5000, 1),
('Brant' , 8000, 1),
('Hogg' , 6000, 3)


-- First Check 3rd Highest Salary using query with Order by Clause
SELECT *
FROM Employees
ORDER BY Salary DESC
-- According to Above query Output, 3rd highest salary is 7000

--1. Using Analytical Dense_Rank Function
SELECT TOP 1 Salary
FROM (SELECT TOP 100 PERCENT NAME,
Salary,
Dense_rank()
OVER(
ORDER BY Salary DESC) AS SalRank
FROM Employees
ORDER BY Salary DESC) EmpSalary
WHERE salrank = 3

--2. Using Nested Sub Queries
SELECT Max(Salary) AS Salary
FROM Employees
WHERE Salary < (SELECT Max(Salary) FROM Employees WHERE Salary < (SELECT Max(Salary) FROM Employees)) /*/>/>*/

--3. Using Corelated Sub Queries
SELECT TOP 1 Salary
FROM Employees e
WHERE 3 = (SELECT Count(DISTINCT Salary)
FROM Employees ee
WHERE e.Salary <= ee.Salary) /*/>/>*/

--4. Using Min and Top Clause
SELECT Min(Salary) AS Salary
FROM (SELECT DISTINCT TOP 3 Salary
FROM Employees e
ORDER BY Salary DESC) Sal

--5. Using Min,Top and Order by Clause
SELECT TOP 1 Salary
FROM (SELECT DISTINCT TOP 3 Salary
FROM Employees e
ORDER BY Salary DESC) Sal
ORDER BY Salary ASC

--6. Using MAX, Top, NOT IN and Order BY Clause
SELECT Max(Salary) AS Salary
FROM Employees
WHERE Salary NOT IN (SELECT DISTINCT TOP (3-1) Salary
FROM Employees e
ORDER BY Salary DESC)

--7. Using Top, NOT IN and Order BY Clause
SELECT TOP 1 Salary
FROM Employees
WHERE Salary NOT IN (SELECT DISTINCT TOP (3-1) Salary
FROM Employees e
ORDER BY Salary DESC)
ORDER BY Salary DESC

--8. Using MIN, INNER JOIN,Top and Order BY Clause
SELECT Min(e.Salary) AS Salary
FROM Employees e
INNER JOIN (SELECT DISTINCT TOP 3 Salary
FROM Employees e
ORDER BY Salary DESC) Sal
ON e.Salary = Sal.Salary

--9. Using MIN, INNER JOIN,Top and Order BY Clause
SELECT TOP 1 e.Salary
FROM Employees e
INNER JOIN (SELECT DISTINCT TOP 3 Salary
FROM Employees e
ORDER BY Salary DESC) Sal
ON e.Salary = Sal.Salary
ORDER BY e.Salary ASC

--10. Using LEFT JOIN,MAX, Top and Order BY Clause
SELECT Max(e.Salary) AS Salary
FROM Employees e
LEFT JOIN (SELECT DISTINCT TOP (3-1) Salary
FROM Employees e
ORDER BY Salary DESC) Sal
ON e.Salary = Sal.Salary
WHERE Sal.Salary IS NULL

--11. Using LEFT JOIN,Top and Order BY Clause
SELECT TOP 1 e.Salary
FROM Employees e
LEFT JOIN (SELECT DISTINCT TOP (3-1) Salary
FROM Employees e
ORDER BY Salary DESC) Sal
ON e.Salary = Sal.Salary
WHERE Sal.Salary IS NULL
ORDER BY e.Salary DESC

--12. Using RIGHT JOIN,Top and Order BY Clause
SELECT TOP 1 Sal.Salary
FROM Employees e
RIGHT JOIN (SELECT DISTINCT TOP 3 Salary
FROM Employees e
ORDER BY Salary DESC) Sal
ON e.Salary = Sal.Salary
ORDER BY e.Salary ASC

--13. Using MIN, RIGHT JOIN,Top and Order BY Clause
SELECT Min(Sal.Salary) AS Salary
FROM Employees e
RIGHT JOIN (SELECT DISTINCT TOP 3 Salary
FROM Employees e
ORDER BY Salary DESC) Sal
ON e.Salary = Sal.Salary


--14. Using OffSET, FETCH and Order BY Clause
SELECT DISTINCT Salary FROM Employees
ORDER BY salary DESC
OFFSET 3-1 ROWS
FETCH NEXT 1 ROWS ONLY


Please make sure you are trying this in development environment.


Monday, June 15, 2015

xp_readerrorlog/sp_readerrorlog in SQL Server



xp_readerrorlog/sp_readerrorlog

SQL Server support undocumented extended stored procedures (xp_readerrorlog) to read error logs, we can also see proc sp_readerrorlog, which ultimately uses procedure xp_readerrorlog in its definition

t-SQL code of sp_readerrorlog:



create proc sys.sp_readerrorlog(
@p1 int = 0,
@p2 int = NULL,
@p3 nvarchar(4000) = NULL,
@p4 nvarchar(4000) = NULL)
as
begin

if (not is_srvrolemember(N'securityadmin') = 1)
begin
raiserror(15003,-1,-1, N'securityadmin')
return (1)
end

if (@p2 is NULL)
exec sys.xp_readerrorlog @p1
else
exec sys.xp_readerrorlog @p1,@p2,@p3,@p4
end




When we execute xp_readerrorlog/sp_readerrorlog it returns sample result as below.




As per definition of sp_readerrorlog, procedure will accept four parameters (@p1, @p2, @p3, and @p4)
Where parameters requires values as below.
@p1 [Log Number] : 0 for current log, 1 for log achieve#1, 2 for log achieve#2 etc…
@p2 [Log Type] : 1  Read SQL server error log, 2  Read SQL server agent error log.
@p3 [Search String1]: first parameter value to search in field Text
@p4 [Search String4]: second parameter value to search in field Text

All above parameters also have default values as 0 or NULL.

Let us take below example for sp_readerrorlog
EXEC sys.sp_readerrorlog 0, 1 ,'failed','user'
According to above statement procedure will read errors from current SQL Server error Log.

In real, actual extended stored procedure is XP_readerrorlog, it accept three more parameters
StartTime,EndTime,SortOrder these three parameters satisfy column LogDate from procedure result

Example:
EXEC sys.xp_readerrorlog 0, 1 , N'failed',N'user','2015-05-25 15:13:02.410','2015-06-15 13:54:10.550','ASC';

According to above statement procedure will read errors from current SQL Server error Log and will have filter criteria on LogDate column as startDate and EndDate in Ascending order

Procedure with required sample parameters name are:

EXEC XP_readerrorlog [LogNumber],[LogType],[SearchStr1],[SearchStr2],[StartTime],[EndTime],[SortOrder]




Thursday, May 21, 2015

SQL Server 2016 New Features



SQL Server 2016 New Features (welcome updates)

1. Always Encrypted
Always encrypted is designed to protect data while retrieving or when stored in
Databases, encryption/decryption keys reside in applications in customer trusted environment. SQL Server can perform operations on encrypted data. This help to hide data from DBA and SQL developer. Encryption/decryption of data happens transparently in application which helps for minimal code changes to existing application.

2. Dynamic Data Masking
This feature helps to mask/hide actual values of a sensitive columns and keeping rest as it. i.e. Email/Phone/Salary etc. this can be maintained at user/role level
This is also available in Azure SQL DB V12

3. Row level security
This help to maintain the policies which help to filter specific rows based on the user queries
This is also available in Azure SQL DB V12

4. Stretch Database
This feature helps to move cold or history data to move in azure environment at lower cost and hot and operational data in on-premises. Always encrypted feature is available in azure environment.

5. Real-time Operational Analytics
This feature combines in-memory OLTP with in-memory column store for real time analysis.
It helps to optimize system for optimal transactional performance and increased workload concurrency.

6. PolyBase into SQL Server
This feature help to integrate sql server with hadoop

7. Native JSON Support
8. Java script object notation (JSON) currently not supported natively by sql server to support it has to have t-SQL,SQLCLR, JavaSupport. Now in this release JSON Support feature help to integrate SQL server directly like xml.

9. Enhancements to AlwaysOn
This release will have up to three synchronous replicas and will support DTC and also support for round robin load balancing for secondary replicas. And will also have automatic fail over based on database health.

10. Enhanced In-Memory OLTP
This release will support more memory (in TB) and greater number of parallel CPU’s and will also support for extended t-SQL surface areas.

11. Revamped SQL Server Data Tools
This release will consolidate different version of SQL Server data tools into one.



Tuesday, May 12, 2015

SQL Server System Databases



MS SQL Server internally uses system databases to manage/maintain every SQL Server activity.
When we install SQL server following system databases created with default recovery model

1. Master(Simple): Records all system level info for an instance of SQL server, SQL server logins, link server
Configuration, service broker end points, system stored proc and functions
Full recovery model for master database is supported and can be changed if master database corrupted rest of the databases will also not work we need to run setup.exe to rebuild the instance of sql server.


2. Model(configurable: Is a template database, when we create new database this new database always contain the exact object/configuration of model database.


3. Msdb(Simple): Is used by SQL Server agent also by alert and jobs when we take backup(Log/diff/tran)
Respective entries also get stored in MSDB..BACKUPSET


4. Tempdb(Simple): Is a temporary database which generally hold temporary objects like (local/global temp table etc) and also intermediate result set. On start of SQL server instance tempdb database get re created & log space always reclaimed. We cannot backup tempdb database.


5. Distribution: When we configure any replication (i.e. Transactional, merge etc) a new Distribution database gets created having data file distribution.mdf and log file distribution_log.ldf


6. Resource: SQL server also uses read only Resource database to maintain all system level information under sys schema. These all system objects physically reside in resource database but logically appear in all databases under sys schema like sys.objects.

Resource database always has database id 32767 & uses mssqlsystemresource.ldf log and mssqlsystemresource.mdf as data file which always present at. :\Program Files\Microsoft SQL Server\MSSQL.\MSSQL\Binn\
SQL server cannot backup/restore detach/attach resource database like other databases, its .ldf & .mdf file should be copied and paste according to system requirement (copy paste is not possible with other databases).
Resource database info can be checked using below query.

SELECT
'Resource Database' AS 'Database Name'
, NAME AS [Database File]
, FILENAME AS [Database File Location]
FROM sys.sysaltfiles
WHERE DBID = 32767