Follow by Email

Thursday, July 31, 2014

Read Committed VS Repeatable Read in SQL Server


Difference between Read Committed and Repeatable Read

READ COMMITTED
Specifies that statements cannot read data that has been modified but not committed by other transactions.
This prevents dirty reads.

Data can be changed by other transactions between individual statements within the current transaction, resulting in nonrepeatable reads or phantom data. This option is the SQL Server default

REPEATABLE READ
Specifies that statements cannot read data that has been modified but not committed by other transactions
and that no other transactions can modify data that has been read by the current transaction until the current transaction completes

Example: let us create table Emp_Temp with following records.

EmpId         EmpName   DeptId
1                 Allen                6
2                 Defriez             6
3                 Sean                6
4                 Alex                 6
5                 Blex                 6


-- Drop table if already exists
if exists(SELECT * FROM SYS.OBJECTS WHERE NAME = 'Emp_Temp')
BEGIN
DROP TABLE Emp_Temp
END

--- Create and insert above data.
create table Emp_Temp (EmpId int primary key,EmpName Varchar(100), DeptId int)
insert Emp_Temp values (1,'Allen',    6)
insert Emp_Temp values (2,'Defriez', 6)
insert Emp_Temp values (3,'Sean',    6)
insert Emp_Temp values (4,'Alex',    6)
insert Emp_Temp values (5,'Blex',    6)

READ COMMITTED
-- Now in session 1 Execute Below Statement, this statement will lock 4th row after passing through shared(S) lock till 3rd Row.
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRAN
update Emp_Temp set EmpId = 4 where EmpId = 4

-- Now in same session 1 read/select all rows.
select * from Emp_Temp

-- Now in Session 2(Create by clicking New Query in Query analyzer) update rows in table Emp_Temp
update Emp_Temp set EmpId = 3 where EmpId = 3

Messages
(1 row(s) affected)

-- in case of Read Committed in session 2 transaction completes by updating 1 rows without waiting for session 1 transaction to complete


REPEATABLE READ
-- Now in session 1 Execute Below Statement, this statement will lock 4th row after passing through shared(S) lock till 3rd Row.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN
update Emp_Temp set EmpId = 4 where EmpId = 4

-- Now in same session 1 read/select all rows.
select * from Emp_Temp

-- Now in Session 2(Create by clicking New Query in Query analyzer) update rows in table Emp_Temp
update Emp_Temp set EmpId = 3 where EmpId = 3

-- In case Repeatable Read This will keep on executing till session 1 transaction completes


Note: I Created table Emp_Temp with proper columns to further understand "Non Repeatable Reads OR Phantom Reads")

Tuesday, July 29, 2014

Columnstore Index in SQL Server 2014



A Columnstore index is a technology which we use to store, select and managing data by using a columnar data format, called a columnstore. SQL Server supports both clustered and nonclustered columnstore indexes. Both use the same in-memory columnstore technology,
But they do have differences in purpose and in features they support.
SQL server 2014 support updateable clustered columnstore index comparing to SQL server 2012 Non updateable clustered columnstore Index.

In columnstore data is logically organized data as table in rows and column format but physically store data in column wise format.



Example:
CREATE TABLE tbl_ProductColumnStore(
ProductId [int] NOT NULL,
OrderDateId [int] NOT NULL,
Quantity [int] NOT NULL,
RecNo [int] NOT NULL);
GO

CREATE CLUSTERED COLUMNSTORE INDEX CCI_tbl_EmployeeColumnStore ON tbl_ProductColumnStore;
GO

-- We can have only one columnstore index on a table, if we create additional index it show error.

CREATE NONCLUSTERED COLUMNSTORE index NCCI_tbl_EmployeeColumnStore ON tbl_ProductColumnStore(ProductId);
GO

Error:
Msg 35303, Level 16, State 1, Line 14
CREATE INDEX statement failed because a nonclustered index cannot be created on a table that has a clustered columnstore index. Consider replacing the clustered columnstore index with a nonclustered columnstore index.


Some Key points on Clustered ColumnStore:
1 Has storage on entire table.
2 No key columns. All columns are included columns.
3 Has only one index on the table.
4 Can be configured to use columnstore or columnstore archival compression.
5 Columns does not physically stored in a sorted order. Instead, it stores data to improve compression and performance.

Some Key points on Non Clustered ColumnStore:
1 Can index a subset of columns in the clustered index
2 Requires extra storage to store a copy of the columns in the index.
3 Is updated by rebuilding the index or switching partitions in and out.
4 Can be combined with other indexes on the table.
5 Can be configured to use columnstore or columnstore archival compression.
6 Does not physically store columns in a sorted order. Instead, it stores data to improve compression and performance. Pre-sorting the data before creating the columnstore index is not required, but can improve columnstore compression.



Friday, July 25, 2014

Transpose of Rows to Columns in SQL Server

-- Transpose of a Records in SQL Server

IF Object_id('tempdb.dbo.#temp')IS NOT NULL
BEGIN
DROP TABLE #temp
END

CREATE TABLE #temp
(
year INT,
qtr VARCHAR(20),
saleamt INT
)

INSERT INTO #temp
SELECT 2011,
'Q1',
15

INSERT INTO #temp
SELECT 2012,
'Q2',
16

INSERT INTO #temp
SELECT 2012,
'Q3',
25

INSERT INTO #temp
SELECT 2012,
'Q4',
35

INSERT INTO #temp
SELECT 2012,
'Q5',
45

INSERT INTO #temp
SELECT 2011,
'Q2',
112

INSERT INTO #temp
SELECT 2011,
'Q3',
125

INSERT INTO #temp
SELECT 2011,
'Q4',
135

INSERT INTO #temp
SELECT 2011,
'Q5',
145

INSERT INTO #temp
SELECT 2012,
'Q6',
15

SELECT *
FROM #temp

DECLARE @sql VARCHAR(max) =''
DECLARE @Qry VARCHAR(max) =''

SELECT @sql = @sql
+ '(select sum(saleamt) from #Temp tt where t.year= tt.year and Qtr = '''
+ qtr + ''') as ''' + qtr + '''' + ','
FROM (SELECT DISTINCT qtr
FROM #temp) t

SET @sql =LEFT(@sql, Len(@sql) - 1)
SET @Qry = 'select Year, ' + @sql
+ ' from #Temp t group by year '

PRINT @Qry

EXECUTE (@Qry)

IF Object_id('tempdb.dbo.#temp')IS NOT NULL
BEGIN
DROP TABLE #temp
END



DDL Trigger (Know DDL commands on a Database)

DDL Trigger (Know DDL commands on a Database)

-- Table to Log DDL Level Events

CREATE TABLE databaseleveleventlog
(
id INT IDENTITY,
dbuser NVARCHAR(100),
systemuser NVARCHAR(100),
hostname NVARCHAR(100),
hostid NVARCHAR(100),
event NVARCHAR(500),
command NVARCHAR(max),
eventtime DATETIME,
);

go

-- Trigger Definition
CREATE TRIGGER tgr_databaseleveleventlog
ON DATABASE
FOR ddl_database_level_events
AS
INSERT databaseleveleventlog
(dbuser,
systemuser,
hostname,
hostid,
event,
command,
eventtime)
VALUES (CONVERT(NVARCHAR(100), CURRENT_USER),
SYSTEM_USER,
Host_name(),
Host_id(),
Eventdata().value('(/EVENT_INSTANCE/EventType)[1]', 'nvarchar(100)')
,
Eventdata().value('(/EVENT_INSTANCE/TSQLCommand)[1]',
'nvarchar(2000)'),
Getdate())

go


-- DROP TRIGGER tgr_DatabaseLevelEventLog ON DATABASE;
-- DROP TABLE DatabaseLevelEventLog;

Note: Once our task/goal is achieved, we need to drop this Trigger and Table, If not required any more.


Thursday, July 24, 2014

Know Stored Procedure which is Updating/Manipulating(DML) a Table ?


How to know SP's which are Manipulating/Updating(DML) a Table?

Suppose we have a table Employee, now we want to know the list of DB objects and users which are manipulating Employee Table
-- Create Table Employee
CREATE TABLE EMPLOYEE
(
EMPID INT,
EMPNAME VARCHAR(100),
Address VARCHAR(200)
)

-- Create Log Table to store the calling sp's/DB objects and Users.
CREATE TABLE TBL_CALLINGOBJECTS
(
ID INT IDENTITY(1, 1),
DTIME DATETIME DEFAULT(GETDATE()),
EMPID INT,
EMPNAME VARCHAR(100),
USERNAME VARCHAR(100) DEFAULT(SUSER_SNAME()),
SPID INT DEFAULT(@@SPID),
SPNAME VARCHAR(200)
)

-- Create Trigger Start on Table Employee
CREATE TRIGGER TRIGGEREMPLOYEE_INSERT
ON EMPLOYEE
FOR INSERT -- ,UPDATE, DELETE
AS
DECLARE @Query VARCHAR(100)
DECLARE @SpName VARCHAR(255)

CREATE TABLE #SESSIONOBJECTS
(
EVENTTYPE NVARCHAR(30),
PARAMETERS INT,
EVENTINFO NVARCHAR(255)
)

SET @Query = 'DBCC INPUTBUFFER(' + Str(@@SPID) + ')'

INSERT INTO #SESSIONOBJECTS
EXEC (@Query)

SELECT @SpName = EVENTINFO FROM #SESSIONOBJECTS

INSERT INTO TBL_CALLINGOBJECTS
(EMPID,
EMPNAME,
SPNAME)
SELECT EMPID,
EMPNAME,
@SpName
FROM INSERTED
GO
-- Create Trigger End on Table Employee

-- Now select the data from table TBL_CALLINGOBJECTS and check calling objects in column SPName
-- Note:
Once we know the calling sp's /objecs we have to delete/disable this trigger as per requirement
else trigger will keep on inserting data in log table(TBL_CALLINGOBJECTS)
right now trigger is defined for Insert command, we can modify it for Delete and Update if reqiured.

Friday, March 7, 2014

Index Selectivity in SQL Server


Index Selectivity in SQL Server

Index selectivity describes distribution of distinct values in a given data set.


To more generalize the term count the number of row and number of distinct values in a given column across all rows then divide count of distinct values by number of rows, the result in a ratio describe the selectivity of the index.


Better the selectivity more useful the index is and used by query optimizer.

There’s a common piece of advice given about columns in an index key that says that the most selective column should go first. It's correct, but the problem is that it’s often given without any explanation as to why the most selective column should go first, nor are the other considerations for index key order mentioned.
This can lead to misunderstandings like, in the extreme case, where one person after hearing that advice went and added the primary key column as the leading column of every single nonclustered index (because it’s highly selective), and then wondered why his database performance decreased dramatically.
The comment about selectivity is because of the way SQL keeps statistics on indexes. SQL only keeps the histogram for the first column of the index. That means that it only knows the actual distribution of values of the first column. If the first column is not selective, the index may not be used. However, that’s not the whole story.
SQL also, in addition to the histogram, keeps density values for all of the left-based subsets of the index keys. So, for a 4 column index key, SQL knows the density of the first column, of the first and second columns, first and second third columns and of all four columns. The density is, in a nutshell, a value that shows how unique the set of columns is. It’s 1/(distinct values). The value can be seen for any index using DBCC Show_Statistics with the DENSITY_VECTOR option.
This means, while SQL only knows the actual data distribution of the first column, it does know, on average, how many rows will be returned by an equality match on any left-based subset of the index keys.


Therefore, a good rule for the order of columns in an index key is to put the most selective columns first, when all other considerations are equal.



Refer below query on employee table having four columns Id,Name,Addr,Dept

SELECT * FROM Employee
-- CREATE INDEX idx_Employee ON Employee(id,Name,Addr)
-- CREATE INDEX idx_Employee1 ON Employee(Addr,Name,id)
DBCC SHOW_STATISTICS(Employee,'idx_Employee')
-- DBCC SHOW_STATISTICS(Employee,'idx_Employee1')