Follow by Email

Thursday, July 24, 2014

Know Stored Procedure which is Updating/Manipulating(DML) a Table ?


How to know SP's which are Manipulating/Updating(DML) a Table?

Suppose we have a table Employee, now we want to know the list of DB objects and users which are manipulating Employee Table
-- Create Table Employee
CREATE TABLE EMPLOYEE
(
EMPID INT,
EMPNAME VARCHAR(100),
Address VARCHAR(200)
)

-- Create Log Table to store the calling sp's/DB objects and Users.
CREATE TABLE TBL_CALLINGOBJECTS
(
ID INT IDENTITY(1, 1),
DTIME DATETIME DEFAULT(GETDATE()),
EMPID INT,
EMPNAME VARCHAR(100),
USERNAME VARCHAR(100) DEFAULT(SUSER_SNAME()),
SPID INT DEFAULT(@@SPID),
SPNAME VARCHAR(200)
)

-- Create Trigger Start on Table Employee
CREATE TRIGGER TRIGGEREMPLOYEE_INSERT
ON EMPLOYEE
FOR INSERT -- ,UPDATE, DELETE
AS
DECLARE @Query VARCHAR(100)
DECLARE @SpName VARCHAR(255)

CREATE TABLE #SESSIONOBJECTS
(
EVENTTYPE NVARCHAR(30),
PARAMETERS INT,
EVENTINFO NVARCHAR(255)
)

SET @Query = 'DBCC INPUTBUFFER(' + Str(@@SPID) + ')'

INSERT INTO #SESSIONOBJECTS
EXEC (@Query)

SELECT @SpName = EVENTINFO FROM #SESSIONOBJECTS

INSERT INTO TBL_CALLINGOBJECTS
(EMPID,
EMPNAME,
SPNAME)
SELECT EMPID,
EMPNAME,
@SpName
FROM INSERTED
GO
-- Create Trigger End on Table Employee

-- Now select the data from table TBL_CALLINGOBJECTS and check calling objects in column SPName
-- Note:
Once we know the calling sp's /objecs we have to delete/disable this trigger as per requirement
else trigger will keep on inserting data in log table(TBL_CALLINGOBJECTS)
right now trigger is defined for Insert command, we can modify it for Delete and Update if reqiured.

Friday, March 7, 2014

Index Selectivity in SQL Server


Index Selectivity in SQL Server

Index selectivity describes distribution of distinct values in a given data set.


To more generalize the term count the number of row and number of distinct values in a given column across all rows then divide count of distinct values by number of rows, the result in a ratio describe the selectivity of the index.


Better the selectivity more useful the index is and used by query optimizer.

There’s a common piece of advice given about columns in an index key that says that the most selective column should go first. It's correct, but the problem is that it’s often given without any explanation as to why the most selective column should go first, nor are the other considerations for index key order mentioned.
This can lead to misunderstandings like, in the extreme case, where one person after hearing that advice went and added the primary key column as the leading column of every single nonclustered index (because it’s highly selective), and then wondered why his database performance decreased dramatically.
The comment about selectivity is because of the way SQL keeps statistics on indexes. SQL only keeps the histogram for the first column of the index. That means that it only knows the actual distribution of values of the first column. If the first column is not selective, the index may not be used. However, that’s not the whole story.
SQL also, in addition to the histogram, keeps density values for all of the left-based subsets of the index keys. So, for a 4 column index key, SQL knows the density of the first column, of the first and second columns, first and second third columns and of all four columns. The density is, in a nutshell, a value that shows how unique the set of columns is. It’s 1/(distinct values). The value can be seen for any index using DBCC Show_Statistics with the DENSITY_VECTOR option.
This means, while SQL only knows the actual data distribution of the first column, it does know, on average, how many rows will be returned by an equality match on any left-based subset of the index keys.


Therefore, a good rule for the order of columns in an index key is to put the most selective columns first, when all other considerations are equal.



Refer below query on employee table having four columns Id,Name,Addr,Dept

SELECT * FROM Employee
-- CREATE INDEX idx_Employee ON Employee(id,Name,Addr)
-- CREATE INDEX idx_Employee1 ON Employee(Addr,Name,id)
DBCC SHOW_STATISTICS(Employee,'idx_Employee')
-- DBCC SHOW_STATISTICS(Employee,'idx_Employee1')


Friday, February 28, 2014

How to find out Dropped DataBase Objects Details in Given Time?




CREATE PROCEDURE uspRecover_dropped_DBobjects @Date_From DATETIME=
'1900/01/01',
@Date_To DATETIME =
'9999/12/31'
AS
; WITH cte
AS (SELECT
B.name
AS [Schema Name]
,
Replace(
Substring(A.[rowlog contents 0], 14, Len(A.[rowlog contents 0])), 0x00, 0x)
AS
[Object Name],
[transaction id],
A.[rowlog contents 0]
FROM Fn_dblog(NULL, NULL) A
LEFT JOIN sys.schemas B
ON CONVERT(INT, Substring([rowlog contents 0], 2, 2)) =
B.schema_id
WHERE A.[allocunitname] = 'sys.sysschobjs.nc1'
AND A.[transaction id] IN (SELECT DISTINCT [transaction id]
FROM sys.Fn_dblog(NULL, NULL)
WHERE context IN ( 'LCX_NULL' )
AND operation IN ( 'LOP_BEGIN_XACT'
)
AND [transaction name] LIKE
'%DROP%'
AND CONVERT(NVARCHAR(11),
[begin time]
)
BETWEEN
@Date_From AND @Date_To))
SELECT[schema name],
[object name],
B.[begin time] AS [Dropped Date & Time],
C.[name] AS [Dropped By User Name]
FROM cte A
INNER JOIN Fn_dblog(NULL, NULL) B
ON A.[transaction id] = B.[transaction id]
AND context IN ( 'LCX_NULL' )
AND operation IN ( 'LOP_BEGIN_XACT' )
AND [transaction name]LIKE '%DROP%'
LEFT JOIN sys.sysusers C
ON B.[transaction sid] = C.[sid]

GO



-- ************ Start Create Procedure and View on User defined Table **************
--*********CREATE table temp_Employee without column Location **************
IF OBJECT_ID('temp_Employee') IS NULL
BEGIN
CREATE TABLE temp_Employee
(EmpId INT,
Name VARCHAR(100),
Department VARCHAR(100))
END

--*********Insert data into table temp_Employee **************
INSERT INTO temp_Employee (EmpId ,Name,Department )
VALUES (1, 'Allen','IT')
GO

--*********CREATE VIEW on Table temp_Employee **************
CREATE VIEW vw_temp_Employee
AS
SELECT * FROM temp_Employee
GO

--*********CREATE Proc on Table temp_Employee **************
CREATE Proc usp_Temp_Employee
AS
SELECT * FROM temp_Employee
GO


-- ************ End Create Procedure and View on User defined Table **************




--- DROP PROC WITH USER SUPPOSE BABU --------
DROP PROC usp_Temp_Employee
GO


--- DROP VIEW WITH USER SUPPOSE BABU1 --------

DROP VIEW vw_Temp_Employee
GO




----- ***** Execute the sp to find out list of dropped objects in given time frame ************

EXEC uspRecover_dropped_DBobjects '2014/02/20 04:45:06:450','2014/03/20 04:45:06:450'


-- OUTPUT
schema name object name Dropped Date & Time Dropped By User Name
dbo uspTemp_Employee 2014/02/28 05:24:28:903 NULL
dbo vw_temp_Employee 2014/02/28 05:32:51:437 Babu
dbo vw_temp_Employee 2014/02/28 05:42:46:610 Babu1
dbo usp_Temp_Employee 2014/02/28 05:42:50:587 Babu
dbo vw_temp_Employee 2014/02/28 05:54:29:100 Babu1
dbo Recover_dropped_objects_detail_proc 2014/02/28 06:03:32:850 Babu1



sp_refreshview (update metadata/definition of a view)


sp_refreshview System stored procedure


USE TEMPDB

BEGIN TRANSACTION

--*********CREATE table temp_Employee without column Location **************
CREATE Table temp_Employee
(EmpId INT,
Name VARCHAR(100),
Department VARCHAR(100))

--*********Insert data into table temp_Employee **************
INSERT INTO temp_Employee (EmpId ,Name,Department )
VALUES (1, 'Allen','IT')
GO

--*********CREATE VIEW on Table temp_Employee **************
CREATE VIEW vw_temp_Employee
AS
SELECT * FROM temp_Employee
GO


--*********SELECT Data from Table and View both query returns same number of columns **************
SELECT TOP 1 * FROM temp_Employee
SELECT TOP 1 * FROM vw_temp_Employee


--*********ALTER TABLE temp_Employee Add New column Location **************
ALTER TABLE temp_employee
ADD Location VARCHAR(100)


--*********SELECT Data from Table and View both query do not returns same number of columns(view is not returning Location Column) **************
SELECT TOP 1 * FROM temp_Employee
SELECT TOP 1 * FROM vw_temp_Employee


-- Now Execute sp_refreshview to update the view definition in System defined module object
EXEC sp_refreshview vw_temp_Employee


--*********SELECT Data from Table and View both query returns same number of columns **************
SELECT TOP 1 * FROM temp_Employee
SELECT TOP 1 * FROM vw_temp_Employee

ROLLBACK TRANSACTION

Thursday, February 27, 2014

DB


ColumnStore index in SQL Server


ColumnStore index in SQL Server


The SQL Server in-memory columnstore index stores and manages data by using column-based data storage and column-based query processing. Columnstore indexes work well for data warehousing workloads that primarily perform bulk loads and read-only queries. Use the columnstore index to achieve up to 10x query performance gains over traditional row-oriented storage, and up to 7x data compression over the uncompressed data size.
Two types of columnstore indexes we have
1. Clustered column store index
2. Non Clustered Column store index
Clustered colulmnstore index updateable means it allows data updation on underlying table, which leads work load while insert, update, delete operation.

A columnstore index is a technology for storing, retrieving and managing data by using a columnar data format, called a columnstore. SQL Server supports both clustered and nonclustered columnstore indexes. Both use the same in-memory columnstore technology, but they do have differences in purpose and in features they support
Columnstore index works well for read only data with large data set, it helps to scan whole Index/table rather than index seek or to search particular value.
Benefits of columnstore index:
• Columns often have similar data, which results in high compression rates.
• High compression rates improve query performance by using a smaller in-memory footprint. In turn, query performance can improve because SQL Server can perform more query and data operations in-memory.
• A new query execution mechanism called batch-mode execution has been added to SQL Server that reduces CPU usage by a large amount. Batch-mode execution is closely integrated with, and optimized around, the columnstore storage format. Batch-mode execution is sometimes known as vector-based or vectorized execution.
• Queries often select only a few columns from a table, which reduces total I/O from the physical media.

• Can index a subset of columns in the clustered index or heap. For example, it can index the frequently used columns.
• Requires extra storage to store a copy of the columns in the index.
• Is updated by rebuilding the index or switching partitions in and out. It is not updateable by using the DML operations such as insert, update, and delete.
• Can be combined with other indexes on the table.
• Can be configured to use columnstore or columnstore archival compression.
• Does not physically store columns in a sorted order. Instead, it stores data to improve compression and performance. Pre-sorting the data before creating the columnstore index is not required, but can improve columnstore compression.

Characteristics:
Clustered columnstore index:
• Is available in Enterprise, Developer, and Evaluation editions.
• Is updateable.
• Is the primary storage method for the entire table.
• Has no key columns. All columns are included columns.
• Is the only index on the table. It cannot be combined with any other indexes.
• Can be configured to use columnstore or columnstore archival compression.
• Does not physically store columns in a sorted order. Instead, it stores data to improve compression and performance
Non Clustered columnstore index:
• Can index a subset of columns in the clustered index or heap. For example, it can index the frequently used columns.
• Requires extra storage to store a copy of the columns in the index.
• Is updated by rebuilding the index or switching partitions in and out. It is not updateable by using the DML operations such as insert, update, and delete.
• Can be combined with other indexes on the table.
• Can be configured to use columnstore or columnstore archival compression.
• Does not physically store columns in a sorted order. Instead, it stores data to improve compression and performance. Pre-sorting the data before creating the columnstore index is not required, but can improve columnstore compression.

Columnstore index works well for all data types except followings.
• ntext, text, and image
• varchar(max) and nvarchar(max)
• rowversion (and timestamp)
• sql_variant
• CLR types (hierarchyid and spatial types)
• Xml

Columnstore indexes cannot be combined with the following features:
• Page and row compression, and vardecimal storage format (A columnstore index is already compressed in a different format.)
• Replication
• Change tracking
• Change data capture
• Filestream
Syntex to create columnstore index
CREATE TABLE MyTable
(ProductId [int] NOT NULL,
RequestDateKey [int] NOT NULL,
DueDateKey [int] NOT NULL,
ShipDateKey [int] NOT NULL);
GO

CREATE CLUSTERED INDEX cl_myTable ON MyTable (ProductId);
GO

CREATE NONCLUSTERED COLUMNSTORE INDEX csindx_MyTable
ON MyTable
(RequestDateKey, DueDateKey, ShipDateKey);
GO

Create columnstore index using options

CREATE NONCLUSTERED COLUMNSTORE INDEX csindx_MyTable
ON MyTable
(RequestDateKey, DueDateKey, ShipDateKey)
WITH (DROP_EXISTING = ON,
MAXDOP = 2)
ON "default"
GO

Change the data in a nonclustered columnstore index
Once you create a nonclustered columnstore index on a table, you cannot directly modify the data in that table. A query with INSERT, UPDATE, DELETE, or MERGE will fail and return an error message. To add or modify the data in the table, you can do one of the following:

Disable or drop the columnstore index. You can then update the data in the table. If you disable the columnstore index, you can rebuild the columnstore index when you finish updating the data.
Syntex to Disable and Rebuild the columnstore index
ALTER INDEX csindx_MyTable ON MyTable DISABLE;
ALTER INDEX csindx_MyTable on MyTable REBUILD

Conclusion:
After columnstore index implementation around 10 times performance(in terms of IO and Time) we get comparing to existing row based index storage approach