In my previous blog I explained about clustered index (https://sqldose.wordpress.com/2015/09/20/clustered-index-and-related-interview-questions/ ).
Considering that we are versed with the concept of heap, binary and clustered. Lets move on to non clustered index(NCI).
NCI for me is the most mysterious object in SQL Server. The way SQL Server work with it, sometimes can be very unpredictable. But hey do not worry, that is why we are for. Share the most with you all.
Have you used information at the end of book. Which predict all the page number where a specific topic is referred.
Above image shows a topic Member Function, Which can be find at the corresponding pages.
Similarly, NCI is a structure ( binary or heap) which holds the pointer to the data rows of table at it leaves.
Even i am. 😛
Ok! let me explain this in another way. As soon as I create an NCI on a column, SQL Server will create a separate structure and fill it with data(just one value) and PageID where that data can be find.
So if I have to explain NCI in a plain simple language for the MemberFunction shown in above image. It will be, The Value “MemberFunction” can be find at PageID = 101,108…
NOTE : I mentioned NCI structure as btree or heap. Why? NCI will be a heap if my table on which NCI is created does not have a CI(i.e. A HEAP). NCI will be a btree if my table has a CI. So structure of NCI depends on table on which NCI is created.
The most important point to be noted about NCI is they are stored separately. So every time when you perform any DML on your table, the same statement will be applied to NCI as well.
Now, How about if you can actually see what my NCI is filled with. What values it actually holds.
Lets get started.
Here, I am using the SalesOrderDetail (I call it SOD) table of AdventureWorks Database. (You can download this database for your practice.)
SOD already have two NCI. We will look for only one NCI i.e. IX_SalesOrderDetailProductID.
Let us first check how this Non-Clustered Index looks like physically.
First query in above snapshot provide me the index id for my NCI. I then used the DBCC IND command to find the pages allocated to this table under my NCI. From the result of DBCC IND I will look for maximum value in IndexLevel column. This column help me understand whether this page is part of ROOT node or INTERMEDIATE NODE or LEAF NODE. Highest Value represent Root Node. Lowest represent Leaf and others are intermediate nodes.
In our case we have only two type of values i.e. 0 and 1. It means 1 represent the Root Node, 0 as leaf and we do not have any intermediate nodes.
- PagePID=24336 is a root node for my NCI.
Now check this page for data. We will use DBCC PAGE command for this.
Let us understand the above result.
ProductID is my NCI column, while SalesOrderID and SalesOrderDetailID are my CI. These will act as pointer. If I remove the clustered index these two column will be replaced by RID which is physical location address. See below snapshot.
(As I have deleted the CI, Pages have been shuffled so new Root page for my index is 24920).
Now back to NCI ProductID. The above snapshots represent that ProductID=707 can be found at ChildPageID 24857,24858…so on. And the way to reach there is HEAP RID/ClusteredIndex column.
To summarize the above. Value 707 lies at page 24857(and many more) and you can use RID or CI as pointer.
I hope this clears what NCI is. If still in doubt drop me your questions.
Now few questions about NCI.
- Which column should be the best candidate for non clustered index?
Ans. Column which is most queried with operator = should be the one.
- Can we have more than one nci?
Ans. Yes, you can. But having many nci on table can hit your performance adversely. With Every INSERT or UPDATE, SQL server will have to perform changes in NCI as well. So if you have many nci. SQL server will actually consume more time in updating indexes rather than just working with your data in table.
In some rare scenario, NCI can even cause deadlocks. So we need to be extra care full with those.
- Can nci be added to a column having duplicate or null values?
- Can nci be added to a column having unique value?
- Will SQL Server uses NCI every time for a where clause on NCI column?
Ans. NO, It totally depend on the percentage of data which is being retrieved.
If productID=707 have 90% of rows out of total records. Then SQL Server will prefer to use CI rather then taking a long route of NCI.