Understanding the basics of indexes is important for any database developer. A simple index can drastically improve the performance of your queries. This blog looks at the basic concepts around indexes in SQL Server, first outlining what an index is before looking into the two main indexing architectures in SQL Server, clustered and non-clustered indexes.
What is an Index?
An index is a copy of columns of data from a table or a pointer to data within a table, sorted by key values. They are used to enhance the efficiency of retrieving data from a table by aiding the database optimizer in finding that data quickly. Without an index on a table, the database has to perform a full table scan to retrieve that data. When indexes are applied correctly, the database can retrieve the data by only accessing the logical grouping of data created by the index.
While indexes may improve read access to a table, it is worth keeping in mind that they also slow down the write and delete operations on a table. This is because the database now not only has to insert or remove data, it also has to deal with organising the indexes. If your table requires more write access than read, it is probably detrimental to performance to create indexes.
In one of the original types of indexing, imagine a library. You walk into the library looking for your favourite book. Within the library building, books are grouped in sections based on their category. Each section is then sorted by alphabetical order of last name. You quickly find the section your book is residing, and then find the book on the shelf based on the authors last name.
In this Picture It, a common analogy for indexing, the grouping of the books in sections, ordered by the author’s last name, represents the index created on the table. Without indexing in a library, it would take much longer to locate the book you were looking for. The same goes for a database table. Providing an index helps the database quickly locate the data you are after. However not any index will do, it is important you create indexes that aid in finding the data. If a library grouped their books based on the first letter of chapter one, very few people would be able to find their book.
In SQL Server, there are two main types of indexing architecture, clustered and nonclustered.
A clustered index stores the actual table data sorted in the order of the defined cluster key. The cluster key is all the columns included in the index definition and should be chosen to maximise the efficiency of your queries. You can only have one clustered index per table as it is sorting the actual data in the table and as such, can only be sorted one way. Before a clustered index is created, data in the table has no logical sort order, it is entirely random.
A non-clustered index is its own structure which stores a pointer to the data using the key values provided in the index definition. Unlike clustered indexes, you can have many non-clustered indexes as they are their own structure and just point the database in the right direction of the data. A non-clustered index does not sort the physical data in the table, that remains random.
The general rule of thumb is that clustered indexes perform much faster than non-clustered indexes as there is less work for the database to do. Before choosing, make sure you do your own research into where you are looking to make performance gains and decide firstly if an index is beneficial, and secondly, what type of index will provide you with the most performance gain from a database-wide perspective. That concludes this brief blog about indexes in SQL Server. This subject is of course much larger than what I have outlined here but I hope this has been a good place to start.
Thanks for reading,