Why or why not to pad an index with fill fator on sql server

Goal:


Using sql server's fill factor to pad an index with free space seems like a great idea if you have high fragmentation on an index or you think you have a very high write to read ratio, or simply a write intensive table.

The problem:


So you go an add lets say a 70% fill factor and suddenly your reads slow down. Why is that?! I just improved performance by adding free space to the leaf pages of the index.

You assumed that since the index has to be updated frequently due to inserts or updates that you need to leave free space.

Fill factor other than 100% or 0% (they are synonymous) can have a negative impact on performance, so don't just globally set it to 90% as some folks like to do. It can in some cases kill performance and bring a system to a halt. Be very selective when you decide to add it.

Fill factor facts:


  • Leaves free space on the leaf pages of the index ONLY when the index is created or rebuilt, NOT when it is re
  • Reorganized or when a page split occurs. The reason it does not happen on a page split is that sql server does a 50% split anyway, half the record from the old page remain and half of the remainder gets move to the new page.
  • The fill factor does not leave space (logically that is) at the end of the index in the page, it interleaves it between the rows of the leaf page's rows. This is an important factor as it is key in deciding when to use fill factor and when not. We will touch on this point here and there so it makes sense.
  • Changing the fill factor on an index with existing rows will lock the table, so be careful when doing it during production high-volume transactions
  • Fill factor is useless on a CLUSTERED index where the key(s) do not change, like IDENTITY key of GUID for example, why, because these are in sequential order already. Remember from my point earlier that fill factor interleaves space between the rows and not at the end, so if the keys don't change and the keys always remain in sequence, what is the purpose of interleaving space?!
  • Adding a fill factor to the CLUSTERED index does not affect the fill factor of NONCLUSTERED indexes.

When to use or even consider fill factor is obviously what you need and want to know, a quick "cheat sheet" so to speak to very quickly determine if you should bother or not so as to save you from tons of time unnecessarily trying to tune an index. We will discuss some tactics to very quickly isolate which ones to worry about and which ones to ignore.

Which type of index should you not add any fill factors to:


- The table that is being indexed never gets any inserts or updates
- The table has less than 5000 records, sql server won't even bother using an index on small tables
- The index only contains keys that are never updated/changed even though the table may get inserts and updates, for example the table has a status field that is always true for all records (not sure why bother, just an example to drive home the point).
- CLUSTERED or NONCLUSTERED index when it is keyed on a column that is always in sequential order and never changes, like an IDENTITY key or a GUID, or any other sequential key that ensures that any new records are appended to the last record on the last leaf page.

You may be tempted to add a fill factor to a NONCLUSTERED index because you needed to add one to the CLUSTERED index. Don't just add a fill factor to all the other indexes on a table just because it made sense for the CLUSTERED index.

Remember an important fact when when deciding to look at a specific index that fill factor does not ad free space to the end of the index's leaf pages. It adds it in between rows, interleaving thr rows. Why? Because when this concept was designed one of its main purposes was to avoid page-splits, which are very expensive. During a page split, and remember this happens during production on live data as the record gets inserted or updated, half of the data from the existing page gets moved to a new page, half stays where it is. You want to avoid this from happening too often. So by adding empty space in between solves this. How? If the free space was added at the end of each leaf page, it would have made no sense because if the key on the index is a sequential key the new rows would always be appended anyway to the last leaf page, hence all the prior leaf pages, which could be thousands will have a fill factor and in this example never be used. But since it is geared towards random inserts because of inserting data based on random keys or because of the key(s) being altered on the index, it will find a new place for the row and if it finds it on a free spot/row reserved by fill factor, no page split is needed, no rows need to be moved or sorted, very fast and very cheap. That is why you don't add a fill factor to any index where the key(s) are sequential.

When it is imperative to consider adding a fill factor:


- Massive inserts and/or updates to the index. It is important to note that I am referring to the index and not the table. You could have a table with massive updates but if the index does not contain any of the index keys that are being changed on the table, the index does not change and so adding a fill factor is not necessary and could have a significant degradation in performance if you do add it when it is not needed.
- This is a more reactive approach, but when the index is fragmented more than 30% you should start considering reorganizing it, rebuilding it, changing the index or adding a fill factor
- The ratio between read and write is high, low read rate and high write rate. What I mean by high and low is that if IMHO the ratio if 1:5 (read:write), you have five times more writes than reads, you should definitely consider a fill factor. If the number of reads is much higher than the write rate, 5:1, five times more reads than writes I would be hesitant to add a fill factor, but for a 25:1, for example 25,000 writes and only 1000 reads I would consider a fill factor of 90%. This number is not too scientific you need to experiment a little.

When to analyze index to determine if they need a fill factor?



This is a great question. Many developers and DBAs get involved in the deployment phase to set fill factors based on what they see at the time. The reason this is short sighted is because:

- Data loads initially may be very low and grow steadily over time
- Data loads may be very high initially due to data migrations from existing databases
- Data load may be very low and grow substantially in the first few days or weeks due to data entry, like when data migrations were not an option or it was very thin and requires a lot of manual tuning
- Data loads may be seasonal, sporadic or depending on project cycles

So look at your read/write rations and fragmentation often, including database size growth.

So now you have identifies some indexes that may need to be tuned because:

  • Their fragmentation levels are above 30%
  • Or you have noticed performance degradation

In short, I would suggest the following fill factor when purely looking at fragmentation:

  • 0 - 30% fragmentation: do not add a fill factor
  • 30 - 50%fragmentation: add a 90% fill factor to start off with and watch fragmentation for a few days or hours depending on your read-write load an adjust it
  • > 50% fragmentation: add a 70% fill factor and again keep an eye on it for a while and adjust as needed

Why so vague about the fill factor, isn't there a perfect science to it? No there is not simply because the behavior of your data changes over time and how people search for data gets changed by programmers very often.

Here are some useful sql queries:


Find out how your index is performing, seeks, scans and no of updates to your index:



SELECT
o.name AS object_name,
i.name AS index_name,
i.type_desc,
u.user_seeks, u.user_scans,
u.user_lookups, u.user_updates,
o.type
FROM
sys.indexes i
JOIN
sys.objects o ON i.object_id = o.object_id
LEFT JOIN
sys.dm_db_index_usage_stats u ON i.object_id = u.OBJECT_ID
AND i.index_id = u.index_id
AND u.database_id = DB_ID('enter your database name here')
WHERE
o.type IN ('U', 'V') AND
i.name IS NOT NULL
ORDER BY
o.name, i.name


Find out the read/write ratio of your tables:



DECLARE @dbid int
SELECT @dbid = db_id('enter your database name here')

SELECT TableName = object_name(s.object_id),
       Reads = SUM(user_seeks + user_scans + user_lookups), Writes =  SUM(user_updates)
FROM sys.dm_db_index_usage_stats AS s
INNER JOIN sys.indexes AS i
ON s.object_id = i.object_id
AND i.index_id = s.index_id
WHERE objectproperty(s.object_id,'IsUserTable') = 1
AND s.database_id = @dbid
GROUP BY object_name(s.object_id)
ORDER BY writes DESC


Find out how fragmented your index is for those fragmented more than 30%:



SELECT OBJECT_NAME(i.OBJECT_ID) AS NameOfTable,
i.name AS NameOfIndex,
indexstats.avg_fragmentation_in_percent as PercentageFragmentation
FROM sys.dm_db_index_physical_stats(DB_ID('enter your database name here'), NULL, NULL, NULL, 'DETAILED') indexstats
INNER JOIN sys.indexes i ON i.OBJECT_ID = indexstats.OBJECT_ID
AND i.index_id = indexstats.index_id
WHERE indexstats.avg_fragmentation_in_percent > 30


Good luck and let me know if you have any interesting stats to share from using fill factor.

Print | posted @ Friday, April 27, 2012 11:24 AM

Comments on this entry:

Gravatar # http://adidastenis2014.activys.com/adidas-tenis/
by Adidas Tênis at 11/12/2013 9:16 PM

Initial thoughts on the new Adidas Smart Run GPS watch
Post A Comment
Title:
Name:
Email:
Comment:
Verification: