Wednesday, July 25, 2007

Causes and cures of defragments action needed

Causes and cures

Fragmentation occurs when the operating system cannot or will not allocate enough contiguous space to store a complete file as a unit, but instead puts parts of it in gaps between other files (usually those gaps exist because they formerly held a file that the operating system has subsequently deleted or because the operating system allocated excess space for the file in the first place). As advances in technology bring larger disk drives, the performance loss due to fragmentation squares with each doubling of the size of the drive.[citation needed] Larger files and greater numbers of files also contribute to fragmentation and consequent performance loss. Defragmentation attempts to alleviate these problems.

Consider the following scenario, as shown by the image on the right:

An otherwise blank disk has 5 files, A, B, C, D and E each using 10 blocks of space (for this section, a block is an allocation unit of that system, it could be 1K, 100K or 1 megabyte and is not any specific size). On a blank disk, all of these files will be allocated one after the other. (Example (1) on the image.) If file B is deleted, there are two options, leave the space for B empty and use it again later, or compress all the files after B so that the empty space follows it. This could be time consuming if there were hundreds or thousands of files which needed to be moved, so in general the empty space is simply left there, marked in a table as available for later use, then used again as needed.[1] (Example (2) on the image.) Now, if a new file, F, is allocated 7 blocks of space, it can be placed into the first 7 blocks of the space formerly holding the file B and the 3 blocks following it will remain available. (Example (3) on the image.) If another new file, G is added, and needs only three blocks, it could then occupy the space after F and before C. (Example (4) on the image). Now, if subsequently F needs to be expanded, since the space immediately following it is no longer available, there are two options: (1) add a new block somewhere else and indicate that F has a second extent, or (2) move the file F to someplace else where it can be created as one contiguous file of the new, larger size. The latter operation may not be possible as the file may be larger than any one contiguous space available, or the file conceivably could be so large the operation would take an undesirably long period of time, thus the usual practice is simply to create an extent somewhere else and chain the new extent onto the old one. (Example (5) on the image.) Repeat this practice hundreds or thousands of times and eventually the file system has many free segments in many places and many files may be spread over many extents. If, as a result of free space fragmentation, a newly created file (or a file which has been extended) has to be placed in a large number of extents, access time for that file (or for all files) may become excessively long.

The process of creating new files, and of deleting and expanding existing files, may sometimes be colloquially referred to as churn, and can occur at both the level of the general root file system, but in subdirectories as well. Fragmentation not only occurs at the level of individual files, but also when different files in a directory (and maybe its subdirectories), that are often read in a sequence, start to "drift apart" as a result of "churn".

A defragmentation program must move files around within the free space available to undo fragmentation. This is a memory intensive operation and cannot be performed on a file system with no free space. The reorganization involved in defragmentation does not change logical location of the files (defined as their location within the directory structure).

Another common strategy to optimize defragmentation and to reduce the impact of fragmentation is to partition the hard disk(s) in a way that separates partitions of the file system that experience many more reads than writes from the more volatile zones where files are created and deleted frequently. In Microsoft Windows, the contents of directories such as "\Program Files" or "\Windows" are modified far less frequently than they are read. The directories that contain the users' profiles are modified constantly (especially with the Temp directory and Internet Explorer cache creating thousands of files that are deleted in a few days). If files from user profiles are held on a dedicated partition (as is commonly done on UNIX systems), the defragmenter runs better since it does not need to deal with all the static files from other directories. For partitions with relatively little write activity, defragmentation performance greatly improves after the first defragmentation, since the defragmenter will need to defrag only a small number of new files in the future.

1 comment:

Anonymous said...

HDDs nowadays are fast and large yet not immune to the fragmentation disease. It can be like an epidemic over workstations and servers and be a nightmare to manage if it gets severe.