Resilient File System (ReFS), codenamed „Protogon“, is a Microsoft proprietary file system introduced with Windows Server 2012 with the intent of becoming the „next generation“ file system after NTFS.
ReFS was designed to overcome problems that had become significant over the years since NTFS was conceived, which are related to how data storage requirements had changed. The key design advantages of ReFS include automatic integrity checking and data scrubbing, removal of the need for running chkdsk, protection against data degradation, built-in handling of hard disk drive failure and redundancy, integration of RAID functionality, a switch to copy/allocate on write for data and metadata updates, handling of very long paths and filenames, and storage virtualization and pooling, including almost arbitrarily sized logical volumes (unrelated to the physical sizes of the used drives). (src)
Test: good for virtualization but is it stable or buggy?
„Microsoft states that ReFS is largely intended for use on file servers in conjunction with Storage Spaces“ (src)
Today windows update restarted our server and the ReFs filsystem was not detected after reboot. The D:\ volume was recognized as RAW!
After some initial investigations we did a full shutdown and cold boot. When the system came online the D:\ volume was recognized as ReFS again.
I looked into the event viewer and saw several entries under \Application and Services Logs\Microsoft\Windows\DataIntegrityScan\Admin and \CrashRecovery.
They look like this and entries exist since we took the server into production.
Volume metadata inconsistency was detected and was repaired successfully.
Volume name: D:
Metadata reference: 0x204
Range offset: 0x0
Range length (in bytes): 0x0
Bytes repaired: 0x3000
„I’m afraid that the fastest way would be recovering from backup.“
is this Microsoft’s Chinese Spy Chip revenge? „ReFS shall fail only on SuperMicro servers“ X-D
if you need an „fit for all“ filesystem stay with NTFS, if you need a host that only stores Hyper-V harddisk images… it might be worth looking at.
Results for ReFS-formatted volume with File Integrity on
As can be seen from the diagram, at the beginning of the testing (around 4 or 5 minutes), there’s a strong leap of performance, but, after the short-time increase, the values go back to nominal/standard ones, which were obtained at the earlier stages of testing, and remain the same for about 10 minutes, and afterwards the system freezes with zero results.
The similar test was carried out 3 or 4 times. In each case, the freezing repeated after a while. We failed to complete the performance test on ReFS with FileIntegrity on.
ReFS with FileIntegrity on changes sequence of commands, changes requests, I/O type, so it removed many random writes. As a result, there’s a strong leap of IOPS, then IOPS fall, the garbage collector is launched, and free disk space runs out. In the end of the test, free disk space drops to zero.
ReFS with FileIntegrity off performs like a conventional file system, identically to NTFS (https://en.wikipedia.org/wiki/NTFS), which preceded it. Typical of a regular file system, no changes in free disk space are observed because read-write takes place on pre-allocated blocks. I/O blocks size and the pattern don’t change, as well. This mode makes ReFS suitable for the modern high-capacity drives and huge files since no chkdsk or scrubber is active.
With FileIntegrity on, we observe fluctuations in free disk space, changes of a sequence of commands, changes of requests, of I/O size and pattern. This means that in this mode ReFS works like a Log-Structured File System (https://en.wikipedia.org/wiki/Log-structured_file_system). This sounds good for virtualization workloads because the system transforms multiple small random writes into bigger pages, which increases performance and prevents from the “I/O blender” effect. This issue is typical for virtualization and refers to an effect of dramatic performance degradation, which results from multiple virtualized workloads being merged into a stream of random I/O. Before LSFS appeared, solving this problem came expensive. Now we have LSFS (WAFL, CASL) and, as it turned out, ReFS can help, too.
So, the main conclusion to be done here is that ReFS with FileIntegrity on works like Log-Structured File System. Is this a good thing? The issue is that, when FileIntegrity’s on, it’s not quite clear which workload we’re going to deal with. There’s a number of other issues, as well, but we’d prefer to leave this topic for another day and another article. Stay tuned.“ (src)