Synchronizing Files with C++

In this post, we look at how to monitor a directory (and its subdirectories) so that we will be notified when a change occurs. We can use the Windows API to do this in C++.

There are two APIs that can achieve this:

  • FindFirstChangeNotification()
  • ReadDirectoryChangesW()

We will use ReadDirectoryChangesW() as this API provides useful information such as which specific file was changed.

From the Microsoft documentation, the function has the following prototype:

BOOL ReadDirectoryChangesW(
  [in]                HANDLE                          hDirectory,
  [out]               LPVOID                          lpBuffer,
  [in]                DWORD                           nBufferLength,
  [in]                BOOL                            bWatchSubtree,
  [in]                DWORD                           dwNotifyFilter,
  [out, optional]     LPDWORD                         lpBytesReturned,
  [in, out, optional] LPOVERLAPPED                    lpOverlapped,
  [in, optional]      LPOVERLAPPED_COMPLETION_ROUTINE lpCompletionRoutine
);

The first parameter is a handle to the directory we want to watch. We obtain this handle using the CreateFile() function. A working example is provided below.

The ReadDirectoryChangesW() function can be used in synchronous or asynchronous operation. The last two parameters control this.

In synchronous operation, the function call will block until the operating system detects filesystem changes. To choose synchronous operation, set the last two parameters to NULL.

There is no timeout parameter, so this call could block indefinitely. However, if you need to interrupt this call (e.g. to shutdown your application gracefully), it is possible to call CancelIoEx() from another thread.

For asynchronous operation, there are two available mechanisms:

  • Event-based notification
  • Completion routines

An example using event-based notification by Nick Aversano can be found here:

For my example, I will use synchronous operation.

When we call ReadDirectoryChangesW() we must provide a buffer for Windows to log the changes to. When the function returns, the buffer may contain information about multiple file events, so it’s necessary to iterate through the buffer. The example code demonstrates how to do this.

If Windows can’t fit all the file notifications that have occurred into the buffer provided, it will write as many as it can and return control to your application. When you call ReadDirectoryChangesW() again, it will pick up where it left off and inform you of the remaining file change events.

Example code

In this demonstration application, we use ReadDirectoryChangesW() to monitor the current directory for file changes. We set bWatchSubtree to TRUE so that we also receive notifications about files in subfolders.

When a change is detected, we call our event handler HandleFileEvent(), which simply displays information to the console.

#include <Windows.h>
#include <cstdint>
#include <iostream>
#include <string>
#include <assert.h>

void HandleFileEvent(FILE_NOTIFY_INFORMATION* event);

int main()
{
    HANDLE file = CreateFileW(L".",
        FILE_LIST_DIRECTORY,
        FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
        NULL,
        OPEN_EXISTING,
        FILE_FLAG_BACKUP_SEMANTICS,
        NULL);
    assert(file != INVALID_HANDLE_VALUE);

    DWORD bytesRead = 0;
    uint8_t changeBuff[8192];

    while (true)
    {
        BOOL success = ReadDirectoryChangesW(
            file, changeBuff, sizeof(changeBuff), TRUE,
            FILE_NOTIFY_CHANGE_FILE_NAME |
            FILE_NOTIFY_CHANGE_DIR_NAME |
            FILE_NOTIFY_CHANGE_LAST_WRITE,
            &bytesRead, NULL, NULL);

        uint8_t* pCurrentEntry = changeBuff;
        while (bytesRead > 0)
        {
            FILE_NOTIFY_INFORMATION* event = reinterpret_cast<FILE_NOTIFY_INFORMATION*>(pCurrentEntry);
            HandleFileEvent(event);
            pCurrentEntry += event->NextEntryOffset;
            if (event->NextEntryOffset == 0)
                break;
        }
    }
}

void HandleFileEvent(FILE_NOTIFY_INFORMATION* event)
{
    std::wstring filename(event->FileName, event->FileNameLength / sizeof(event->FileName[0]));
    if (event->Action == FILE_ACTION_ADDED)
        std::wcout << "FILE_ACTION_ADDED            ";
    else if (event->Action == FILE_ACTION_REMOVED)
        std::wcout << "FILE_ACTION_REMOVED          ";
    else if (event->Action == FILE_ACTION_MODIFIED)
        std::wcout << "FILE_ACTION_MODIFIED         ";
    else if (event->Action == FILE_ACTION_RENAMED_OLD_NAME)
        std::wcout << "FILE_ACTION_RENAMED_OLD_NAME ";
    else if (event->Action == FILE_ACTION_RENAMED_NEW_NAME)
        std::wcout << "FILE_ACTION_RENAMED_NEW_NAME ";

    std::wcout << filename << std::endl;
}

This simple example will loop forever. In a real-world appliation, you may want to break out of the loop. In this case, don’t forget to close the file handle when you have finished with it.

One note of caution: according to the Microsoft documentation, if too many changes occur at the same time, it’s possible for Windows to run out of buffer space to record all the changes. In this case, the function returns TRUE but lpBytesReturned will be zero. In this situation, it’s up to the developer to find another way to figure out which files have changed! It’s worth reading the Remarks section in the Microsoft Documentation:

Interpreting the output

Once you start to use the ReadFReadDirectoryChangesW API in earnest, it begins to throw up a few conundrums.

1. File or directory?

When you receive a notification, there is nothing in the API to tell you whether that notification pertains to a file or a folder. If the notification arises from an item being modified or created, you can check whether the filename refers to a file or folder by calling PathIsDirectory().

But if the notification is FILE_ACTION_REMOVED, that could relate to a file or a folder. Since it has already been deleted, you can’t tell from the file system a posteriori whether the deleted item was a file or folder.

2. Multiple notifications

In some circumstances, it’s possible to receive multiple notifications about the same file in quick succession. Some of these notifications may be superfluous. For example, if you are sychronizing a local file with a remote server (e.g. a Dropbox-style application) you only need to process/upload that file once.

One possible solution is to check when the file was last modified with the Windows API function GetFileTime(). For example, each time you process a file, you could take a record of the file’s modification time. If you receive a subsequent notification for the same file, you can quickly check if the file’s modification time has actually changed. If not, you can ignore the second notification as there’s no need to process that file again.

3. Unexpected operations

Suppose you want to monitor changes made to a folder containing source code. Some IDEs and editors don’t always behave as you would expect. When a file is saved, you might be surprised to see hidden/temporary files created in the background, and backups of files created.

Often, these files have specific file extensions, so you can suppress these unwanted file notifications by pattern-matching on the filename.

4. File renaming

When a file is renamed, you’ll get two distinct notifications:

  • FILE_ACTION_RENAMED_OLD_NAME – this tells you a file with the specified name was renamed. E.g. a file that was originally called foo was renamed to something else.
  • FILE_ACTION_RENAMED_NEW_NAME – this tells you that a file was renamed to the specified name. E.g. a file that was originally called something else is now called bar.

Usually, you see these notifications in sequence. So it is reasonable to infer from these two events that the file foo was renamed to bar.

But there’s nothing in the Microsoft documentation that mandates that this will always be the case. So we don’t know for sure if these two notifications can ever be interleaved with other notifications. This could be important if there are a lot of file changes happening at the same time.

Summary

In this post, we have discussed how to use the native Windows API to receive notifications about changes to files detected by the operating system. At first glance, the API is quite complex and fiddly to use. However, once you have a working framework (such as the example provided), it’s possible to customise the event handler HandleFileEvent() to do something more interesting and useful. For example, uploading onto a remote web server. Or even developing your own competitor to Dropbox!