Deployment

The topics in this section provide information regarding the deployment of applications built with CBFS Connect.

Topics

Driver-specific

The topics in this section provide information relevant to the deployment of applications built with the classes that include a kernel-mode driver or make use of a third-party kernel driver: CBFS or FUSE.

This information should be reviewed carefully when designing a deployment strategy for such an application, since CBFS Connect's kernel mode drivers and other supplementary DLLs must be distributed along with the application in order for it to function correctly.

Topics

Prerequisites

CBFS and FUSE classes:

The classes create a virtual drive, visible to other processes, which requires a certain level of integration between the classes and the system itself. In order for an application that uses such class to function correctly, the following prerequisites must first be met on the target machine:

  • Windows: The kernel mode drivers must be installed; please refer to the Driver Installation topic for more information.
  • macOS: macFUSE together with its "FUSE Compatibility Layer" must be installed in the target system (including the final end-users' systems). Note that the product does not include macFUSE, it must be downloaded and installed separately.
  • Linux: The system's kernel must have been compiled with support for FUSE. Also, FUSE 2.9 user-mode libraries must be installed in the development system. This can be achieved using the following commands:
    • RedHat/CentOS and derivative Linux distributions: sudo yum install fuse-devel
    • Debian/Ubuntu and derivative Linux distributions: sudo apt-get install libfuse-dev
    FUSE 2 must be installed in target systems, where your application is deployed. Modern versions of Linux do not include FUSE 2 by default, but it can be installed using these commands:

    • RedHat/CentOS and derivative Linux distributions: sudo yum install fuse-libs
    • Debian/Ubuntu and derivative Linux distributions: sudo apt-get install libfuse2

Driver Installation

The topics in this section provide information regarding the deployment of applications built with CBFS or FUSE classes on Windows. The information in these topics should be reviewed carefully when designing a deployment strategy, because CBFS Connect's kernel mode drivers and supplementary dynamic link libraries (DLLs) must be distributed along with the application for it to function correctly.

At a high level, CBFS consists of a kernel mode driver, a helper DLL, and a user mode library; all of which work together in tandem to provide the product's functionality. Therefore, it is necessary to install the CBFS kernel mode driver and helper DLL when deploying an application built with the CBFS Connect user mode library.

The functionality needed to install the above-mentioned modules is included in the user mode library itself, as well as in a separate installer DLL. The drivers directory, located within the product's Python package (see notes below), contains the following files:

{ClassName}.cab Contains the main drivers, PnP bus drivers, helper DLLs, and the supplementary installation/uninstallation files.
installer/{ClassName}Inst.h A header file for the installer DLL. The installer DLL may be used on the target system to install (or uninstall) the items within {ClassName}.cab.
installer/x64/{ClassName}Inst.dll The C/C++ installer DLL for the x64 (AMD64) processor architecture.
installer/x86/{ClassName}Inst.dll The C/C++ installer DLL for 32-bit x86 processor architecture.
installer/ARM/{ClassName}Inst.dll The C/C++ installer DLL for 32-bit ARM processor architecture.
installer/ARM64/{ClassName}Inst.dll The C/C++ installer DLL for 64-bit ARM processor architecture.
*Note: The FUSE class uses the same driver as the CBFS class.

Windows: Note: When the user-mode library is installed or updated on end-user systems, it is required to ensure that the kernel-mode drivers already present in the system are updated to match the version of the installed user-mode library.

Python Package Notes

In the Python edition, the drivers directory described above is included as part of the product's Python package, <install_dir>\cbfsconnect-24.0.xxxx.tar.gz. After installing the package with pip (see the User Mode Library topic), the drivers directory can be found within the product's package directory, which can be located by running pip show cbfsconnect.

Installation and Uninstallation via User Mode Library Methods

The class includes the following methods to install and uninstall the required files; please refer to their documentation for more information:

Important: uninstall must only be used when completely removing the driver. When updating the driver, this method must not be used as it may cause the OS to incorrectly remove the driver on reboot. Please refer to the "Updating the Driver" section, below, for more information.

Installation and Uninstallation via Installer DLL Functions

The installer DLL is a lightweight, stand-alone library that contains only the functionality required for installing and uninstalling the required files. It is available in both 32-bit and 64-bit versions (each of which is capable of installing both 32-bit and 64-bit drivers and helper DLLs); and may be used as desired in installation scripts, setup applications, or any other executable capable of loading dynamically-linked libraries (DLLs).

The functions exposed by the installer DLL mirror the class methods listed above. Each function is available in two forms: those with an *A suffix, which can be used with ANSI/UTF8 strings; and those with a *W suffix, which can be used with Unicode (UTF16) strings.

Updating the Driver

To update the driver, call the install method. The new version of the driver will replace the older version. Please do not call the uninstall method when updating the driver.

Uninstalling the Driver

To uninstall the driver completely, call the uninstall method. If the driver cannot be immediately uninstalled, it will be marked for removal and uninstalled on the next reboot.

Use caution when calling uninstall ; if it gets called and the driver cannot be uninstalled immediately, and then install is subsequently called to install a new version, then upon reboot, the OS will end up uninstalling the newly-installed driver.

Important: The driver should only be uninstalled when the intent is to completely remove it from the system. Do not uninstall the driver to update it.

Reboot Requirements

Depending on the current state of the system, as well as the options chosen when installing or uninstalling the driver, the OS may need to reboot to complete the operation.

For example, the helper DLL must be loaded by Windows File Explorer when it starts, and a reboot or restart of Explorer is required for this to occur. When installing or uninstalling the Plug-and-Play (PnP) drivers, a reboot is almost always requested by Windows.

Always check the return value of the install and uninstall methods/functions; it will indicate whether a reboot is required (and if so, which module(s) required it).

Additional Notes

The OS treats major versions of the driver as separate products; they can operate in parallel and do not share any resources. Old major versions may optionally be removed from the system when calling install by passing the appropriate value for its Flags parameter.

For each major version of the product, only one copy of the driver can be installed at any time. When the driver is being installed, its version is checked, and one of the following three things occurs:

  • If no driver with the same major version is currently installed, then the install procedure installs the driver as a new product.
  • If a driver with the same major version and an older minor version is currently installed, then the install procedure updates the existing driver with the new one.
  • If a driver with the same major version and a newer minor version is currently installed, then the install procedure leaves the existing driver unchanged.

When deploying files to a target system, the CAB file must remain present on the system. This file is required for uninstallation of the driver at a later time.

The product's installation code maintains a ProductGUID-based record of driver installations in the Windows Registry, creating a separate registry entry for each different ProductGUID. When the driver is "uninstalled", the corresponding registry entry is removed. The driver is only removed from the system if there are no entries left in the registry that reference the driver.

Windows 7 and Windows 2008 Server R2

Kernel-mode drivers are signed using the SHA2 algorithm. The original releases of Windows 7 and Windows 2008 Server R2 didn't support SHA2. To be able to load the newest versions of the drivers, the system needs to have certain updates installed. The updates are KB976932 (Service Pack 1 of the mentioned systems) and KB4474419 (Security Update).

Required Permissions

By default, Windows only allows installation and uninstallation of the CBFS Connect system files (kernel mode drivers and helper DLLs) to be performed from a user account which is a member of the Administrators group.

On systems where UAC is enabled, the process responsible for installing or uninstalling the system files must run with elevated permissions. Detection of current privileges and elevation of permissions is not within the scope of the class itself.

Some examples of obtaining the required permissions for driver installation and uninstallation are below.

  • Starting the application which uses the class with the "Run as administrator" option.
  • Modifying the Load and unload device drivers setting in the Local Security Policy under the User Rights Assignment section.
  • Including a manifest alongside the application indicating the requirement for elevated permissions. For instance, if a file MyApp.exe.manifest with the content below exists next to the application MyApp.exe, it will prompt for elevated permissions when started (if required).

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0"> <assemblyIdentity version="1.0.0.0" processorArchitecture="X86" name="ExeName" type="win32"/> <description>elevate execution level</description> <trustInfo xmlns="urn:schemas-microsoft-com:asm.v2"> <security> <requestedPrivileges> <requestedExecutionLevel level="requireAdministrator" uiAccess="false"/> </requestedPrivileges> </security> </trustInfo> </assembly>

User Mode Library

Windows Only: This section only applies to CBFS and FUSE on Windows. The user-mode library must be deployed to end-user systems along with the kernel-mode drivers; the version of the kernel-mode drivers on the end-user systems must be equal to or newer than the version of the user-mode library. Thus, when the user-mode library is installed or updated on end-user systems, it is required to ensure that the kernel-mode drivers already present in the system are updated to match the version of the installed user-mode library.

The user-mode library comes in two pieces, both of which must be deployed with the application:

  • A Python module named cbfsconnect.py.
  • A native dynamic library (unmanaged), named as follows:
    • Windows: pycbfsconnect24.dll (available for x64 and x86 processor architectures)
    • Linux: libpycbfsconnect.so.24.0 (available for x64 and x86 processor architectures)
    • macOS: libpycbfsconnect24.0.dylib (available for x64 and ARM64 processor architectures)

Both the module and the native library are included in the product's Python package, <install_dir>\cbfsconnect-24.0.xxxx.tar.gz, which should be installed using pip: cd C:\path\to\install_dir python -m pip install cbfsconnect-24.0.xxxx.tar.gz

Once the product's Python package has been installed, the module can be imported and used: from cbfsconnect import *. Nothing else is required to deploy the application.

As an alternative to installing the module using pip, you can utilize the built-in setuptools module to package the module for deployment or install it to the machine. python setup.py build --build-lib=<app_dir> The above setup command packages the module and native library for deployment. A folder is created in the app_dir directory with the module and the native library packaged inside.

For CBFS and FUSE classes, Windows Only: Remember to deploy the drivers too, as they are an integral part of CBFS Connect.

General Information

The topics in this section provide general information about various aspects of the product's functionality.

CBFS Topics

The topics in this section provide additional information specific to the CBFS class.

Note: This functionality is only available in Windows.

Topics

Thread Safety of the API

The properties and methods provided by CBFS Connect's APIs fall into one of the following categories when it comes to thread safety:

  • Properties and methods that are intended to be accessed from event handlers (i.e., those documented as such) are thread-safe, since event handlers always execute in the context of worker threads.
  • Methods related to installation, uninstallation, and status retrieval are not thread-safe, since they are expected to be used during the application installation process. (Methods in this category are those which are documented as being "available in both the class API and the Installer DLL".)
  • Any helper methods documented in the class's API (i.e., those used to convert error codes, or provide similar supplementary functionality) don't use any explicit thread synchronization, but also don't access any other class properties or methods, and thus are inherently thread-safe.
  • All other properties and methods are not thread-safe, so applications must employ proper thread synchronization techniques when using them on multiple threads (including, but not limited to, during event handlers).

Event Handling

The topics in this section provide information about event handling.

Note: When events are fired, the system and the kernel modules, such as filter drivers or a filesystem driver, are in the middle of handling a filesystem request (the I/O request packets, or IRP). In this state, it is dangerous to initiate new operations on the drives or filesystems - both those handled by the product and other drives and filesystems. If you need to write data to the file, it is better to do this asynchronously by offloading the task to a separate worker thread and returning immediately. If you need to read data from the local file, in-memory caching of data, performed by a worker thread, is a good idea (if it is possible in your application).

In general, it is relatively safe to access other filesystems than the one to which the request in progress is directed. However, some third-party filesystem filters may be not prepared for such recursive operations and will deadlock. Such filters must be dealt with on a case-by-case basis (usually by disabling them).

To ensure stable operation, it is critical to avoid accessing drives and filesystems recursively. Essentially, this means that event handlers must not perform any operations involving the drive or filesystems that the event fired for (i.e., do not read from/write to files on it, do not unmount the media).

If an event handler detects that the file, for which this handler was called, has been changed externally (not through the virtual drive), this event handler may return the ERROR_FILE_INVALID error code to tell the driver and the OS about the change. This error code causes the on_get_file_info event to be fired, after which, the class will call the notify_directory_change method, so your code doesn't need to call it.

Additional Topics

Contexts

It is often necessary for an application to associate certain information with a given file/directory, file handle, or enumeration operation. To assist developers in doing so in a convenient and performant manner, the CBFS class provides context parameters in a number of events.

A context carries an application-defined value that identifies or points to some application-defined data, and each file/directory, file handle, and enumeration operation has a separate context associated with it. The CBFS class treats context values as opaque; it stores the context values passed to it by the application and ensures that the correct values are exposed again whenever some event fires for a particular file, handle, or enumeration; but it does not otherwise attempt to use said values in any way.

Context Lifetimes

Contexts can be grouped into a few categories, each of which is subject to a different lifetime:

  • File contexts and directory contexts, which are associated with an open file or directory.
  • Handle contexts, which are associated with a specific open file or directory handle.
  • Enumeration contexts, which are associated with an ongoing enumeration.

File/directory contexts are created the first time a file or directory is opened, and they live until the last handle to that file or directory is closed. Handle contexts, conversely, are created every time a file or directory is opened, and they live only until the associated file handle is closed. For example, consider the following sequence of operations:

Operation on File X Context Creations/Deletions Active Contexts
1. Opened by process A File context FX and handle context HXA created FX, HXA
2. Opened by process B Handle context HXB created FX, HXA, HXB
3. Closed by process B Handle context HXB deleted FX, HXA
4. Opened by process C Handle context HXC created FX, HXA, HXC
5. Closed by process A Handle context HXA deleted FX, HXC
6. Closed by process C File context FX and handle context HXC deleted

Note: When the fire_all_open_close_events property is disabled, the on_open_file and on_close_file events are both fired only once (for steps 1 and 6, respectively), which renders handle contexts useless.

File/directory contexts are available in all events corresponding to operations performed on some open file or directory. Handle contexts have similar availability, with the notable exception of the on_read_file and on_write_file events because of behavior caused by the system's memory mapping manager and cache manager. Because these managers act as proxies for file I/O requests, said requests may be buffered initially, and then combined and submitted to the filesystem later in a single request. It is not possible to determine which processes contributed to these "aggregate requests", nor is there any guarantee that any or all of them are still alive, hence, the lack of handle context parameters in the aforementioned events.

Enumeration contexts are created any time a new enumeration operation begins, and they live until the enumeration operation ends.

All contexts, when created, are created before their corresponding "first event" fires (e.g., on_open_file, on_enumerate_directory); and when deleted, they are deleted after their corresponding "last event" fires (e.g., on_close_file, on_close_directory_enumeration). If, however, a context's "first event" fails, whether expectedly (e.g., due to Security Checks) or otherwise (see Error Handling), then that context's value is immediately discarded because its corresponding "last event" will not ever fire.

Additionally, for file and handle contexts, if the route_to_file method is called at some point, then depending on which flags are passed to its Flags argument, the current event may be the "last event" for the current handle context. In such cases, the application must be sure to perform the necessary cleanup actions for the handle context before the event handler finishes. If additional events are expected, it is also reasonable to pass the CBFS_ROUTE_CLOSE_EVENT flag when calling route_to_file so that the on_close_file event will fire again later (at which point, the application can perform the necessary cleanup actions).

Context Use Cases

Contexts are most helpful when used to store information associated with a file/directory or file handle. Typically, contexts are initialized during their corresponding "first event", and then are used in subsequent events to speed up those operations.

For example, when an on_open_file event arrives, an application might wish to obtain, for example, a stream for some real file and to store it (directly or indirectly) in the file context. Then, when the application needs to handle subsequent requests like, for example, on_read_file, it can do so quickly using the stream stored in the file context, instead of having to go through the process of obtaining that stream a second time.

As another example, consider the first time an on_enumerate_directory event fires. In this case, a full list of files can be built, and a reference to it will be stored in the enumeration context. This list can then be used to speed up subsequent firings.

Using Contexts

Python does not have a completely safe way to store object references in contexts either directly or indirectly, which is why all context parameters are integers. To emulate such capabilities, the following approach is recommended:

  1. Create a global dictionary instance for the application (i.e., a singleton), with keys that are integers and values of whatever type is desired.
  2. When the application needs to create a context object in an event handler, a "key" can be created using the hash of the full file/directory name (including path), potentially mixed with additional information.
    • For file contexts, a hash of the full file/directory name is sufficiently unique because the context is exposed in all events pertaining to that file/directory.
    • For handle and enumeration contexts, additional information must be mixed in because multiple handle or enumeration contexts may be present at once for any given file/directory.
  3. Using the created key, add the object to the dictionary.
  4. Set the context parameter to the key used in the previous step.
  5. To access the object in a later event, use the key stored by the context to retrieve the object from the dictionary.

Notes:

  • Applications must take care to enforce proper thread synchronization when accessing the dictionary because events are always fired using worker threads. Please refer to the Threading and Concurrency topic for more information.
  • In 32-bit applications, contexts are stored in 32-bit variables internally, and thus, the higher 32 bits of 64-bit values are lost.

Security Checks

Filesystems play a key role in securing access to the files and directories they contain; in addition to storing the security information associated with each filesystem object, they must also enforce access restrictions based on that information. CBFS-based virtual filesystems, if desired, should generally handle security using one of the following options:

  • For virtual filesystems that identify themselves (via the file_system_name property) as NTFS, security should be handled using standard Windows security mechanisms: access control lists (ACLs).
  • Virtual filesystems handle security by implementing additional custom security checks.

Each security handling method is described in more detail below, followed by additional information relevant for both.

Windows Security Mechanisms (ACLs)

The CBFS class can optionally advertise support for NTFS security attributes by enabling the use_windows_security property (and, as implied earlier, this is highly recommended for virtual filesystems that identify as NTFS). When use_windows_security is enabled, the on_get_file_security and on_set_file_security events will fire any time Windows needs to read or write security information. To correctly implement these events, applications must store and retrieve the OS-provided security information alongside the associated filesystem objects.

Note: Windows does not handle the task of actually checking or enforcing access rights when sending requests to create or open filesystem objects. It is up to the virtual filesystem to do so. Therefore, when such requests arrive, the application must retrieve the previously stored security information for the applicable file or directory, and then use that information to validate whether the process attempting access has sufficient rights to do so.

The aforementioned validation can be done easily using the Windows API's AccessCheck function, which validates a process's access rights against some object's security attributes. Please refer to its documentation for more information. Once the application has performed the access validation, it should either allow or deny the request according to the results.

Custom Security Checks

The CBFS class also offers a number of methods that applications can use to implement their own custom security checks. The two most notable methods are get_originator_process_name, which returns the name of the process that initiated the request, and get_originator_token, which returns the system-defined security token of the process that initiated the request.

The latter is particularly useful when used with various methods in the Windows API, such as the GetTokenInformation function, which can be used to obtain various pieces of information about the object the token is associated with.

Implementing Security Checks

Regardless of which security handling method is used, a number of factors must be considered when implementing security checks in general. First and foremost, for an application to effectively enforce file security, it is important that the fire_all_open_close_events property is enabled so that all file open requests are surfaced (and thus given a chance to be allowed or denied).

Second, consider that effective security enforcement does not require that all event handlers perform security checks. Applications are technically free to deny almost any file-related event because of a failed security check. However, it is not actually meaningful to perform security checks in all event handlers in the first place, and unnecessary security checks will decrease an application's overall performance.

For example, it makes sense to validate access rights in the on_create_file and on_open_file event handlers, but not in the on_read_file or on_write_file event handlers. If an application denies a file create or open request made by some process, then that process will not be able to make a subsequent read/write request. Therefore, applications can safely assume that all read/write requests come from processes whose access rights they have already verified. Similar logic can be applied for directories and directory enumerations. Note: This is not an exhaustive set of use cases; each application's needs will differ.

Third, as mentioned earlier, an application may allow or deny any operation based on security checks. However, it must not alter (or "selectively report") filesystem information based on the results of the validation process (it is "all or nothing"). Said another way: at any given time, and for each CBFS-based virtual filesystem associated with it, an application must expose a single, consistent representation of the filesystem's state to all processes accessing it. The following clarifying examples should help guide implementation:

  • A filesystem object cannot be reported as existent to process A, but as nonexistent to process B.
  • A filesystem object cannot be reported as a file to process A, but as a directory to process B.
  • A filesystem object's metadata, absent of any changes, must be reported consistently between requests; for example, if a file's size is reported as 1 KB to process A, then it must also be reported as 1 KB to process B.
  • A file's contents, absent of any changes, must also be reported consistently between requests; for example, for a file whose size is reported as 1 KB, the application must return exactly 1024 bytes when the file is read, and those 1024 bytes must be exactly the same (including ordering) regardless of which process is doing the reading.

Timeouts

Because the class's events are typically tied directly to requests from the OS itself, it is critical that event handlers complete requests quickly to prevent the system from being blocked. To help prevent such blocking, the CBFS Connect system driver can enforce request timeouts on a per-virtual-drive basis.

A virtual drive's request timeout interval is specified by passing the desired number of milliseconds to the Timeout parameter when calling the mount_media method. The value passed to the Timeout parameter must either be a positive value greater than or equal to 3000, or 0, indicating that timeouts should not be enforced, in which case events may take as long as necessary to execute.

When timeout enforcement is in effect, and an event executes long enough for its timeout to expire, the driver cancels the underlying request by reporting an error to the OS. The tardy event still runs to completion, but any results it returns once finished are ignored because the underlying request has already been handled.

If an event handler knows it will require additional time to complete an operation, it can call the reset_timeout method (before the current timeout expires) to restart the timeout timer.

Applications should always strive to ensure that all event handlers complete quickly, even if request timeouts are disabled. Do not perform time-consuming work, especially network operations, within event handlers; offload such work to background threads. For example, a viable strategy for writing to a remote file would be to pass the data to be written to a background thread, and then to immediately finish the event handler. Similarly, to read from a remote file, an application could pre-cache some of its data with the help of a worker thread, and then return the cached data when requested.

Threading and Concurrency

Through the use of multithreading, the CBFS and FUSE classes provide powerful concurrency features to help applications maximize their performance. For data integrity purposes, the class also strictly enforces the order in which events fire in certain situations and allows applications to specify the extent to which events should be fired concurrently.

Note: Even when configured for minimal concurrency, the class always fires events in the context of some worker thread, and not in the thread the class was originally created on. Therefore, applications must be sure to synchronize operations between event handlers and other threads as necessary (including, but not limited to, calls to the class instance, unless a method is explicitly documented as callable within events).

Configuring Event Concurrency

Generally speaking, the CBFS and FUSE classes will always enforce per-file event serialization; that is, it always fires events relating to the same file in sequence (although technically speaking, there is one optional exception to this behavior, discussed at the end of this section). For example, if there are multiple read or write operations pending against a given file, then an event will be fired for the first operation, and after its event handler has returned, another event will be fired for the second operation, and so on.

With per-file event serialization already ensured, the most important concurrency-related consideration for CBFS- and FUSE-based applications is whether to enforce multifile event serialization as well. The classes' serialize_events property controls whether events relating to different files should be serialized on a single worker thread, or fired in parallel on several threads; by default, this property is set to seOnMultipleThreads, and events for different files are allowed to fire in parallel.

CBFS-specific features

When the serialize_events property is set to seOnMultipleThreads, the MinWorkerThreadCount and MaxWorkerThreadCount configuration settings control the minimum and maximum number of worker threads the class can use for firing events. By default, both are set to 0, which indicates that the class's system driver should automatically choose appropriate values based on how many CPU cores the system has. These settings are both ignored if serialize_events is set to values other than seOnMultipleThreads.

As mentioned previously, the CBFS class offers one feature that alters its otherwise strictly enforced per-file event serialization model: the ability to handle nonoverlapping read operations on a single file in parallel. This functionality is controlled by the serialize_access property, which is enabled by default.

Mounting Points

A mounting point is a name that can be used to access a volume. When the filesystem driver mounts a volume, it must make that volume accessible by creating one or more mounting points for it.

Windows:

Mounting points can be global (visible in all user sessions) or local (visible only to a specific user session). The add_mounting_point method creates global mounting points by default; applications must include the STGMP_LOCAL flag in the Flags parameter value to create local mounting points. (Note: The STGMP_MOUNT_MANAGER flag is not compatible with the STGMP_LOCAL flag.) Local mounting points are not supported by the FUSE class.

When creating a local mounting point, applications can specify a specific user session for it to be visible in by passing that session's Authentication ID for the AuthenticationId parameter (retrieval of Authentication IDs is discussed in a later section). If no Authentication ID is provided (i.e., 0 is passed), the local mounting point is created in the current user session; and if the application does this while running with elevated rights, then the local mounting point will only be visible in the elevated session, and consequently won't be available to applications in other sessions (such as, e.g., Windows File Explorer).

When mounting points are added or removed, a system message (WM_DEVICECHANGE) is broadcast. It instructs Windows File Explorer to refresh the list of drives available. However, these messages cannot cross user session boundaries; so if, for example, the application is running as a service, Windows File Explorer may not receive the broadcast and thus fail to refresh the list of drives. To address this issue, CBFS Connect includes a Helper DLL which, among other things, helps ensure that Windows File Explorer always refreshes the list of drives regardless of which user session the application is running in; please refer to that topic for more information.

Types of Mounting Points

There are a handful of different mounting point types, each of which exposes volumes in a slightly different manner:

  • Drive letter mounting points
  • Folder mounting points
  • Network mounting points (not supported by the FUSE class)
  • UNC path mounting points

Each type of mounting point is discussed in more detail below.

Drive Letter Mounting Points

Drive letter mounting points are one of the more commonly-used mounting point types thanks to users' familiarity with them. To create a drive letter mounting point, pass a string composed of a single character in the A-Z range followed by a colon (e.g., Z:) for the add_mounting_point method's MountingPoint parameter (when using FUSE class, pass the value to the mount method).

If the value passed for the add_mounting_point method's Flags parameter includes the STGMP_AUTOCREATE_DRIVE_LETTER flag, the class will assign a drive letter automatically. In this case, the value passed for the MountingPoint parameter must not include a drive letter.

Folder Mounting Points

A folder mounting point makes a volume accessible through a folder located on another (pre-existing) NTFS volume. Folder mounting points are always visible to all users in the system, and their creation requires administrative privileges.

To create a folder mounting point using the add_mounting_point method, include the STGMP_MOUNT_MANAGER flag in the Flags parameter (in FUSE, this flag is always set), and pass the target folder's absolute path for the MountingPoint parameter (e.g., C:\MountedDrives\MyMountingPoint). The target folder must already exist, must reside on an NTFS volume, and must be empty; otherwise, the call will fail.

Authentication IDs

(not supported by FUSE class)

An Authentication ID is a locally unique identifier (LUID) assigned to a logon session (or, "user session"), retrievable through the access token that represents said session. Applications can obtain the Authentication ID of a session from an access token or by enumerating logon sessions.

To obtain an Authentication ID from an access token, call the Windows API's GetTokenInformation function and pass either TokenGroupsAndPrivileges or TokenStatistics for the TokenInformation parameter. The resulting value will be a reference to a structure (TOKEN_GROUPS_AND_PRIVILEGES or TOKEN_STATISTICS, respectively) containing the needed Authentication ID.

To enumerate logon sessions, use the Windows API's LsaEnumerateLogonSessions function, which returns a list of existing logon session IDs (that is, Authentication IDs). To obtain additional information about a particular logon session (e.g., in order to determine if it's the desired one), use the Windows API's LsaGetLogonSessionData function. Network Mounting Points (not supported by FUSE class)

Network mounting points are similar to other mounting point types, except that the system treats them as "remote devices". This distinction is useful since:

  • Windows File Explorer makes fewer requests for files located on remote devices.
  • Some applications are more tolerant of timeouts and delays when working with remote devices.

Therefore, when an application is designed to work with some slow or remote storage medium, it's recommended that it use a network mounting point. When using network mounting points, it's important that the Helper DLL be used so that Windows File Explorer displays the correct drive status.

To create a network mounting point using the add_mounting_point (for the FUSE class, mount) method, include the STGMP_NETWORK flag in the Flags parameter, and pass a string of the form <Local Name>;<Server Name>;<Share Name> for the MountingPoint parameter.

  • <Local Name> is the name to use for the mounting point on the local system; it can be a drive letter or a name for use in a UNC path. Alternatively, it can be left empty, in which case the volume will only be accessible via the network path (see below) or the drive letter will be assigned automatically if the STGMP_AUTOCREATE_DRIVE_LETTER flag is used.
    • Note: This "local name" is not related to the concept of "local and global mounting points" discussed in the overview.
  • <Server Name> and <Share Name> are used to create a network path of the form \\<Server Name>\<Share Name>. This network path is not shared by default (see notes following examples below).

The set of characters allowed in server names, is defined in this document. The set of characters allowed in share names, is defined in this document.

With the above information in mind, here are some examples of valid MountingPoint parameter values when creating network mounting points:

  • Y:;MyServer;VirtualShare: Creates a network mounting point accessible both via the drive letter Y: and via the network path \\MyServer\VirtualShare.
  • MyMountingPoint;MyServer;VirtualShare: Creates a network mounting point accessible both via the UNC path \\.\MyMountingPoint and via the network path \\MyServer\VirtualShare
  • ;MyServer;VirtualShare: Creates a network mounting point accessible only via the network path \\MyServer\VirtualShare.

As stated above, the network paths created for network mounting points are not shared (i.e., visible to other computers on the network) by default. To have the class create an actual network share when add_mounting_point is called, applications must include either the STGMP_NETWORK_READ_ACCESS or the STGMP_NETWORK_WRITE_ACCESS flag in the Flags parameter value, and use empty string for the <Server Name> segment of the MountingPoint parameter value (the local computer's name is used). Note that when the mounting point is shared in this way, a local resource is created and then shared. The name of the resource is derived from the Share Name defined above. However, the set of allowed characters for such name is not strictly defined. Additionally, sharing is done using a call to NetShareAdd Windows API function, which can be called by Administrators, System Operators, and Power Users.

UNC Path Mounting Points

UNC path mounting points make a volume available via a specific name, and unlike other mounting points types, they are not displayed anywhere in Windows File Explorer; the UNC path must already be known.

UNC path mounting points consist of the \\.\ prefix, followed by a name (e.g., \\.\CBDrive1). The mounting-point-related class methods expect just the name (i.e., the UNC path with the \\.\ prefix omitted). So to add a new UNC path mounting point like, e.g., \\.\CBDrive1, call the add_mounting_point (for the FUSE class, mount) method and pass CBDrive1 for the MountingPoint parameter.

Linux and macOS:

A virtual drive / filesystem is mounted to a directory, which must exist at the time of mounting and be empty; otherwise, the call will fail.

Error Handling

Error Codes

The CBFS and FUSE on Windows communicate errors using the Win32 error codes defined in WinError.h, which is part of the Windows Platform SDK. The CBFS and FUSE on Windows system drivers, however, use NT native error codes, which do not have one-to-one mappings with Win32 error codes. The class includes logic for converting between the two as necessary, and that logic can be updated if any codes are not covered.

Additionally, CBFS and FUSE on Windows use certain error codes in a special manner. For more information about such error codes, please refer to the Error Codes page. In the FUSE class on Linux, errors are communicated using positive and negative Unix/Linux system error codes (e.g., the absence of some file, normally indicated in Linux using the ENOENT error code, may be reported by the Linux FUSE library as -ENOENT and so on).

Reporting Errors to the Class from Event Handlers

If an event has a ResultCode parameter, an event handler can use it to return the result code of the operation to the class. The ResultCode parameter is set to 0 by default, which indicates the operation was successful. In the FUSE class, on all supported platforms, you may use negative Unix/Linux system error codes to report errors (on Windows, Unix system codes will be converted to Win32 error codes implicitly). Also, on Windows, you may return Win32 error codes.

If an unhandled exception occurs in the event handler, it will be caught by the class, which will fire the OnError event.

In some events, the OS does not expect the error code to be returned and either the class or the OS ignores the returned error code. Please refer to the description of a particular event for more information.

How to Handle Errors Reported by the Class

If an error occurs, the class will throw an exception. The code property of the exception object will contain an error code, and the message property will contain an error message (if available).

Extended Logging in Windows

Some class methods in CBFS Connect are capable of writing extended information about reported errors to the Windows event logs, which can be viewed using the system's eventvwr.exe tool. The user mode part of the class writes to the "Windows Logs \ Application" folder, while the kernel mode part writes to the "Windows Logs \ System" folder.

The information written in the extended logs is meaningful to the Callback Technologies development team, but not to end-users, so extended logging is disabled by default. If issues occur during the installation of the CBFS Connect system drivers, or while using the class, please do the following:

  1. Enable extended logging (see below).
  2. Replicate the issue.
  3. Using Event Viewer (eventvwr.exe), export the event log entries from the locations mentioned above in native format (please restrict the scope of the export to just those entries related to CBFS Connect).
  4. Submit an issue report that includes the exported file.

There are two ways to toggle extended logging for a class:

  1. By toggling the class's LoggingEnabled configuration setting.
  2. By adding a DWORD-typed value named Enabled to the HKEY_LOCAL_MACHINE\SOFTWARE\Callback Technologies\{ClassName}\EventLog registry key and setting it to 0 (disabled) or 1 (enabled).
    • Replace the {ClassName} part of the registry key path with the name of the applicable class.
    • If this registry key, one of its parents, or the actual value does not exist, please create it manually.

    Note: If your code runs in emulated mode (x86 mode on x64 or ARM64 architecture), you need to add the value to the HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Callback Technologies\{ClassName}\EventLog registry key in addition to the "main" registry key.

The system must be rebooted anytime extended logging is enabled or disabled to make the changes take effect.

Troubleshooting

Note: The below information applies to the operations of the class on Windows .

CBFS Connect is a complex product that operates in both user mode and kernel mode simultaneously; so when a serious issue occurs, it's critical that we are able to obtain sufficient information about the circumstances of the failure.

In order to help us assist you in a more expedient manner, please collect the information described in the instructions below when reporting a serious issue (i.e., one that causes the system to crash or hang). Our development team cannot effectively diagnose such issues without this information.

Also, please note that these sorts of issues commonly involve environmental differences and other factors that are either unforeseen or otherwise out of our control. It is also not unheard-of for a crash to appear attributable to one thing while in fact being caused by something completely different. Rest assured that we are committed to assisting you as best we can, and we thank you ahead-of-time for your patience and understanding throughout the support process.

System Crashes (BSODs)

If you encounter a consistently-reproducible system crash (BSOD) that you suspect may be due to CBFS Connect, please obtain a crash dump and include it when reporting the issue to us. Our development team is unable to diagnose system crashes without the information these dumps contain.

Ensure that your system is set up to generate crash dumps, and to not restart automatically after a crash, by following the steps found in Microsoft's Enabling a Kernel-Mode Dump File article. The options available in the memory dump dropdown vary depending on your version of Windows; please choose the first one from the following list that is present in yours:

  • Complete
  • Full
  • Automatic
  • Kernel

Once your system is set up to generate crash dumps, perform the same action that caused the BSOD originally to trigger the crash again. When it occurs, be sure to copy the information on the BSOD screen exactly so that it can be included in your submission (a picture of the screen in which all of the information is legible is also acceptable). Here are some examples of the specific information we're looking for:

Recent versions of Windows:

What failed: cb***24.sys Stop Code: FILE_SYSTEM

Older versions of Windows:

STOP: 0x00000022 (0x00240076, 0xF7A07AA8, 0xF7A077A8, 0xF7800C82) cb***24.sys - Address F7800C82 base at F77CD000, DateStamp 447d6975

After you've copied this information, reboot and check that the memory dump file was created at %SYSTEMROOT%\MEMORY.DMP (typically this is C:\Windows\MEMORY.DMP; if you changed the dump file location in the crash dump settings, check the location you specified instead). It will be a very large file that is too big to attach to an email, so please upload it to a file sharing site of your choice and generate a sharing link that our development team can use to download it.

Finally, submit a support issue to us that includes the link to your dump file, all of the information from the BSOD screen (if you took a picture, attach it or provide another sharing link), a description of how the BSOD was triggered, and any other information that you feel is relevant.

System Hangs

If you encounter a consistently-reproducible system hang that you suspect may be due to CBFS Connect, you'll need to collect the same information as described above. But in order to obtain a crash dump, you'll first need to configure your system so that you can trigger a crash from the keyboard once it hangs. To make this possible, follow these steps (adapted from Microsoft's Forcing a System Crash from the Keyboard article):

  1. First, using the instructions provided in the section above, configure your system to generate crash dumps, and to not restart automatically after a crash.
  2. Next, you must enable keyboard-initiated crashes in the registry by creating a new value named CrashOnCtrlScroll, and setting it equal to a REG_DWORD value of 0x01, in all of the following registry keys:
    • HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\i8042prt\Parameters
    • HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\kbdhid\Parameters
    • HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\hyperkbd\Parameters
  3. Finally, you must restart the system in order for these settings to take effect.

After these steps are complete, you'll be able to trigger a keyboard-initiated crash by using the following hotkey sequence: hold down the Right CTRL key, and press the SCROLL LOCK key twice.

At this point, you can perform the same action that caused the system to hang originally to trigger the hang again. Once the system hangs, use the hotkey sequence to force it to crash, and then follow the rest of the instructions from the section above to collect and submit the necessary information.

Method Never Returns

If your application makes a call to some CBFS Connect method and that method never returns, there's a high chance that the driver deadlock has occurred. This can be caused by various factors; to determine the reason and possible solutions, we need a memory dump, as described above. If your system remains functional, you may also use easier ways to initiate a crash:

  1. Use the NotMyFault tool by SysInternals to initiate a system crash. This is the preferred way because it generates Complete memory dumps. This tool has a self-explanatory GUI.
  2. Use the LiveKD tool by SysInternals to create a live memory dump without crashing a system. The disadvantage of this method is that it creates Kernel dumps that are missing user-mode information. To use LiveKD on 32-bit systems, use this command: "livekd.exe -ml -o memory.dmp". On 64-bit systems, the command line would be "livekd64.exe -ml -k .\kd64.exe -o memory.dmp".

Application Crashes

Sometimes, an application crashes while the OS continues to operate, and the name of one of the modules of CBFS Connect is present in the crash information. A crashing application can be the one that uses CBFS Connect or some third-party process. If the crash occurs repeatedly, it is possible to make use of a User-Mode Crash Dump to locate or narrow down the source of the crash. Generation of crash dumps is disabled by default. Before you reproduce the crash, you need to Enable Collecting User-Mode Crash Dumps.

After you enable the crash dump, you don't need to reboot, you can proceed to reproduction of the crash immediately. After the crash re-occurs, you can pick the dump file from its location. The default locations of user-mode dump files are:

  • For regular applications: %LOCALAPPDATA%\CrashDumps
  • For System services: %WINDIR%\System32\Config\SystemProfile
  • For Network and Local services: %WINDIR%\ServiceProfiles
If you changed the dump file location in the crash dump settings in the Registry, check the location you specified instead.

A crash dump can be a large file (depending on the settings) that is too big to attach to an email, so please upload it to a file sharing site of your choice and generate a sharing link that our development team can use to download it.

Finally, submit a support issue to us that includes the link to your dump file, a description of how the BSOD or a manual crash was triggered, and any other information that you feel is relevant.

Helper DLL

The Helper DLL is integrated into Windows File Explorer and offers functionality designed to provide users with a consistent and pleasant experience. It is recommended that the Helper DLL be installed alongside the system driver, which can be accomplished by including the MODULE_HELPER_DLL flag when calling the install method.

The Helper DLL is distributed in the same .cab file as the system driver; its name is CBFSShellHelper24.dll, and it is shipped in both 32-bit and 64-bit variants. Its functionality is described below.

Mounting Point Change Broadcasts

Anytime a mounting point is added or removed, the system driver will send a notification to the Helper DLL, which then broadcasts a system message instructing Windows File Explorer to refresh the list of drives. Without this functionality, Windows File Explorer will not refresh the list of drives if a mounting point is added or removed from a Windows service or another user session.

Network Mounting Point

When a network mounting point is used, the Helper DLL provides the functionality that allows Windows File Explorer to correctly display the current status of, and interact with, the virtual drive. Without this functionality, the virtual drive will display as "Disconnected", which may result in unexpected behavior.

Custom Icons

When custom icons are used for a virtual drive, the Helper DLL ensures that they are properly displayed in Windows File Explorer.

Caching

Caching is a mechanism used to increase performance by optimizing access to commonly used data or information. The CBFS class provides caching functionality for three distinct areas:

  • File data caching
  • File metadata caching
  • Nonexistent file information caching

Each virtual drive can be individually configured to use all, some, or none of these caches; please refer to the following sections for more information about each one.

File Data Caching

The CBFS class can employ a file data cache to help decrease the on_read_file and on_write_file events' rate of fire. Three different file data cache implementations are available: (1) the Windows file management system cache, which is built into Windows; and the internal (2) kernel-mode and (3) user-mode caches, which are implemented using the CBFS class and its system driver.

At a basic level, all three cache implementations in CBFS work by storing blocks of file data in memory pages (which may, or may not, be swappable), and use some sort of least recently used (LRU)-based cache replacement algorithm to decide which pages to discard first (when necessary). Implementation-specific information is discussed in more detail in the following.

Additionally, an application may use the CBCache class for caching and buffering file data on the disk for further transfer to and from external or remote locations.

For each virtual drive, the file_cache property controls whether file data caching is enabled and specifies which file data cache implementation to use; by default, file data caching is enabled, and the system's cache implementation is used.

The FileCachePolicyWriteThrough configuration setting controls the behavior of the driver with regards to writing file data. When the setting is enabled, the driver will fire on_write_file as soon as the data have been cached to ensure that it is written out to storage as soon as the write request is received. When the setting is disabled (default), the data are cached, and flushed out to storage later; this causes write operations to appear to finish quickly, but since some operations (e.g., file copying) require a flushing step, they can appear to get "stuck" at the last moment as flushing occurs.

If the writing process opens a file with FILE_FLAG_WRITE_THROUGH option, the system and the driver behave as if FileCachePolicyWriteThrough were set.

System Cache Details

By default, the CBFS class leverages Windows' built-in file management system cache, offloading the job of file data caching to the OS entirely. For CBFS-based virtual drives configured to use the system cache, Windows automatically caches file data in a dedicated area of the system memory region. This caching operates on a per-file basis, and the overall cache size is managed automatically by the system based on current memory pressure. The system automatically balances the rate at which cached data are flushed to storage for optimal system performance. Together, these attributes ensure access to frequently read file data are as fast as possible. Please refer to Microsoft's File Caching article for more information about the system cache.

The system cache, by virtue of being a built-in Windows mechanism, is faster, more robust, and more flexible than either of the CBFS-implemented file data caches. Therefore, unless there are specific reasons to do otherwise, applications should always prefer the system cache over either of the internal implementations.

Kernel Mode Cache Details

The internal kernel mode file data cache uses a nonpaged (i.e., nonswappable) memory region to cache file data. As with the system cache, this approach guarantees high-speed access to cached data because it is guaranteed to remain in-memory. However, there are two caveats to be aware of regarding this memory region:

  • The memory region is limited to a finite size that cannot be increased. On modern systems, this is typically 300 MB, but the exact amount depends on how much physical memory the system has.
  • Because the memory region is allocated by the CBFS system driver, it is shared between all CBFS-based virtual drives on the system configured to use kernel mode file data caching.

If the kernel mode cache's size limit is reached, pages are freed in LRU order. Pages are also freed if they have not been used for 120 seconds.

User-Mode Cache Details

The internal user-mode file data cache uses pageable (i.e., swappable) memory regions, allocated per-process, to cache file data. Caching the file data in each process's memory space means the cache size can be as large as desired. As with the internal kernel mode cache, however, there are some caveats to keep in mind:

  • Because the memory regions are allocated per-process, all CBFS-based virtual drives within a given process (that are configured to use user-mode file data caching) share that process' memory region.
  • Because the memory is pageable, the OS can swap any page out to storage at any time, causing the next access of such pages' data to be slower.
  • CBFS locks recently used pages to prevent them from being swapped out, but the total size of all locked pages' data is limited to about 300 MB (and this limit is shared among all of the CBFS-based virtual drives on the system that are configured to use user-mode file data caching).

The UserModeFileCacheSize configuration setting controls the size of the user-mode file data cache. Because the memory is allocated per-process, this setting's value is synced between all CBFS class instances in a process and thus can be changed only when the user-mode cache is not in use by any virtual drive in the process (please refer to its documentation for details).

If the user-mode cache's size limit is reached, pages are freed in LRU order.

File Metadata Caching

The CBFS class can employ a file metadata cache to help decrease the on_get_file_info events' rate of fire. The file metadata cache stores entries containing file names, times, and attributes, which the class uses to automatically respond to file information requests (rather than firing the aforementioned events).

The metadata_cache_enabled property controls whether file metadata should be cached for a virtual drive; it is enabled by default. The metadata_cache_size property specifies the maximum number of cache entries.

While a file or directory is open, its metadata is kept available in a special record called a File Control Block, regardless of whether or not the metadata cache is enabled.

Nonexistent File Information Caching

The CBFS class can also employ a "nonexistent files" cache to further decrease the on_get_file_info event's rate of fire. The nonexistent files cache stores entries representing files that are known to be nonexistent (based on previous responses to the on_get_file_info event) so that the class can automatically respond to information requests for such files.

The non_existent_files_cache_enabled property controls whether nonexistent file information should be cached for a virtual drive. The non_existent_files_cache_size property specifies the maximum number of cache entries.

Docker Containers

CBFS Connect can be used to create virtual drives within Docker containers on Windows. To operate correctly the following conditions must be met:

  • The CBFS drivers are installed on the host machine.
  • The container uses Process isolation (Hyper-V isolation is not supported).

Isolation Modes and Windows Containers

Docker containers on Windows run in two isolation modes: Process isolation and Hyper-V isolation. These isolation modes are similar in that containers do not share resources with one another, but they differ in how much interaction the containers have with the host kernel. Process isolation containers share their kernel with the host system and allow for the isolation boundary to be crossed upon a user request. Containers running in Hyper-V isolation mode do not share a kernel with the host and are not supported.

The CBFS drivers need to be installed only on the host. The application creating the virtual drive can run either in the container or on the host.

Virtual Drives

Virtual drives can be created in a variety of ways. Common scenarios include creating virtual drives from within a container, or while running on the host, creating a virtual drive for use within a container.

To create and use a virtual drive within a container no special settings are required, as shown in the following example:

//Create a virtual drive within a container //This code runs in a container. cbfs.AddMountingPoint("Z:", STGMP_SIMPLE, 0);

When mounting a drive within a container, the host will not have access to the drive. If, however, a network mounting point with a UNC (universal naming convention) path is created, the drive will be accessible as a regular network share to applications running on the host or in other process-isolated containers in the same system.

When running on the host, set the DockerContainerId setting to the container Id to specify the container where the drive will be created. The container Id is either the 12-character short Id or the full 64-character Id of the container. See the following example:

//Create a virtual drive form the host, for use by a container //This code runs on the host, and must specify a container Id cbfs.Config("DockerContainerId=4c01db0b339c"); cbfs.AddMountingPoint("Z:", STGMP_SIMPLE, 0);

Mounting Point Restrictions

Certain mounting point types are not supported depending on where the mount operation takes place. An application that is running on the host cannot create a network mounting point (STGMP_NETWORK) for a virtual drive that has been created in a container. STGMP_MOUNT_MANAGER mounting points are not supported in containers, regardless of whether the process performing the mount operation is running on the host or within the container.

Advanced Features

The topics in this section provide information about various advanced features of CBFS Connect.

Topics

Custom Drive Icons

Virtual drives created with the CBFS class can have a custom icon associated with them to better distinguish them in Windows File Explorer. There are a few different ways to accomplish this:

Using Additional Files

If placing additional files into the virtual drive itself is an acceptable condition, Windows provides a couple of file-based mechanisms for specifying a custom icon.

To specify a custom icon for the virtual drive itself, an autorun.inf file can be created based on the information in Microsoft's Autorun.inf article.

Additionally, custom icons can be specified for subdirectories of the virtual drive using desktop.ini files, which can be created based on the information in Microsoft's Desktop.ini article. Note that desktop.ini files cannot be used to specify a custom icon for the root directory of the virtual drive (i.e., they cannot be used to change the icon of the virtual drive itself).

Using Registry Keys

If the virtual drive is assigned a persistent drive letter, using registry keys to assign a custom icon may be a good option. To specify a custom icon using the registry, create a subkey like {DriveLetter}\DefaultIcon (e.g., K\DefaultIcon) under one of the following keys:

  • HKEY_CURRENT_USER\SOFTWARE\Classes\Applications\Explorer.exe\Drives, if the custom icon should only be used for the current user.
  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\DriveIcons, if the custom icon should be used for all users. (Note that manipulating anything under the HKEY_LOCAL_MACHINE registry hive requires administrative rights.)

Regardless of which key it is created under, the {DriveLetter}\DefaultIcon subkey's (Default) (i.e., "unnamed") value should then be set to the absolute path of the icon file.

Please note that custom icons specified in this manner are only effective so long as the drive letter assigned to the virtual drive remains unchanged over time; if its drive letter changes, the registry keys used to specify the custom icon will need to be updated accordingly.

Using the Class and its Shell Helper DLL

As long as the Helper DLL has been installed to the system using the install method, custom icons can be assigned to a virtual drive directly using the class. This method of specifying custom icons is especially valuable when project constraints preclude placing additional files into the drive or modifying the registry.

Custom icons assigned in this manner function a bit differently than those assigned using the two methods described above, as they are implemented using Windows' icon overlay mechanism. Consequently, the custom icons are restricted to 25% of the original icon's area (except for 16x16 icons); the tables below describe the required sizes and color levels of the assets in the icon file.

Overlay icon sizes map as follows:

Main Icon SizeOverlay Icon Size
16x1610x10
32x3216x16
48x4824x24
256x256128x128

Icon assets must have the following color levels:

Icon SizeColor Level
16x1616 colors
32x3216 colors
48x48256 colors
256x25632-bit color

Because it's possible to specify multiple different overlay icons (e.g., to represent different drive states), icons are assigned through the class using a two-step process:

  1. Register the desired icon(s) using the register_icon method. (Note that administrative rights are required to execute this method successfully.)
  2. Switch between the registered icon(s) using the set_icon and reset_icon methods.

Icons are copied to a temporary location when registered; and removed from said location when unregistered using the unregister_icon method.

It is important to keep in mind that Windows limits the number of registered overlay icons to 15 (this is a global limit for the entire system, and it cannot be changed). Since other applications on the system (e.g., OneDrive, Dropbox, etc.) may have registered multiple overlay icons, it's not uncommon to get into a situation where various applications are competing to have their overlay icons registered.

Overlay icons are registered by placing values in the following keys in the Registry:

  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\ShellIconOverlayIdentifiers
  • HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\explorer\ShellIconOverlayIdentifiers (64-bit Windows only)
If necessary, it's up to the application (or better yet, the user) to decide whether or not to remove other entries; however, doing so too aggressively will likely have a negative impact on the user's experience with other applications.

Hard Links

For filesystems that support them, hard links are essentially arbitrary names used to expose a particular file's data at one or more locations within the filesystem's directory structure. For example, although not typically referred to as such, a file's "original" file name is technically a hard link in such a filesystem.

Expanding on the previous example, hard links could be thought of as "additional" names for the "original" file (although a file's "original" name technically is not any more important than any "additional" names for it that may exist elsewhere in the filesystem). Regardless of the order in which they are created and deleted, or their location within the directory structure, all of the names (i.e., hard links) for a particular file are equivalent because they all reference the exact same file data.

Because of this equivalence, a CBFS-based virtual filesystem that indicates support for hard links (see below for details) must handle file deletion requests carefully. When such a request arrives, the application must check how many hard links to the actual file data remain after removing the requested one; the file data should not actually be deleted until the last hard link has been deleted.

Note: Hard links, by definition, can refer only to files. Directories only ever have one name, in one location, within a CBFS-based virtual filesystem (to avoid recursion).

Hard links are a powerful and complex feature of Windows filesystems; and for clarity and brevity, this overview omits many details not relevant to understanding how the CBFS class interacts with them. To learn more about hard links, please refer to Microsoft's Hard Links article.

Supporting Hard Links with CBFS

In CBFS, the use_hard_links property is used to indicate whether a virtual filesystem supports hard links; it is disabled by default. Virtual filesystems that do advertise hard link support must also support file Ids and Id-to-name resolution.

When hard link support is enabled, the following events must be handled by the class:

Additionally, the on_get_file_info event handler must always be sure to return the number of hard links a file has by setting the HardLinkCount event parameter appropriately.

Finally, as described previously, the application must be careful to delete file data only after the last hard link to a file has been deleted.

Reparse Points

For filesystems that support them, reparse points are essentially storage containers for some application-specific data associated with a file or directory. The format of the data held by a reparse point is determined by the application that stored it; the data are treated as opaque by the OS, which in turn delegates the interpretation of it to some filesystem filter. The notable exceptions are symbolic links and NFS links, implemented using reparse points and handled by the OS and a filesystem.

To assist the OS in determining which filesystem filter to use, each reparse point includes a reparse tag that uniquely identifies the data's format. When the filesystem encounters a reparse point, it attempts to find the filesystem filter associated with the data format identified by this tag. If a matching filesystem filter is found, it is invoked to process the file/directory using the reparse point's data.

Some reparse tags are predefined by Microsoft, and others are third-party-defined (although they are still assigned by Microsoft to ensure uniqueness). All Microsoft-defined tags are listed in this MSDN article; there is no central list of third-party tags.

A complete reparse point (e.g., tag, data) is always represented as either a REPARSE_GUID_DATA_BUFFER structure, or (for certain Microsoft-reserved tags) a REPARSE_DATA_BUFFER structure. That said, the CBFS class interacts with reparse points in a limited fashion, and its API exposes reparse point structures as nothing more than opaque blobs of data for the application to store and retrieve.

Reparse points are a powerful and complex feature of Windows filesystems; and for clarity and brevity, this overview omits many details not relevant to understanding how the CBFS class interacts with them. To learn more about reparse points, please refer to Microsoft's Reparse Point articles.

Supporting Reparse Points with CBFS

In CBFS, the use_reparse_points property is used to indicate whether a virtual filesystem supports reparse points; it is disabled by default. When a virtual filesystem has reparse point support enabled, the associated CBFS-based application is responsible for storing reparse points created within the virtual filesystem.

When reparse point support is enabled, the following events must be handled by the class:

Folling is the generalized flow for reparse point support:

  1. When a reparse point is created somewhere within the virtual filesystem, the on_set_reparse_point event will fire. To properly handle this event, the application must store the data provided in the ReparseBuffer parameter in some persistent manner (this data should not replace other file/directory data; it should be stored in addition to such data). In addition, it must record (in some manner) the fact that the file/directory specified by the event has a reparse point associated with it.
  2. Any time information about a file/directory is requested via the on_enumerate_directory and/or on_get_file_info events, the application must check whether a reparse point is associated with it. If so, the FILE_ATTRIBUTE_REPARSE_POINT (0x00000400) flag must be included in the returned Attributes event parameter value.
  3. Any time the OS encounters a file/directory with the aforementioned file attribute flag set, it will request the reparse point data before continuing so that it knows how to process the file/directory correctly. Such requests cause the on_get_reparse_point event to fire, at which point the application must return the requested reparse point data.
  4. When a reparse point is deleted somewhere within the virtual filesystem, the on_delete_reparse_point event will fire. At this point, the application should delete the data it had stored for that reparse point and should update its records for the file/directory accordingly (to indicate that a reparse point is no longer associated with it).

Please refer to each of the above events' documentation for more information about how to properly handle each scenario.

File IDs

File Ids are used by some system components (e.g., NFS sharing) and third-party applications as an alternative to file names. Ids associated with files must be unique within a filesystem and should not change over time.

The root directory (\) always uses the predefined Id 0x7FFFFFFFFFFFFFFF.

For proper File Id support, an application must do all of the following:

If an Id, returned during directory enumeration, is 0, the class will generate an Id. This generated Id is not guaranteed to be unique, especially under a heavy load. File Ids, including the generated ones, are kept in the metadata cache. If the metadata cache is disabled and Ids are not provided by an application, the driver will fail to provide proper File Ids and will open files by Id, which can prevent operations of components and applications that rely on File Id support.

Short File Names

The CBFS class includes optional support for short filenames (also known as 8.3 filenames), which allows files to be accessed using the 8.3 name format: an eight-character name followed by a three-character extension, separated by a period, containing no spaces. A CBFS-based virtual filesystem may need to enable this feature, for example, for its files to be accessible to some older, third-party software that only knows of the 8.3 name format.

In CBFS, the use_short_file_names property is used to indicate whether a virtual filesystem supports short filenames; it is disabled by default. Virtual filesystems that do advertise short filename support must also:

  • Be able to generate short filenames that uniquely identify any and all files.
  • Be able to map between files' normal (i.e., long) and short names, in both directions.
  • Be sure to return files' short names when handling the on_enumerate_directory and on_get_file_info events.
  • Be able to handle a short filename being provided (instead of a normal one) in any event.

For more information about short filenames, please refer to this section of Microsoft's file naming article.

Symbolic Links

Symbolic links are implemented in Windows as reparse points with a dedicated reparse tag. Such reparse points' data contains the name of the target of the symbolic link (i.e., the file or directory referenced by the symbolic link).

To support symbolic links, an application must do the following things:

  1. Enable reparse points by setting the use_reparse_points property to True and implementing the corresponding events:
  2. For files and directories that are symbolic links, include the FILE_ATTRIBUTE_REPARSE_POINT flag in the attributes of the corresponding file or directory when handling the on_enumerate_directory and on_get_file_info events.
  3. When handling the on_open_file event, check whether the Attributes event parameter contains the FILE_FLAG_OPEN_REPARSE_POINT flag. If the flag is present, the application must open the target instead of the file itself. If the flag is not present, the application must act as if the file did not have a reparse point attached to it.

Additionally, for Network-type mounting points (i.e., those for which STGMP_NETWORK was included in the value passed for the add_mounting_point method's Flags parameter), symbolic links will work only if support for symbolic links on network drives is enabled in the system. This can be done by running the following commands in an elevated command prompt:

  • fsutil behavior set symlinkEvaluation R2R:1
  • fsutil behavior set symlinkEvaluation L2R:1
  • fsutil behavior set symlinkEvaluation R2L:1

Re-use of Process IDs

When using various rules that are based on process IDs (PID), you need to be aware that Windows tends to reuse PID numbers. Once the process with a certain PID is finished in any way, Windows can re-use this PID for another process being started. And it does reuse PIDs quite frequently for the purpose of keeping PID numbers low.

Such reuse can cause unexpected and sometimes unpleasant consequence for your application. To counteract it, you can take one or both actions:

  1. Open a handle to the process with the needed PID and not close it as long as your rule exists. Windows documentation states that as long as there exists an open handle to a process, its PID is not reused.
  2. Track completion of the process with the given PID (either by monitoring the state of the process by its handle or using CBProcess component of the CBFS Filter product) and once the process is finished, delete the corresponding rule.

CBCache Topics

The topics in this section provide additional information specific to the CBCache class.

Topics

File Ids

Each file in the cache must have a unique file Id that identifies it. File Ids are strings of unlimited length that may contain any character; the CaseSensitive parameter of the CBCache class's open_cache method determines whether the class will treat file Ids in a case-sensitive or a case-insensitive manner.

File Sizes

For each file in the cache, the CBCache class keeps track of an application-defined file size value that indicates the total size of the file (i.e., the size of the real file outside of the cache, not the amount of data cached for the file).

The CBCache class will not change this value by itself. It is up to the application to set it. If True is passed for the FetchMissingData parameter when a file's size is changed using the file_set_size method, the CBCache class will fire the on_read_data event to fetch additional data from external storage so that it may be cached.

File Tags

CBFS Connect allows applications to attach arbitrary metadata to any cached file as tags. Such metadadata can be used to store supplementary information (e.g., file metadata or some file handle).

File tags are stored as key-value pairs. They use numeric Ids as keys and store raw binary data.

  • Valid Id values are those in the range 0x0001 to 0xCFFF (inclusive).
  • A tag should contain at least one (1) byte of data.
  • The maximum size of a file tag's binary data is 65531 bytes.

Each cached file can have up to 53247 raw file tags attached to it at one time. The following methods are used to manage and interact with file tags:

Local Files

Local files are files created in the cache that do not have corresponding files in an external storage. Local files are stored wholly within the cache, and the cache will never attempt to read data into them from external storage, or flush their data out to an external storage.

Typically, applications make use of local files in cases in which certain files are known to be temporary or for some other reason that they should not be written to an external storage. For example, the Windows Thumbs.db and desktop.ini files and the macOS .DS_Store files. It is up to the application to determinate the state of, and subsequently manage, all local files.

Orphan Files

Files that are not currently open, yet have unflushed data, are referred to as orphan files. Typically, orphan files appear when an application is closed or terminated before all file modifications have a chance to be flushed to external storage.

To recover orphan files, applications should enumerate them by passing the ENUM_MODE_ORPHAN flag to the enumerate_cached_files method. The application is then free to open, flush, or delete these files as needed.

Reporting Transfer Errors

If the on_read_data or on_write_data event reports a RWRESULT_FILE_FAILURE result, the class will not send any more requests for the specified file until the application calls reset_error_state to reset the error state of the file.

If the on_read_data or on_write_data event reports a RWRESULT_PERMANENT_FAILURE result, the class will not send any more requests for any file until the application calls reset_error_state to reset the global error state.

Cache Corruption

Vault files used by CBCache have a rather complex internal structure that may become corrupted if the class is not deactivated properly or if some operation is interrupted. Typically, such things are caused by an application crash or a system crash. Corruption also can occur if a cache's raw data is modified externally, either intentionally or because of storage failure.

Journaling

To reduce the chances of cache corruption in the event of a crash, CBCache can make use of journaling. Journaling works by wrapping vault modification operations in transactions, as follows:

  1. A new transaction is opened by writing information about a change to a journal located within the vault.
  2. The changes themselves are written to the vault.
  3. The transaction is committed by writing another entry in the journal.

If a crash occurs, any interrupted modification operations will appear in a cache vault's journal as a pending transaction. The next time CBFS Connect opens the cache vault, it will discover any pending transactions and automatically recover them. During the transaction recovery process, each transaction is either committed or rolled back, depending on its last known state.

Overall, journaling is an effective technique used to maintain data integrity. However, keep the following considerations in mind:

  • When journaling is enabled, all file data changes incur additional write operations; this has a significant impact on overall write performance.
  • Journaling does not provide any kind of data redundancy or consistency; it cannot protect against corruption caused by bit-rot, storage failures, or external modification of a vault's physical data.

CBFS Connect implements journaling as an operational mode rather than a cache vault attribute, so there is no such thing as a "journaled cache vault" or a "non-journaled cache vault". Applications control whether the journaling mode is used by setting the JournalingMode configuration setting before activating the class.

The class will always perform transaction recovery when the cache is opened (if there are pending transactions in its journal) , even if journaling is disabled.

Error Handling

Reporting Errors to the Class from Event Handlers

If the event has a ResultCode parameter, the event handler should use it to return the result code of the operation to the class. The ResultCode parameter is set to 0 by default, which indicates the operation was successful. If a problem occurs while handling the event, set the ResultCode parameter to one of the following values to inform the class about it:

RWRESULT_SUCCESS0x00000000Success.

Returning this code indicates that all data were successfully transferred.

RWRESULT_PARTIAL0x00000080Partial success.

Returning this code indicates that some, but not all, data were successfully transferred. i.e., when BytesRead < BytesToRead (for on_read_data), or BytesWritten < BytesToWrite (for on_write_data).

RWRESULT_RANGE_BEYOND_EOF0x00000081Specified range is beyond the end of the file (EOF).

For the on_read_data event, returning this code indicates to the cache that there is no need to request data beyond (Position + BytesRead), and that the tail should be zeroed instead.

RWRESULT_FILE_MODIFIED_EXTERNALLY0x00000082The specified file's size was modified externally.

Returning this code indicates that the operation was not completed successfully because of this conflicting situation.

RWRESULT_FILE_MODIFIED_LOCALLY0x00000083The requested file's size was modified locally.

RWRESULT_FAILURE0x0000008DThe operation failed for some transient reason.

Returning this code indicates that, even though the current request failed, it is safe to continue sending requests for both the specified file and others. (Other operations may continue.)

RWRESULT_FILE_FAILURE0x0000008EThe operation failed for some external-file-related reason.

Returning this code indicates that some failure related to the external file has occurred, and it is expected that any further requests made against that file will also fail. The class will not send any more requests for the specified file until the file-specific error is reset as described in Reporting Transfer Errors.

RWRESULT_PERMANENT_FAILURE0x0000008FThe operation failed for some external-storage-related reason.

Returning this code indicates that some failure related to the external storage has occurred, and it is expected that all further requests will also fail. The class will not send any more requests for any file until the global error is reset as described in Reporting Transfer Errors.

If an unhandled exception occurs in the event handler, it will be caught by the class, which will fire the OnError event.

How to Handle Errors Reported by the Class

The CBCache class APIs communicate errors to applications using the error codes defined on the Error Codes page.

If an error occurs, the class will throw an exception. The code property of the exception object will contain an error code, and the message property will contain an error message (if available).

Recursive Calls

To ensure stable operation, it is critical to avoid accessing drives and filesystems recursively. Essentially, this means that event handlers must not perform any operations involving the drive or filesystem that the event fired for (i.e., don't read from/write to files on it, don't unmount the media, etc.).

Buffer Parameters

Some events include one or more parameters intended for use as a binary data buffer. Depending on the event, these parameters may contain data when the event is fired, or it may be expected that the application populates them with the desired amount of data during the event handler. Some events combine both paradigms and then expect the application to modify the data already present when the event is fired.

The documentation for such events will describe which of these cases applies to each buffer parameter. In all cases, buffer parameters point to a preallocated block of unmanaged memory, the size of which is specified by the parameter immediately following the buffer parameter. In cases in which data are to be written, be sure to write it directly to the pointed-to memory, do not change the value of the buffer parameter itself. Buffer parameters are always of the ctypes.c_void_p type; use the ctypes.memmove() method to read and write data from and to the unmanaged memory region. To obtain a pointer to a Python buffer suitable for use with ctypes.memmove(), call ctypes.c_void_p.from_buffer() and pass it a byte array or some other type that implements Python's buffer interface.