Content-type: text/html; charset=UTF-8 Man page of IOCTL-FIEXCHANGE_RANGE

IOCTL-FIEXCHANGE_RANGE

Section: Linux Programmer's Manual (2)
Updated: 2021-04-01
Index Return to Main Contents
 

NAME

ioctl_fiexchange_range - exchange the contents of parts of two files  

SYNOPSIS


#include <sys/ioctl.h>
#include <linux/fiexchange.h>

int ioctl(int file2_fd, FIEXCHANGE_RANGE, struct file_xchg_range *arg);  

DESCRIPTION

Given a range of bytes in a first file file1_fd and a second range of bytes in a second file file2_fd, this ioctl(2) exchanges the contents of the two ranges.

Exchanges are atomic with regards to concurrent file operations, so no userspace-level locks need to be taken to obtain consistent results. Implementations must guarantee that readers see either the old contents or the new contents in their entirety, even if the system fails.

The exchange parameters are conveyed in a structure of the following form:

struct file_xchg_range {
    __s64    file1_fd;
    __s64    file1_offset;
    __s64    file2_offset;
    __s64    length;


    __u64    flags;


    __s64    file2_ino;
    __s64    file2_mtime;
    __s64    file2_ctime;
    __s32    file2_mtime_nsec;
    __s32    file2_ctime_nsec;


    __u64    pad[6]; };

The field pad must be zero.

The fields file1_fd, file1_offset, and length define the first range of bytes to be exchanged.

The fields file2_fd, file2_offset, and length define the second range of bytes to be exchanged.

Both files must be from the same filesystem mount. If the two file descriptors represent the same file, the byte ranges must not overlap. Most disk-based filesystems require that the starts of both ranges must be aligned to the file block size. If this is the case, the ends of the ranges must also be so aligned unless the FILE_XCHG_RANGE_TO_EOF flag is set.

The field flags control the behavior of the exchange operation.

FILE_XCHG_RANGE_FILE2_FRESH
Check the freshness of file2_fd after locking the file but before exchanging the contents. The supplied file2_ino field must match file2's inode number, and the supplied file2_mtime, file2_mtime_nsec, file2_ctime, and file2_ctime_nsec fields must match the modification time and change time of file2. If they do not match, EBUSY will be returned.
FILE_XCHG_RANGE_TO_EOF
Ignore the length parameter. All bytes in file1_fd from file1_offset to EOF are moved to file2_fd, and file2's size is set to (file2_offset+(file1_length-file1_offset)). Meanwhile, all bytes in file2 from file2_offset to EOF are moved to file1 and file1's size is set to (file1_offset+(file2_length-file2_offset)). This option is not compatible with FILE_XCHG_RANGE_FULL_FILES.
FILE_XCHG_RANGE_FSYNC
Ensure that all modified in-core data in both file ranges and all metadata updates pertaining to the exchange operation are flushed to persistent storage before the call returns. Opening either file descriptor with O_SYNC or O_DSYNC will have the same effect.
FILE_XCHG_RANGE_SKIP_FILE1_HOLES
Skip sub-ranges of file1_fd that are known not to contain data. This facility can be used to implement atomic scatter-gather writes of any complexity for software-defined storage targets.
FILE_XCHG_RANGE_DRY_RUN
Check the parameters and the feasibility of the operation, but do not change anything.
FILE_XCHG_RANGE_COMMIT
This flag is a combination of FILE_XCHG_RANGE_FILE2_FRESH | FILE_XCHG_RANGE_FSYNC and can be used to commit changes to file2_fd to persistent storage if and only if file2 has not changed.
FILE_XCHG_RANGE_FULL_FILES
Require that file1_offset and file2_offset are zero, and that the length field matches the lengths of both files. If not, EDOM will be returned. This option is not compatible with FILE_XCHG_RANGE_TO_EOF.
FILE_XCHG_RANGE_NONATOMIC
This flag relaxes the requirement that readers see only the old contents or the new contents in their entirety. If the system fails before all modified in-core data and metadata updates are persisted to disk, the contents of both file ranges after recovery are not defined and may be a mix of both.

Do not use this flag unless the contents of both ranges are known to be identical and there are no other writers.

 

RETURN VALUE

On error, -1 is returned, and errno is set to indicate the error.

 

ERRORS

Error codes can be one of, but are not limited to, the following:
EBADF
file1_fd is not open for reading and writing or is open for append-only writes; or file2_fd is not open for reading and writing or is open for append-only writes.
EBUSY
The inode number and timestamps supplied do not match file2_fd and FILE_XCHG_RANGE_FILE2_FRESH was set in flags.
EDOM
The ranges do not cover the entirety of both files, and FILE_XCHG_RANGE_FULL_FILES was set in flags.
EINVAL
The parameters are not correct for these files. This error can also appear if either file descriptor represents a device, FIFO, or socket. Disk filesystems generally require the offset and length arguments to be aligned to the fundamental block sizes of both files.
EIO
An I/O error occurred.
EISDIR
One of the files is a directory.
ENOMEM
The kernel was unable to allocate sufficient memory to perform the operation.
ENOSPC
There is not enough free space in the filesystem exchange the contents safely.
EOPNOTSUPP
The filesystem does not support exchanging bytes between the two files.
EPERM
file1_fd or file2_fd are immutable.
ETXTBSY
One of the files is a swap file.
EUCLEAN
The filesystem is corrupt.
EXDEV
file1_fd and file2_fd are not on the same mounted filesystem.
 

CONFORMING TO

This API is Linux-specific.  

USE CASES

Three use cases are imagined for this system call.

The first is a filesystem defragmenter, which copies the contents of a file into another file and wishes to exchange the space mappings of the two files, provided that the original file has not changed. The flags NONATOMIC and FILE2_FRESH are recommended for this application.

The second is a data storage program that wants to commit non-contiguous updates to a file atomically. This can be done by creating a temporary file, calling FICLONE(2) to share the contents, and staging the updates into the temporary file. Either of the FULL_FILES or TO_EOF flags are recommended, along with FSYNC. Depending on the application's locking design, the flags FILE2_FRESH or COMMIT may be applicable here. The temporary file can be deleted or punched out afterwards.

The third is a software-defined storage host (e.g. a disk jukebox) which implements an atomic scatter-gather write command. Provided the exported disk's logical block size matches the file's allocation unit size, this can be done by creating a temporary file and writing the data at the appropriate offsets. Use this call with the SKIP_HOLES flag to exchange only the blocks involved in the write command. The use of the FSYNC flag is recommended here. The temporary file should be deleted or punched out completely before being reused to stage another write.  

NOTES

Some filesystems may limit the amount of data or the number of extents that can be exchanged in a single call.  

SEE ALSO

ioctl(2)


 

Index

NAME
SYNOPSIS
DESCRIPTION
RETURN VALUE
ERRORS
CONFORMING TO
USE CASES
NOTES
SEE ALSO

This document was created by man2html, using the manual pages.
Time: 15:00:59 GMT, September 21, 2021