#define _GNU_SOURCE /* See feature_test_macros(7) */ #include <sched.h> int sched_setaffinity(pid_t pid, size_t cpusetsize, const cpu_set_t *mask); int sched_getaffinity(pid_t pid, size_t cpusetsize, cpu_set_t *mask);
A CPU affinity mask is represented by the cpu_set_t structure, a "CPU set", pointed to by mask. A set of macros for manipulating CPU sets is described in CPU_SET(3).
sched_setaffinity() sets the CPU affinity mask of the thread whose ID is pid to the value specified by mask. If pid is zero, then the calling thread is used. The argument cpusetsize is the length (in bytes) of the data pointed to by mask. Normally this argument would be specified as sizeof(cpu_set_t).
If the thread specified by pid is not currently running on one of the CPUs specified in mask, then that thread is migrated to one of the CPUs specified in mask.
sched_getaffinity() writes the affinity mask of the thread whose ID is pid into the cpu_set_t structure pointed to by mask. The cpusetsize argument specifies the size (in bytes) of mask. If pid is zero, then the mask of the calling thread is returned.
There are various ways of determining the number of CPUs available on the system, including: inspecting the contents of /proc/cpuinfo; using sysconf(3) to obtain the values of the _SC_NPROCESSORS_CONF and _SC_NPROCESSORS_ONLN parameters; and inspecting the list of CPU directories under /sys/devices/system/cpu/.
sched(7) has a description of the Linux scheduling scheme.
The affinity mask is a per-thread attribute that can be adjusted independently for each of the threads in a thread group. The value returned from a call to gettid(2) can be passed in the argument pid. Specifying pid as 0 will set the attribute for the calling thread, and passing the value returned from a call to getpid(2) will set the attribute for the main thread of the thread group. (If you are using the POSIX threads API, then use pthread_setaffinity_np(3) instead of sched_setaffinity().)
The isolcpus boot option can be used to isolate one or more CPUs at boot time, so that no processes are scheduled onto those CPUs. Following the use of this boot option, the only way to schedule processes onto the isolated CPUs is via sched_setaffinity() or the cpuset(7) mechanism. For further information, see the kernel source file Documentation/admin-guide/kernel-parameters.txt. As noted in that file, isolcpus is the preferred mechanism of isolating CPUs (versus the alternative of manually setting the CPU affinity of all processes on the system).
A child created via fork(2) inherits its parent's CPU affinity mask. The affinity mask is preserved across an execve(2).
On success, the raw sched_getaffinity() system call returns the number of bytes placed copied into the mask buffer; this will be the minimum of cpusetsize and the size (in bytes) of the cpumask_t data type that is used internally by the kernel to represent the CPU set bit mask.
sched_getaffinity(pid, sizeof(cpu_set_t), &mask);
fail with the error EINVAL, the error produced by the underlying system call for the case where the mask size specified in cpusetsize is smaller than the size of the affinity mask used by the kernel. (Depending on the system CPU topology, the kernel affinity mask can be substantially larger than the number of active CPUs in the system.)
When working on systems with large kernel CPU affinity masks, one must dynamically allocate the mask argument (see CPU_ALLOC(3)). Currently, the only way to do this is by probing for the size of the required mask using sched_getaffinity() calls with increasing mask sizes (until the call does not fail with the error EINVAL).
Be aware that CPU_ALLOC(3) may allocate a slightly larger CPU set than requested (because CPU sets are implemented as bit masks allocated in units of sizeof(long)). Consequently, sched_getaffinity() can set bits beyond the requested allocation size, because the kernel sees a few additional bits. Therefore, the caller should iterate over the bits in the returned set, counting those which are set, and stop upon reaching the value returned by CPU_COUNT(3) (rather than iterating over the number of bits requested to be allocated).
As the sample runs below demonstrate, the amount of real and CPU time consumed when running the program will depend on intra-core caching effects and whether the processes are using the same CPU.
We first employ lscpu(1) to determine that this (x86) system has two cores, each with two CPUs:
$ lscpu | egrep -i 'core.*:|socket' Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1
We then time the operation of the example program for three cases: both processes running on the same CPU; both processes running on different CPUs on the same core; and both processes running on different CPUs on different cores.
$ time -p ./a.out 0 0 100000000 real 14.75 user 3.02 sys 11.73 $ time -p ./a.out 0 1 100000000 real 11.52 user 3.98 sys 19.06 $ time -p ./a.out 0 3 100000000 real 7.89 user 3.29 sys 12.07
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \
} while (0)
int
main(int argc, char *argv[])
{
cpu_set_t set;
int parentCPU, childCPU;
int nloops;
if (argc != 4) {
fprintf(stderr, "Usage: %s parent-cpu child-cpu num-loops\n",
argv[0]);
exit(EXIT_FAILURE);
}
parentCPU = atoi(argv[1]);
childCPU = atoi(argv[2]);
nloops = atoi(argv[3]);
CPU_ZERO(&set);
switch (fork()) {
case -1: /* Error */
errExit("fork");
case 0: /* Child */
CPU_SET(childCPU, &set);
if (sched_setaffinity(getpid(), sizeof(set), &set) == -1)
errExit("sched_setaffinity");
for (int j = 0; j < nloops; j++)
getppid();
exit(EXIT_SUCCESS);
default: /* Parent */
CPU_SET(parentCPU, &set);
if (sched_setaffinity(getpid(), sizeof(set), &set) == -1)
errExit("sched_setaffinity");
for (int j = 0; j < nloops; j++)
getppid();
wait(NULL); /* Wait for child to terminate */
exit(EXIT_SUCCESS);
}
}