This configuration file contains metadata necessary to implement standard operations against the container. This includes the process to run, environment variables to inject, sandboxing features to use, etc.
The canonical schema is defined in this document, but there is a JSON Schema in schema/config-schema.json
and Go bindings in specs-go/config.go
.
Platform-specific configuration schema are defined in the platform-specific documents linked below.
For properties that are only defined for some platforms, the Go property has a platform
tag listing those protocols (e.g. platform:"linux,solaris"
).
Below is a detailed description of each field defined in the configuration format and valid values are specified. Platform-specific fields are identified as such. For all platform-specific configuration values, the scope defined below in the Platform-specific configuration section applies.
ociVersion
(string, REQUIRED) MUST be in SemVer v2.0.0 format and specifies the version of the Open Container Initiative Runtime Specification with which the bundle complies. The Open Container Initiative Runtime Specification follows semantic versioning and retains forward and backward compatibility within major versions. For example, if a configuration is compliant with version 1.1 of this specification, it is compatible with all runtimes that support any 1.1 or later release of this specification, but is not compatible with a runtime that supports 1.0 and not 1.1.
"ociVersion": "0.1.0"
root
(object, OPTIONAL) specifies the container's root filesystem.
On Windows, for Windows Server Containers, this field is REQUIRED.
For Hyper-V Containers, this field MUST NOT be set.
On all other platforms, this field is REQUIRED.
-
path
(string, REQUIRED) Specifies the path to the root filesystem for the container.- On Windows,
path
MUST be a volume GUID path. - On POSIX platforms,
path
is either an absolute path or a relative path to the bundle. For example, with a bundle at/to/bundle
and a root filesystem at/to/bundle/rootfs
, thepath
value can be either/to/bundle/rootfs
orrootfs
. The value SHOULD be the conventionalrootfs
.
A directory MUST exist at the path declared by the field.
- On Windows,
-
readonly
(bool, OPTIONAL) If true then the root filesystem MUST be read-only inside the container, defaults to false.- On Windows, this field MUST be omitted or false.
"root": {
"path": "rootfs",
"readonly": true
}
"root": {
"path": "\\\\?\\Volume{ec84d99e-3f02-11e7-ac6c-00155d7682cf}\\"
}
mounts
(array of objects, OPTIONAL) specifies additional mounts beyond root
.
The runtime MUST mount entries in the listed order.
For Linux, the parameters are as documented in mount(2) system call man page.
For Solaris, the mount entry corresponds to the 'fs' resource in the zonecfg(1M) man page.
destination
(string, REQUIRED) Destination of mount point: path inside container.- Linux: This value SHOULD be an absolute path. For compatibility with old tools and configurations, it MAY be a relative path, in which case it MUST be interpreted as relative to "/". Relative paths are deprecated.
- Windows: This value MUST be an absolute path. One mount destination MUST NOT be nested within another mount (e.g., c:\foo and c:\foo\bar).
- Solaris: This value MUST be an absolute path. Corresponds to "dir" of the fs resource in zonecfg(1M).
- For all other platforms: This value MUST be an absolute path.
source
(string, OPTIONAL) A device name, but can also be a file or directory name for bind mounts or a dummy. Path values for bind mounts are either absolute or relative to the bundle. A mount is a bind mount if it has eitherbind
orrbind
in the options.- Windows: a local directory on the filesystem of the container host. UNC paths and mapped drives are not supported.
- Solaris: corresponds to "special" of the fs resource in zonecfg(1M).
options
(array of strings, OPTIONAL) Mount options of the filesystem to be used.- Linux: See Linux mount options below.
- Solaris: corresponds to "options" of the fs resource in zonecfg(1M).
- Windows: runtimes MUST support
ro
, mounting the filesystem read-only whenro
is given.
Runtimes MUST/SHOULD/MAY implement the following option strings for Linux:
Option name | Requirement | Description |
---|---|---|
async |
MUST | 1 |
atime |
MUST | 1 |
bind |
MUST | Bind mount 2 |
defaults |
MUST | 1 |
dev |
MUST | 1 |
diratime |
MUST | 1 |
dirsync |
MUST | 1 |
exec |
MUST | 1 |
iversion |
MUST | 1 |
lazytime |
MUST | 1 |
loud |
MUST | 1 |
mand |
MAY | 1 (Deprecated in kernel 5.15, util-linux 2.38) |
noatime |
MUST | 1 |
nodev |
MUST | 1 |
nodiratime |
MUST | 1 |
noexec |
MUST | 1 |
noiversion |
MUST | 1 |
nolazytime |
MUST | 1 |
nomand |
MAY | 1 |
norelatime |
MUST | 1 |
nostrictatime |
MUST | 1 |
nosuid |
MUST | 1 |
nosymfollow |
SHOULD | 1 (Introduced in kernel 5.10, util-linux 2.38) |
private |
MUST | Bind mount propagation 2 |
ratime |
SHOULD | Recursive atime 3 |
rbind |
MUST | Recursive bind mount 2 |
rdev |
SHOULD | Recursive dev 3 |
rdiratime |
SHOULD | Recursive diratime 3 |
relatime |
MUST | 1 |
remount |
MUST | 1 |
rexec |
SHOULD | Recursive dev 3 |
rnoatime |
SHOULD | Recursive noatime 3 |
rnodiratime |
SHOULD | Recursive nodiratime 3 |
rnoexec |
SHOULD | Recursive noexec 3 |
rnorelatime |
SHOULD | Recursive norelatime 3 |
rnostrictatime |
SHOULD | Recursive nostrictatime 3 |
rnosuid |
SHOULD | Recursive nosuid 3 |
rnosymfollow |
SHOULD | Recursive nosymfollow 3 |
ro |
MUST | 1 |
rprivate |
MUST | Bind mount propagation 2 |
rrelatime |
SHOULD | Recursive relatime 3 |
rro |
SHOULD | Recursive ro 3 |
rrw |
SHOULD | Recursive rw 3 |
rshared |
MUST | Bind mount propagation 2 |
rslave |
MUST | Bind mount propagation 2 |
rstrictatime |
SHOULD | Recursive strictatime 3 |
rsuid |
SHOULD | Recursive suid 3 |
rsymfollow |
SHOULD | Recursive symfollow 3 |
runbindable |
MUST | Bind mount propagation 2 |
rw |
MUST | 1 |
shared |
MUST | 1 |
silent |
MUST | 1 |
slave |
MUST | Bind mount propagation 2 |
strictatime |
MUST | 1 |
suid |
MUST | 1 |
symfollow |
SHOULD | Opposite of nosymfollow |
sync |
MUST | 1 |
tmpcopyup |
MAY | copy up the contents to a tmpfs |
unbindable |
MUST | Bind mount propagation 2 |
idmap |
SHOULD | Indicates that the mount MUST have an idmapping applied. This option SHOULD NOT be passed to the underlying mount(2) call. If uidMappings or gidMappings are specified for the mount, the runtime MUST use those values for the mount's mapping. If they are not specified, the runtime MAY use the container's user namespace mapping, otherwise an error MUST be returned. If there are no uidMappings and gidMappings specified and the container isn't using user namespaces, an error MUST be returned. This SHOULD be implemented using mount_setattr(MOUNT_ATTR_IDMAP) , available since Linux 5.12. |
ridmap |
SHOULD | Indicates that the mount MUST have an idmapping applied, and the mapping is applied recursively 3. This option SHOULD NOT be passed to the underlying mount(2) call. If uidMappings or gidMappings are specified for the mount, the runtime MUST use those values for the mount's mapping. If they are not specified, the runtime MAY use the container's user namespace mapping, otherwise an error MUST be returned. If there are no uidMappings and gidMappings specified and the container isn't using user namespaces, an error MUST be returned. This SHOULD be implemented using mount_setattr(MOUNT_ATTR_IDMAP) , available since Linux 5.12. |
The "MUST" options correspond to mount(8)
.
Runtimes MAY also implement custom option strings that are not listed in the table above.
If a custom option string is already recognized by mount(8)
, the runtime SHOULD follow the behavior of mount(8)
.
Runtimes SHOULD treat unknown options as filesystem-specific ones)
and pass those as a comma-separated string to the fifth (const void *data
) argument of mount(2)
.
"mounts": [
{
"destination": "C:\\folder-inside-container",
"source": "C:\\folder-on-host",
"options": ["ro"]
}
]
For POSIX platforms the mounts
structure has the following fields:
type
(string, OPTIONAL) The type of the filesystem to be mounted.- Linux: filesystem types supported by the kernel as listed in /proc/filesystems (e.g., "minix", "ext2", "ext3", "jfs", "xfs", "reiserfs", "msdos", "proc", "nfs", "iso9660"). For bind mounts (when
options
include eitherbind
orrbind
), the type is a dummy, often "none" (not listed in /proc/filesystems). - Solaris: corresponds to "type" of the fs resource in zonecfg(1M).
- Linux: filesystem types supported by the kernel as listed in /proc/filesystems (e.g., "minix", "ext2", "ext3", "jfs", "xfs", "reiserfs", "msdos", "proc", "nfs", "iso9660"). For bind mounts (when
uidMappings
(array of type LinuxIDMapping, OPTIONAL) The mapping to convert UIDs from the source file system to the destination mount point. This SHOULD be implemented usingmount_setattr(MOUNT_ATTR_IDMAP)
, available since Linux 5.12. If specified, theoptions
field of themounts
structure SHOULD contain eitheridmap
orridmap
to specify whether the mapping should be applied recursively forrbind
mounts, as well as to ensure that older runtimes will not silently ignore this field. The format is the same as user namespace mappings. If specified, it MUST be specified along withgidMappings
.gidMappings
(array of type LinuxIDMapping, OPTIONAL) The mapping to convert GIDs from the source file system to the destination mount point. This SHOULD be implemented usingmount_setattr(MOUNT_ATTR_IDMAP)
, available since Linux 5.12. If specified, theoptions
field of themounts
structure SHOULD contain eitheridmap
orridmap
to specify whether the mapping should be applied recursively forrbind
mounts, as well as to ensure that older runtimes will not silently ignore this field. For more details seeuidMappings
. If specified, it MUST be specified along withuidMappings
.
"mounts": [
{
"destination": "/tmp",
"type": "tmpfs",
"source": "tmpfs",
"options": ["nosuid","strictatime","mode=755","size=65536k"]
},
{
"destination": "/data",
"type": "none",
"source": "/volumes/testing",
"options": ["rbind","rw"]
}
]
"mounts": [
{
"destination": "/opt/local",
"type": "lofs",
"source": "/usr/local",
"options": ["ro","nodevices"]
},
{
"destination": "/opt/sfw",
"type": "lofs",
"source": "/opt/sfw"
}
]
process
(object, OPTIONAL) specifies the container process.
This property is REQUIRED when start
is called.
terminal
(bool, OPTIONAL) specifies whether a terminal is attached to the process, defaults to false. As an example, if set to true on Linux a pseudoterminal pair is allocated for the process and the pseudoterminal pty is duplicated on the process's standard streams.consoleSize
(object, OPTIONAL) specifies the console size in characters of the terminal. Runtimes MUST ignoreconsoleSize
ifterminal
isfalse
or unset.height
(uint, REQUIRED)width
(uint, REQUIRED)
cwd
(string, REQUIRED) is the working directory that will be set for the executable. This value MUST be an absolute path.env
(array of strings, OPTIONAL) with the same semantics as IEEE Std 1003.1-2008'senviron
.args
(array of strings, OPTIONAL) with similar semantics to IEEE Std 1003.1-2008execvp
's argv. This specification extends the IEEE standard in that at least one entry is REQUIRED (non-Windows), and that entry is used with the same semantics asexecvp
's file. This field is OPTIONAL on Windows, andcommandLine
is REQUIRED if this field is omitted.commandLine
(string, OPTIONAL) specifies the full command line to be executed on Windows. This is the preferred means of supplying the command line on Windows. If omitted, the runtime will fall back to escaping and concatenating fields fromargs
before making the system call into Windows.
For systems that support POSIX rlimits (for example Linux and Solaris), the process
object supports the following process-specific properties:
-
rlimits
(array of objects, OPTIONAL) allows setting resource limits for the process. Each entry has the following structure:-
type
(string, REQUIRED) the platform resource being limited.- Linux: valid values are defined in the
getrlimit(2)
man page, such asRLIMIT_MSGQUEUE
. - Solaris: valid values are defined in the
getrlimit(3)
man page, such asRLIMIT_CORE
.
The runtime MUST generate an error for any values which cannot be mapped to a relevant kernel interface. For each entry in
rlimits
, agetrlimit(3)
ontype
MUST succeed. For the following properties,rlim
refers to the status returned by thegetrlimit(3)
call. - Linux: valid values are defined in the
-
soft
(uint64, REQUIRED) the value of the limit enforced for the corresponding resource.rlim.rlim_cur
MUST match the configured value. -
hard
(uint64, REQUIRED) the ceiling for the soft limit that could be set by an unprivileged process.rlim.rlim_max
MUST match the configured value. Only a privileged process (e.g. one with theCAP_SYS_RESOURCE
capability) can raise a hard limit.
If
rlimits
contains duplicated entries with sametype
, the runtime MUST generate an error. -
For Linux-based systems, the process
object supports the following process-specific properties.
-
apparmorProfile
(string, OPTIONAL) specifies the name of the AppArmor profile for the process. For more information about AppArmor, see AppArmor documentation. -
capabilities
(object, OPTIONAL) is an object containing arrays that specifies the sets of capabilities for the process. Valid values are defined in the capabilities(7) man page, such asCAP_CHOWN
. Any value which cannot be mapped to a relevant kernel interface, or cannot be granted otherwise MUST be logged as a warning by the runtime. Runtimes SHOULD NOT fail if the container configuration requests capabilities that cannot be granted, for example, if the runtime operates in a restricted environment with a limited set of capabilities.capabilities
contains the following properties:effective
(array of strings, OPTIONAL) theeffective
field is an array of effective capabilities that are kept for the process.bounding
(array of strings, OPTIONAL) thebounding
field is an array of bounding capabilities that are kept for the process.inheritable
(array of strings, OPTIONAL) theinheritable
field is an array of inheritable capabilities that are kept for the process.permitted
(array of strings, OPTIONAL) thepermitted
field is an array of permitted capabilities that are kept for the process.ambient
(array of strings, OPTIONAL) theambient
field is an array of ambient capabilities that are kept for the process.
-
noNewPrivileges
(bool, OPTIONAL) settingnoNewPrivileges
to true prevents the process from gaining additional privileges. As an example, theno_new_privs
article in the kernel documentation has information on how this is achieved using aprctl
system call on Linux. -
oomScoreAdj
(int, OPTIONAL) adjusts the oom-killer score in[pid]/oom_score_adj
for the process's[pid]
in a proc pseudo-filesystem. IfoomScoreAdj
is set, the runtime MUST setoom_score_adj
to the given value. IfoomScoreAdj
is not set, the runtime MUST NOT change the value ofoom_score_adj
.This is a per-process setting, where as
disableOOMKiller
is scoped for a memory cgroup. For more information on how these two settings work together, see the memory cgroup documentation section 10. OOM Contol. -
scheduler
(object, OPTIONAL) is an object describing the scheduler properties for the process. Thescheduler
contains the following properties:-
policy
(string, REQUIRED) represents the scheduling policy. A valid list of values is:SCHED_OTHER
SCHED_FIFO
SCHED_RR
SCHED_BATCH
SCHED_ISO
SCHED_IDLE
SCHED_DEADLINE
-
nice
(int32, OPTIONAL) is the nice value for the process, affecting its priority. A lower nice value corresponds to a higher priority. If not set, the runtime must use the value 0. -
priority
(int32, OPTIONAL) represents the static priority of the process, used by real-time policies like SCHED_FIFO and SCHED_RR. If not set, the runtime must use the value 0. -
flags
(array of strings, OPTIONAL) is an array of strings representing scheduling flags. A valid list of values is:SCHED_FLAG_RESET_ON_FORK
SCHED_FLAG_RECLAIM
SCHED_FLAG_DL_OVERRUN
SCHED_FLAG_KEEP_POLICY
SCHED_FLAG_KEEP_PARAMS
SCHED_FLAG_UTIL_CLAMP_MIN
SCHED_FLAG_UTIL_CLAMP_MAX
-
runtime
(uint64, OPTIONAL) represents the amount of time in nanoseconds during which the process is allowed to run in a given period, used by the deadline scheduler. If not set, the runtime must use the value 0. -
deadline
(uint64, OPTIONAL) represents the absolute deadline for the process to complete its execution, used by the deadline scheduler. If not set, the runtime must use the value 0. -
period
(uint64, OPTIONAL) represents the length of the period in nanoseconds used for determining the process runtime, used by the deadline scheduler. If not set, the runtime must use the value 0.
-
-
selinuxLabel
(string, OPTIONAL) specifies the SELinux label for the process. For more information about SELinux, see SELinux documentation. -
ioPriority
(object, OPTIONAL) configures the I/O priority settings for the container's processes within the process group. The I/O priority settings will be automatically applied to the entire process group, affecting all processes within the container. The following properties are available:class
(string, REQUIRED) specifies the I/O scheduling class. Possible values areIOPRIO_CLASS_RT
,IOPRIO_CLASS_BE
, andIOPRIO_CLASS_IDLE
.priority
(int, REQUIRED) specifies the priority level within the class. The value should be an integer ranging from 0 (highest) to 7 (lowest).
The user for the process is a platform-specific structure that allows specific control over which user the process runs as.
For POSIX platforms the user
structure has the following fields:
uid
(int, REQUIRED) specifies the user ID in the container namespace.gid
(int, REQUIRED) specifies the group ID in the container namespace.umask
(int, OPTIONAL) specifies the [umask][umask_2] of the user. If unspecified, the umask should not be changed from the calling process' umask.additionalGids
(array of ints, OPTIONAL) specifies additional group IDs in the container namespace to be added to the process.
Note: symbolic name for uid and gid, such as uname and gname respectively, are left to upper levels to derive (i.e. /etc/passwd
parsing, NSS, etc)
"process": {
"terminal": true,
"consoleSize": {
"height": 25,
"width": 80
},
"user": {
"uid": 1,
"gid": 1,
"umask": 63,
"additionalGids": [5, 6]
},
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/root",
"args": [
"sh"
],
"apparmorProfile": "acme_secure_profile",
"selinuxLabel": "system_u:system_r:svirt_lxc_net_t:s0:c124,c675",
"ioPriority": {
"class": "IOPRIO_CLASS_IDLE",
"priority": 4
},
"noNewPrivileges": true,
"capabilities": {
"bounding": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"permitted": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"inheritable": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"effective": [
"CAP_AUDIT_WRITE",
"CAP_KILL"
],
"ambient": [
"CAP_NET_BIND_SERVICE"
]
},
"rlimits": [
{
"type": "RLIMIT_NOFILE",
"hard": 1024,
"soft": 1024
}
]
}
"process": {
"terminal": true,
"consoleSize": {
"height": 25,
"width": 80
},
"user": {
"uid": 1,
"gid": 1,
"umask": 7,
"additionalGids": [2, 8]
},
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/root",
"args": [
"/usr/bin/bash"
]
}
For Windows based systems the user structure has the following fields:
username
(string, OPTIONAL) specifies the user name for the process.
"process": {
"terminal": true,
"user": {
"username": "containeradministrator"
},
"env": [
"VARIABLE=1"
],
"cwd": "c:\\foo",
"args": [
"someapp.exe",
]
}
hostname
(string, OPTIONAL) specifies the container's hostname as seen by processes running inside the container. On Linux, for example, this will change the hostname in the container UTS namespace. Depending on your namespace configuration, the container UTS namespace may be the runtime UTS namespace.
"hostname": "mrsdalloway"
domainname
(string, OPTIONAL) specifies the container's domainname as seen by processes running inside the container. On Linux, for example, this will change the domainname in the container UTS namespace. Depending on your namespace configuration, the container UTS namespace may be the runtime UTS namespace.
"domainname": "foobarbaz.test"
linux
(object, OPTIONAL) Linux-specific configuration. This MAY be set if the target platform of this spec islinux
.windows
(object, OPTIONAL) Windows-specific configuration. This MUST be set if the target platform of this spec iswindows
.solaris
(object, OPTIONAL) Solaris-specific configuration. This MAY be set if the target platform of this spec issolaris
.vm
(object, OPTIONAL) Virtual-machine-specific configuration. This MAY be set if the target platform and architecture of this spec support hardware virtualization.zos
(object, OPTIONAL) z/OS-specific configuration. This MAY be set if the target platform of this spec iszos
.
{
"linux": {
"namespaces": [
{
"type": "pid"
}
]
}
}
For POSIX platforms, the configuration structure supports hooks
for configuring custom actions related to the lifecycle of the container.
hooks
(object, OPTIONAL) MAY contain any of the following properties:prestart
(array of objects, OPTIONAL, DEPRECATED) is an array ofprestart
hooks.- Entries in the array contain the following properties:
path
(string, REQUIRED) with similar semantics to IEEE Std 1003.1-2008execv
's path. This specification extends the IEEE standard in thatpath
MUST be absolute.args
(array of strings, OPTIONAL) with the same semantics as IEEE Std 1003.1-2008execv
's argv.env
(array of strings, OPTIONAL) with the same semantics as IEEE Std 1003.1-2008'senviron
.timeout
(int, OPTIONAL) is the number of seconds before aborting the hook. If set,timeout
MUST be greater than zero.
- The value of
path
MUST resolve in the runtime namespace. - The
prestart
hooks MUST be executed in the runtime namespace.
- Entries in the array contain the following properties:
createRuntime
(array of objects, OPTIONAL) is an array ofcreateRuntime
hooks.- Entries in the array contain the following properties (the entries are identical to the entries in the deprecated
prestart
hooks):path
(string, REQUIRED) with similar semantics to IEEE Std 1003.1-2008execv
's path. This specification extends the IEEE standard in thatpath
MUST be absolute.args
(array of strings, OPTIONAL) with the same semantics as IEEE Std 1003.1-2008execv
's argv.env
(array of strings, OPTIONAL) with the same semantics as IEEE Std 1003.1-2008'senviron
.timeout
(int, OPTIONAL) is the number of seconds before aborting the hook. If set,timeout
MUST be greater than zero.
- The value of
path
MUST resolve in the runtime namespace. - The
createRuntime
hooks MUST be executed in the runtime namespace.
- Entries in the array contain the following properties (the entries are identical to the entries in the deprecated
createContainer
(array of objects, OPTIONAL) is an array ofcreateContainer
hooks.- Entries in the array have the same schema as
createRuntime
entries. - The value of
path
MUST resolve in the runtime namespace. - The
createContainer
hooks MUST be executed in the container namespace.
- Entries in the array have the same schema as
startContainer
(array of objects, OPTIONAL) is an array ofstartContainer
hooks.- Entries in the array have the same schema as
createRuntime
entries. - The value of
path
MUST resolve in the container namespace. - The
startContainer
hooks MUST be executed in the container namespace.
- Entries in the array have the same schema as
poststart
(array of objects, OPTIONAL) is an array ofpoststart
hooks.- Entries in the array have the same schema as
createRuntime
entries. - The value of
path
MUST resolve in the runtime namespace. - The
poststart
hooks MUST be executed in the runtime namespace.
- Entries in the array have the same schema as
poststop
(array of objects, OPTIONAL) is an array ofpoststop
hooks.- Entries in the array have the same schema as
createRuntime
entries. - The value of
path
MUST resolve in the runtime namespace. - The
poststop
hooks MUST be executed in the runtime namespace.
- Entries in the array have the same schema as
Hooks allow users to specify programs to run before or after various lifecycle events. Hooks MUST be called in the listed order. The state of the container MUST be passed to hooks over stdin so that they may do work appropriate to the current state of the container.
The prestart
hooks MUST be called as part of the create
operation after the runtime environment has been created (according to the configuration in config.json) but before the pivot_root
or any equivalent operation has been executed.
On Linux, for example, they are called after the container namespaces are created, so they provide an opportunity to customize the container (e.g. the network namespace could be specified in this hook).
The prestart
hooks MUST be called before the createRuntime
hooks.
Note: prestart
hooks were deprecated in favor of createRuntime
, createContainer
and startContainer
hooks, which allow more granular hook control during the create and start phase.
The prestart
hooks' path MUST resolve in the runtime namespace.
The prestart
hooks MUST be executed in the runtime namespace.
The createRuntime
hooks MUST be called as part of the create
operation after the runtime environment has been created (according to the configuration in config.json) but before the pivot_root
or any equivalent operation has been executed.
The createRuntime
hooks' path MUST resolve in the runtime namespace.
The createRuntime
hooks MUST be executed in the runtime namespace.
On Linux, for example, they are called after the container namespaces are created, so they provide an opportunity to customize the container (e.g. the network namespace could be specified in this hook).
The definition of createRuntime
hooks is currently underspecified and hooks authors, should only expect from the runtime that the mount namespace have been created and the mount operations performed. Other operations such as cgroups and SELinux/AppArmor labels might not have been performed by the runtime.
The createContainer
hooks MUST be called as part of the create
operation after the runtime environment has been created (according to the configuration in config.json) but before the pivot_root
or any equivalent operation has been executed.
The createContainer
hooks MUST be called after the createRuntime
hooks.
The createContainer
hooks' path MUST resolve in the runtime namespace.
The createContainer
hooks MUST be executed in the container namespace.
For example, on Linux this would happen before the pivot_root
operation is executed but after the mount namespace was created and setup.
The definition of createContainer
hooks is currently underspecified and hooks authors, should only expect from the runtime that the mount namespace and different mounts will be setup. Other operations such as cgroups and SELinux/AppArmor labels might not have been performed by the runtime.
The startContainer
hooks MUST be called before the user-specified process is executed as part of the start
operation.
This hook can be used to execute some operations in the container, for example running the ldconfig
binary on linux before the container process is spawned.
The startContainer
hooks' path MUST resolve in the container namespace.
The startContainer
hooks MUST be executed in the container namespace.
The poststart
hooks MUST be called after the user-specified process is executed but before the start
operation returns.
For example, this hook can notify the user that the container process is spawned.
The poststart
hooks' path MUST resolve in the runtime namespace.
The poststart
hooks MUST be executed in the runtime namespace.
The poststop
hooks MUST be called after the container is deleted but before the delete
operation returns.
Cleanup or debugging functions are examples of such a hook.
The poststop
hooks' path MUST resolve in the runtime namespace.
The poststop
hooks MUST be executed in the runtime namespace.
See the below table for a summary of hooks and when they are called:
Name | Namespace | When |
---|---|---|
prestart (Deprecated) |
runtime | After the start operation is called but before the user-specified program command is executed. |
createRuntime |
runtime | During the create operation, after the runtime environment has been created and before the pivot root or any equivalent operation. |
createContainer |
container | During the create operation, after the runtime environment has been created and before the pivot root or any equivalent operation. |
startContainer |
container | After the start operation is called but before the user-specified program command is executed. |
poststart |
runtime | After the user-specified process is executed but before the start operation returns. |
poststop |
runtime | After the container is deleted but before the delete operation returns. |
"hooks": {
"prestart": [
{
"path": "/usr/bin/fix-mounts",
"args": ["fix-mounts", "arg1", "arg2"],
"env": [ "key1=value1"]
},
{
"path": "/usr/bin/setup-network"
}
],
"createRuntime": [
{
"path": "/usr/bin/fix-mounts",
"args": ["fix-mounts", "arg1", "arg2"],
"env": [ "key1=value1"]
},
{
"path": "/usr/bin/setup-network"
}
],
"createContainer": [
{
"path": "/usr/bin/mount-hook",
"args": ["-mount", "arg1", "arg2"],
"env": [ "key1=value1"]
}
],
"startContainer": [
{
"path": "/usr/bin/refresh-ldcache"
}
],
"poststart": [
{
"path": "/usr/bin/notify-start",
"timeout": 5
}
],
"poststop": [
{
"path": "/usr/sbin/cleanup.sh",
"args": ["cleanup.sh", "-f"]
}
]
}
annotations
(object, OPTIONAL) contains arbitrary metadata for the container.
This information MAY be structured or unstructured.
Annotations MUST be a key-value map.
If there are no annotations then this property MAY either be absent or an empty map.
Keys MUST be strings.
Keys MUST NOT be an empty string.
Keys SHOULD be named using a reverse domain notation - e.g. com.example.myKey
.
The org.opencontainers
namespace for keys is reserved for use by this specification, annotations using keys in this namespace MUST be as described in this section.
The following keys in the org.opencontainers
namespaces MAY be used:
Key | Definition |
---|---|
org.opencontainers.image.os |
Indicates the operating system the container image was built to run on. The annotation value MUST have a valid value for the os property as defined in the OCI image specification. This annotation SHOULD only be used in accordance with the OCI image specification's runtime conversion specification. |
org.opencontainers.image.os.version |
Indicates the operating system version targeted by the container image. The annotation value MUST have a valid value for the os.version property as defined in the OCI image specification. This annotation SHOULD only be used in accordance with the OCI image specification's runtime conversion specification. |
org.opencontainers.image.os.features |
Indicates mandatory operating system features required by the container image. The annotation value MUST have a valid value for the os.features property as defined in the OCI image specification. This annotation SHOULD only be used in accordance with the OCI image specification's runtime conversion specification. |
org.opencontainers.image.architecture |
Indicates the architecture that binaries in the container image are built to run on. The annotation value MUST have a valid value for the architecture property as defined in the OCI image specification. This annotation SHOULD only be used in accordance with the OCI image specification's runtime conversion specification. |
org.opencontainers.image.variant |
Indicates the variant of the architecture that binaries in the container image are built to run on. The annotation value MUST have a valid value for the variant property as defined in the OCI image specification. This annotation SHOULD only be used in accordance with the OCI image specification's runtime conversion specification. |
org.opencontainers.image.author |
Indicates the author of the container image. The annotation value MUST have a valid value for the author property as defined in the OCI image specification. This annotation SHOULD only be used in accordance with the OCI image specification's runtime conversion specification. |
org.opencontainers.image.created |
Indicates the date and time when the container image was created. The annotation value MUST have a valid value for the created property as defined in the OCIimage specification. This annotation SHOULD only be used in accordance with the OCI image specification's runtime conversion specification. |
org.opencontainers.image.stopSignal |
Indicates signal that SHOULD be sent by the container runtimes to kill the container. The annotation value MUST have a valid value for the config.StopSignal property as defined in the OCI image specification. This annotation SHOULD only be used in accordance with the OCI image specification's runtime conversion specification. |
All other keys in the org.opencontainers
namespace not specified in this above table are reserved and MUST NOT be used by subsequent specifications.
Runtimes MUST handle unknown annotation keys like any other unknown property.
Values MUST be strings. Values MAY be an empty string.
"annotations": {
"com.example.gpu-cores": "2"
}
Runtimes MAY log unknown properties but MUST otherwise ignore them. That includes not generating errors if they encounter an unknown property.
Runtimes MUST generate an error when invalid or unsupported values are encountered. Unless support for a valid value is explicitly required, runtimes MAY choose which subset of the valid values it will support.
Here is a full example config.json
for reference.
{
"ociVersion": "1.0.1",
"process": {
"terminal": true,
"user": {
"uid": 1,
"gid": 1,
"additionalGids": [
5,
6
]
},
"args": [
"sh"
],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"TERM=xterm"
],
"cwd": "/",
"capabilities": {
"bounding": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"permitted": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"inheritable": [
"CAP_AUDIT_WRITE",
"CAP_KILL",
"CAP_NET_BIND_SERVICE"
],
"effective": [
"CAP_AUDIT_WRITE",
"CAP_KILL"
],
"ambient": [
"CAP_NET_BIND_SERVICE"
]
},
"rlimits": [
{
"type": "RLIMIT_CORE",
"hard": 1024,
"soft": 1024
},
{
"type": "RLIMIT_NOFILE",
"hard": 1024,
"soft": 1024
}
],
"apparmorProfile": "acme_secure_profile",
"oomScoreAdj": 100,
"selinuxLabel": "system_u:system_r:svirt_lxc_net_t:s0:c124,c675",
"ioPriority": {
"class": "IOPRIO_CLASS_IDLE",
"priority": 4
},
"noNewPrivileges": true
},
"root": {
"path": "rootfs",
"readonly": true
},
"hostname": "slartibartfast",
"mounts": [
{
"destination": "/proc",
"type": "proc",
"source": "proc"
},
{
"destination": "/dev",
"type": "tmpfs",
"source": "tmpfs",
"options": [
"nosuid",
"strictatime",
"mode=755",
"size=65536k"
]
},
{
"destination": "/dev/pts",
"type": "devpts",
"source": "devpts",
"options": [
"nosuid",
"noexec",
"newinstance",
"ptmxmode=0666",
"mode=0620",
"gid=5"
]
},
{
"destination": "/dev/shm",
"type": "tmpfs",
"source": "shm",
"options": [
"nosuid",
"noexec",
"nodev",
"mode=1777",
"size=65536k"
]
},
{
"destination": "/dev/mqueue",
"type": "mqueue",
"source": "mqueue",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/sys",
"type": "sysfs",
"source": "sysfs",
"options": [
"nosuid",
"noexec",
"nodev"
]
},
{
"destination": "/sys/fs/cgroup",
"type": "cgroup",
"source": "cgroup",
"options": [
"nosuid",
"noexec",
"nodev",
"relatime",
"ro"
]
}
],
"hooks": {
"prestart": [
{
"path": "/usr/bin/fix-mounts",
"args": [
"fix-mounts",
"arg1",
"arg2"
],
"env": [
"key1=value1"
]
},
{
"path": "/usr/bin/setup-network"
}
],
"poststart": [
{
"path": "/usr/bin/notify-start",
"timeout": 5
}
],
"poststop": [
{
"path": "/usr/sbin/cleanup.sh",
"args": [
"cleanup.sh",
"-f"
]
}
]
},
"linux": {
"devices": [
{
"path": "/dev/fuse",
"type": "c",
"major": 10,
"minor": 229,
"fileMode": 438,
"uid": 0,
"gid": 0
},
{
"path": "/dev/sda",
"type": "b",
"major": 8,
"minor": 0,
"fileMode": 432,
"uid": 0,
"gid": 0
}
],
"uidMappings": [
{
"containerID": 0,
"hostID": 1000,
"size": 32000
}
],
"gidMappings": [
{
"containerID": 0,
"hostID": 1000,
"size": 32000
}
],
"sysctl": {
"net.ipv4.ip_forward": "1",
"net.core.somaxconn": "256"
},
"cgroupsPath": "/myRuntime/myContainer",
"resources": {
"network": {
"classID": 1048577,
"priorities": [
{
"name": "eth0",
"priority": 500
},
{
"name": "eth1",
"priority": 1000
}
]
},
"pids": {
"limit": 32771
},
"hugepageLimits": [
{
"pageSize": "2MB",
"limit": 9223372036854772000
},
{
"pageSize": "64KB",
"limit": 1000000
}
],
"memory": {
"limit": 536870912,
"reservation": 536870912,
"swap": 536870912,
"kernel": -1,
"kernelTCP": -1,
"swappiness": 0,
"disableOOMKiller": false
},
"cpu": {
"shares": 1024,
"quota": 1000000,
"period": 500000,
"realtimeRuntime": 950000,
"realtimePeriod": 1000000,
"cpus": "2-3",
"idle": 1,
"mems": "0-7"
},
"devices": [
{
"allow": false,
"access": "rwm"
},
{
"allow": true,
"type": "c",
"major": 10,
"minor": 229,
"access": "rw"
},
{
"allow": true,
"type": "b",
"major": 8,
"minor": 0,
"access": "r"
}
],
"blockIO": {
"weight": 10,
"leafWeight": 10,
"weightDevice": [
{
"major": 8,
"minor": 0,
"weight": 500,
"leafWeight": 300
},
{
"major": 8,
"minor": 16,
"weight": 500
}
],
"throttleReadBpsDevice": [
{
"major": 8,
"minor": 0,
"rate": 600
}
],
"throttleWriteIOPSDevice": [
{
"major": 8,
"minor": 16,
"rate": 300
}
]
}
},
"rootfsPropagation": "slave",
"seccomp": {
"defaultAction": "SCMP_ACT_ALLOW",
"architectures": [
"SCMP_ARCH_X86",
"SCMP_ARCH_X32"
],
"syscalls": [
{
"names": [
"getcwd",
"chmod"
],
"action": "SCMP_ACT_ERRNO"
}
]
},
"timeOffsets": {
"monotonic": {
"secs": 172800,
"nanosecs": 0
},
"boottime": {
"secs": 604800,
"nanosecs": 0
}
},
"namespaces": [
{
"type": "pid"
},
{
"type": "network"
},
{
"type": "ipc"
},
{
"type": "uts"
},
{
"type": "mount"
},
{
"type": "user"
},
{
"type": "cgroup"
},
{
"type": "time"
}
],
"maskedPaths": [
"/proc/kcore",
"/proc/latency_stats",
"/proc/timer_stats",
"/proc/sched_debug"
],
"readonlyPaths": [
"/proc/asound",
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
],
"mountLabel": "system_u:object_r:svirt_sandbox_file_t:s0:c715,c811"
},
"annotations": {
"com.example.key1": "value1",
"com.example.key2": "value2"
}
}
Footnotes
-
Corresponds to
mount(8)
(filesystem-independent). ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8 ↩9 ↩10 ↩11 ↩12 ↩13 ↩14 ↩15 ↩16 ↩17 ↩18 ↩19 ↩20 ↩21 ↩22 ↩23 ↩24 ↩25 ↩26 ↩27 ↩28 ↩29 ↩30 ↩31 -
Corresponds to bind mounts and shared subtrees. ↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8 ↩9
-
These
AT_RECURSIVE
options need kernel 5.12 or later. Seemount_setattr(2)
↩ ↩2 ↩3 ↩4 ↩5 ↩6 ↩7 ↩8 ↩9 ↩10 ↩11 ↩12 ↩13 ↩14 ↩15 ↩16 ↩17 ↩18