|  | .. SPDX-License-Identifier: GPL-2.0 | 
|  |  | 
|  | Idmappings | 
|  | ========== | 
|  |  | 
|  | Most filesystem developers will have encountered idmappings. They are used when | 
|  | reading from or writing ownership to disk, reporting ownership to userspace, or | 
|  | for permission checking. This document is aimed at filesystem developers that | 
|  | want to know how idmappings work. | 
|  |  | 
|  | Formal notes | 
|  | ------------ | 
|  |  | 
|  | An idmapping is essentially a translation of a range of ids into another or the | 
|  | same range of ids. The notational convention for idmappings that is widely used | 
|  | in userspace is:: | 
|  |  | 
|  | u:k:r | 
|  |  | 
|  | ``u`` indicates the first element in the upper idmapset ``U`` and ``k`` | 
|  | indicates the first element in the lower idmapset ``K``. The ``r`` parameter | 
|  | indicates the range of the idmapping, i.e. how many ids are mapped. From now | 
|  | on, we will always prefix ids with ``u`` or ``k`` to make it clear whether | 
|  | we're talking about an id in the upper or lower idmapset. | 
|  |  | 
|  | To see what this looks like in practice, let's take the following idmapping:: | 
|  |  | 
|  | u22:k10000:r3 | 
|  |  | 
|  | and write down the mappings it will generate:: | 
|  |  | 
|  | u22 -> k10000 | 
|  | u23 -> k10001 | 
|  | u24 -> k10002 | 
|  |  | 
|  | From a mathematical viewpoint ``U`` and ``K`` are well-ordered sets and an | 
|  | idmapping is an order isomorphism from ``U`` into ``K``. So ``U`` and ``K`` are | 
|  | order isomorphic. In fact, ``U`` and ``K`` are always well-ordered subsets of | 
|  | the set of all possible ids usable on a given system. | 
|  |  | 
|  | Looking at this mathematically briefly will help us highlight some properties | 
|  | that make it easier to understand how we can translate between idmappings. For | 
|  | example, we know that the inverse idmapping is an order isomorphism as well:: | 
|  |  | 
|  | k10000 -> u22 | 
|  | k10001 -> u23 | 
|  | k10002 -> u24 | 
|  |  | 
|  | Given that we are dealing with order isomorphisms plus the fact that we're | 
|  | dealing with subsets we can embed idmappings into each other, i.e. we can | 
|  | sensibly translate between different idmappings. For example, assume we've been | 
|  | given the three idmappings:: | 
|  |  | 
|  | 1. u0:k10000:r10000 | 
|  | 2. u0:k20000:r10000 | 
|  | 3. u0:k30000:r10000 | 
|  |  | 
|  | and id ``k11000`` which has been generated by the first idmapping by mapping | 
|  | ``u1000`` from the upper idmapset down to ``k11000`` in the lower idmapset. | 
|  |  | 
|  | Because we're dealing with order isomorphic subsets it is meaningful to ask | 
|  | what id ``k11000`` corresponds to in the second or third idmapping. The | 
|  | straightforward algorithm to use is to apply the inverse of the first idmapping, | 
|  | mapping ``k11000`` up to ``u1000``. Afterwards, we can map ``u1000`` down using | 
|  | either the second idmapping mapping or third idmapping mapping. The second | 
|  | idmapping would map ``u1000`` down to ``21000``. The third idmapping would map | 
|  | ``u1000`` down to ``u31000``. | 
|  |  | 
|  | If we were given the same task for the following three idmappings:: | 
|  |  | 
|  | 1. u0:k10000:r10000 | 
|  | 2. u0:k20000:r200 | 
|  | 3. u0:k30000:r300 | 
|  |  | 
|  | we would fail to translate as the sets aren't order isomorphic over the full | 
|  | range of the first idmapping anymore (However they are order isomorphic over | 
|  | the full range of the second idmapping.). Neither the second or third idmapping | 
|  | contain ``u1000`` in the upper idmapset ``U``. This is equivalent to not having | 
|  | an id mapped. We can simply say that ``u1000`` is unmapped in the second and | 
|  | third idmapping. The kernel will report unmapped ids as the overflowuid | 
|  | ``(uid_t)-1`` or overflowgid ``(gid_t)-1`` to userspace. | 
|  |  | 
|  | The algorithm to calculate what a given id maps to is pretty simple. First, we | 
|  | need to verify that the range can contain our target id. We will skip this step | 
|  | for simplicity. After that if we want to know what ``id`` maps to we can do | 
|  | simple calculations: | 
|  |  | 
|  | - If we want to map from left to right:: | 
|  |  | 
|  | u:k:r | 
|  | id - u + k = n | 
|  |  | 
|  | - If we want to map from right to left:: | 
|  |  | 
|  | u:k:r | 
|  | id - k + u = n | 
|  |  | 
|  | Instead of "left to right" we can also say "down" and instead of "right to | 
|  | left" we can also say "up". Obviously mapping down and up invert each other. | 
|  |  | 
|  | To see whether the simple formulas above work, consider the following two | 
|  | idmappings:: | 
|  |  | 
|  | 1. u0:k20000:r10000 | 
|  | 2. u500:k30000:r10000 | 
|  |  | 
|  | Assume we are given ``k21000`` in the lower idmapset of the first idmapping. We | 
|  | want to know what id this was mapped from in the upper idmapset of the first | 
|  | idmapping. So we're mapping up in the first idmapping:: | 
|  |  | 
|  | id     - k      + u  = n | 
|  | k21000 - k20000 + u0 = u1000 | 
|  |  | 
|  | Now assume we are given the id ``u1100`` in the upper idmapset of the second | 
|  | idmapping and we want to know what this id maps down to in the lower idmapset | 
|  | of the second idmapping. This means we're mapping down in the second | 
|  | idmapping:: | 
|  |  | 
|  | id    - u    + k      = n | 
|  | u1100 - u500 + k30000 = k30600 | 
|  |  | 
|  | General notes | 
|  | ------------- | 
|  |  | 
|  | In the context of the kernel an idmapping can be interpreted as mapping a range | 
|  | of userspace ids into a range of kernel ids:: | 
|  |  | 
|  | userspace-id:kernel-id:range | 
|  |  | 
|  | A userspace id is always an element in the upper idmapset of an idmapping of | 
|  | type ``uid_t`` or ``gid_t`` and a kernel id is always an element in the lower | 
|  | idmapset of an idmapping of type ``kuid_t`` or ``kgid_t``. From now on | 
|  | "userspace id" will be used to refer to the well known ``uid_t`` and ``gid_t`` | 
|  | types and "kernel id" will be used to refer to ``kuid_t`` and ``kgid_t``. | 
|  |  | 
|  | The kernel is mostly concerned with kernel ids. They are used when performing | 
|  | permission checks and are stored in an inode's ``i_uid`` and ``i_gid`` field. | 
|  | A userspace id on the other hand is an id that is reported to userspace by the | 
|  | kernel, or is passed by userspace to the kernel, or a raw device id that is | 
|  | written or read from disk. | 
|  |  | 
|  | Note that we are only concerned with idmappings as the kernel stores them not | 
|  | how userspace would specify them. | 
|  |  | 
|  | For the rest of this document we will prefix all userspace ids with ``u`` and | 
|  | all kernel ids with ``k``. Ranges of idmappings will be prefixed with ``r``. So | 
|  | an idmapping will be written as ``u0:k10000:r10000``. | 
|  |  | 
|  | For example, within this idmapping, the id ``u1000`` is an id in the upper | 
|  | idmapset or "userspace idmapset" starting with ``u0``. And it is mapped to | 
|  | ``k11000`` which is a kernel id in the lower idmapset or "kernel idmapset" | 
|  | starting with ``k10000``. | 
|  |  | 
|  | A kernel id is always created by an idmapping. Such idmappings are associated | 
|  | with user namespaces. Since we mainly care about how idmappings work we're not | 
|  | going to be concerned with how idmappings are created nor how they are used | 
|  | outside of the filesystem context. This is best left to an explanation of user | 
|  | namespaces. | 
|  |  | 
|  | The initial user namespace is special. It always has an idmapping of the | 
|  | following form:: | 
|  |  | 
|  | u0:k0:r4294967295 | 
|  |  | 
|  | which is an identity idmapping over the full range of ids available on this | 
|  | system. | 
|  |  | 
|  | Other user namespaces usually have non-identity idmappings such as:: | 
|  |  | 
|  | u0:k10000:r10000 | 
|  |  | 
|  | When a process creates or wants to change ownership of a file, or when the | 
|  | ownership of a file is read from disk by a filesystem, the userspace id is | 
|  | immediately translated into a kernel id according to the idmapping associated | 
|  | with the relevant user namespace. | 
|  |  | 
|  | For instance, consider a file that is stored on disk by a filesystem as being | 
|  | owned by ``u1000``: | 
|  |  | 
|  | - If a filesystem were to be mounted in the initial user namespaces (as most | 
|  | filesystems are) then the initial idmapping will be used. As we saw this is | 
|  | simply the identity idmapping. This would mean id ``u1000`` read from disk | 
|  | would be mapped to id ``k1000``. So an inode's ``i_uid`` and ``i_gid`` field | 
|  | would contain ``k1000``. | 
|  |  | 
|  | - If a filesystem were to be mounted with an idmapping of ``u0:k10000:r10000`` | 
|  | then ``u1000`` read from disk would be mapped to ``k11000``. So an inode's | 
|  | ``i_uid`` and ``i_gid`` would contain ``k11000``. | 
|  |  | 
|  | Translation algorithms | 
|  | ---------------------- | 
|  |  | 
|  | We've already seen briefly that it is possible to translate between different | 
|  | idmappings. We'll now take a closer look how that works. | 
|  |  | 
|  | Crossmapping | 
|  | ~~~~~~~~~~~~ | 
|  |  | 
|  | This translation algorithm is used by the kernel in quite a few places. For | 
|  | example, it is used when reporting back the ownership of a file to userspace | 
|  | via the ``stat()`` system call family. | 
|  |  | 
|  | If we've been given ``k11000`` from one idmapping we can map that id up in | 
|  | another idmapping. In order for this to work both idmappings need to contain | 
|  | the same kernel id in their kernel idmapsets. For example, consider the | 
|  | following idmappings:: | 
|  |  | 
|  | 1. u0:k10000:r10000 | 
|  | 2. u20000:k10000:r10000 | 
|  |  | 
|  | and we are mapping ``u1000`` down to ``k11000`` in the first idmapping . We can | 
|  | then translate ``k11000`` into a userspace id in the second idmapping using the | 
|  | kernel idmapset of the second idmapping:: | 
|  |  | 
|  | /* Map the kernel id up into a userspace id in the second idmapping. */ | 
|  | from_kuid(u20000:k10000:r10000, k11000) = u21000 | 
|  |  | 
|  | Note, how we can get back to the kernel id in the first idmapping by inverting | 
|  | the algorithm:: | 
|  |  | 
|  | /* Map the userspace id down into a kernel id in the second idmapping. */ | 
|  | make_kuid(u20000:k10000:r10000, u21000) = k11000 | 
|  |  | 
|  | /* Map the kernel id up into a userspace id in the first idmapping. */ | 
|  | from_kuid(u0:k10000:r10000, k11000) = u1000 | 
|  |  | 
|  | This algorithm allows us to answer the question what userspace id a given | 
|  | kernel id corresponds to in a given idmapping. In order to be able to answer | 
|  | this question both idmappings need to contain the same kernel id in their | 
|  | respective kernel idmapsets. | 
|  |  | 
|  | For example, when the kernel reads a raw userspace id from disk it maps it down | 
|  | into a kernel id according to the idmapping associated with the filesystem. | 
|  | Let's assume the filesystem was mounted with an idmapping of | 
|  | ``u0:k20000:r10000`` and it reads a file owned by ``u1000`` from disk. This | 
|  | means ``u1000`` will be mapped to ``k21000`` which is what will be stored in | 
|  | the inode's ``i_uid`` and ``i_gid`` field. | 
|  |  | 
|  | When someone in userspace calls ``stat()`` or a related function to get | 
|  | ownership information about the file the kernel can't simply map the id back up | 
|  | according to the filesystem's idmapping as this would give the wrong owner if | 
|  | the caller is using an idmapping. | 
|  |  | 
|  | So the kernel will map the id back up in the idmapping of the caller. Let's | 
|  | assume the caller has the somewhat unconventional idmapping | 
|  | ``u3000:k20000:r10000`` then ``k21000`` would map back up to ``u4000``. | 
|  | Consequently the user would see that this file is owned by ``u4000``. | 
|  |  | 
|  | Remapping | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | It is possible to translate a kernel id from one idmapping to another one via | 
|  | the userspace idmapset of the two idmappings. This is equivalent to remapping | 
|  | a kernel id. | 
|  |  | 
|  | Let's look at an example. We are given the following two idmappings:: | 
|  |  | 
|  | 1. u0:k10000:r10000 | 
|  | 2. u0:k20000:r10000 | 
|  |  | 
|  | and we are given ``k11000`` in the first idmapping. In order to translate this | 
|  | kernel id in the first idmapping into a kernel id in the second idmapping we | 
|  | need to perform two steps: | 
|  |  | 
|  | 1. Map the kernel id up into a userspace id in the first idmapping:: | 
|  |  | 
|  | /* Map the kernel id up into a userspace id in the first idmapping. */ | 
|  | from_kuid(u0:k10000:r10000, k11000) = u1000 | 
|  |  | 
|  | 2. Map the userspace id down into a kernel id in the second idmapping:: | 
|  |  | 
|  | /* Map the userspace id down into a kernel id in the second idmapping. */ | 
|  | make_kuid(u0:k20000:r10000, u1000) = k21000 | 
|  |  | 
|  | As you can see we used the userspace idmapset in both idmappings to translate | 
|  | the kernel id in one idmapping to a kernel id in another idmapping. | 
|  |  | 
|  | This allows us to answer the question what kernel id we would need to use to | 
|  | get the same userspace id in another idmapping. In order to be able to answer | 
|  | this question both idmappings need to contain the same userspace id in their | 
|  | respective userspace idmapsets. | 
|  |  | 
|  | Note, how we can easily get back to the kernel id in the first idmapping by | 
|  | inverting the algorithm: | 
|  |  | 
|  | 1. Map the kernel id up into a userspace id in the second idmapping:: | 
|  |  | 
|  | /* Map the kernel id up into a userspace id in the second idmapping. */ | 
|  | from_kuid(u0:k20000:r10000, k21000) = u1000 | 
|  |  | 
|  | 2. Map the userspace id down into a kernel id in the first idmapping:: | 
|  |  | 
|  | /* Map the userspace id down into a kernel id in the first idmapping. */ | 
|  | make_kuid(u0:k10000:r10000, u1000) = k11000 | 
|  |  | 
|  | Another way to look at this translation is to treat it as inverting one | 
|  | idmapping and applying another idmapping if both idmappings have the relevant | 
|  | userspace id mapped. This will come in handy when working with idmapped mounts. | 
|  |  | 
|  | Invalid translations | 
|  | ~~~~~~~~~~~~~~~~~~~~ | 
|  |  | 
|  | It is never valid to use an id in the kernel idmapset of one idmapping as the | 
|  | id in the userspace idmapset of another or the same idmapping. While the kernel | 
|  | idmapset always indicates an idmapset in the kernel id space the userspace | 
|  | idmapset indicates a userspace id. So the following translations are forbidden:: | 
|  |  | 
|  | /* Map the userspace id down into a kernel id in the first idmapping. */ | 
|  | make_kuid(u0:k10000:r10000, u1000) = k11000 | 
|  |  | 
|  | /* INVALID: Map the kernel id down into a kernel id in the second idmapping. */ | 
|  | make_kuid(u10000:k20000:r10000, k110000) = k21000 | 
|  | ~~~~~~~ | 
|  |  | 
|  | and equally wrong:: | 
|  |  | 
|  | /* Map the kernel id up into a userspace id in the first idmapping. */ | 
|  | from_kuid(u0:k10000:r10000, k11000) = u1000 | 
|  |  | 
|  | /* INVALID: Map the userspace id up into a userspace id in the second idmapping. */ | 
|  | from_kuid(u20000:k0:r10000, u1000) = k21000 | 
|  | ~~~~~ | 
|  |  | 
|  | Since userspace ids have type ``uid_t`` and ``gid_t`` and kernel ids have type | 
|  | ``kuid_t`` and ``kgid_t`` the compiler will throw an error when they are | 
|  | conflated. So the two examples above would cause a compilation failure. | 
|  |  | 
|  | Idmappings when creating filesystem objects | 
|  | ------------------------------------------- | 
|  |  | 
|  | The concepts of mapping an id down or mapping an id up are expressed in the two | 
|  | kernel functions filesystem developers are rather familiar with and which we've | 
|  | already used in this document:: | 
|  |  | 
|  | /* Map the userspace id down into a kernel id. */ | 
|  | make_kuid(idmapping, uid) | 
|  |  | 
|  | /* Map the kernel id up into a userspace id. */ | 
|  | from_kuid(idmapping, kuid) | 
|  |  | 
|  | We will take an abbreviated look into how idmappings figure into creating | 
|  | filesystem objects. For simplicity we will only look at what happens when the | 
|  | VFS has already completed path lookup right before it calls into the filesystem | 
|  | itself. So we're concerned with what happens when e.g. ``vfs_mkdir()`` is | 
|  | called. We will also assume that the directory we're creating filesystem | 
|  | objects in is readable and writable for everyone. | 
|  |  | 
|  | When creating a filesystem object the caller will look at the caller's | 
|  | filesystem ids. These are just regular ``uid_t`` and ``gid_t`` userspace ids | 
|  | but they are exclusively used when determining file ownership which is why they | 
|  | are called "filesystem ids". They are usually identical to the uid and gid of | 
|  | the caller but can differ. We will just assume they are always identical to not | 
|  | get lost in too many details. | 
|  |  | 
|  | When the caller enters the kernel two things happen: | 
|  |  | 
|  | 1. Map the caller's userspace ids down into kernel ids in the caller's | 
|  | idmapping. | 
|  | (To be precise, the kernel will simply look at the kernel ids stashed in the | 
|  | credentials of the current task but for our education we'll pretend this | 
|  | translation happens just in time.) | 
|  | 2. Verify that the caller's kernel ids can be mapped up to userspace ids in the | 
|  | filesystem's idmapping. | 
|  |  | 
|  | The second step is important as regular filesystem will ultimately need to map | 
|  | the kernel id back up into a userspace id when writing to disk. | 
|  | So with the second step the kernel guarantees that a valid userspace id can be | 
|  | written to disk. If it can't the kernel will refuse the creation request to not | 
|  | even remotely risk filesystem corruption. | 
|  |  | 
|  | The astute reader will have realized that this is simply a variation of the | 
|  | crossmapping algorithm we mentioned above in a previous section. First, the | 
|  | kernel maps the caller's userspace id down into a kernel id according to the | 
|  | caller's idmapping and then maps that kernel id up according to the | 
|  | filesystem's idmapping. | 
|  |  | 
|  | From the implementation point it's worth mentioning how idmappings are represented. | 
|  | All idmappings are taken from the corresponding user namespace. | 
|  |  | 
|  | - caller's idmapping (usually taken from ``current_user_ns()``) | 
|  | - filesystem's idmapping (``sb->s_user_ns``) | 
|  | - mount's idmapping (``mnt_idmap(vfsmnt)``) | 
|  |  | 
|  | Let's see some examples with caller/filesystem idmapping but without mount | 
|  | idmappings. This will exhibit some problems we can hit. After that we will | 
|  | revisit/reconsider these examples, this time using mount idmappings, to see how | 
|  | they can solve the problems we observed before. | 
|  |  | 
|  | Example 1 | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | caller id:            u1000 | 
|  | caller idmapping:     u0:k0:r4294967295 | 
|  | filesystem idmapping: u0:k0:r4294967295 | 
|  |  | 
|  | Both the caller and the filesystem use the identity idmapping: | 
|  |  | 
|  | 1. Map the caller's userspace ids into kernel ids in the caller's idmapping:: | 
|  |  | 
|  | make_kuid(u0:k0:r4294967295, u1000) = k1000 | 
|  |  | 
|  | 2. Verify that the caller's kernel ids can be mapped to userspace ids in the | 
|  | filesystem's idmapping. | 
|  |  | 
|  | For this second step the kernel will call the function | 
|  | ``fsuidgid_has_mapping()`` which ultimately boils down to calling | 
|  | ``from_kuid()``:: | 
|  |  | 
|  | from_kuid(u0:k0:r4294967295, k1000) = u1000 | 
|  |  | 
|  | In this example both idmappings are the same so there's nothing exciting going | 
|  | on. Ultimately the userspace id that lands on disk will be ``u1000``. | 
|  |  | 
|  | Example 2 | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | caller id:            u1000 | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k20000:r10000 | 
|  |  | 
|  | 1. Map the caller's userspace ids down into kernel ids in the caller's | 
|  | idmapping:: | 
|  |  | 
|  | make_kuid(u0:k10000:r10000, u1000) = k11000 | 
|  |  | 
|  | 2. Verify that the caller's kernel ids can be mapped up to userspace ids in the | 
|  | filesystem's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k20000:r10000, k11000) = u-1 | 
|  |  | 
|  | It's immediately clear that while the caller's userspace id could be | 
|  | successfully mapped down into kernel ids in the caller's idmapping the kernel | 
|  | ids could not be mapped up according to the filesystem's idmapping. So the | 
|  | kernel will deny this creation request. | 
|  |  | 
|  | Note that while this example is less common, because most filesystem can't be | 
|  | mounted with non-initial idmappings this is a general problem as we can see in | 
|  | the next examples. | 
|  |  | 
|  | Example 3 | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | caller id:            u1000 | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k0:r4294967295 | 
|  |  | 
|  | 1. Map the caller's userspace ids down into kernel ids in the caller's | 
|  | idmapping:: | 
|  |  | 
|  | make_kuid(u0:k10000:r10000, u1000) = k11000 | 
|  |  | 
|  | 2. Verify that the caller's kernel ids can be mapped up to userspace ids in the | 
|  | filesystem's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k0:r4294967295, k11000) = u11000 | 
|  |  | 
|  | We can see that the translation always succeeds. The userspace id that the | 
|  | filesystem will ultimately put to disk will always be identical to the value of | 
|  | the kernel id that was created in the caller's idmapping. This has mainly two | 
|  | consequences. | 
|  |  | 
|  | First, that we can't allow a caller to ultimately write to disk with another | 
|  | userspace id. We could only do this if we were to mount the whole filesystem | 
|  | with the caller's or another idmapping. But that solution is limited to a few | 
|  | filesystems and not very flexible. But this is a use-case that is pretty | 
|  | important in containerized workloads. | 
|  |  | 
|  | Second, the caller will usually not be able to create any files or access | 
|  | directories that have stricter permissions because none of the filesystem's | 
|  | kernel ids map up into valid userspace ids in the caller's idmapping | 
|  |  | 
|  | 1. Map raw userspace ids down to kernel ids in the filesystem's idmapping:: | 
|  |  | 
|  | make_kuid(u0:k0:r4294967295, u1000) = k1000 | 
|  |  | 
|  | 2. Map kernel ids up to userspace ids in the caller's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k10000:r10000, k1000) = u-1 | 
|  |  | 
|  | Example 4 | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | file id:              u1000 | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k0:r4294967295 | 
|  |  | 
|  | In order to report ownership to userspace the kernel uses the crossmapping | 
|  | algorithm introduced in a previous section: | 
|  |  | 
|  | 1. Map the userspace id on disk down into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | make_kuid(u0:k0:r4294967295, u1000) = k1000 | 
|  |  | 
|  | 2. Map the kernel id up into a userspace id in the caller's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k10000:r10000, k1000) = u-1 | 
|  |  | 
|  | The crossmapping algorithm fails in this case because the kernel id in the | 
|  | filesystem idmapping cannot be mapped up to a userspace id in the caller's | 
|  | idmapping. Thus, the kernel will report the ownership of this file as the | 
|  | overflowid. | 
|  |  | 
|  | Example 5 | 
|  | ~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | file id:              u1000 | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k20000:r10000 | 
|  |  | 
|  | In order to report ownership to userspace the kernel uses the crossmapping | 
|  | algorithm introduced in a previous section: | 
|  |  | 
|  | 1. Map the userspace id on disk down into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | make_kuid(u0:k20000:r10000, u1000) = k21000 | 
|  |  | 
|  | 2. Map the kernel id up into a userspace id in the caller's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k10000:r10000, k21000) = u-1 | 
|  |  | 
|  | Again, the crossmapping algorithm fails in this case because the kernel id in | 
|  | the filesystem idmapping cannot be mapped to a userspace id in the caller's | 
|  | idmapping. Thus, the kernel will report the ownership of this file as the | 
|  | overflowid. | 
|  |  | 
|  | Note how in the last two examples things would be simple if the caller would be | 
|  | using the initial idmapping. For a filesystem mounted with the initial | 
|  | idmapping it would be trivial. So we only consider a filesystem with an | 
|  | idmapping of ``u0:k20000:r10000``: | 
|  |  | 
|  | 1. Map the userspace id on disk down into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | make_kuid(u0:k20000:r10000, u1000) = k21000 | 
|  |  | 
|  | 2. Map the kernel id up into a userspace id in the caller's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k0:r4294967295, k21000) = u21000 | 
|  |  | 
|  | Idmappings on idmapped mounts | 
|  | ----------------------------- | 
|  |  | 
|  | The examples we've seen in the previous section where the caller's idmapping | 
|  | and the filesystem's idmapping are incompatible causes various issues for | 
|  | workloads. For a more complex but common example, consider two containers | 
|  | started on the host. To completely prevent the two containers from affecting | 
|  | each other, an administrator may often use different non-overlapping idmappings | 
|  | for the two containers:: | 
|  |  | 
|  | container1 idmapping:  u0:k10000:r10000 | 
|  | container2 idmapping:  u0:k20000:r10000 | 
|  | filesystem idmapping:  u0:k30000:r10000 | 
|  |  | 
|  | An administrator wanting to provide easy read-write access to the following set | 
|  | of files:: | 
|  |  | 
|  | dir id:       u0 | 
|  | dir/file1 id: u1000 | 
|  | dir/file2 id: u2000 | 
|  |  | 
|  | to both containers currently can't. | 
|  |  | 
|  | Of course the administrator has the option to recursively change ownership via | 
|  | ``chown()``. For example, they could change ownership so that ``dir`` and all | 
|  | files below it can be crossmapped from the filesystem's into the container's | 
|  | idmapping. Let's assume they change ownership so it is compatible with the | 
|  | first container's idmapping:: | 
|  |  | 
|  | dir id:       u10000 | 
|  | dir/file1 id: u11000 | 
|  | dir/file2 id: u12000 | 
|  |  | 
|  | This would still leave ``dir`` rather useless to the second container. In fact, | 
|  | ``dir`` and all files below it would continue to appear owned by the overflowid | 
|  | for the second container. | 
|  |  | 
|  | Or consider another increasingly popular example. Some service managers such as | 
|  | systemd implement a concept called "portable home directories". A user may want | 
|  | to use their home directories on different machines where they are assigned | 
|  | different login userspace ids. Most users will have ``u1000`` as the login id | 
|  | on their machine at home and all files in their home directory will usually be | 
|  | owned by ``u1000``. At uni or at work they may have another login id such as | 
|  | ``u1125``. This makes it rather difficult to interact with their home directory | 
|  | on their work machine. | 
|  |  | 
|  | In both cases changing ownership recursively has grave implications. The most | 
|  | obvious one is that ownership is changed globally and permanently. In the home | 
|  | directory case this change in ownership would even need to happen every time the | 
|  | user switches from their home to their work machine. For really large sets of | 
|  | files this becomes increasingly costly. | 
|  |  | 
|  | If the user is lucky, they are dealing with a filesystem that is mountable | 
|  | inside user namespaces. But this would also change ownership globally and the | 
|  | change in ownership is tied to the lifetime of the filesystem mount, i.e. the | 
|  | superblock. The only way to change ownership is to completely unmount the | 
|  | filesystem and mount it again in another user namespace. This is usually | 
|  | impossible because it would mean that all users currently accessing the | 
|  | filesystem can't anymore. And it means that ``dir`` still can't be shared | 
|  | between two containers with different idmappings. | 
|  | But usually the user doesn't even have this option since most filesystems | 
|  | aren't mountable inside containers. And not having them mountable might be | 
|  | desirable as it doesn't require the filesystem to deal with malicious | 
|  | filesystem images. | 
|  |  | 
|  | But the usecases mentioned above and more can be handled by idmapped mounts. | 
|  | They allow to expose the same set of dentries with different ownership at | 
|  | different mounts. This is achieved by marking the mounts with a user namespace | 
|  | through the ``mount_setattr()`` system call. The idmapping associated with it | 
|  | is then used to translate from the caller's idmapping to the filesystem's | 
|  | idmapping and vica versa using the remapping algorithm we introduced above. | 
|  |  | 
|  | Idmapped mounts make it possible to change ownership in a temporary and | 
|  | localized way. The ownership changes are restricted to a specific mount and the | 
|  | ownership changes are tied to the lifetime of the mount. All other users and | 
|  | locations where the filesystem is exposed are unaffected. | 
|  |  | 
|  | Filesystems that support idmapped mounts don't have any real reason to support | 
|  | being mountable inside user namespaces. A filesystem could be exposed | 
|  | completely under an idmapped mount to get the same effect. This has the | 
|  | advantage that filesystems can leave the creation of the superblock to | 
|  | privileged users in the initial user namespace. | 
|  |  | 
|  | However, it is perfectly possible to combine idmapped mounts with filesystems | 
|  | mountable inside user namespaces. We will touch on this further below. | 
|  |  | 
|  | Filesystem types vs idmapped mount types | 
|  | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  |  | 
|  | With the introduction of idmapped mounts we need to distinguish between | 
|  | filesystem ownership and mount ownership of a VFS object such as an inode. The | 
|  | owner of a inode might be different when looked at from a filesystem | 
|  | perspective than when looked at from an idmapped mount. Such fundamental | 
|  | conceptual distinctions should almost always be clearly expressed in the code. | 
|  | So, to distinguish idmapped mount ownership from filesystem ownership separate | 
|  | types have been introduced. | 
|  |  | 
|  | If a uid or gid has been generated using the filesystem or caller's idmapping | 
|  | then we will use the ``kuid_t`` and ``kgid_t`` types. However, if a uid or gid | 
|  | has been generated using a mount idmapping then we will be using the dedicated | 
|  | ``vfsuid_t`` and ``vfsgid_t`` types. | 
|  |  | 
|  | All VFS helpers that generate or take uids and gids as arguments use the | 
|  | ``vfsuid_t`` and ``vfsgid_t`` types and we will be able to rely on the compiler | 
|  | to catch errors that originate from conflating filesystem and VFS uids and gids. | 
|  |  | 
|  | The ``vfsuid_t`` and ``vfsgid_t`` types are often mapped from and to ``kuid_t`` | 
|  | and ``kgid_t`` types similar how ``kuid_t`` and ``kgid_t`` types are mapped | 
|  | from and to ``uid_t`` and ``gid_t`` types:: | 
|  |  | 
|  | uid_t <--> kuid_t <--> vfsuid_t | 
|  | gid_t <--> kgid_t <--> vfsgid_t | 
|  |  | 
|  | Whenever we report ownership based on a ``vfsuid_t`` or ``vfsgid_t`` type, | 
|  | e.g., during ``stat()``, or store ownership information in a shared VFS object | 
|  | based on a ``vfsuid_t`` or ``vfsgid_t`` type, e.g., during ``chown()`` we can | 
|  | use the ``vfsuid_into_kuid()`` and ``vfsgid_into_kgid()`` helpers. | 
|  |  | 
|  | To illustrate why this helper currently exists, consider what happens when we | 
|  | change ownership of an inode from an idmapped mount. After we generated | 
|  | a ``vfsuid_t`` or ``vfsgid_t`` based on the mount idmapping we later commit to | 
|  | this ``vfsuid_t`` or ``vfsgid_t`` to become the new filesystem wide ownership. | 
|  | Thus, we are turning the ``vfsuid_t`` or ``vfsgid_t`` into a global ``kuid_t`` | 
|  | or ``kgid_t``. And this can be done by using ``vfsuid_into_kuid()`` and | 
|  | ``vfsgid_into_kgid()``. | 
|  |  | 
|  | Note, whenever a shared VFS object, e.g., a cached ``struct inode`` or a cached | 
|  | ``struct posix_acl``, stores ownership information a filesystem or "global" | 
|  | ``kuid_t`` and ``kgid_t`` must be used. Ownership expressed via ``vfsuid_t`` | 
|  | and ``vfsgid_t`` is specific to an idmapped mount. | 
|  |  | 
|  | We already noted that ``vfsuid_t`` and ``vfsgid_t`` types are generated based | 
|  | on mount idmappings whereas ``kuid_t`` and ``kgid_t`` types are generated based | 
|  | on filesystem idmappings. To prevent abusing filesystem idmappings to generate | 
|  | ``vfsuid_t`` or ``vfsgid_t`` types or mount idmappings to generate ``kuid_t`` | 
|  | or ``kgid_t`` types filesystem idmappings and mount idmappings are different | 
|  | types as well. | 
|  |  | 
|  | All helpers that map to or from ``vfsuid_t`` and ``vfsgid_t`` types require | 
|  | a mount idmapping to be passed which is of type ``struct mnt_idmap``. Passing | 
|  | a filesystem or caller idmapping will cause a compilation error. | 
|  |  | 
|  | Similar to how we prefix all userspace ids in this document with ``u`` and all | 
|  | kernel ids with ``k`` we will prefix all VFS ids with ``v``. So a mount | 
|  | idmapping will be written as: ``u0:v10000:r10000``. | 
|  |  | 
|  | Remapping helpers | 
|  | ~~~~~~~~~~~~~~~~~ | 
|  |  | 
|  | Idmapping functions were added that translate between idmappings. They make use | 
|  | of the remapping algorithm we've introduced earlier. We're going to look at: | 
|  |  | 
|  | - ``i_uid_into_vfsuid()`` and ``i_gid_into_vfsgid()`` | 
|  |  | 
|  | The ``i_*id_into_vfs*id()`` functions translate filesystem's kernel ids into | 
|  | VFS ids in the mount's idmapping:: | 
|  |  | 
|  | /* Map the filesystem's kernel id up into a userspace id in the filesystem's idmapping. */ | 
|  | from_kuid(filesystem, kid) = uid | 
|  |  | 
|  | /* Map the filesystem's userspace id down ito a VFS id in the mount's idmapping. */ | 
|  | make_kuid(mount, uid) = kuid | 
|  |  | 
|  | - ``mapped_fsuid()`` and ``mapped_fsgid()`` | 
|  |  | 
|  | The ``mapped_fs*id()`` functions translate the caller's kernel ids into | 
|  | kernel ids in the filesystem's idmapping. This translation is achieved by | 
|  | remapping the caller's VFS ids using the mount's idmapping:: | 
|  |  | 
|  | /* Map the caller's VFS id up into a userspace id in the mount's idmapping. */ | 
|  | from_kuid(mount, kid) = uid | 
|  |  | 
|  | /* Map the mount's userspace id down into a kernel id in the filesystem's idmapping. */ | 
|  | make_kuid(filesystem, uid) = kuid | 
|  |  | 
|  | - ``vfsuid_into_kuid()`` and ``vfsgid_into_kgid()`` | 
|  |  | 
|  | Whenever | 
|  |  | 
|  | Note that these two functions invert each other. Consider the following | 
|  | idmappings:: | 
|  |  | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k20000:r10000 | 
|  | mount idmapping:      u0:v10000:r10000 | 
|  |  | 
|  | Assume a file owned by ``u1000`` is read from disk. The filesystem maps this id | 
|  | to ``k21000`` according to its idmapping. This is what is stored in the | 
|  | inode's ``i_uid`` and ``i_gid`` fields. | 
|  |  | 
|  | When the caller queries the ownership of this file via ``stat()`` the kernel | 
|  | would usually simply use the crossmapping algorithm and map the filesystem's | 
|  | kernel id up to a userspace id in the caller's idmapping. | 
|  |  | 
|  | But when the caller is accessing the file on an idmapped mount the kernel will | 
|  | first call ``i_uid_into_vfsuid()`` thereby translating the filesystem's kernel | 
|  | id into a VFS id in the mount's idmapping:: | 
|  |  | 
|  | i_uid_into_vfsuid(k21000): | 
|  | /* Map the filesystem's kernel id up into a userspace id. */ | 
|  | from_kuid(u0:k20000:r10000, k21000) = u1000 | 
|  |  | 
|  | /* Map the filesystem's userspace id down into a VFS id in the mount's idmapping. */ | 
|  | make_kuid(u0:v10000:r10000, u1000) = v11000 | 
|  |  | 
|  | Finally, when the kernel reports the owner to the caller it will turn the | 
|  | VFS id in the mount's idmapping into a userspace id in the caller's | 
|  | idmapping:: | 
|  |  | 
|  | k11000 = vfsuid_into_kuid(v11000) | 
|  | from_kuid(u0:k10000:r10000, k11000) = u1000 | 
|  |  | 
|  | We can test whether this algorithm really works by verifying what happens when | 
|  | we create a new file. Let's say the user is creating a file with ``u1000``. | 
|  |  | 
|  | The kernel maps this to ``k11000`` in the caller's idmapping. Usually the | 
|  | kernel would now apply the crossmapping, verifying that ``k11000`` can be | 
|  | mapped to a userspace id in the filesystem's idmapping. Since ``k11000`` can't | 
|  | be mapped up in the filesystem's idmapping directly this creation request | 
|  | fails. | 
|  |  | 
|  | But when the caller is accessing the file on an idmapped mount the kernel will | 
|  | first call ``mapped_fs*id()`` thereby translating the caller's kernel id into | 
|  | a VFS id according to the mount's idmapping:: | 
|  |  | 
|  | mapped_fsuid(k11000): | 
|  | /* Map the caller's kernel id up into a userspace id in the mount's idmapping. */ | 
|  | from_kuid(u0:k10000:r10000, k11000) = u1000 | 
|  |  | 
|  | /* Map the mount's userspace id down into a kernel id in the filesystem's idmapping. */ | 
|  | make_kuid(u0:v20000:r10000, u1000) = v21000 | 
|  |  | 
|  | When finally writing to disk the kernel will then map ``v21000`` up into a | 
|  | userspace id in the filesystem's idmapping:: | 
|  |  | 
|  | k21000 = vfsuid_into_kuid(v21000) | 
|  | from_kuid(u0:k20000:r10000, k21000) = u1000 | 
|  |  | 
|  | As we can see, we end up with an invertible and therefore information | 
|  | preserving algorithm. A file created from ``u1000`` on an idmapped mount will | 
|  | also be reported as being owned by ``u1000`` and vica versa. | 
|  |  | 
|  | Let's now briefly reconsider the failing examples from earlier in the context | 
|  | of idmapped mounts. | 
|  |  | 
|  | Example 2 reconsidered | 
|  | ~~~~~~~~~~~~~~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | caller id:            u1000 | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k20000:r10000 | 
|  | mount idmapping:      u0:v10000:r10000 | 
|  |  | 
|  | When the caller is using a non-initial idmapping the common case is to attach | 
|  | the same idmapping to the mount. We now perform three steps: | 
|  |  | 
|  | 1. Map the caller's userspace ids into kernel ids in the caller's idmapping:: | 
|  |  | 
|  | make_kuid(u0:k10000:r10000, u1000) = k11000 | 
|  |  | 
|  | 2. Translate the caller's VFS id into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | mapped_fsuid(v11000): | 
|  | /* Map the VFS id up into a userspace id in the mount's idmapping. */ | 
|  | from_kuid(u0:v10000:r10000, v11000) = u1000 | 
|  |  | 
|  | /* Map the userspace id down into a kernel id in the filesystem's idmapping. */ | 
|  | make_kuid(u0:k20000:r10000, u1000) = k21000 | 
|  |  | 
|  | 3. Verify that the caller's kernel ids can be mapped to userspace ids in the | 
|  | filesystem's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k20000:r10000, k21000) = u1000 | 
|  |  | 
|  | So the ownership that lands on disk will be ``u1000``. | 
|  |  | 
|  | Example 3 reconsidered | 
|  | ~~~~~~~~~~~~~~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | caller id:            u1000 | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k0:r4294967295 | 
|  | mount idmapping:      u0:v10000:r10000 | 
|  |  | 
|  | The same translation algorithm works with the third example. | 
|  |  | 
|  | 1. Map the caller's userspace ids into kernel ids in the caller's idmapping:: | 
|  |  | 
|  | make_kuid(u0:k10000:r10000, u1000) = k11000 | 
|  |  | 
|  | 2. Translate the caller's VFS id into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | mapped_fsuid(v11000): | 
|  | /* Map the VFS id up into a userspace id in the mount's idmapping. */ | 
|  | from_kuid(u0:v10000:r10000, v11000) = u1000 | 
|  |  | 
|  | /* Map the userspace id down into a kernel id in the filesystem's idmapping. */ | 
|  | make_kuid(u0:k0:r4294967295, u1000) = k1000 | 
|  |  | 
|  | 3. Verify that the caller's kernel ids can be mapped to userspace ids in the | 
|  | filesystem's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k0:r4294967295, k1000) = u1000 | 
|  |  | 
|  | So the ownership that lands on disk will be ``u1000``. | 
|  |  | 
|  | Example 4 reconsidered | 
|  | ~~~~~~~~~~~~~~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | file id:              u1000 | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k0:r4294967295 | 
|  | mount idmapping:      u0:v10000:r10000 | 
|  |  | 
|  | In order to report ownership to userspace the kernel now does three steps using | 
|  | the translation algorithm we introduced earlier: | 
|  |  | 
|  | 1. Map the userspace id on disk down into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | make_kuid(u0:k0:r4294967295, u1000) = k1000 | 
|  |  | 
|  | 2. Translate the kernel id into a VFS id in the mount's idmapping:: | 
|  |  | 
|  | i_uid_into_vfsuid(k1000): | 
|  | /* Map the kernel id up into a userspace id in the filesystem's idmapping. */ | 
|  | from_kuid(u0:k0:r4294967295, k1000) = u1000 | 
|  |  | 
|  | /* Map the userspace id down into a VFS id in the mounts's idmapping. */ | 
|  | make_kuid(u0:v10000:r10000, u1000) = v11000 | 
|  |  | 
|  | 3. Map the VFS id up into a userspace id in the caller's idmapping:: | 
|  |  | 
|  | k11000 = vfsuid_into_kuid(v11000) | 
|  | from_kuid(u0:k10000:r10000, k11000) = u1000 | 
|  |  | 
|  | Earlier, the caller's kernel id couldn't be crossmapped in the filesystems's | 
|  | idmapping. With the idmapped mount in place it now can be crossmapped into the | 
|  | filesystem's idmapping via the mount's idmapping. The file will now be created | 
|  | with ``u1000`` according to the mount's idmapping. | 
|  |  | 
|  | Example 5 reconsidered | 
|  | ~~~~~~~~~~~~~~~~~~~~~~ | 
|  |  | 
|  | :: | 
|  |  | 
|  | file id:              u1000 | 
|  | caller idmapping:     u0:k10000:r10000 | 
|  | filesystem idmapping: u0:k20000:r10000 | 
|  | mount idmapping:      u0:v10000:r10000 | 
|  |  | 
|  | Again, in order to report ownership to userspace the kernel now does three | 
|  | steps using the translation algorithm we introduced earlier: | 
|  |  | 
|  | 1. Map the userspace id on disk down into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | make_kuid(u0:k20000:r10000, u1000) = k21000 | 
|  |  | 
|  | 2. Translate the kernel id into a VFS id in the mount's idmapping:: | 
|  |  | 
|  | i_uid_into_vfsuid(k21000): | 
|  | /* Map the kernel id up into a userspace id in the filesystem's idmapping. */ | 
|  | from_kuid(u0:k20000:r10000, k21000) = u1000 | 
|  |  | 
|  | /* Map the userspace id down into a VFS id in the mounts's idmapping. */ | 
|  | make_kuid(u0:v10000:r10000, u1000) = v11000 | 
|  |  | 
|  | 3. Map the VFS id up into a userspace id in the caller's idmapping:: | 
|  |  | 
|  | k11000 = vfsuid_into_kuid(v11000) | 
|  | from_kuid(u0:k10000:r10000, k11000) = u1000 | 
|  |  | 
|  | Earlier, the file's kernel id couldn't be crossmapped in the filesystems's | 
|  | idmapping. With the idmapped mount in place it now can be crossmapped into the | 
|  | filesystem's idmapping via the mount's idmapping. The file is now owned by | 
|  | ``u1000`` according to the mount's idmapping. | 
|  |  | 
|  | Changing ownership on a home directory | 
|  | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 
|  |  | 
|  | We've seen above how idmapped mounts can be used to translate between | 
|  | idmappings when either the caller, the filesystem or both uses a non-initial | 
|  | idmapping. A wide range of usecases exist when the caller is using | 
|  | a non-initial idmapping. This mostly happens in the context of containerized | 
|  | workloads. The consequence is as we have seen that for both, filesystem's | 
|  | mounted with the initial idmapping and filesystems mounted with non-initial | 
|  | idmappings, access to the filesystem isn't working because the kernel ids can't | 
|  | be crossmapped between the caller's and the filesystem's idmapping. | 
|  |  | 
|  | As we've seen above idmapped mounts provide a solution to this by remapping the | 
|  | caller's or filesystem's idmapping according to the mount's idmapping. | 
|  |  | 
|  | Aside from containerized workloads, idmapped mounts have the advantage that | 
|  | they also work when both the caller and the filesystem use the initial | 
|  | idmapping which means users on the host can change the ownership of directories | 
|  | and files on a per-mount basis. | 
|  |  | 
|  | Consider our previous example where a user has their home directory on portable | 
|  | storage. At home they have id ``u1000`` and all files in their home directory | 
|  | are owned by ``u1000`` whereas at uni or work they have login id ``u1125``. | 
|  |  | 
|  | Taking their home directory with them becomes problematic. They can't easily | 
|  | access their files, they might not be able to write to disk without applying | 
|  | lax permissions or ACLs and even if they can, they will end up with an annoying | 
|  | mix of files and directories owned by ``u1000`` and ``u1125``. | 
|  |  | 
|  | Idmapped mounts allow to solve this problem. A user can create an idmapped | 
|  | mount for their home directory on their work computer or their computer at home | 
|  | depending on what ownership they would prefer to end up on the portable storage | 
|  | itself. | 
|  |  | 
|  | Let's assume they want all files on disk to belong to ``u1000``. When the user | 
|  | plugs in their portable storage at their work station they can setup a job that | 
|  | creates an idmapped mount with the minimal idmapping ``u1000:k1125:r1``. So now | 
|  | when they create a file the kernel performs the following steps we already know | 
|  | from above::: | 
|  |  | 
|  | caller id:            u1125 | 
|  | caller idmapping:     u0:k0:r4294967295 | 
|  | filesystem idmapping: u0:k0:r4294967295 | 
|  | mount idmapping:      u1000:v1125:r1 | 
|  |  | 
|  | 1. Map the caller's userspace ids into kernel ids in the caller's idmapping:: | 
|  |  | 
|  | make_kuid(u0:k0:r4294967295, u1125) = k1125 | 
|  |  | 
|  | 2. Translate the caller's VFS id into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | mapped_fsuid(v1125): | 
|  | /* Map the VFS id up into a userspace id in the mount's idmapping. */ | 
|  | from_kuid(u1000:v1125:r1, v1125) = u1000 | 
|  |  | 
|  | /* Map the userspace id down into a kernel id in the filesystem's idmapping. */ | 
|  | make_kuid(u0:k0:r4294967295, u1000) = k1000 | 
|  |  | 
|  | 3. Verify that the caller's filesystem ids can be mapped to userspace ids in the | 
|  | filesystem's idmapping:: | 
|  |  | 
|  | from_kuid(u0:k0:r4294967295, k1000) = u1000 | 
|  |  | 
|  | So ultimately the file will be created with ``u1000`` on disk. | 
|  |  | 
|  | Now let's briefly look at what ownership the caller with id ``u1125`` will see | 
|  | on their work computer: | 
|  |  | 
|  | :: | 
|  |  | 
|  | file id:              u1000 | 
|  | caller idmapping:     u0:k0:r4294967295 | 
|  | filesystem idmapping: u0:k0:r4294967295 | 
|  | mount idmapping:      u1000:v1125:r1 | 
|  |  | 
|  | 1. Map the userspace id on disk down into a kernel id in the filesystem's | 
|  | idmapping:: | 
|  |  | 
|  | make_kuid(u0:k0:r4294967295, u1000) = k1000 | 
|  |  | 
|  | 2. Translate the kernel id into a VFS id in the mount's idmapping:: | 
|  |  | 
|  | i_uid_into_vfsuid(k1000): | 
|  | /* Map the kernel id up into a userspace id in the filesystem's idmapping. */ | 
|  | from_kuid(u0:k0:r4294967295, k1000) = u1000 | 
|  |  | 
|  | /* Map the userspace id down into a VFS id in the mounts's idmapping. */ | 
|  | make_kuid(u1000:v1125:r1, u1000) = v1125 | 
|  |  | 
|  | 3. Map the VFS id up into a userspace id in the caller's idmapping:: | 
|  |  | 
|  | k1125 = vfsuid_into_kuid(v1125) | 
|  | from_kuid(u0:k0:r4294967295, k1125) = u1125 | 
|  |  | 
|  | So ultimately the caller will be reported that the file belongs to ``u1125`` | 
|  | which is the caller's userspace id on their workstation in our example. | 
|  |  | 
|  | The raw userspace id that is put on disk is ``u1000`` so when the user takes | 
|  | their home directory back to their home computer where they are assigned | 
|  | ``u1000`` using the initial idmapping and mount the filesystem with the initial | 
|  | idmapping they will see all those files owned by ``u1000``. |