##
# policy.txt <description to the Xen access control architecture>
#
# Author:
# Reiner Sailer 08/15/2005 <sailer@watson.ibm.com>
#
#
# This file gives an overview of the security policies currently
# provided and also gives some reasoning about how to assign
# labels to domains.
##

Xen access control policies


General explanation of supported security policies:
=====================================================

We have implemented the mandatory access control architecture of our
hypervisor security architecture (sHype) for the Xen hypervisor. It
controls communication (in Xen: event channels, grant tables) between
Virtual Machines (from here on called domains) and through this the
virtual block devices, networking, and shared memory are implemented
on top of these communication means. While we have implemented the
described policies and access control architecture for other
hypervisor systems, we will describe below specifically its
implementation and use in the Xen hypervisor. The policy enforcement
is called mandatory regarding user domains since the policy it is
given by the security administration and enforced independently of the
user domains by the Xen hypervisor in cooperation with the domain
management.

The access control architecture consists of three parts:

i) The access control policy determines the "command set" of the ACM
and the hooks with which they can be configured to constrain the
sharing of virtual resources. The current access control architecture
implemented for Xen supports two policies: Chinese Wall and Simple
Type Enforcement, which we describe in turn below.


ii) The actually enforced policy instantiation uses the policy
language (i) to configure the Xen access control in a way that suits
the specific application (home desktop environment, company desktop,
Web server system, etc.). We have defined an exemplary policy
instantiation for Chinese Wall (chwall policy) and Simple Type
Enforcement (ste policy) for a desktop system. We offer these policies
in combination since they are controlling orthogonal events.


iii) The access control module (ACM) and related hooks are part of the
core hypervisor and their controls cannot be bypassed by domains. The
ACM and hooks are the active security components. We refer to
publications that describe how access control is enforced in the Xen
hypervisor using the ACM (access decision) and the hooks (decision
enforcement) inserted into the setup of event channels and grant
tables, and into domain operations (create, destroy, save, restore,
migrate). These controls decide based on the active policy
configuration (see i. and ii.) if the operation proceeds of if the
operation is aborted (denied).

In general, security policy instantiations in the Xen access control
framework are defined by XML policy files. Each security policy has
exactly one file including all the information the hypervisor needs to
enforce the policy.

The name of a policy is unique and consists of a colon-separated list
of names, which can be translated into the location (subtree) where
this policy must be located. The last part of the name is the file
name pre-fix for the policy xml file. The preceding name parts are
translated into the local path relative to the global policy root
(/etc/xen/acm-security/policies) pointing to the policy xml file. For
example: example.chwall_ste.client_v1 denotes the policy file
example/chwall_ste/client_v1-security_policy.xml relative to the
global policy root directory.

Every security policy has its own sub-directory under the global
policy root directory /etc/xen/acm-security/policies, which is
installed during the Xen installation or can be manually installed
(when switching from a "security disabled" Xen to a "security enabled"
Xen AFTER configuring security, see install.txt) by the command
sequence:

   cd "Xen-root"/tools/security/policies; make install

We will describe those files for our example policy (Chinese Wall and
Simple Type Enforcement) in more detail as we go along. Eventually, we
will move towards a system installation where the policies will reside
under /etc.


CHINESE WALL
============

The Chinese Wall policy enables the user to define "which workloads
(domain payloads) cannot run on a single physical system at the same
time". Why would we want to prevent workloads from running at the same
time on the same system? This supports requirements that can (but
don't have to) be rooted in the measure of trust into the isolation of
different domains that share the same hardware. Since the access
control architecture aims at high performance and non-intrusive
implementation, it currently does not address covert (timing) channels
and aims at medium assurance. Users can apply the Chinese Wall policy
to guarantee an air-gap between very sensitive payloads both regarding
covert information channels and regarding resource starvation.

To enable the CW control, each domain is labeled with a set of Chinese
Wall types and CW Conflict Sets are defined which include those CW
types that cannot run simultaneously on the same hardware. This
interpretation of conflict sets is the only policy rule for the Chines
Wall policy.

This is enforced by controlling the start of domains according to
their assigned CW worload types. Domains with Chinese Wall types that
appear in a common conflict set are running mutually exclusive on a
platform, i.e., once a domain with one of the cw-types of a conflict
set is running, no domain with another cw-type of the same conflict
set can start until the first domain is destroyed, paused, or migrated
away from the physical system (this assumes that such a partition can
no longer be observed). The idea is to assign cw-types according to
the type of payload that a domain runs and to use the Chinese Wall
policy to ensure that payload types can be differentiated by the
hypervisor and can be prevented from being executed on the same system
at the same time. Using the flexible CW policy maintains system
consolidation and workload-balancing while introducing guaranteed
constraints where necessary.


Example of a Chinese Wall Policy Instantiation
----------------------------------------------

The file client_v1-security_policy.xml defines the Chinese Wall types
as well as the conflict sets for our example policy (you find it in
the directory "policy_root"/example/chwall).

It defines four Chinese Wall types (prefixed with cw_) with the
following meaning:

* cw_SystemsManagement is a type identifying workloads for systems
management, e.g., domain management, device management, or hypervisor
management.

* cw_Sensitive is identifying workloads that are critical to the user
for one reason or another.

* cw_Distrusted is identifying workloads a user does not have much
confidence in. E.g. a domain used for surfing in the internet without
protection( i.e., active-X, java, java-script, executing web content)
or for (Internet) Games should be typed this way.

* cw_Isolated is identifying workloads that are supposedly isolated by
use of the type enforcement policy (described below). For example, if
a user wants to donate cycles to seti@home, she can setup a separate
domain for a Boinc (http://boinc.ssl.berkeley.edu/) client, disable
this domain from accessing the hard drive and from communicating to
other local domains, and type it as cw_Isolated. We will look at a
specific example later.

The example policy uses the defined types to define one conflict set:
Protection1 = {cw_Sensitive, cw_Distrusted}. This conflict set tells
the hypervisor that once a domain typed as cw_Sensitive is running, a
domain typed as cw_Distrusted cannot run concurrently (and the other
way round). With this policy, a domain typed as cw_Isolated is allowed
to run simultaneously with domains tagged as cw_Sensitive.

Consequently, the access control module in the Xen hypervisor
distinguishes in this example policy 4 different workload types in
this example policy. It is the user's responsibility to type the
domains in a way that reflects the workloads of these domains and, in
the case of cw_Isolated, its properties, e.g. by configuring the
sharing capabilities of the domain accordingly by using the simple
type enforcement policy.

Users can define their own or change the existing example policy
according to their working environment and security requirements. To
do so, replace the file chwall-security_policy.xml with the new
policy.


SIMPLE TYPE ENFORCEMENT
=======================

The file client_v1-security_policy.xml defines the simple type
enforcement types for our example policy (you find it in the directory
"policy_root"/example/ste). The Simple Type Enforcement policy defines
which domains can share information with which other domains. To this
end, it controls

i) inter-domain communication channels (e.g., network traffic, events,
and shared memory).

ii) access of domains to physical resources (e.g., hard drive, network
cards, graphics adapter, keyboard).

In order to enable the hypervisor to distinguish different domains and
the user to express access rules, the simple type enforcement defines
a set of types (ste_types).

The policy defines that communication between domains is allowed if
the domains share a common STE type. As with the chwall types, STE
types should enable the differentiation of workloads. The simple type
enforcement access control implementation in the hypervisor enforces
that domains can only communicate (setup event channels, grant tables)
if they share a common type, i.e., both domains have assigned at least
on type in common. A domain can access a resource, if the domain and
the resource share a common type. Hence, assigning STE types to
domains and resources allows users to define constraints on sharing
between domains and to keep sensitive data confined from distrusted
domains.

Domain <--> Domain Sharing
''''''''''''''''''''''''''
(implemented but its effective use requires factorization of Dom0)

a) Domains with a single STE type (general user domains): Sharing
between such domains is enforced entirely by the hypervisor access
control. It is independent of the domains and does not require their
co-operation.

b) Domains with multiple STE types: One example is a domain that
virtualizes a physical resource (e.g., hard drive) and serves it as
multiple virtual resources (virtual block drives) to other domains of
different types. The idea is that only a specific device domain has
assigned the type required to access the physical hard-drive. Logical
drives are then assigned the types of domains that have access to this
logical drive. Since the Xen hypervisor cannot distinguish between the
logical drives, the access control (type enforcement) is delegated to
the device domain, which has access to the types of domains requesting
to mount a logical drive as well as the types assigned to the
different available logical drives.

Currently in Xen, Dom0 controls all hardware, needs to communicate
with all domains during their setup, and intercepts all communication
between domains. Consequently, Dom0 needs to be assigned all types
used and must be completely trusted to maintain the separation of
informatio ncoming from domains with different STE types. Thus a
refactoring of Dom0 is recommended for stronger confinement
guarantees.

Domain --> RESOURCES Access
'''''''''''''''''''''''''''
(current work)

We define for each resource that we want to distinguish a separate STE
type. Each STE type is assigned to the respective resource and to
those domains that are allowed to access this resource. Type
enforcement will guarantee that other domains cannot access this
resource since they don't share the resource's STE type.

Since in the current implementation of Xen, Dom0 controls access to
all hardware (e.g., disk drives, network), Domain-->Resource access
control enforcement must be implemented in Dom0. This is possible
since Dom0 has access to both the domain configuration (including the
domain STE types) and the resource configuration (including the
resource STE types).

For purposes of gaining higher assurance in the resulting system, it
may be desirable to reduce the size of dom0 by adding one or more
"device domains" (DDs). These DDs, e.g. providing storage or network
access, can support one or more physical devices, and manage
enforcement of MAC policy relevant for said devices. Security benefits
come from the smaller size of these DDs, as they can be more easily
audited than monolithic device driver domains. DDs can help to obtain
maximum security benefit from sHype.


Example of a Simple Type Enforcement Policy Instantiation
---------------------------------------------------------

We define the following types:

* ste_SystemManagement identifies workloads (and domains that runs
them) that must share information to accomplish the management of the
system

* ste_PersonalFinances identifies workloads that are related to
sensitive programs such as HomeBanking applications or safely
configured web browsers for InternetBanking

* ste_InternetInsecure identifies workloads that are very
function-rich and unrestricted to offer for example an environment
where internet games can run efficiently

* ste_DonatedCycles identifies workloads that run on behalf of others,
e.g. a Boinc client

* ste_PersistentStorage identifies workloads that have direct access
to persistent storage (e.g., hard drive)

* ste_NetworkAccess identifies workload that have direct access to
network cards and related networks



SECURITY LABEL TEMPLATES
========================

We introduce security label templates because it is difficult for
users to ensure tagging of domains consistently and since there are
--as we have seen in the case of isolation-- useful dependencies
between the policies. Security Label Templates define type sets that
can be addressed by more user-friendly label names,
e.g. dom_Homebanking describes a typical typeset tagged to domains
used for sensitive Homebanking work-loads. Labels are defined in the
file

Using Security Label Templates has multiple advantages:
a) easy reference of typical sets of type assignments
b) consistent interpretation of type combinations
c) meaningful application-level label names

The definition of label templates depends on the combination of
policies that are used. We will describe some of the labels defined
for the Chinese Wall and Simple Type Enforcement combination.

In the BoincClient example, the label_template file specifies that
this Label is assigned the Chinese Wall type cw_Isolated. We do this
assuming that this BoincClient is isolated against the rest of the
system infrastructure (no persistent memory, no sharing with local
domains). Since cw_Isolated is not included in any conflict set, it
can run at any time concurrently with any other domain. The
ste_DonatedCycles type assigned to the BoincClient reflect the
isolation assumption: it is only assigned to the dom_NetworkDomain
giving the BoincClient domain access to the network to communicate
with its BoincServer.

The strategy for combining types into Labels is the following: First
we define a label for each type of general user domain
(workload-oriented). Then we define a new label for each physical
resource that shall be shared using a DD domain (e.g., disk) and for
each logical resource offered through this physical resource (logical
disk partition). We define then device domain labels (here:
dom_SystemManagement, dom_StorageDomain, dom_NetworkDomain) which
include the types of the physical resources (e.g. hda) their domains
need to connect to. Such physical resources can only be accessed
directly by device domains types with the respective device's STE
type. Additionally we assign to such a device domain Label the STE
types of those user domains that are allowed to access one of the
logical resources (e.g., hda1, hda2) built on top of this physical
resource through the device domain.


Label Construction Example:
---------------------------

We define here a storage domain label for a domain that owns a real
disk drive and creates the logical disk partitions hda1 and hda2 which
it serves to domains labeled dom_HomeBanking and dom_Fun
respectively. The labels we refer to are defined in the label template
file policies/chwall_ste/chwall_ste-security-label-template.xml.

step1: To distinguish different shared disk drives, we create a
separate Label and STE type for each of them. Here: we create a type
ste_PersistentStorageA for disk drive hda. If you have another disk
drive, you may define another persistent storage type
ste_PersistentStorageB in the chwall_ste-security_policy.xml.

step2: To distinguish different domains, we create multiple domain
labels including different types. Here: label dom_HomeBanking includes
STE type ste_PersonalFinances, label dom_Fun includes STE type
ste_InternetInsecure.

step3: The storage domain in charge of the hard drive A needs access
to this hard drive. Therefore the storage domain label
dom_StorageDomain must include the type assigned to the hard drive
(ste_PersistentStorageA).

step4: In order to serve dom hda1 to domains labeled dom_HomeBanking
and hda2 to domains labeled dom_Fun, the storage domain label must
include the types of those domains as well (ste_PersonalFinance,
ste_InternetInsecure).

step5: In order to keep the data for different types safely apart, the
different logical disk partitions must be assigned unique labels and
types, which are used inside the storage domain to extend the ACM
access enforcement to logical resources served from inside the storage
domain. We define labels "res_LogicalDiskPartition1 (hda1)" and assign
it to hda1 and "res_LogicalDiskPartition2 (hda2)" and assign it to
hda2. These labels must include the STE types of those domains that
are allowed to use them (e.g., ste_PersonalFinances for hda1).

The overall mandatory access control is then enforced in 3 different
Xen components and these components use a single consistent policy to
co-operatively enforce the policy. In the storage domain example, we
have three components that co-operate:

1. The ACM module inside the hypervisor enforces: communication between
user domains and the storage domain (only domains including types
ste_PersonalFinances or ste_InternetInsecure can communicate with the
storage domain and request access to logical resource). This confines
the sharing to the types assigned to the storage domain.

2. The domain management will enforce (work in progress): assignment of
real resources (hda) to domains (storage domain) that share a
type with the resource.

3. If the storage domain serves multiple STE types (as in our example),
it enforces (work in progress): that domains can access (mount)
logical resources only if they share an STE type with the respective
resource. In our example, domains with the STE type
ste_PersonalFinances can request access (mount) to logical resource
hda1 from the storage domain.

If you look at the virtual machine label dom_StorageDomain, you will
see the minimal set of types assigned to our domain manageing disk
drive hda for serving logical disk partitions exclusively to
dom_HomeBanking and dom_Fun.

Similary, network domains can confine access to the network or
network communication between user domains.

As a result, device domains (e.g., storage domain, network domain)
must be simple and small to ensure their correct co-operation in the
type enforcement model. If such trust is not possible, then hardware
should be assigned exclusively to a single type (or to a single
partition) in which case the hypervisor ACM enforcement enforces the
types independently.
