What is partition checker in ARM Secure Mode - linux

As per this link
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0333h/Chdfjdgi.html
under
System boot sequence
...
Program the partition checker to allocate physical memory available to the Non-secure OS.
What is the partition checker? Is it a subsystem which has registers, what is its programming model ?

What is the partition checker?
It is outside of the TrustZone specification for the CPU. However, in a nut shell it partitions or divided memory spaces into different permitted accesses. If the access is not permitted, it throws an external BUS error.
Is it a subsystem which has registers, what is its programming model?
Typically, it is a bunch of registers. It maybe multiple register files. For instance, an APB (peripheral bus), AHB (older ARM bus) and a new AXI (TrustZone aware bus) may all be present in one system. There may even be multiple APB buses, etc.
From the same page,
The principle of TrustZone memory management is to partition the physical memory into Secure and Non-secure regions.
It should be added that partitioning the masters as secure and non-secure is also important. The partitioning is outside the ARM CPU TrustZone specification; it is part of the BUS architecture. It is up to a bus controller/structure to implement this. The bus controller has both masters (CPUs, DMA peripherals, etc) and slaves (memory devices, register interfaces, etc) connected.
Partitioning in the context of the ARM TrustZone document is a little nebulous as it is up to each SOC and the bus controllers (and hierarchy) to implement the details. As above, it partitions or divided memory spaces into different permitted accesses. This is just like supervisor versus user access with traditional ARM (AMABA) AHB buses. The AXI interface adds an NS bit.
Here are possible combinations for a bus controller to support.
| Read | Write
-------------+--------+-------
Normal User | yes/no | yes/no
Normal Super | yes/no | yes/no
Secure User | yes/no | yes/no
Secure Super | yes/no | yes/no
The SCR NS bit will dynamically determine whether the 'NS' bit is set on bus accesses. This is a TrustZone difference. For the super and user, there is a traditional HPROT bit. As well, each master will assert a WRITE/~READ signal (maybe the polarity is different, but we are software not hardware).
A DMA master (Ethernet, USB, etc) may also send out requests to a BUS. Typically, these are setup and locked at boot time. If your secure world uses the Ethernet, then it is probably a secure DMA master to access secure memory. The Ethernet chip also typically has a slave register interface. It must be marked (or partitioned) as secure. If the normal world accesses the ethernet register file, then an BUS error is thrown. A vendor may also make DMA peripherals that dynamically set the NS bit depending on the command structure. The CAAM is a crypto driver that can setup job descriptions to handle both normal and secure access, as an example of a DMA master which does both.
A CPU (say Cortex-M4 or Cortex-R) may also be globally secure or normal. Only the Cortex-A series (and ARMv6) with full TrustZone will dynamically toggle the NS bit allowing the CPU to be both secure and normal, depending on context.
Slave peripherals maybe partitioned. For example, the first 10MB of SDRAM maybe both normal and secure read and write for inter-world communication. Then next 54MB, maybe normal only read/write for the normal world. Then a final 64MB of read/write secure for the secure world. Typically, register interfaces for peripherals are an all or none setup.
These are all outside of the scope of an MMU and deal only with physical addresses. If the SOC locks them after boot, it is impossible for anyone to change the mapping. If the secure world code is read-only, it maybe more difficult to engineer an exploit.
Typically, all APB buses are layered on an AHB bus, which connects to an AXI main bus like a tree. The AXI bus is the default for a Cortex-A. Each BUS will have a list of slaves and masters and will support various yes and no configurations, which maybe a subset of the list above; Ie, it may not care about read/write or super/user or some other permutations. It will be different for each ARM system. In some cases, a vendor may not even support it. In this case, it maybe more difficult to make the system secure or even use TrustZone. See: Handling ARM TrustZones‌​, where some of the bus issues are touched on in less details.
See: TrustZone versus Hypervisor which gives some more details.

Related

Why applications cannot access a hardware device directly ? Why we need to switch to kernel space in order to do this?

I wondered why we need to switch to kernel space when we want to access a hardware device. I understand that sometimes, for specific actions such as memory allocation, we need to make system calls in order to switch from user space to kernel space because the operating system needs to organize everything and make a separation between processes and how they use memory and others. But why we can't directly access a hardware device ?
There is no problem in writing your own driver to access the hardware from User Space and plenty of documentation is available. For example, this tutorial at xatlantis seems to be recent and good source.
The reason it has been designed like that is because mainly due to security reasons .Most systems I know about specifically do not allow user programs to do I/O or to access kernel space memory. Such things would lead to wildly insecure systems, because with access to the kernel a user program could change permissions and get access to any data anywhere in the system, and presumably change it.
References:
XATLANTIS
STACKEXCHANGE
A device-driver may choose to provide access from user processes to device registers, device memory, or both. A common method is a device-specific service connected with an mmap() request. Consider a frame-buffer's on-board memory, and efficiency from a user process being able to r/w that space directly. For devices in general, notably there are security considerations and drivers that provide direct access often set limits to processes with sufficient credentials. Files within /dev are usually set with owner/group access permissions similarly limited.

What is cost of context switching to secure mode (arm trustzone)

I am trying to understand the cost of switching back and forth between trusted (secure) and non-secure modes in arm.
What exactly needs to happen when moving from non-secure to secure world? I know the ns bit needs to be set (based on some special instruction?), the page tables need to be flushed and updated (?), the processor caches flushed and updated. Anything else that needs to happen?
Processor caches: Are they caches segmented and shared or is the whole cache used for each mode? That determines the cost of the switch.
RAM: This must be 'partitioned' and used by both modes. So addressing is just an offset into the 'partition'. Is this right?
What is different about this from a user space to kernel mode switch or a process to process switch in user space?
Is there anything in moving from non-secure to secure modes that would make it more expensive than the regular process context switch?
Are there any articles that explain what exactly happens?
EDIT: Based on a reply below, I am looking to understand what exactly happens when a process switches from non-secure mode to a secure mode (trust zone) on an arm processor.
What exactly needs to happen when moving from non-secure to secure world?
TL-DR; the minimum is to save/restore all CPU registers that are needed by the secure world and change the NS bits. Normally, R0-R14 as well as current mode, and banked LR and SP (aborts, interrupts, etc) are in this register group. Everything else depends on your security model.
First off, there are many different models that can be used in TrustZone; TrustZone is a tool not a solution. The most basic model is a library with API where some secure data is stored (ie decryption keys) to process by an external source (some DRM download from the 'normal world' space). I assume you don't mean this.
An OS can be pre-emptible and non-premptible. If you have two OSes in both worlds, then how control is relinquished, resources shared and security assets protected will all come into play on a world switch.
In many cases, the caches and TLB are world aware. Devices may also be world aware and designed with the intent that context is built into the device. This is not to say that some system might have information leaked in some way.
Meltdown (2017)
Specter (2017)
Hyperthreading exploit (2004)
If you are really concerned about this type of attack, it may be appropriate to mark the secure world memory as non-cached that needs to be protected. In many ARM systems, the L1/L2 and TLB cache are unified between worlds and can provide a side channel attack.
TrustZone as implmented on many ARM devices comes with a GIC which can run FIQ in the secure world and masking of FIQ can be prevented in the normal world. Many GIC features are banked between worlds allowing both OSes to use it without 'context switch' information. Ie, the NS bit will automatically change the accessed GIC features based on the state of the NS bit (so it has the context stored in the device). Many other vendor specific devices are designed to behave this way.
If both worlds use NEON/VFP, then you need to save/restore these registers on a world switch as well. For pre-emption you may need to hook into the OS secure scheduler to allow and normal world interrupt to pre-empt the secure world main line (obviously this depends on assets you are trying to protect; if you allow this the secure mainline has a DOS vector).
If there are glitches in devices, then you may need to save/restore device state. If the normal world is restricted from using FIQ mode, it is still needed to at least clear the SP_fiq and LR_fiq when going to the normal world (and restore the secure value the other way). Some of these registers are difficult to save/restore as you must switch modes which can itself be a security risk if care is not taken.
RAM: This must be 'partitioned' and used by both modes. So addressing is just an offset into the 'partition'. Is this right?
Secure boot will partition memory based on the 'NS bit'. The physical memory will be visible or not based on the partition manager device logic which can often be locked at boot. Ie, if non-visible it is a bus error like any non-existent memory. There is no 'switch' beside the NS bit.
Is there anything in moving from non-secure to secure modes that would make it more expensive than the regular process context switch?
Yes a normal switch is only for a 'mode'. A world is for all ARM modes and so all banked registers must be switched. Depending on the system the TLB and cache would not normally need to be switched.
Related:
How to introspect normal world
TrustZone monitor mode switch design
Preventing memory access from the normal world
How is a TrustZone OS secure?
TrustZone scheduler in secure/non-secure OS
IMX53 and TrustZone
ARM Trusted firmware on github
TrustZone Whitepaper

Windows/Linux: Can a malicious program read/write the memory-mapped space of a PCIe peripheral?

I apologize in advance for the lack of precision in my phrasing/terminology...I'm not a system programmer by any means...
This is a security-related programming question...at work, I've been asked to assess the "risk" to a PCIe add-in card depending on the integrity of the host operating-system (specifically, Windows Server 2012 x64, and Redhat Enterprise 6/7 x86-64.)
So my question is this:
We have a PCIe-peripheral (add-in board) that contains several embedded processors that will handle sensitive data. The preferred solution would be to encrypt the data before it enters the PCIe-bus, and decrypt it after it leaves the PCIe-bus...but we can't do this for a variety of reasons (performance, cost, etc.) Instead, we'll be passing data in cleartext form over the PCIe-bus.
Let's assume an attacker has network access to the machine, but not physical access. If a vendor's PCIe-endpoint device is installed in a server, and the vendor's (signed) driver is up and running with the associated hardware, is it possible for a malicious process/thread to access (read/write) the PCI memory-mapped space(s) of the PCIe-endpoint?
I know there are utilities that allow me to dump (read) the pci config space of all endpoints in a pcie hierarchy...but I have no idea if that extends to reading and writing inside the memory-mapped windows of the installed endpoints (especially if the endpoint is already associated with a device-driver.)
Also, if this is possible, how difficult is it?
Are we talking a user-space program being able to do this, or does it require the attacker to have root/admin-access to the machine (to run a program of his design, or install a fake/proxy driver.)?
Also, does virtualization make a difference?
Accessing device memory requires operating in a lower protection ring than userland software, also known as kernel mode. The only way to access it is going through a driver or the kernel.

What kind of api does a sata hard drive expose?

I understand that the linux kernel uses a driver to communicate with the hard disk device and that there is firmware code on the device to service the driver's requests. My questions are:
what kind of functionality (i.e. api) does the firmware expose? For example, does it only expose an address space that the kernel manages, or is there some code in the linux kernel that deals with some of the physics related to the hard drive (i.e. data layout on track/sector/platter etc...)
Does the kernel schedule the disk's head movement, or is it the firmware?
Is there a standard spec for the apis exposed by hard disk devices?
I understand that the linux kernel uses a driver to communicate with the hard disk device
That's true for all peripherals.
there is firmware code on the device to service the driver's requests
Modern HDDs (since the advent of IDE) have an integrated disk controller.
"Firmware" by itself isn't going to do anything, and is an ambiguous description. I.E. what is executing this "firmware"?
what kind of functionality (i.e. api) does the firmware expose? For example, does it only expose an address space that the kernel manages, or is there some code in the linux kernel that deals with some of the physics related to the hard drive (i.e. data layout on track/sector/platter etc...)
SATA drives use the ATA Packet Interface, ATAPI.
The old SMD and ST506 drive interfaces used cylinder, head, and sector (aka CHS) addressing. Disk controllers for such drives typically kept a similar interface on the host side, so the operating system was obligated to be aware of the drive (physical) geometry. OSes would try to optimize performance by aligning partitions to cylinders, and minimize seek/access time by ordering requests by cylinder address.
Although the disk controller typically required CHS addressing, the higher layers of an OS would use a sequential logical sector address. Conversion between a logical sector address to cylinder, head, & sector address is straightforward so long as the drive geometry is known.
The SCSI and IDE (ATA) interfaces for the host side of the disk controller offered logical block addressing (block = sector) rather than CHS addressing. The OS no longer had to be aware of the physical geometry of the drive, and the disk controller was able to use the abstraction of logical addressing to implement a more consistent areal density per sector using zone-bit recording.
So the OS should only issue a read or write block operation with a logical block address, and not be too concerned with the drive's geometry.
For example, low-level format is no longer possible through the ATA interface, and the drive's geometry is variable (and unknown to the host) due to zone-bit recording. Bad sector management is typically under sole control of the integrated controller.
However you can probably still find some remnants of CHS optimization in various OSes (e.g. drive partitions aligned to a "cylinder").
Does the kernel schedule the disk's head movement, or is it the firmware?
It's possible with a seek operation, but more likely the OS uses R/W operations with auto-seek or LBA R/W operations.
However with LBA and modern HDDs that have sizeable cache and zone-bit recording, such seek operations are not needed and can be counterproductive.
Ultimately the disk controller performs the actual seek.
Is there a standard spec for the apis exposed by hard disk devices?
ATA/ATAPI is a published specification (although it seems to be in a "working draft" state for 20 years).
See http://www.t13.org/Documents/UploadedDocuments/docs2013/d2161r5-ATAATAPI_Command_Set_-_3.pdf
ABSTRACT
This standard specifies the AT Attachment command set used to communicate between host systems and
storage devices. This provides a common command set for systems manufacturers, system integrators, software
suppliers, and suppliers of storage devices. The AT Attachment command set includes the PACKET feature set
implemented by devices commonly known as ATAPI devices. This standard maintains a high degree of
compatibility with the ATA/ATAPI Command Set - 2 (ACS-2).

TrustZone Memory Partitioning

I am reading about ARM Trustzone at this link. I understand that using TrustZone, one can partition the memory into secure and non-secure regions. Vendors may use this to run a secure OS.
What I am curious about is that what is the granularity support for this partition ? Is it just that there can be a block of memory marked "secure" and there can be only one such block of memory per OS ? Does TrustZone have the capacity to partition memory for individual processes ?
Lets say I have a .so file (hypothetical example) for a Linux application. Could it be possible that the same code in process A could be marked secure in virtual address 0x1000 to 0x2000, while in process B could be marked secure in virtual address 0x5000 to 0x6000 ?
TrustZone partitioning happens at the physical memory level, so the process-level parts of your question don't really apply. Note that Linux as the non-secure OS can't even see secure memory, so having virtual mappings for inaccessible addresses would be of little use; however the secure OS does have the ability to map both secure and non-secure physical addresses by virtue of the NS bit in its page table entries.
As for how that physical partitioning goes, it depends on the implementation. The TZC-380 your link refers to supports 2-16 regions with a minimum 32KB granularity; its successor the TZC-400 has 9 regions, and goes all the way down to 4KB granularity. Other implementations may be different still, although granularity below 4KB is unlikely since that would be pretty much unusable for the CPU with its MMU on. Also, there are usually some things in a system which are going to be hardwired to the secure memory map only (the TZC's programming interface, for one), and that often includes some dedicated secure SRAM.

Resources