TrustZone Memory Partitioning - linux

I am reading about ARM Trustzone at this link. I understand that using TrustZone, one can partition the memory into secure and non-secure regions. Vendors may use this to run a secure OS.
What I am curious about is that what is the granularity support for this partition ? Is it just that there can be a block of memory marked "secure" and there can be only one such block of memory per OS ? Does TrustZone have the capacity to partition memory for individual processes ?
Lets say I have a .so file (hypothetical example) for a Linux application. Could it be possible that the same code in process A could be marked secure in virtual address 0x1000 to 0x2000, while in process B could be marked secure in virtual address 0x5000 to 0x6000 ?

TrustZone partitioning happens at the physical memory level, so the process-level parts of your question don't really apply. Note that Linux as the non-secure OS can't even see secure memory, so having virtual mappings for inaccessible addresses would be of little use; however the secure OS does have the ability to map both secure and non-secure physical addresses by virtue of the NS bit in its page table entries.
As for how that physical partitioning goes, it depends on the implementation. The TZC-380 your link refers to supports 2-16 regions with a minimum 32KB granularity; its successor the TZC-400 has 9 regions, and goes all the way down to 4KB granularity. Other implementations may be different still, although granularity below 4KB is unlikely since that would be pretty much unusable for the CPU with its MMU on. Also, there are usually some things in a system which are going to be hardwired to the secure memory map only (the TZC's programming interface, for one), and that often includes some dedicated secure SRAM.

Related

Virtual Memory and Virtual Address in Linux

I am currently studying about virtual memory in operating system and I have few questions.
Is swap partition or swap file same as virtual memory in terms of Linux?
If yes, then in case I've no swapping enabled in my Linux system, does that mean my system has no virtual memory?
I have also read that virtual memory makes system more secure because with virtual memory, CPU generates virtual addresses which are then translated to actual physical addresses by MMU, therefore securing the system because no process can actually interact with the actual physical memory. So if I just enable swapping on my Linux system, will my CPU start generating virtual addresses and currently it's directly generating physical addresses as I have no swap partition?
How does CPU know if virtual memory is present or not?
Having no swap file/partition doesn't imply that you don't have virtual memory. Modern operating-systems always use paging/virtual memory no matter what.
Is swap partition or swap file same as virtual memory in terms of Linux?
No swap file and virtual memory is not the same in terms of any OS. Virtual memory just says that all memory accesses are going to be translated by the MMU using the page tables. Modern OSes always use paging.
If yes, then in case I've no swapping enabled in my Linux system, does that mean my system has no virtual memory?
Your system certainly has virtual memory. To use long mode (64bits mode), the OS must enable paging. I doubt that you have a system old enough to not use paging. Page swapping to the hard-disk is not virtual memory. It is more like a feature of virtual memory that can be used to extend physical memory because a page which isn't required immediatly can be swapped to the hard-disk momentarily.
I have also read that virtual memory makes system more secure because with virtual memory, CPU generates virtual addresses which are then translated to actual physical addresses by MMU, therefore securing the system because no process can actually interact with the actual physical memory. So if I just enable swapping on my Linux system, will my CPU start generating virtual addresses and currently it's directly generating physical addresses as I have no swap partition?
Your computer certainly has paging/virtual memory enabled. Having no swap partition doesn't mean that you don't have virtual memory. Paging can also be used to avoid fragmentation of RAM and for security. You are right that paging is securing your system because the page tables prevent a process from accessing the memory of another process. It also has ring privilege on a page to page basis which allows to differentiate between kernel mode and user mode code.
How does CPU know if virtual memory is present or not?
The OS just enables paging by setting a bit in a control register. Then the CPU starts blindly translating every memory accesses using the MMU.
No. Swap file is not the same as virtual memory.
Once the firmware/kernel sets up the necessary registers and/or in-memory data structures and switches the processor mode, virtual memory mappings are used for accessing the physical memory.
Yes, the inability of processes to refer to memory locations without a mapping allows the kernel to employ isolation and access control mechanisms.
Through active mappings, different virtual addresses can map to the same physical memory region at different times. The kernel can maintain the illusion that a larger amount of memory is available that the capacity of the actual physical memory, where only a subset of the virtual memory resides in the physical memory at any given time. The rest is stored in the swap file.
Accesses to virtual addresses where the corresponding data is currently in the swap file are trapped by the kernel (via a page fault) and might lead to the kernel swapping the data in, and swapping some other data from physical memory out.
If you disable the swap file, the kernel has no place store the swapped out data. This reduces the amount of virtual memory available.

What is cost of context switching to secure mode (arm trustzone)

I am trying to understand the cost of switching back and forth between trusted (secure) and non-secure modes in arm.
What exactly needs to happen when moving from non-secure to secure world? I know the ns bit needs to be set (based on some special instruction?), the page tables need to be flushed and updated (?), the processor caches flushed and updated. Anything else that needs to happen?
Processor caches: Are they caches segmented and shared or is the whole cache used for each mode? That determines the cost of the switch.
RAM: This must be 'partitioned' and used by both modes. So addressing is just an offset into the 'partition'. Is this right?
What is different about this from a user space to kernel mode switch or a process to process switch in user space?
Is there anything in moving from non-secure to secure modes that would make it more expensive than the regular process context switch?
Are there any articles that explain what exactly happens?
EDIT: Based on a reply below, I am looking to understand what exactly happens when a process switches from non-secure mode to a secure mode (trust zone) on an arm processor.
What exactly needs to happen when moving from non-secure to secure world?
TL-DR; the minimum is to save/restore all CPU registers that are needed by the secure world and change the NS bits. Normally, R0-R14 as well as current mode, and banked LR and SP (aborts, interrupts, etc) are in this register group. Everything else depends on your security model.
First off, there are many different models that can be used in TrustZone; TrustZone is a tool not a solution. The most basic model is a library with API where some secure data is stored (ie decryption keys) to process by an external source (some DRM download from the 'normal world' space). I assume you don't mean this.
An OS can be pre-emptible and non-premptible. If you have two OSes in both worlds, then how control is relinquished, resources shared and security assets protected will all come into play on a world switch.
In many cases, the caches and TLB are world aware. Devices may also be world aware and designed with the intent that context is built into the device. This is not to say that some system might have information leaked in some way.
Meltdown (2017)
Specter (2017)
Hyperthreading exploit (2004)
If you are really concerned about this type of attack, it may be appropriate to mark the secure world memory as non-cached that needs to be protected. In many ARM systems, the L1/L2 and TLB cache are unified between worlds and can provide a side channel attack.
TrustZone as implmented on many ARM devices comes with a GIC which can run FIQ in the secure world and masking of FIQ can be prevented in the normal world. Many GIC features are banked between worlds allowing both OSes to use it without 'context switch' information. Ie, the NS bit will automatically change the accessed GIC features based on the state of the NS bit (so it has the context stored in the device). Many other vendor specific devices are designed to behave this way.
If both worlds use NEON/VFP, then you need to save/restore these registers on a world switch as well. For pre-emption you may need to hook into the OS secure scheduler to allow and normal world interrupt to pre-empt the secure world main line (obviously this depends on assets you are trying to protect; if you allow this the secure mainline has a DOS vector).
If there are glitches in devices, then you may need to save/restore device state. If the normal world is restricted from using FIQ mode, it is still needed to at least clear the SP_fiq and LR_fiq when going to the normal world (and restore the secure value the other way). Some of these registers are difficult to save/restore as you must switch modes which can itself be a security risk if care is not taken.
RAM: This must be 'partitioned' and used by both modes. So addressing is just an offset into the 'partition'. Is this right?
Secure boot will partition memory based on the 'NS bit'. The physical memory will be visible or not based on the partition manager device logic which can often be locked at boot. Ie, if non-visible it is a bus error like any non-existent memory. There is no 'switch' beside the NS bit.
Is there anything in moving from non-secure to secure modes that would make it more expensive than the regular process context switch?
Yes a normal switch is only for a 'mode'. A world is for all ARM modes and so all banked registers must be switched. Depending on the system the TLB and cache would not normally need to be switched.
Related:
How to introspect normal world
TrustZone monitor mode switch design
Preventing memory access from the normal world
How is a TrustZone OS secure?
TrustZone scheduler in secure/non-secure OS
IMX53 and TrustZone
ARM Trusted firmware on github
TrustZone Whitepaper

Windows Program Memory Vs Linux Program Memory

Linux creates virtual memory pages for every program to use, and the OS handles mapping the virtual addresses to genuine hardware addresses, correct?
But how does Windows do this? Do Windows programs actually have memory that translates to real hardware addresses? I'm also aware that windows can use hard disk memory when RAM is over used, and this process is again called virtual memory, but I believe this is an entirely different concept?
Windows and Linux (at least on Intel 32/64 bit systems) both implement virtual memory using the same mechanism: hardware supported page tables. The OS and the hardware cooperate together to do the address mapping.
The entire concept of separating the logical addresses a program uses from the physical addresses is what is called virtual memory. The use of the hard disk as a backing store is an implementation of virtual memory that uses a swap file to increase the amount of virtual memory to an amount greater than the physical memory installed in the system.
Virtual memory is a pretty deep and wide subject. Maybe start with this Wiki article an Memory Management and then hit the googles for a deeper understanding.

Secure mode can access secure / non secure memory how?

As per CortexA prog Guide
TrustZone hardware also effectively provides two virtual MMUs, one for each virtual processor. This enables each world to have a local set of translation tables, with the Secure world mappings hidden and protected from the Normal world.
The page table descriptions include a NS bit, which is used to determine whether accesses are made to the secure or non-secure physical address space.
Although the page table entry bit is still present, the Normal virtual processor hardware does not use this field, and memory accesses are always made with NS = 1. The Secure virtual processor can therefore access either Secure or Normal memory. Cache and TLB hardware permits Normal and Secure entries to co-exist.
So If a code (running in secure mode) is to be written to access say address 0xA0000000 [NS] and 0xA0000000[S] how would it be coded?
So If a code (running in secure mode) is to be written to access say address 0xA0000000 [NS] and 0xA0000000[S] how would it be coded?
It is possible you have a conceptual issue here. There is no physical address 0xA0000000 [NS] and 0xA0000000[S], there is only the physical address 0xA0000000. The NS bit is used by a bus controller, like the HPROT (user/supervisor) access to check permissions on the access; afterwards, only one physical memory address stores the result. In this way, the SDRAM device does not need to be TrustZone aware, but just the bus controllers.
You need the to setup the partition checker to have a world shareable mapping. That is read/write access in both worlds. Then the information as scott gives applies. If both OSes, have an MMU, then create two mappings with the same physical address. Two copies of the memory and MMU entries may exist in the L1-cache and TLB. There is no issue with the duplicate TLB. The L1 may need flushing after writing to this memory. There will be two lines both with the same data, but one tagged with NS and one without.
Hyperthreading for fun and profit may be an interesting paper in this context.
The easiest way would be to setup two mappings in the secure MMU translation table which both use physical address 0xA0000000, one which has the NS bit set and another copy at a different virtual address that has the NS bit clear. Then secure states can use the two virtual addresses to make the different accesses.
You could also use just one mapping and change the NS bit, but this would require flushing the TLB after each change.

What is partition checker in ARM Secure Mode

As per this link
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0333h/Chdfjdgi.html
under
System boot sequence
...
Program the partition checker to allocate physical memory available to the Non-secure OS.
What is the partition checker? Is it a subsystem which has registers, what is its programming model ?
What is the partition checker?
It is outside of the TrustZone specification for the CPU. However, in a nut shell it partitions or divided memory spaces into different permitted accesses. If the access is not permitted, it throws an external BUS error.
Is it a subsystem which has registers, what is its programming model?
Typically, it is a bunch of registers. It maybe multiple register files. For instance, an APB (peripheral bus), AHB (older ARM bus) and a new AXI (TrustZone aware bus) may all be present in one system. There may even be multiple APB buses, etc.
From the same page,
The principle of TrustZone memory management is to partition the physical memory into Secure and Non-secure regions.
It should be added that partitioning the masters as secure and non-secure is also important. The partitioning is outside the ARM CPU TrustZone specification; it is part of the BUS architecture. It is up to a bus controller/structure to implement this. The bus controller has both masters (CPUs, DMA peripherals, etc) and slaves (memory devices, register interfaces, etc) connected.
Partitioning in the context of the ARM TrustZone document is a little nebulous as it is up to each SOC and the bus controllers (and hierarchy) to implement the details. As above, it partitions or divided memory spaces into different permitted accesses. This is just like supervisor versus user access with traditional ARM (AMABA) AHB buses. The AXI interface adds an NS bit.
Here are possible combinations for a bus controller to support.
| Read | Write
-------------+--------+-------
Normal User | yes/no | yes/no
Normal Super | yes/no | yes/no
Secure User | yes/no | yes/no
Secure Super | yes/no | yes/no
The SCR NS bit will dynamically determine whether the 'NS' bit is set on bus accesses. This is a TrustZone difference. For the super and user, there is a traditional HPROT bit. As well, each master will assert a WRITE/~READ signal (maybe the polarity is different, but we are software not hardware).
A DMA master (Ethernet, USB, etc) may also send out requests to a BUS. Typically, these are setup and locked at boot time. If your secure world uses the Ethernet, then it is probably a secure DMA master to access secure memory. The Ethernet chip also typically has a slave register interface. It must be marked (or partitioned) as secure. If the normal world accesses the ethernet register file, then an BUS error is thrown. A vendor may also make DMA peripherals that dynamically set the NS bit depending on the command structure. The CAAM is a crypto driver that can setup job descriptions to handle both normal and secure access, as an example of a DMA master which does both.
A CPU (say Cortex-M4 or Cortex-R) may also be globally secure or normal. Only the Cortex-A series (and ARMv6) with full TrustZone will dynamically toggle the NS bit allowing the CPU to be both secure and normal, depending on context.
Slave peripherals maybe partitioned. For example, the first 10MB of SDRAM maybe both normal and secure read and write for inter-world communication. Then next 54MB, maybe normal only read/write for the normal world. Then a final 64MB of read/write secure for the secure world. Typically, register interfaces for peripherals are an all or none setup.
These are all outside of the scope of an MMU and deal only with physical addresses. If the SOC locks them after boot, it is impossible for anyone to change the mapping. If the secure world code is read-only, it maybe more difficult to engineer an exploit.
Typically, all APB buses are layered on an AHB bus, which connects to an AXI main bus like a tree. The AXI bus is the default for a Cortex-A. Each BUS will have a list of slaves and masters and will support various yes and no configurations, which maybe a subset of the list above; Ie, it may not care about read/write or super/user or some other permutations. It will be different for each ARM system. In some cases, a vendor may not even support it. In this case, it maybe more difficult to make the system secure or even use TrustZone. See: Handling ARM TrustZones‌​, where some of the bus issues are touched on in less details.
See: TrustZone versus Hypervisor which gives some more details.

Resources