I am trying to understand ARM Linux Boot Process.
These are the things I understood:
When reset button is pressed in any processor, it jumps to the reset vector or address, in case of ARM it is either 0x00 or 0xFFFF0000.
This location contains the start up code or ROM Code or Boot ROM Code
My query is how this Boot ROM Code gets the address of u-boot first instruction ?
It depends on the SoC, and the scheme used for booting will differ from one SoC to the other. It is usually documented in the SoC's reference manual, and it does describe the various conventions (where to read u-boot from, specific addresses) the u-boot port specific to this SoC should follow in order to the code in ROM to be able to load u-boot, and ultimately transfer control to u-boot.
This code in the ROM could do something like: - If pin x is 0, read 64KiB from first sector of the eMMC into the On Chip Static RAM, then transfer control to the code located at offset 256 of the OCRAM for example. - If pin x is 1, configure the UART for 19200 bauds, 8 bits parity, no stop bits, attempt to read 64KiB from the serial port using the X-MODEM protocol into the OCRAM, then transfer control to the code located at offset 256 of the OCRAM.This code, which is often named the Secondary Program Loader (SPL) would then be responsible for, say, configuring the SDRAM controller, then read the non-SPL part of u-boot into at the beginnning of the SDRAM, then jump to a specific address in the SDRAM. The SPL for a given SoC should be small enough to fit in the SoC On Chip RAM. The ROM code would be the Primary Boot Loader in this scenario.
In the case of the TI AM335x Cortex-A8 SoCs for example, section 26.1.6 of the Technical Reference Manual, and more specifically figure 26-10, explains the boot process. Some input pins may be used by the ROM code to direct the boot process - see SYSBOOT Configuration Pins in table 26-7. See The AM335x U-Boot User's Guide for more u-boot specific, AM335x-related information.
ARM doesnt make chips it makes IP that chip vendors purchase. It is one block in their chip design, usually they have many other blocks, usb controllers (likely purchased ip), pcie controller (likely purchased ip), ddr, ethernet, sata, emmc/sd, etc. Plus glue logic, plus whatever their secret sauce is they add to this to make it different and more interesting to the consumer than the competition.
The address space, particularly for the full sized arms is fairly wide open, so even if they use the same usb ip as everyone else, doesnt mean it is at the same address as everyone else.
There is no reason to assume that all chips with a cortex-a are centered around the cortex-a, the cortex-a may just be there to boot and manage the real thing the chip was made for. The chips you are asking about are most likely centered around the ARM processor the purpose of the chip is to make a "CPU" that is ARM based. What we have seen in that market is a need to support various non-volatile storage solutions. Some may wish to be ram heavy and dont care that a slow spi flash is used to get the kernel and root file system over and runtime everything is ram based including the file system. Some may wish to support traditional hard drives as well as ram, the file system being on something sata for example, spinning media or ssd. Some may wish to use eMMC, SD, etc. With the very high cost of chip production, it does not make sense to make one chip for each combination, rather make one chip that supports many combinations. You use several "strap" pins (which are not pins but balls/pads on the BGA) that the customer ties to ground or "high" whatever the definition of that voltage is so that when the chip comes out of reset (whichever of the reset pins for that product are documented as sampling the strap pins) those strap pins tell the "processor" (chip as an entity) how you desire it to boot. I want to you to first look for an sd card on this spi bus if nothing there then look for a sata drive on this interface if nothing there then please fall into the xmodem bootloader on uart0.
This leads into Frant's excellent answer. What IP is in the chip, what possible non-volatile storage is supported and what possible solutions of loading a bootloader if the "chip" itself supports it are very much chip specific not just broadcom does it this way and ti does it another but a specific chip or family of chips within their possibly vast array of products, no reason to assume any two products from a vendor work the same way, you read the documentation for each one you are interested in. Certainly dont assume any two vendors details are remotely the same, it is very likely that they have purchased similar ip for certain technologies (maybe everyone uses the same usb ip for example, or most usb ip conforms to a common set of registers, or maybe not...).
I have not gotten to the arm core, you could in these designs likely change your mind and pull the arm out and put a mips in and sell that as a product...
Now does it make sense to say write logic to read a spi flash that loads the contents of that flash into an internal sram and then for that boot mode put that sram at the arm processors address zero then reset the arm? yes that is not a horrible idea to do that in logic only. But does it make sense for example to have logic dig through a file system of a sata drive to find some bootloader? Maybe not so much, possible sure, but maybe your product will be viable longer if you instead put a rom in the product that can be aimed at the arm address zero the arm boots that, the arm code in that rom reads the straps, makes the decision as to what the boot media is spins up that peripheral (sata, emmc, spi, etc) wades through the filesystem looking for a filename, copies that file to sram, re-maps the arm address space (not using an mmu but using the logic in the chip) and fakes a reset to that by branching to address zero. (the rom is mapped in two places at least address zero then some other address so that it can branch to the other address space allowing address zero to be remapped and reused). so that if down the road you find a bug all you hve to do is change the image burned into the rom before you deliver the chips, rather than spin the chip to change the transistors and/or wiring of the transistors (pennies and days/weeks vs millions of dollars and months). so you may actually never see or be the code that the arm processor boots into on reset. The reset line to the arm core you might never have any physical nor software access to.
THEN depending on the myriad of boot options for this or any of the many chip offerings, the next step is very much specific to that chip and possibly that boot mode. You as owning all the bootcode for that board level product may have to per the chip and board design, bring up ddr, bring up pcie, bring up usb. Or perhaps some chip vendor logic/code has done some of that for you (unlikely, but maybe for specific boot cases). Now you have these generic and popular "boot loaders" like u-boot, you as the software designer and implementer may choose to have code that preceeds u-boot that does a fair amount of the work because maybe porting u-boot is a PITA, maybe not. Also note u-boot is in no way required for linux, booting linux is easy, u-boot is a monstrosity, a beast of its own, the easiest way to boot linux is to not bother porting u-boot. what u-boot gives is an already written bootloader, it is an argument that can go either way is it cheaper to port u-boot or is it cheaper to just roll your own (or port one of the competitors to u-boot)? Depends on the boot options you want, if you want bootp/tftp or anything network stack, well thats a task although there are off the shelf solutions. If you want to access a file system on some media, well that is another strong argument to just use u-boot. but if you dont need all of that, then maybe you dont need u-boot.
You have to walk the list of things that need to happen before linux boots, the chips tend to not have enough on chip ram to hold the linux kernel image and the root file system, so you need to get ddr up before linux. you probably need to get pcie up and enumerated and maybe usb I have not looked at that. but ethernet that can be brought up by the linux driver for that peripheral as an example.
The requirements to "boot" linux on arm ports of linux and probably others are relatively simple. you copy the linux kernel to some space in memory ideally aligned or at an agreed offset from an aligned address (say 0x10001000 for example, just pulling that out of the air), you then provide a table of information, how much ram there is, the ascii kernel boot string, and these days the device tree information. you branch to the linux kernel with one of the registers say r0 pointed at this table (google ATAG arm linux or some such combination of words). thats it booting linux using a not that old kernel is setting a few dozen bytes in ram, copy the kernel to ram, and branch to it, a few dozen lines of code, no need for the u-boot monstrosity. Now it is more than a few dozen bytes but it is still a table generated independent of u-boot, place that in ram, place the kernel in ram set one or more registers to point at the table, branch to the address where the kernel lives "booting linux" is complete or the linux bootloader is complete.
you still have to port linux which is a task that requires lots of trial and error and eventually years of experience. particularly since linux is a constantly evolving beast in and of itself.
How do you get to u-boot code? you may have some pre-u-boot code you have to write to find u-boot and copy it to ram then branch to it. the chip vendor may have solved this for you and "all you have to do" is put u-boot where they tell you for the media choice, and then u-boot is placed at address zero in the arm memory space for some internal sram, or u-boot is placed at some non-zero address in the arm memory space and some magic (a rom based bootloader in the chip) causes your u-boot code to execute from that address.
One I messed with recently is the ti chip used on various beagle boards, black, green, white, pocket, etc...One of the boot modes it looks at a specific offset on the sd card (not part of a file system yet, a specific logical block if you will or basically specific offset in the sd card address space) for a table, that table includes where in the "processors" address space you want the attached "bootloader" to be copied to, is it compressed, etc. you make your bootloader (roll your own or build a u-boot port) you build the correct table per the documentation with a destination address how much data, possibly a crc/checksum, whatever the docs say. the "chip" magically (probably software but might be pure logic) copies that over and causes the arm to start executing at that address (likely arm software that simply branches there). And that is how YOU get u-boot running on that product line with that boot option.
The SAME product line has other strap options, has other sd-card boot options to get a u-boot loaded and running.
Other products from other vendors have different solutions.
The broadcom chip in the raspberry pi, totally different beast, or at least how it is used. It has a broadcom (invented or purchased) gpu in it, that gpu boots some rom based code that knows how to find its own first stage bootloader on an sd card, that first stage loader does things like initialize DDR, there isnt pcie so that doesnt have to happen and I dont think the gpu cares about usb so that doesnt have to get enumerated either. but it does search out a second stage bootloader of gpu code, which is really an RTOS that it is loading, the code the GPU uses to do its graphics features to offload the burden on the ARM. In addition to that that software also looks for a third file on the flash (and fourth and nth) lets just go with third kernel.img which it copies to ram (the ddr is shared between the gpu and the arm but with different addressing schemes) at an agreed offset (0x8000 if kernel.img is used without config.txt adjustments to that) the gpu then writes a bootstrap program and ATAGs into arms memory at address zero and then releases reset on the ARM core(s). The GPU is the bootloader, with relatively limited options, but for that platform design/solution one media option, a removable sd card, what operating system, etc you run on the arm is whatever is on that sd card.
I think you will find the lots of straps driving multiple possible non-volatile storage media peripherals being the more common solution. Whether or not one or any of these boot options for a particular SOC can take u-boot (or choose your bootloader or write your own) directly or of a pre-u-boot program is required for any number of reasons (on chip sram is too small for a full u-boot lets say for example, sake of argument) is specific to that boot option for that chip from that vendor and is documented somewhere although if you are not part of the company making the board that signed the NDA with that chip vendor you may not get to see that documentation. And/or as you may know or will learn, that doesnt mean the documentation is good or makes sense. Some companies or products do a bad job, some do a good job and most are somewhere in between. If you are paying them for parts and have an NDA you at least have the option of getting or buying tech support and can ask direct questions (again the answers may not be useful, depends on the company and their support).
Just because there is an ARM inside means next to nothing. ARM makes processor cores not IP, depending on the SOC design it may be easy or moderatly painful but possible to pull the arm out and put some other purchased ip (like MIPS) in there or free ip (like risc-v) and re-use the rest of your already tested design/glue. Trying to generalize ARM based processors is like trying to generalize automobiles with V-6 engines, if I have a vehicle with a V-6 engine does that mean the headlight controls are always on the dash to the left of the steering column? Nope!
Documentation of GPIOs in Linux states:
A "General Purpose Input/Output" (GPIO) is a flexible software-controlled
digital signal. They are provided from many kinds of chip, and are familiar
to Linux developers working with embedded and custom hardware.
If we are capable of control the behavior of a pin, then why all the pins on a chip are not GPIOs?
OR
How can we provide functionality through software for a pin on chip?
Please explain.
When you design an integrated circuit (chip) you design with some component model in mind, those internal components may have specific needs that can not be reassigned among different pins, then those pins are fixed function.
For example pins related to memory controller have a very strict performance requirements set (in terms of signal integrity, toggle rate, output driver, capacitance), those pins are fixed function not reassignable, thus you can not use those pins for GPIO. If you do it you will end with a slower chip because the additional circuit change those values becoming unfeasible. Other example are power domain pins (those tipically called VCC, VDD, VEE, GND,).
Thats why GPIO pins are always shared with slow interfaces like SPI, I2C, SMBUS but never with fast interfaces like SATA, DDR, etc.
In other cases the only reason is because the chip doesnt make sense without a particular component, for example, given you must to have RAM memory, then RAM dedicated pins doesnt need to be reassignable because you never will implement the system without RAM memory and never will need reuse those pins for GPIO
All the pins in SOC are not GPIO. A specific group of pins mapped as GPIO. Other pins are configured for specific interfaces like DDR, SPI, I2C... etc, which includes clock, data and power supply pins. GPIO is generic pins can be used for any purpose based on user requirement. It can be used for handling IRQs, trigger Resets, Glow LEDs..etc.
for example, Consider a FPGA is connected to SOC via GPIOs. User need to program the FPGA via these GPIO pins. In SOC side user need to write a specific program with mentioned sequence to drive those GPIO to program the FPGA config file.
What are some physical materials, such as metal plates, that are efficient at blocking out Bluetooth Low Energy signals? If I have a BLE beacon sensor, how can I limit the sensor to receive signals from only one direction and not from the sides or behind the sensor?
While it is not possible to block signals from other directions, you can reduce the strength of those signals from various directions with metal shielding. Just as important as the materials is placement if them from the sensor and grounding the shielding material so it does not act as an antenna itself. Specific instructions are difficult, as a lot depends on your physical environment. Expect lots of trial and error.
A more reliable approach to the above is to get a BLE sensor with an antenna connector and attach a directional antenna pointing in the way you want to collect signals.
I need to transfer data from a bare metal microcontroller system to a linux PC with 2 MBaud.
The linux PC is currently running a 32 bit Kubuntu 14.04.
To archive this, I'd tried to use a FT232R based USB-UART adapter, but I sometimes observed lost data.
As long as the linux PC is mainly idle, it seems to work most time; however, I see rare data loss.
But when I force cpu load (e.g. rebuild my project), the data loss increases significantly.
After some research I read here, that the FT232R consist of a receive buffer with a capacity of only 384 Byte. This means, that the FT232R has to be read out (USB-polled) after at least every 1,9 ms. Well, FTDI recommends to use flow control, but because of the used microcontroller system, I'm fixed to cannot use any flow control.
I can live with the fact, that there is no absolutely guarantee for having no data loss. But the observed amount of data loss is quiet too heavy for my needs.
So I tried to find a way to increase the priority of the "FT232 driver" on my linux, but cannot find how to do this. It's not described in the
AN220 FTDI Drivers Installation Guide for Linux
and the document
AN107 FTDI Advanced Driver Options
has a capter about "Changing the Driver Priority" but only for windows.
So, does anybody know how to increase the FT232R driver priority in linux?
Any other ideas to solve this problem?
BTW: As I read the FT232H datasheet, it seems that this comes with 1 KiB RX buffer. I'd order one just now and check out its behaviour. Edit: No significant improvement.
If you want reliable data transfer, there is absolutely no way to use any USB-to-serial bridge correctly without hardware flow control, and without dedicating at least all remaining RAM in your microcontroller as the serial buffer (or at least until you can store ~1s worth of data).
I've been using FTDI devices since FT232AM was a hot new thing, and here's how I implement them:
(At least) four lines go between the bridge and the MCU: RXD, TXD, RTS#, CTS#.
Flow control is enabled on the PC side of things.
Flow control is enabled on the MCU side of things.
MCU code is only sending communications when it can fit a complete reply packet into the buffer. Otherwise, it lets the PC side of it time out and retry the request. For requests that stream data back, the entire frame is dropped if it can't fit in the transmit buffer at the time the frame is ready.
If you wish the PC to be reliably notified of new data, say every number of complete samples/frames, you must use event characters to flush the FTDI buffers to the hist, and encode your data. HDLC works great for that purpose and is documented in free standards (RFCs and ITU X and Q series - all free!).
The VCP driver, or the D2XX port bring-up is set up to have transfer sizes and latencies set for the needs of the application.
The communication protocol is framed, with CRCs. I usually use a cut-down version if X.25/Q.921/HDLC, limited to SNRM(E) mode for simple "dumb" command-and-respond devices, and SABM(E) for devices that stream data.
The size of FTDI buffers is immaterial, your MCU should have at least an order of magnitude more storage available to buffer things.
If you're running hard real-time code, such as signal processing, make sure that you account for the overhead of lots of transmit interrupts running "back-to-back". Once the FTDI device purges its buffers after a USB transfer, and indicates that it's ready to receive more data from your MCU, your code can potentially transmit a full FTDI buffer's worth of data at once.
If you're close to running out of cycles in your realtime code, you can use a timer as a source of transmit interrupts instead of the UART interrupt. You can then set the timer rate much lower than the UART speed. This allows you to pace the transmission slower without lowering the baudrate. If you're running in setup/preoperational mode or with lower real-time task load, you can then trivially raise the transmit rate without changing the baudrate. You can use a similar trick to pace the receives by flipping the RTS# output on the MCU under timer control. Of course this isn't a problem is you use DMA or a sufficiently fast MCU.
If you're out of timers, note that many other peripherals can also be repurposed as a source of timer interrupts.
This advice applies no matter what is the USB host.
Sidebar: Admittedly, Linux USB serial driver "architecture" is in the state of suspended animation as far as I can tell, so getting sensible results there may require a lot of work. It's not a matter of a simple kernel thread priority change, I'm afraid. Part of the reason is that funding for a lot of Linux work focuses on server/enterprise applications, and there the USB performance is a matter of secondary interest at best. It works well enough for USB storage, but USB serial is a mess nobody really cares enough to overhaul, and overhaul it needs. Just look at the amount of copy-pasta in that department...
There is so many structures in the Linux wireless driver mac80211. Things like struct net_device, struct ieee80211_hw, struct ieee80211_vif and struct ieee80211_local and so on. So many structures that I don't understand what information they contain and when them were initialized.
How can I learn about them and the whole architecture of wireless drivers?
You may want to check out Johannes Berg's (mac80211 maintainer) slides here.
They may be somewhat outdated but should give you a place to start.
A high level description of the Linux WiFi kernel stack:
It's important to understand there are 2 paths in which userspace communicates with the kernel when we're talking about WiFi:
Data path: the data being received is passed from the wireless driver to the netdev core (usually using netif_rx()). From there the net core will pass it through the TCP/IP stack code and will queue it on the relevant sockets from which the userspace process will read it. On the Tx path packets will be sent from the netdev core to the wireless driver using the ndo_start_xmit() callback. The driver registers (like other netdevices such as an ethernet driver) a set of operations callbacks by using the struct net_device_ops.
Control path: This path is how userspace controls the WiFi interface/device and performs operations like scan / authentication / association. The userspace interface is based on netlink and called nl80211 (see include/uapi/linux/nl80211.h). You can send commands and get events in response.
When you send an nl80211 command it gets initially handled by cfg80211 kernel module (it's code is under net/wireless and the handlers are in net/wireless/nl80211.c).
cfg80211 will usually call a lower level driver. In case of Full MAC hardware the specific HW driver is right below cfg80211. The driver below cfg80211 registers a set of ops with cfg80211 by using cfg80211_ops struct. For example see brcmfmac driver (drivers/net/wireless/brcm80211/brcmfmac/wl_cfg80211.c)
For Soft MAC hardware there's mac80211 which is a kernel module implementing the 802.11 MAC layer. In this case cfg80211 will talk to mac80211 which will in turn use the hardware specific lower level driver. An example of this is iwlwifi (For Intel chips).
mac80211 registers itself with cfg80211 by using the cfg80211_ops (see net/mac80211/cfg.c). The specific HW driver registers itself with mac80211 by using the ieee80211_ops struct (for example drivers/net/wireless/iwlwifi/mvm/mac80211.c).
Initialization of a new NIC you've connected occurs from the bottom up the stack. The HW specific driver will call mac80211's ieee80211_allow_hw() usually after probing the HW. ieee80211_alloc_hw() gets the size of private data struct used by the HW driver. It in turns calls cfg80211 wiphy_new() which does the actual allocation of space sufficient for the wiphy struct, the ieee80211_local struct (which is used by mac80211) and the HW driver private data (the layering is seen in ieee80211_alloc_hw code).
ieee80211_hw is an embedded struct within ieee80211_local which is "visible" to the the HW driver. All of these (wiphy, ieee80211_local, ieee80211_hw) represent a single physical device connected.
On top of a single physical device (also referred to as phy) you can set up multiple virtual interfaces. These are essentially what you know as wlan0 or wlan1 which you control with ifconfig. Each such virtual interface is represented by an ieee80211_vif. This struct also contains at the end private structs accessed by the HW driver. Multiple interfaces can be used to run something like a station on wlan0 and an AP on wlan1 (this is possible depending on the HW capabilities).