$HASP373 and IEF403I z/os syslog - mainframe

I ask myself a question about the z/os log:
I just would like to know if all the operations getting started were always called by $HASP373 and IEF403I ?
And for the status Ended called by $HASP395 and IEF404I ?

The trouble with z/OS is that it's really hard to explain something without introducing another concept that also needs explaining. This, in turn, requires another explanation etc. This is partly due to the z/OS operating system being from a different planet compared to Unix, Windows, OS X etc, all of which are broadly similar.
Those messages are issued by the system for a lot of the work that happens on a mainframe, but not all of it.
All work on z/OS runs in its own address space, which is almost like a mini-VM. There will be many address spaces in a z/OS system (380 in ours currently). A program in an address space is not aware of any other address spaces and thinks it has access to the entire 2Gb (31-bit addressing) range of memory (different address spaces can communicate if necessary & authorised, and more than 2GB is available with 64-bit addressing). A program in one address space cannot crash a program in another address space by overwriting storage. Programs in 2 different address spaces can access the same memory address, but don't affect each other, as they will actually, unbeknown to them, access different memory.
There are 4 types of address spaces:
TSO (Time Sharing Option) - these are users logged on to the system, typing commands and getting responses. They may run scripts, using the languages REXX and Clist (Command Lists - older, generally replaced by REXX) much like Perl and shell scripts, submit batch jobs, write and compile code etc.
BATCH JOBS (or JOB) - This is where you want to run a program, so you create a text file with the name of the program(s) to run and the file(s) that it/they need(s) and SUBMIT it. The system will run the program(s) and tell you when they are done, Whilst running, you can go and do something else. You don't even need to be logged on - you can prepare an FTP job (for example) to run at 01:00 whilst you're asleep and another job to run if the first one works.
STARTED TASKS (STCs) - Very similar to a batch job. Usually started either by the system itself when it starts or by an operator issuing a START command for that STC at the system console. (E.g. 'START DB2' starts the DB2 started task. Alternatively a user may submit a batch job for their own test DB2 system.)
System Address Spaces (SYSAS). Consider these like a Unix daemon. started by the operating system itself for various essential processes. There are also address spaces representing processes running under the 'Unix' half of z/OS (USS - Uxniz System Services), but that's another story.
There is no such thing as an 'operation' in z/OS terms. Within an address space, many programs may be running, each one identified by a TCB (Task Control Block) or SRB (System Request Block).
However, if you knew that the information you wanted was produced by a normal batch job, then looking for the £HASP373 and £HASP395 messages for that job would be the right place to start. Bear in mind that the message ids (HASP373 and HASP395) might not start with a '£' on your system. '£' is the default, but it is a customisable parameter. $ and # are also fairly common.
I do know what I'm talking about, but if any of the above is not clear, then I haven't explained it very well. I may be guilty of doing exactly what I warned against and explaining an unknown concept by using another unknown concept. :-)

Work gets into z/OS through something called the subsystem interface. Part of this flow is that generally, when an address space is started, it requests work from the subsystem that started the address space through a well-defined interface (IEFSSREQ). This handshake is where things like your HASP messages come from.
Here's a watered down example.
An operator enters a START command from a system console. As part of processing that command, the system creates an address space, and eventually a thread in the new address space says, "ok - I'm ready...give me some work to do". This goes to the primary job entry subsystem, who hands the address space something to do - the internal data structures representing the task that the operator started in this case. As part of this chain, the various $HASP messages are issued, and this works pretty much the same way for TSO sessions, started tasks (STCs) and JCL submitted for a batch job.
JES2/JES3 are examples of subsystems, but there are others.
For example, if our operator added the SUB=MSTR parameter on the start command, the requests wouldn't go through the primary JES - and so there wouldn't be any of the $HASP messages you're looking for. There are plenty of vendor applications that start and manage address spaces outside of JES, and this is the stuff you miss by limiting yourself to the HASP and IEF401 messages.
Also, UNIX Services has a variety of APIs similar to UNIX "fork" that can be used to spawn address spaces without necessarily involving JES.
If you want to know about activity starting and ending, there are better ways - SMF, ENF signals, etc. A great way to learn this stuff if you don't know already is to use the system trace facilities and read some dumps. The wonderful thing about z/OS is that it's all right there, for those who spend the time figuring out where to look.

No. Those messages are for jobs. Not all operations are jobs. An example of an operation that is not a job would be a system command. I don't have a z/OS system at hand right now, but I believe another example of an operation that would not use the messages you reference would be a started task.
This may be helpful, as it attempts to explain z/OS concepts in Unix terms.

A job is something that goes through JES2/JES3. (In your case, JES2.) JES2/JES3 jobs are generally used for batch type of work. For example, a sort job, where I submit something, and come back later and get an answer. However, there's a lot of work running under z/OS that doesn't go through JES2/JES3.
Part of the problem here is what you mean by an operation; for example, while you may get a message saying that DB2 has started, after it's started, it's not going to tell you every time it gets a query. A TSO user might run a REXX exec underneath his/her address space, but that's not going to go through JES.
Another way to look at this is that JES2/JES3 are job management subsystems, but they aren't equivalent to the kernel on a unix/windows system, which does schedule all the work running on the system. For z/OS, there are multiple ways that work can come in to a system; examples include JES2/JES3, TSO, ISPF, CICS, DB2, IMS, via the console, etc. It's then up to the master scheduler/WLM/SRM to manage all the requests that come in through all of the subsystems.
If you have access to a z/OS system, look into SDSF, or whatever you use to manage JES2. The ST panel, under SDSF, is a list of things that are running/eligible to run that are managed by JES2. However, if you look at the DA panel (assuming you have authority to do so), you'll note that there are a lot of address spaces that show up on the DA panel that don't show up in the ST panel.

If address spaces are started through the JES2-subsystem, which is normally the case unless another subsystem or MSTR is specified using the MVS START command, then the $HASP373 jobname STARTED is issued. Similarly, when the address space ends, message $HASP395 is issued.
The IEF403I and IEF404I messages are issued by the system in similar situations and independent of what either JES2 or JES3 are doing and regardless under what subsystem the address space was started. The messages are only issued when the operator has requested to monitor job names using the SETCON MONITOR or the MONITOR JOBNAMES command. Products for automated operations typically do this.

Related

Persistant storage values handling in Linux

I have a QSPI flash on my embedded board.
I have a driver + process "Q" to handle reading and writing into.
I want to store variables like SW revisions, IP, operation time, etc.
I would like to ask for suggestions how to handle the different access rights to write and read values from user space and other processes.
I was thinking to have file for each variable. Than I can assign access rights for those files and process Q can change the value in file if value has been changed. So process Q will only write into and other processes or users can only read.
But I don't know about writing. I was thinking about using message queue or zeroMQ and build the software around it but i am not sure if it is not overkill. But I am not sure how to manage access rights anyway.
What would be the best approach? I would really appreciate if you could propose even totally different approach.
Thanks!
This question will probably be downvoted / flagged due to the "Please suggest an X" nature.
That said, if a file per variable is what you're after, you might want to look at implementing a FUSE file system that wraps your SPI driver/utility "Q" (or build it into "Q" if you get to compile/control source to "Q"). I'm doing this to store settings in an EEPROM on a current work project and its turned out nicely. So I have, for example, a file, that when read, retrieves 6 bytes from EEPROM (or a cached copy) provides a MAC address in std hex/colon-separated notation.
The biggest advantage here, is that it becomes trivial to access all your configuration / settings data from shell scripts (e.g. your init process) or other scripting languages.
Another neat feature of doing it this way is that you can use inotify (which comes "free", no extra code in the fusefs) to create applications that efficiently detect when settings are changed.
A disadvantage of this approach is that it's non-trivial to do atomic transactions on multiple settings and still maintain normal file semantics.

How to monitor a process in Linux CPU, Memory and time

How can I benchmark a process in Linux? I need something like "top" and "time" put together for a particular process name (it is a multiprocess program so many PIDs will be given)?
Moreover I would like to have a plot over time of memory and cpu usage for these processes and not just final numbers.
Any ideas?
I typically throw together a simple script for this type of work.
Take a look at the kernel documentation for the proc filesystem (Google 'linux proc.txt').
The first line of /proc/stat (Section 1.8 in proc.txt) will give you cumulative cpu usage stats (i.e. user, nice, system, idle, ...). For each process, the file /proc/$PID/stat (Table 1-4 in proc.txt) will provide you with both process-specific cpu usage stats and memory usage stats (see rss).
If you google a bit you'll find plenty of detailed info on these files, and pointers to libraries / apps / code snippets that can help you obtain / derive the values you need. With that in mind, I'll focus on the high-level strategy.
For CPU stats, use your favorite scripting language to create an executable that takes a set of process ids for monitoring. At a fixed interval (ex: 1 second) poll / calculate the cumulative totals for each process and the system as a whole. During each poll interval, write all results on a single line to stdout.
For memory stats, write a similar script, but simply log the per-process memory usage. Memory is a bit easier as we directly obtain the instantaneous values.
Run these script for the duration of your test, passing the set of processes ids that you'd like to monitor and redirecting its output to a log file.
./logcpu $(pidof foo) $(pidof bar) > cpustats
./logmem $(pidof foo) $(pidof bar) > memstats
Import the contents of these files into a spreadsheet (for certain applications this is as easy as copy / paste). For CPU, you are after instantaneous values but have cumulative values, so you'll need to do some minor spreadsheet work to derive these values (it's just the delta 't(x + 1) - t(x)'). Of course you could have your cpu logger write the delta, but you'll be spending a bit more time up front on the script.
Finally, use your spreadsheet to generate a nice plot.
Following are the tools for monitoring a linux system
System commands like top, free -m, vmstat, iostat, iotop, sar, netstat, etc. Nothing comes near these linux utility when you are debugging a problem. These command give you a clear picture that is going inside your server
SeaLion: Agent executes all the commands mentioned in #1 (also user defined) and outputs of these commands can be accessed in a beautiful web interface. This tool comes handy when you are debugging across hundreds of servers as installation is clear simple. And its FREE
Nagios: It is the mother of all monitoring/alerting tools. It is very much customization but very much difficult to setup for beginners. There are sets of tools called nagios plugins that covers pretty much all important Linux metrics
Munin
Server Density: A cloudbased paid service that collects important Linux metrics and gives users ability to write own plugins.
New Relic: Another well know hosted monitoring service.
Zabbix

Address space identifiers using qemu for i386 linux kernel

Friends, I am working on an in-house architectural simulator which is used to simulate the timing-effect of a code running on different architectural parameters like core, memory hierarchy and interconnects.
I am working on a module takes the actual trace of a running program from an emulator like "PinTool" and "qemu-linux-user" and feed this trace to the simulator.
Till now my approach was like this :
1) take objdump of a binary executable and parse this information.
2) Now the emulator has to just feed me an instruction-pointer and other info like load-address/store-address.
Such approaches work only if the program content is known.
But now I have been trying to take traces of an executable running on top of a standard linux-kernel. The problem now is that the base kernel image does not contain the code for LKM(Loadable Kernel Modules). Also the daemons are not known when starting a kernel.
So, my approach to this solution is :
1) use qemu to emulate a machine.
2) When an instruction is encountered for the first time, I will parse it and save this info. for later.
3) create a helper function which sends the ip, load/store address when an instruction is executed.
i am stuck in step2. how do i differentiate between different processes from qemu which is just an emulator and does not know anything about the guest OS ??
I can modify the scheduler of the guest OS but I am really not able to figure out the way forward.
Sorry if the question is very lengthy. I know I could have abstracted some part but felt that some part of it gives an explanation of the context of the problem.
In the first case, using qemu-linux-user to perform user mode emulation of a single program, the task is quite easy because the memory is linear and there is no virtual memory involved in the emulator. The second case of whole system emulation is a lot more complex, because you basically have to parse the addresses out of the kernel structures.
If you can get the virtual addresses directly out of QEmu, your job is a bit easier; then you just need to identify the process and everything else functions just like in the single-process case. You might be able to get the PID by faking a system call to get_pid().
Otherwise, this all seems quite a bit similar to debugging a system from a physical memory dump. There are some tools for this task. They are probably too slow to run for every instruction, though, but you can look for hints there.

Would executable files be Machine Code - made for the hardware?

Here is from Wiki .
"In computing, an executable file causes a computer "to perform indicated tasks according to encoded instructions," ( Machine Code ?? )
"Modern operating systems retain control over the computer's resources, requiring that individual programs make system calls to access privileged resources. Since each operating system family features its own system call architecture, executable files are generally tied to specific operating systems."
Well this is my perspective .
Executables cannot be Machine Code as they need to tal to the OS for hardware services ( system calls) Hence executable is just not yet "Machine Code" ... Perhaps it is like some part of the code is actual Machine Code and some parts are just meant to call the Machine code embedded in the Operating system ? Overall it contains some junks of Machine Code - and some junks of codes to call the operating system .
Edited after Damon's Answer :
In the end OS is a set of machine codes . Basically OS would be doing the job of copy pasting user's Machine Code ( created by C Compiler ) and then if the instruction is a system call , the transfer goes to OS memory region for handling it . Now the question is what Machine Code generated in C can do this part ? Like asking to transfer control to OS etc - I suppose its system calls at higher abstraction but under the hood - how does it work .
I get a feeling its similar to chicken egg problem , C creates OS and C uses OS Cant find the exactly how the process goes .
Can anyone break the puzzle for me ?
One thing does not exclude the other. Executables are (unless they are some form of bytecode running in a virtual machine) machine code. However, there are different kinds of instructions, some of which are not usable at certain privilegue levels.
That is where the operating system comes in, it is "machine code" that runs at the highest privilegue level, working as arbiter for the "important" parts and tasks, such as deciding who gets CPU time and what value goes into some hardware register.
(originally comment, made an answer by request)
EDIT: About your extended question, this works approximately as follows. When the computer is turned on, the processor runs at its highest privilegue level. In this "mode", the BIOS, the boot loader, and the operating system can do just what they want. This sounds great, but you don't want any kind of code being able to do just whatever it wants.
For example, the code can tell the MMU which memory pages are allowed to be read or written to, and which ones are not. Or, it can define what address is called if "something special" such as a trap or interrupt happens. Or, it can directly write to some special memory addresses that map ports of some devices (disk, network, whatever).
Eventually, the OS switches to "unprivileged" mode and calls some non-OS code. When a trap or interrupt happens, execution is interrupted and continues elsewhere (as specified by the OS previously), and the privilege level is upped again. Once the interrupt has been dealt with, privilege is taken away, and user code is called again.
If a user program needs the OS to do something "OS like", it sets up parameters according to an agreed scheme (for example in some particular registers) and executes a trap instruction.
This is for example how things like multithreading or virtual memory are implemented. In regular intervals, a timer fires off an interrupt, which stops execution of "normal" code, and calls some code in the kernel (in privileged mode). That code then decides what user process control should returned to, after some kind of priority scheme. Those are the "CPU time slices" that are handed out.
If some process reads from or writes to a page that it isn't allowed, a trap is generated by the MMU. The OS then looks at what happened and where, and decides whether to load some data from disk into some memory region (and possibly purge something else) and change the process' mappings, or whether to kill the process with a "segmentation fault" error.
Of course in reality, it is a million times more complicated, but in principle that's about as it works.
It does not really matter whether the OS or the programs were originally written in C or with an assembler. To the processor, it's just a sequence of machine instructions. Even a python or perl script is "just machine instructions" in the end, only with a detour via the interpreter.

How to "hibernate" a process in Linux by storing its memory to disk and restoring it later?

Is it possible to 'hibernate' a process in linux?
Just like 'hibernate' in laptop, I would to write all the memory used by a process to disk, free up the RAM. And then later on, I can 'resume the process', i.e, reading all the data from memory and put it back to RAM and I can continue with my process?
I used to maintain CryoPID, which is a program that does exactly what you are talking about. It writes the contents of a program's address space, VDSO, file descriptor references and states to a file that can later be reconstructed. CryoPID started when there were no usable hooks in Linux itself and worked entirely from userspace (actually, it still does work, depending on your distro / kernel / security settings).
Problems were (indeed) sockets, pending RT signals, numerous X11 issues, the glibc caching getpid() implementation amongst many others. Randomization (especially VDSO) turned out to be insurmountable for the few of us working on it after Bernard walked away from it. However, it was fun and became the topic of several masters thesis.
If you are just contemplating a program that can save its running state and re-start directly into that state, its far .. far .. easier to just save that information from within the program itself, perhaps when servicing a signal.
I'd like to put a status update here, as of 2014.
The accepted answer suggests CryoPID as a tool to perform Checkpoint/Restore, but I found the project to be unmantained and impossible to compile with recent kernels.
Now, I found two actively mantained projects providing the application checkpointing feature.
The first, the one I suggest 'cause I have better luck running it, is CRIU
that performs checkpoint/restore mainly in userspace, and requires the kernel option CONFIG_CHECKPOINT_RESTORE enabled to work.
Checkpoint/Restore In Userspace, or CRIU (pronounced kree-oo, IPA: /krɪʊ/, Russian: криу), is a software tool for Linux operating system. Using this tool, you can freeze a running application (or part of it) and checkpoint it to a hard drive as a collection of files. You can then use the files to restore and run the application from the point it was frozen at. The distinctive feature of the CRIU project is that it is mainly implemented in user space.
The latter is DMTCP; quoting from their main page:
DMTCP (Distributed MultiThreaded Checkpointing) is a tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications.
There is also a nice Wikipedia page on the argument: Application_checkpointing
The answers mentioning ctrl-z are really talking about stopping the process with a signal, in this case SIGTSTP. You can issue a stop signal with kill:
kill -STOP <pid>
That will suspend execution of the process. It won't immediately free the memory used by it, but as memory is required for other processes the memory used by the stopped process will be gradually swapped out.
When you want to wake it up again, use
kill -CONT <pid>
The more complicated solutions, like CryoPID, are really only needed if you want the stopped process to be able to survive a system shutdown/restart - it doesn't sound like you need that.
Linux Kernel has now partially implemented the checkpoint/restart futures:https://ckpt.wiki.kernel.org/, the status is here.
Some useful information are in the lwn(linux weekly net):
http://lwn.net/Articles/375855/ http://lwn.net/Articles/412749/ ......
So the answer is "YES"
The issue is restoring the streams - files and sockets - that the program has open.
When your whole OS hibernates, the local files and such can obviously be restored. Network connections don't, but then the code that accesses the internet is typically more error checking and such and survives the error conditions (or ought to).
If you did per-program hibernation (without application support), how would you handle open files? What if another process accesses those files in the interim? etc?
Maintaining state when the program is not loaded is going to be difficult.
Simply suspending the threads and letting it get swapped to disk would have much the same effect?
Or run the program in a virtual machine and let the VM handle suspension.
Short answer is "yes, but not always reliably". Check out CryoPID:
http://cryopid.berlios.de/
Open files will indeed be the most common problem. CryoPID states explicitly:
Open files and offsets are restored.
Temporary files that have been
unlinked and are not accessible on the
filesystem are always saved in the
image. Other files that do not exist
on resume are not yet restored.
Support for saving file contents for
such situations is planned.
The same issues will also affect TCP connections, though CryoPID supports tcpcp for connection resuming.
I extended Cryopid producing a package called Cryopid2 available from SourceForge. This can
migrate a process as well as hibernating it (along with any open files and sockets - data
in sockets/pipes is sucked into the process on hibernation and spat back into these when
process is restarted).
The reason I have not been active with this project is I am not a kernel developer - both
this (and/or the original cryopid) need to get someone on board who can get them running
with the lastest kernels (e.g. Linux 3.x).
The Cryopid method does work - and is probably the best solution to general purpose process
hibernation/migration in Linux I have come across.
The short answer is "yes." You might start by looking at this for some ideas: ELF executable reconstruction from a core image (http://vx.netlux.org/lib/vsc03.html)
As others have noted, it's difficult for the OS to provide this functionality, because the application needs to have some error checking builtin to handle broken streams.
However, on a side note, some programming languages and tools that use virtual machines explicitly support this functionality, such as the Self programming language.
This is sort of the ultimate goal of clustered operating system. Mathew Dillon puts a lot of effort to implement something like this in his Dragonfly BSD project.
adding another workaround: you can use virtualbox. run your applications in a regular virtual machine and simply "save the machine state" whenever you want.
I know this is not an answer, but I thought it could be useful when there are no real options.
if for any reason you don't like virtualbox, vmware and Qemu are as good.
Ctrl-Z increases the chances the process's pages will be swapped, but it doesn't free the process's resources completely. The problem with freeing a process's resources completely is that things like file handles, sockets are kernel resources the process gets to use, but doesn't know how to persist on its own. So Ctrl-Z is as good as it gets.
There was some research on checkpoint/restore for Linux back in 2.2 and 2.4 days, but it never made it past prototype. It is possible (with the caveats described in the other answers) for certain values of possible - I you can write a kernel module to do it, it is possible. But for the common value of possible (can I do it from the shell on a commercial Linux distribution), it is not yet possible.
There's ctrl+z in linux, but i'm not sure it offers the features you specified. I suspect you asked this question since it doesn't

Resources