I made a code with a switch to enable multi core feature .
I ran two different run
1st - with single core
2nd - with two cores
both were running with same exec and in the same machine in different terminals. I had to debug something. I was debugging parallel with both run, while debugging I found that "one same pointer in both run were pointing to same address" .
I know shared memory concept when we use fork , but here I was running two different process.
how Is this possible and what is the concept behind it .
You are not specifying an operating system, but typically processes have independent address spaces. What you are probably seeing are two pointers which happen to have the same value, but actually referring to each process' own memory space.
Protected mode OSs often remap physical memory into new address spaces for the user level programs.
Related
I ask myself a question about the z/os log:
I just would like to know if all the operations getting started were always called by $HASP373 and IEF403I ?
And for the status Ended called by $HASP395 and IEF404I ?
The trouble with z/OS is that it's really hard to explain something without introducing another concept that also needs explaining. This, in turn, requires another explanation etc. This is partly due to the z/OS operating system being from a different planet compared to Unix, Windows, OS X etc, all of which are broadly similar.
Those messages are issued by the system for a lot of the work that happens on a mainframe, but not all of it.
All work on z/OS runs in its own address space, which is almost like a mini-VM. There will be many address spaces in a z/OS system (380 in ours currently). A program in an address space is not aware of any other address spaces and thinks it has access to the entire 2Gb (31-bit addressing) range of memory (different address spaces can communicate if necessary & authorised, and more than 2GB is available with 64-bit addressing). A program in one address space cannot crash a program in another address space by overwriting storage. Programs in 2 different address spaces can access the same memory address, but don't affect each other, as they will actually, unbeknown to them, access different memory.
There are 4 types of address spaces:
TSO (Time Sharing Option) - these are users logged on to the system, typing commands and getting responses. They may run scripts, using the languages REXX and Clist (Command Lists - older, generally replaced by REXX) much like Perl and shell scripts, submit batch jobs, write and compile code etc.
BATCH JOBS (or JOB) - This is where you want to run a program, so you create a text file with the name of the program(s) to run and the file(s) that it/they need(s) and SUBMIT it. The system will run the program(s) and tell you when they are done, Whilst running, you can go and do something else. You don't even need to be logged on - you can prepare an FTP job (for example) to run at 01:00 whilst you're asleep and another job to run if the first one works.
STARTED TASKS (STCs) - Very similar to a batch job. Usually started either by the system itself when it starts or by an operator issuing a START command for that STC at the system console. (E.g. 'START DB2' starts the DB2 started task. Alternatively a user may submit a batch job for their own test DB2 system.)
System Address Spaces (SYSAS). Consider these like a Unix daemon. started by the operating system itself for various essential processes. There are also address spaces representing processes running under the 'Unix' half of z/OS (USS - Uxniz System Services), but that's another story.
There is no such thing as an 'operation' in z/OS terms. Within an address space, many programs may be running, each one identified by a TCB (Task Control Block) or SRB (System Request Block).
However, if you knew that the information you wanted was produced by a normal batch job, then looking for the £HASP373 and £HASP395 messages for that job would be the right place to start. Bear in mind that the message ids (HASP373 and HASP395) might not start with a '£' on your system. '£' is the default, but it is a customisable parameter. $ and # are also fairly common.
I do know what I'm talking about, but if any of the above is not clear, then I haven't explained it very well. I may be guilty of doing exactly what I warned against and explaining an unknown concept by using another unknown concept. :-)
Work gets into z/OS through something called the subsystem interface. Part of this flow is that generally, when an address space is started, it requests work from the subsystem that started the address space through a well-defined interface (IEFSSREQ). This handshake is where things like your HASP messages come from.
Here's a watered down example.
An operator enters a START command from a system console. As part of processing that command, the system creates an address space, and eventually a thread in the new address space says, "ok - I'm ready...give me some work to do". This goes to the primary job entry subsystem, who hands the address space something to do - the internal data structures representing the task that the operator started in this case. As part of this chain, the various $HASP messages are issued, and this works pretty much the same way for TSO sessions, started tasks (STCs) and JCL submitted for a batch job.
JES2/JES3 are examples of subsystems, but there are others.
For example, if our operator added the SUB=MSTR parameter on the start command, the requests wouldn't go through the primary JES - and so there wouldn't be any of the $HASP messages you're looking for. There are plenty of vendor applications that start and manage address spaces outside of JES, and this is the stuff you miss by limiting yourself to the HASP and IEF401 messages.
Also, UNIX Services has a variety of APIs similar to UNIX "fork" that can be used to spawn address spaces without necessarily involving JES.
If you want to know about activity starting and ending, there are better ways - SMF, ENF signals, etc. A great way to learn this stuff if you don't know already is to use the system trace facilities and read some dumps. The wonderful thing about z/OS is that it's all right there, for those who spend the time figuring out where to look.
No. Those messages are for jobs. Not all operations are jobs. An example of an operation that is not a job would be a system command. I don't have a z/OS system at hand right now, but I believe another example of an operation that would not use the messages you reference would be a started task.
This may be helpful, as it attempts to explain z/OS concepts in Unix terms.
A job is something that goes through JES2/JES3. (In your case, JES2.) JES2/JES3 jobs are generally used for batch type of work. For example, a sort job, where I submit something, and come back later and get an answer. However, there's a lot of work running under z/OS that doesn't go through JES2/JES3.
Part of the problem here is what you mean by an operation; for example, while you may get a message saying that DB2 has started, after it's started, it's not going to tell you every time it gets a query. A TSO user might run a REXX exec underneath his/her address space, but that's not going to go through JES.
Another way to look at this is that JES2/JES3 are job management subsystems, but they aren't equivalent to the kernel on a unix/windows system, which does schedule all the work running on the system. For z/OS, there are multiple ways that work can come in to a system; examples include JES2/JES3, TSO, ISPF, CICS, DB2, IMS, via the console, etc. It's then up to the master scheduler/WLM/SRM to manage all the requests that come in through all of the subsystems.
If you have access to a z/OS system, look into SDSF, or whatever you use to manage JES2. The ST panel, under SDSF, is a list of things that are running/eligible to run that are managed by JES2. However, if you look at the DA panel (assuming you have authority to do so), you'll note that there are a lot of address spaces that show up on the DA panel that don't show up in the ST panel.
If address spaces are started through the JES2-subsystem, which is normally the case unless another subsystem or MSTR is specified using the MVS START command, then the $HASP373 jobname STARTED is issued. Similarly, when the address space ends, message $HASP395 is issued.
The IEF403I and IEF404I messages are issued by the system in similar situations and independent of what either JES2 or JES3 are doing and regardless under what subsystem the address space was started. The messages are only issued when the operator has requested to monitor job names using the SETCON MONITOR or the MONITOR JOBNAMES command. Products for automated operations typically do this.
We are trying to setup Eclipse in a shared environment, i.e., it will be installed on a server and each user connects to it using VNC. There are different reasons for sharing Eclipse, one being proper integration with ClearCase.
We identified that Eclipse is using large amounts of memory. We are wondering whether the Eclipse(JVM?) loads each class once per user/session or whether there is any sort of sharing of objects that are already loaded into memory?
This makes me think about a basic question in general. How many copies of a program gets loaded into memory when two or more users are accessing the host at the same time.
Is it one per user or a single copy is shared between users?
Two questions here:
1) How many copies of a program gets loaded into memory when two or
more users are using it at the same time?
2) How does the above holds in the world of Java/JVM?
Linux allows for sharing binary code between running processes, i.e. the segments that hold executable parts of a program are mapped into virtual memory space of each running copy. Then each process gets its own data parts (stack, heap, etc.).
The issue with Java, or almost any other interpreted language, is that run-time, the JVM, treats byte-code as data, loading it into heap. The fact that Java is half-compiled and half interpreted is irrelevant here. This results in a situation where the JVM executable itself is eligible for code sharing by the OS, but your application Java code is not.
In general, a single copy of a program (i.e. text segment) is loaded into RAM and shared by all instances, so the exact same read-only memory mapped physical pages (though possibly/probably mapped to different addresses in different address spaces, but it's still the same memory). Data is usually private to each process, i.e. each program's data lives in separate pages RAM (though it can be shared).
BUT
The problem is that the actual program here is only the Java runtime interpreter, or the JIT compiler. Eclipse, like all Java programs, is rather data than a program (which however is interpreted as a program). That data is either loaded into the private address space and interpreted by the JVM or turned into an executable by the JIT compiler, resulting in a (temporary) executable binary, which is launched. This means, in principle, each Java program runs as a separate copy, using separate RAM.
Now, you might of course be lucky, and the JVM might load the data as a shared mapping, in this case the bytecode would occupy the same identical RAM in all instances. However, whether that's the case is something only the author of the JVM could tell, and it's not something you can rely on in general.
Also, depending on how clever the JIT is, it might cache that binary for some time and reuse it for identical Java programs, which would be very advantageous, not only because it saves the compilation. All instances launched from the same executable image share the same memory, so this would be just what you want.
It is even likely that this is done -- at least to some extent -- on your JIT compiler, because compiling is rather expensive and it's a common optimization.
I have the homework question:
Explain how a process can refer to objects that are not in its
address space (for example, a file or another process)?
I know that each process is created with an address space that defines access to every memory mapped resource in that process (got that from this book). I think that the second part to this question does not make sense. How can a process reference an object of another process? Isn't the OS suppose to restrict that? maybe I am not understanding the question correctly. Anyways if I understood the question correctly the only way that will be possible is by using the kernel I believe.
If you are asking it in a general sense, then its a no. Operating systems do not allow one process to access another process's virtual address space under the normal circumstances.
However there are ways in which you can create a controlled environment where such a thing can be done using various techniques.
A perfect example is the debugger. It uses process tracing mechanism (like reading from /proc filesystem or using the ptrace() system calls) to gain access to read and write from another address space.
There is also a shared memory concept, where a particular piece of memory is explicitly shared between two processes and can be controlled via a shared memory object.
You can attach as a debugger to the application. Or if using Windows, you can use windows hooks
I have researched and I have the answer to the file part of the question.
first of an address space is the collection of addresses that a
thread can reference. Normally these addresses reference to an
executable in memory. Some operating systems allow a programmer to
read and write the contents of a file using addresses from the process
address space. This is accomplished by opening the file, then binding each byte in the file to an address in the address space.
The second part of the question this is what I will answer:
Most operating systems will not allow reading addresses from another
process. This will imply a huge security risk. I have not heard of any
operating system that enables you to read data from a thread that is
not owned by the current process. So in short I believe this will not
be possible.
Friends, I am working on an in-house architectural simulator which is used to simulate the timing-effect of a code running on different architectural parameters like core, memory hierarchy and interconnects.
I am working on a module takes the actual trace of a running program from an emulator like "PinTool" and "qemu-linux-user" and feed this trace to the simulator.
Till now my approach was like this :
1) take objdump of a binary executable and parse this information.
2) Now the emulator has to just feed me an instruction-pointer and other info like load-address/store-address.
Such approaches work only if the program content is known.
But now I have been trying to take traces of an executable running on top of a standard linux-kernel. The problem now is that the base kernel image does not contain the code for LKM(Loadable Kernel Modules). Also the daemons are not known when starting a kernel.
So, my approach to this solution is :
1) use qemu to emulate a machine.
2) When an instruction is encountered for the first time, I will parse it and save this info. for later.
3) create a helper function which sends the ip, load/store address when an instruction is executed.
i am stuck in step2. how do i differentiate between different processes from qemu which is just an emulator and does not know anything about the guest OS ??
I can modify the scheduler of the guest OS but I am really not able to figure out the way forward.
Sorry if the question is very lengthy. I know I could have abstracted some part but felt that some part of it gives an explanation of the context of the problem.
In the first case, using qemu-linux-user to perform user mode emulation of a single program, the task is quite easy because the memory is linear and there is no virtual memory involved in the emulator. The second case of whole system emulation is a lot more complex, because you basically have to parse the addresses out of the kernel structures.
If you can get the virtual addresses directly out of QEmu, your job is a bit easier; then you just need to identify the process and everything else functions just like in the single-process case. You might be able to get the PID by faking a system call to get_pid().
Otherwise, this all seems quite a bit similar to debugging a system from a physical memory dump. There are some tools for this task. They are probably too slow to run for every instruction, though, but you can look for hints there.
Here is from Wiki .
"In computing, an executable file causes a computer "to perform indicated tasks according to encoded instructions," ( Machine Code ?? )
"Modern operating systems retain control over the computer's resources, requiring that individual programs make system calls to access privileged resources. Since each operating system family features its own system call architecture, executable files are generally tied to specific operating systems."
Well this is my perspective .
Executables cannot be Machine Code as they need to tal to the OS for hardware services ( system calls) Hence executable is just not yet "Machine Code" ... Perhaps it is like some part of the code is actual Machine Code and some parts are just meant to call the Machine code embedded in the Operating system ? Overall it contains some junks of Machine Code - and some junks of codes to call the operating system .
Edited after Damon's Answer :
In the end OS is a set of machine codes . Basically OS would be doing the job of copy pasting user's Machine Code ( created by C Compiler ) and then if the instruction is a system call , the transfer goes to OS memory region for handling it . Now the question is what Machine Code generated in C can do this part ? Like asking to transfer control to OS etc - I suppose its system calls at higher abstraction but under the hood - how does it work .
I get a feeling its similar to chicken egg problem , C creates OS and C uses OS Cant find the exactly how the process goes .
Can anyone break the puzzle for me ?
One thing does not exclude the other. Executables are (unless they are some form of bytecode running in a virtual machine) machine code. However, there are different kinds of instructions, some of which are not usable at certain privilegue levels.
That is where the operating system comes in, it is "machine code" that runs at the highest privilegue level, working as arbiter for the "important" parts and tasks, such as deciding who gets CPU time and what value goes into some hardware register.
(originally comment, made an answer by request)
EDIT: About your extended question, this works approximately as follows. When the computer is turned on, the processor runs at its highest privilegue level. In this "mode", the BIOS, the boot loader, and the operating system can do just what they want. This sounds great, but you don't want any kind of code being able to do just whatever it wants.
For example, the code can tell the MMU which memory pages are allowed to be read or written to, and which ones are not. Or, it can define what address is called if "something special" such as a trap or interrupt happens. Or, it can directly write to some special memory addresses that map ports of some devices (disk, network, whatever).
Eventually, the OS switches to "unprivileged" mode and calls some non-OS code. When a trap or interrupt happens, execution is interrupted and continues elsewhere (as specified by the OS previously), and the privilege level is upped again. Once the interrupt has been dealt with, privilege is taken away, and user code is called again.
If a user program needs the OS to do something "OS like", it sets up parameters according to an agreed scheme (for example in some particular registers) and executes a trap instruction.
This is for example how things like multithreading or virtual memory are implemented. In regular intervals, a timer fires off an interrupt, which stops execution of "normal" code, and calls some code in the kernel (in privileged mode). That code then decides what user process control should returned to, after some kind of priority scheme. Those are the "CPU time slices" that are handed out.
If some process reads from or writes to a page that it isn't allowed, a trap is generated by the MMU. The OS then looks at what happened and where, and decides whether to load some data from disk into some memory region (and possibly purge something else) and change the process' mappings, or whether to kill the process with a "segmentation fault" error.
Of course in reality, it is a million times more complicated, but in principle that's about as it works.
It does not really matter whether the OS or the programs were originally written in C or with an assembler. To the processor, it's just a sequence of machine instructions. Even a python or perl script is "just machine instructions" in the end, only with a detour via the interpreter.