LC-3 Assembly Language - swapping values

LC-3 Assembly Language - swapping values - linux

How can I swap to values in an address. Currently I have 2 registers which contain the addresses. I then had 2 temporary variables which stores those addresses. I then loaded the values since I have the address. But I can not figure out how to swap the values. I am trying to do bubble sort. The code below is what I currently have
IF ;swapping condition
ST R2,idata ;temporily hold the smaller data
ST R1,imindata ;temporaily hold the larger data
ST R2,iminaddres ;store the values into that address
ST R2,iaddress ;finish the swaping of the two values
LD R1,iminaddres ;reput the address back into the register
LD R2,iaddres ;reput the address back into the register to be used for next cycle

How would you do it in C?
temp = a;
a = b;
b = temp;
Then understand there is a need to load those values from memory, which changes things a bit
tempa = a;
tempb = b;
b = tempa;
a = tempb;
then isolate the loads and stores
rega <= load(a);
regb <= load(b);
store(a) <= regb;
store(b) <= rega;
then implement that in assembly. This smells like a homework assignment so I wont do it for you.

If all you want to do is swap the contents of two registers, there's a simple bit-twiddling trick:
XOR R1,R2
XOR R2,R1
XOR R1,R2
This will exchange the contents of the two registers without using any memory.

Related

Proper way to use a bus in a for loop in SystemVerilog?

I'm trying to make a module in SystemVerilog that can find the dot product between two vectors with up to 8 8-bit values. I'm trying to make it flexible for vectors of different length, so I have an input called EN that's 3 bits and determines the number of multiplications to perform.
So, if EN == 3'b101, the first five values of each vector will be multiplied and added together, then output as a 32-bit value. Right now, I'm trying to do that like:
int acc = 0;
always_comb
begin
for(int i = 0; i < EN; i++) begin
acc += A[i] * B[i];
end
end
assign OUT = acc;
Where A and B are the two input vectors. However, SystemVerilog is telling me there's an illegal comparison being performed between i and EN.
So my questions are:
1) Is this the proper way to have a variable vector "length" in SystemVerilog?
2) If so, what's the proper way to iterate n times where n is the value on a bus?
Thank you!

I have to guess here, but I'm assuming it's a synthesizer complaining about that code. The synthesizer I use accepts your code with minor modifications, but maybe not all do since the loop can't be unrolled statically (notice I have input logic [2:0] EN, maybe input int EN does not work due to having too big a max number of cycles). Your loop per se (question #2) is fine.
int acc;
always_comb
begin
// If acc is not reset always_comb tries to update on its old value and puts
// it in sensitivity list, halting simulation... also no initialization to variable
// used in always_comb is allowed.
acc = 0;
...
This is a somewhat decent reason to complain about your otherwise perfectly good code, and the tool does not make the assumption that it is "reasonable" to generate all possible loops in this specific case (if EN was an unsigned integer your chip would be stupidly huge after all): you can force the tool to infer all possibilities with something that looks like the following:
module test (
input int A[8],
input int B[8],
input logic [2:0] EN,
output int OUT
);
int acc[8]; // 8 accumulators
always_comb begin
acc[0] = A[0] * B[0]; // acc[-1] does not exist, different formula!
for (int i = 1; i < 8; i++) begin
// Each partial sum builds on previous one.
acc[i] = acc[i-1] + (A[i] * B[i]);
end
end
assign OUT = acc[EN]; // EN used as selector for a multiplexer on partial sums
endmodule: test
The above module is an explicit description of the "parallel loop" my synthesizer infers.
Regarding your question #1, the answer is "it depends". In hardware there is no variable length, so unless you fix the number of iterations as a parameter as opposed to an input you either have a maximum size and ignore some values or you iterate over multiple cycles using pointers to some memory. If you want to have a variable vector length in a test (not going to silicon) then you can declare a "dynamic array" that you can resize at will (IEEE 1800-2017, 7.5: Dynamic arrays):
int dyn_vec[];
As a final side note, int bad integer good for everything that is not testbench in order to catch X values and avoid RTL-synthesis mismatch.

16-bit CPU design: Issues with implementing fetch-execute cycle

I am doing a computer architecture course on Coursera called
NandtoTetris and have been struggling with my 16-bit CPU design. The
course uses a language called HDL, which is a very simple Verilog like
language.
I have spent so many hours trying to iterate on my CPU design based on
the diagram below and I don't understand what I am doing wrong. I
tried my best to represent the fetch and execute mechanics. Does
anyone have any advice on how to solve this?
Here are the design and control syntax diagram links:
CPU IO high-level diagram:
Gate level CPU diagram:
Control instruction syntax:
Here is my code below:
// Put your code here:
// Instruction decoding:from i of “ixxaccccccdddjjj”
// Ainstruction: Instruction is 16-bit value of the constant that should be loaded into the A register
// C-instruction: The a- and c-bits code comp part, d- and j-bits code dest and jump(x-bits are ignored).
Mux16(a=outM, b=instruction, sel=instruction[15], out=aMUX); // 0 for A-instruction or 1 for a C-instruction
Not(in=instruction[15], out=aInst); // assert A instruction with op-code as true
And(a=instruction[15], b=instruction[5], out=cInst); // assert wite-to-A-C-instruction with op code AND d1-bit
Or(a=aInst, b=cInst, out=aMuxload); // assert Ainstruction or wite-to-A-C-instruction is true
ARegister(in=aMUX, load=cInst, out=addressM); // load Ainstruction or wite-to-A-C-instruction
// For C-instruction, a-bit determines if ALU will operate on A register input (0) vs M input (1)
And(a=instruction[15], b=instruction[12], out=Aselector); // assert that c instruction AND a-bit
Mux16(a=addressM, b=inM, sel=Aselector, out=aluMUX); // select A=0 or A=1
ALU(x=DregisterOut, y=aluMUX, zx=instruction[11], nx=instruction[10], zy=instruction[9], ny=instruction[8], f=instruction[7], no=instruction[6], zr=zr, ng=ng,out=outM);
// The 3 d-bits of “ixxaccccccdddjjj” ALUout determine registers are destinations for for ALUout
// Whenever there is a C-Instruction and d2 (bit 4) is a 1 the D register is loaded
And(a=instruction[15], b=instruction[4], out=writeD); // assert that c instruction AND d2-bit
DRegister(in=outM, load=writeD, out=DregisterOut); // d2 of d-bits for D register destination
// Whenever there is a C-Instruction and d3 (bit 3) is a 1 then writeM (aka RAM[A]) is true
And(a=instruction[15], b=instruction[3], out=writeM); // assert that c instruction AND d3-bit
// Programe counter to fetch next instruction
// PC logic: if (reset==1), then PC = 0
// else:
// load = comparison(instruction jump bits, ALU output zr & ng)
// if load == 1, PC = A
// else: PC ++
And(a=instruction[2], b=ng, out=JLT); // J2 test against ng: out < 0
And(a=instruction[1], b=zr, out=JEQ); // J1 test against zr: out = 0
Or(a=ng, b=zr, out=JGToutMnot)); // J0 test if ng and zr are both zero
Not(in=JGToutMnot, out=JGToutM; // J0 test if ng and zr are both zero
And(a=instruction[0], b=JGToutM, out=JGT);
Or(a=JLT, b=JEQ, out=JLE); // out <= 0
Or(a=JGT, b=JLE, out=JMP); // final jump assertion
And(a=instruction[15], b=JMP, out=PCload); // C instruction AND JMP assert to get the PC load bit
// load in all values into the programme counter if load and reset, otherwise continue increasing
PC(in=addressM, load=PCload, inc=true, reset=reset, out=pc);

It is tricky to answer these kinds of questions without doing the work for you, which isn't helpful to you in the long run.
Some general thoughts.
Consider each element in isolation (including the circles where signals come together).
Label each line between elements with a name. These will become internal control lines. It helps reduce the chances of confusion.
Be very careful about junk outputs. If you're not supposed to be putting valid data on outM, use a Mux to output false.
Potential gotcha: I seem to remember that it's a bad idea to use a design output (like outM) as an input to something else. Outputs should only be outputs. Right now you are sending the output of the ALU to outM and using outM as an input to other elements. I suggest you try outputting the ALU to a new signal "ALUout", and using that as the input for the other elements and (through a mux with false controlled by writeM) outM. But remember, writeM is an output! So the block that generates writeM needs to generate a copy of itself to use as the control to the mux. FORTUNATELY, a block can have multiple out statements!
For example, right now you're generating outM like this (I won't comment on whether it is wrong, I am just using it as an illustration):
And(a=instruction[15], b=instruction[3], out=writeM);
You can create a second output like this:
And(a=instruction[15], b=instruction[3], out=writeM, out=writeM2)
and then "clean" your outM like this:
Mux16(a=false,b=ALUout,sel=writeM2,out=outM);
Good luck!

Evaluation order for always blocks triggered within always blocks in Verilog?

I understand that, for 2 always blocks with the same trigger, their order of evaluation is completely unpredictable.
However, suppose I have:
always #(a) begin : blockX
c = 0;
d = a + 2;
if(c != 1) e = 2;
end
always #(a) begin : blockY
e = 3;
end
always #(d) begin : blockZ
c = 1;
e = 1;
end
Suppose block X evaluates first. Does changing d in blockX immediately jump to blockZ? If not, when is blockZ evaluated with respect to blockY?
My programmer's instinct thinks of the sequence of events as a stack, where evaluating blockX is like a function call to blockZ and I immediately jump there in the code, then finish evaluating blockX.
However, because we call the active events queue, well, a queue, this suggests blockZ is enqueued at the back of the active events queue, and I'm 100% guaranteed it will be evaluated last (unless there are other triggered always blocks).
There's also the intermediate possibility, where it's neither first nor last but is also evaluated in a random and unpredictable order.
So in this example, are 1, 2, or 3 all possible final values for e, depending on how the compiler is feeling at run time?
Additionally, while I understand, of course, this represents awful style, where might I find the specification for this kind of behvaior?

Always blocks are not function calls. See a recent answer I just gave for a similar question. These blocks are concurrent processes. The LRM only guarentees the ordering of statements within a begin/end block. There is no defined ordering between concurrently executing begin/end blocks (See Section 4.7 Nondeterminism in the 1800-2012 LRM) So a simulator is free to interleave the statements in any way as long as it honors the order within a single block.
So you are correct that e could have the final values 1, 2 or 3 depending on how a simulator decides to implement and optimize your code.

swap two variables in verilog using XOR

I have a line of data of 264 bits in memory buffer written using Verilog HDL.
buffer[2]=264'b000100000100001000000000001000000000000001000001000000000000000000000000000000000000100000010000010000100000000000100000000010000100001100000000000000000000000000000000000010000001000001000010000000000010000000000000010001010000000000000000000000000000000000001000;
I want to transfer 10 bits within the above raw from buffer[2][147:138] bits to buffer[2][59:50], then transfer buffer[2][235:226] bits into buffer[2][147:138]
I try to do this using XOR but it dose not work
buffer[2][59:50]=buffer[2][59:50]^buffer[2][147:138];
buffer[2][147:138]=buffer[2][59:50]^buffer[2][147:138];
buffer[2][59:50]=buffer[2][59:50]^buffer[2][147:138];
buffer[2][235:226]=buffer[2][235:226]^buffer[2][147:138];
buffer[2][147:138]=buffer[2][235:226]^buffer[2][147:138];
buffer[2][235:226]=buffer[2][235:226]^buffer[2][147:138];
How can I do this without using non-blocking assignment ?

You can swap with concatenations, no xor required:
{buffer[2][147:138],buffer[2][59:50]} = {buffer[2][59:50],buffer[2][147:138]};
{buffer[2][235:226],buffer[2][147:138] = {buffer[2][147:138],buffer[2][235:226]};
Your title says swap, but your description says transfer. To transfer you can can still use the same approach:
{buffer[2][147:138],buffer[2][59:50]} = {buffer[2][235:226],buffer[2][147:138]}
// Or you can do this, beware order matters
buffer[2][59:50] = buffer[2][147:138];
buffer[2][147:138] = buffer[2][235:226];
Be careful where you do this in an always block. It can create a combinational feedback loop after synthesized if done incorrectly. The bits must first be assigned by a determinate value (ideally a flop) before doing the swap.

Just create a new variable to hold the new, rearranged, array. This should not generate any logic, you are just rearranging wires.
reg [263:0] reArrBuffer [0:2];
assign reArrBuffer =
'{buffer[0],
buffer[1],
{buffer[2][263:148], buffer[2][235:226], buffer[2][137:60], buffer[2][147:138], buffer[2][49:0]}
};
Note: You need ' in front of the first { to create an assignment pattern for an unpacked array. It can be removed if buffer and reArrBuffer is packed.

How to count the number of occurrences of a type passed to a function in haskell

How would one count the number of times a data type was passed into a function and a total of the values? I am new to FP and not sure if this is permitted by mutability laws or referential transparency. The context is working with stacks and trying to work out if you passed in a series of instructions to the stack you could work out the frequency particular instruction was passed in and the total value of all those type, as a sort of counter... I have searched around to no avail and starting to think my approach may be fundamentally flawed so any advice would be appreciated, but i thought i would put it out there as i'm interested to know, i was working along the lines of;
> data Value
> = Numeric Int
> | Logical Bool
> deriving (Eq, Show, Read)
...
> data Instruction
> = Push Value
> | Pop
> | Fetch Int
> | Store Int
...
> step inst c=
> case (inst) of
> (Push, stack) -> (c', x : stack)
> (Pop, _ : stack) -> (c', stack)
> where
> c = c' + 1
...

Instead of explicitly managing the stack, you can use the State monad from Control.Monad.State. For details of the inner workings you should read the docs.
step :: Instruction -> State [Value] ()
step (Push v) = do
stack <- get
put (v:stack)
step Pop = do
(_:stack) <- get
put stack
You can also store the number of each instruction in the state:
step :: Instruction -> State (Int, Int, Int, Int, [Value]) ()
step (Push v) = do
(a, b, c, d, stack) <- get
put (a+1, b, c, d, v:stack)
step Pop = do
(a, b, c, d, (_:stack)) <- get
put (a, b+1, c, d, stack)
Working with a 5-tuple is somewhat cumbersome so you may want to define your own datatype for this. In this model, the first Int is the number of Pushes, the second, the number of Pops, etc.

So you need to apply a whole sequence of stack operations and to get a resulting stack and operations calling statistics. For accumulating them in a pure way, you need to carry them along with you chain of operations. You can actually do it in some ways:
1) add the stats explicitly to each function call and combine them manually;
2) or wrap them into a monad so calls could be automatically chained with >>= or sequence.
The last one suggest some particular variants.
2.1) Use State, as user2407038 proposed earlier. It hides an additional argument which carry stats so it should look like an imperative state which can be manipulated via put, get and modify.
2.2) Use Writer, which can be considered a «fat free» State where you can only add «something» (e.g. which operation was called) to your carried stats — which is actually what you need (as I can understand). Computations will be simpler because instead of all those put-s, get-s and modify-es you'll have single tell. But you'll need to make your Stats type an instance of Monoid(which is quite easy and linear, although).
2.3) Use ST, where types can be quite frightening, but you can use mutable infringement counters for performance. I wouldn't recommend that without real necessity, however.

You could assign a number to each instruction, and add an extra argument to the function, so that when when the number and the instruction match the count gets incremented. The input and output would be the program, the stack(s), and the counter.
step i1 (push x : insts, stack, c) = if i1 == 0 then step i1 (insts, x : stack, c + 1) else step i1 (insts, x : stack, c)
step i1 (pop : insts, _ : stack, c) = if i1 == 1 then step i1 (insts, stack, c + 1) else step i1 (insts, stack, c)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

LC-3 Assembly Language - swapping values - linux

If all you want to do is swap the contents of two registers, there's a simple bit-twiddling trick: XOR R1,R2 XOR R2,R1 XOR R1,R2 This will exchange the contents of the two registers without using any memory.

Related

Proper way to use a bus in a for loop in SystemVerilog?

16-bit CPU design: Issues with implementing fetch-execute cycle

Evaluation order for always blocks triggered within always blocks in Verilog?

swap two variables in verilog using XOR

How to count the number of occurrences of a type passed to a function in haskell

Categories

Resources