Number of ways a program can execute on a sequentially consistent architecture - multithreading

I am very poor at understanding and trying to solve this problem,
Let's say we have 3 threads with a,b,c instructions each, I need to find how many different ways the program can execute on a sequentially consistent architecture?
H
How should I approach this problem?

You have 3 threads. Each one processes a sequence of actions a, b, and c. Lets use numbers for threads. The key thing to understand: each thread 1 to 3 ... has to process its actions in their order. Variations can only happen because the threads can do their work in many combinations. Let's also assume that our machine can serve only one thread at any time - and that actions are completed before a thread context switch happens.
You can have:
1a, 1b, 1c, 2a, 2b, 2c, 3a, 3b, 3c
1a, 2a, 1b, 1c, 2b, 2c, 3a, 3b, 3c
1a, 2a, 2b, 1b, 1c, 2c, 3a, 3b, 3c
1a, 2a, 2b, 2c, 1b, 1c, 3a, 3b, 3c
1a, 2a, 2b, 2c, 3a, 1b, 1c, 3b, 3c
1a, 2a, 2b, 2c, 3a, 3b, 1b, 1c, 3c
...
Looking at that, one can think up an algorithm to build a tree spawning all combinations. You basically pick a candidate, like "1a", and then you figure which next steps are possible (in this case 1b, 1c, 2a, ... 3c).
You can then start building paths:
1a-1b
1a-2a
...
and so on. For each path, you remember the elements on it. And in order to add another element, you check out the "remaining" ones. Each remaining object defines another new path. Repeat.
By doing so, you should be able to define an algorithm that computes all possible paths.
This would make up a nice coding kata exercise - and my input should be good enough to get you going. If you need a shortcut, maybe look here. Or there.
Beyond that: obviously, this could also solved as a pure mathematical problem: if you just put all elements 1a, ... 3c into a list, and create all permutations of that list, you would receive 9! so, 362880 possibilities. But of course, this doesn't work, as your problem should exclude permutations such as 1b, 1a (because a, b, c will always be "in order" given your requirements).
So ( number threads + number steps ) ! gives you an upper boundary for the number of valid paths. Maybe someone else comes by and adds a bit more of maths to figure the number of invalid paths.
( btw: that would be another approach for "printing" all possible paths - simply create ALL permutations of the 9 elements, and drop those that are invalid )
Disclaimer: all of the above only makes sense when we assume that the underlying machine has exactly one "real" thread. And that thread execution and context switches happens after an operation has completed. If you drop these assumptions, then you make room for:
1: aaaabbbbcccc
2: aaaabbbbcccc
3: aaaabbbbccc
In other words: if you consider the potential paths in a real machine, things become much more complicated.

Related

Why are labels in BASIC increments of 10?

In BASIC, tags are in increments of 10. For example, mandlebrot.bas from github/linguist:
10 REM Mandelbrot Set with ANSI Colors in BASIC
20 REM https://github.com/telnet23
30 REM 20 November 2020
40 CLS
50 MAXK = 32
60 MINRE = -2.5
70 MAXRE = 1.5
80 MINIM = -1.5
90 MAXIM = 1.5
100 FOR X = 1 TO WIDTH
110 FOR Y = 1 TO HEIGHT
120 LOCATE Y, X
130 REC = MINRE + (MAXRE - MINRE) / (WIDTH - 1) * (X - 1)
140 IMC = MINIM + (MAXIM - MINIM) / (HEIGHT - 1) * (Y - 1)
150 K = 0
160 REF = 0
170 IMF = 0
180 K = K + 1
190 REF = REC + REF * REF - IMF * IMF
200 IMF = IMC + REF * IMF + REF * IMF
210 IF REF * REF + IMF * IMF > 4 THEN GOTO 230
220 IF K < MAXK THEN GOTO 180
230 M = 40 + INT(8 / MAXK * (K - 1))
240 PRINT CHR$(27) + "[" + STR$(M) + "m";
250 PRINT " ";
260 PRINT CHR$(27) + "[49m";
270 NEXT Y
280 NEXT X
Why isn't it just increments in 1? That would make more sense.
The short answer is that BASIC numbering is in increments of one, but programmers can and do skip some of the increments. BASIC grew out of Fortran, which also used numeric labels, and often used increments of 10. Unlike Fortran, early BASIC required numbering all lines, so that they changed from labels to line numbers.
BASIC is numbered in increments greater than one to allow adding new lines between existing lines.
Most early home computer BASIC implementations did not have a built-in means of renumbering lines.
Code execution in BASIC implementations with line numbers happened in order of line number.
This meant that if you wanted to add new lines, you needed to leave numbers free between those lines. Even on computers with a RENUM implementation, renumbering could take time. So if you wanted standard increments you’d still usually only RENUM at the end of a session or when you thought you were mostly finished.
Speculation: Programmers use increments of 10 specifically for BASIC line numbers for at least two reasons. First, tradition. Fortran code from the era appears to use increments of 10 for its labels when it uses any standard increments at all. Second, appearance. On the smaller screens of the era it is easier to see where BASIC lines start if they all end in the same symbol, and zero is a very useful symbol for that purpose. Speaking from personal experience, I followed the spotty tradition of starting different routines on hundreds boundaries and thousands boundaries to take advantage of the multiple zeroes at the beginning of the line. This made it easier to recognize the starts of those routines later when reading through the code.
BASIC grew from Fortran, which also used numbers, but as labels. Fortran lines only required a label if they needed to be referred to, such as with a GO TO, to know where a loop can be exited, or as a FORMAT for a WRITE. Such lines were also often in increments greater than 1—and commonly also 10—so as to allow space to add more in between if necessary. This wasn’t technically necessary. Since they were labels and not line numbers, they didn’t need to be sequential. But most programmers made them sequential for readability.
In his commonly-used Fortran 77 tutorial, Erik Boman writes:
Typically, there will be many loops and other statements in a single program that require a statement label. The programmer is responsible for assigning a unique number to each label in each program (or subprogram). The numerical value of statement labels have no significance, so any integer numbers can be used. Typically, most programmers increment labels by 10 at a time.
BASIC required that all lines have numbers and that the line numbers be sequential; that was part of the purpose of having line numbers: a BASIC program could be entered out of order. This allowed for later edits. Thus, line 15 could be added after lines 10 and 20 had been added. This made leaving potential line numbers between existing line numbers even more useful.
If you look at magazines with BASIC program listings, such as Rainbow Magazine or Creative Computing, you’ll often see numbers sandwiched somewhat randomly between the tens. And depending on style, many people used one less than the line number at the start of a routine or subroutine to comment the routine. Routines and DATA sections might also start on even hundreds or even thousands.
Programmers who used conventions like this might not even want to renumber a program, as it would mess up their conventions. BASIC programs were often a mass of text; any convention that improved readability was savored.
Ten was a generally accepted spacing even before the home computer era. In his basic basic, second edition (1978, and expecting that the user would be using “a remote terminal”), James S. Coan writes (page 2):
It is conventional although not required to use intervals of 10 for the numbers of adjacent lines in a program. This is because any modification in the program must also have line numbers. So you can use the in-between numbers for that purpose. It should be comforting to know at this point that the line numbers do not have to be typed in order. No matter what order they are typed in, the computer will follow the numerical order in executing the program.
There are examples of similar patterns in Coan’s Basic Fortran. For example, page 46 has a simple program to “search for pythagorean triples”; while the first label is 12, the remaining labels are 20, 30, and 40, respectively.
He used similar patterns without increments of 10; for example, on page 132 of Basic Fortran, Coan uses increments of 2 for his labels, and keeps the calculation section of the program in the hundreds with the display section of the program in the two hundreds. The END statement uses label 9900.
Similarly, in their 1982 Elementary BASIC, Henry Ledgard and Andrew Singer write (page 27):
Depending on the version of Basic you are using, a line number can consist of 1 to 4 or 5 digits. Here, all line numbers will consist of 4 digits, a common practice accepted by almost every version of Basic. The line numbers must be in sequential order. Increasing line numbers are often given in increments of 10, a convention we will also follow. This convention allows you to make small changes to a program without changing all the line numbers.
And Jerald R. Brown’s 1982 Instant BASIC: 2nd Astounding Edition (p. 7):
You don’t have to enter or type in a program in line number order. That is, you don’t have to enter line 10 first, then line 20, and then line 30. If we type in a program out of line number order, the computer doesn’t care. It follows the line numbers not the order they were entered or typed in. This makes it easy to insert more statements in a program already stored in the computer’s memory. You may have noticed how we cleverly number the statements in our programs by 10's. This makes it easy to add more statements between the existing line numbers -- up to nine more statements between lines 10 and 20, for example.
Much of the choice of how to number lines in a BASIC program was based on tradition and a vague sense of what worked. This was especially true in the home computer era where most users didn’t take classes on how to use BASIC but rather learned by reading other people’s programs, typing them in from the many books and magazines that provided program listings. The tradition of incrementing by 10 and inserting new features between those increments was an obvious one.
You can see it scanning through old books of code, such as 101 BASIC Computer Games. The very first program, “Amazin” increments its line numbers by 10. But at some point, a user/coder decided they needed an extra space after the code prints out how many dollars the player has; so that extra naked PRINT is on line 195. And the display of the instructions for the game are all kept between lines 100 and 109, another common pattern.
The program listing on page 30 for Basket displays the common habit of starting separate routines at even hundreds and thousands. Line numbers within those routines continue to increment by 10. The pattern is fairly obvious even though new features (and possibly other patterns) have added several lines outside the pattern.
As BASIC implementations began to get RENUM commands, more BASIC code listings appeared with increments of one. This is partly because using an increment of one used less memory. While the line number itself used a fixed amount of RAM (with the result that the maximum line number was often somewhere around FFFF, or 65525), references to line numbers did not tend to use a fixed length. Thus, smaller line numbers used less RAM overall.
Depending on how large the program was, and how much branching it used, this could be significant compared to the amount of RAM the machine itself had.
For example, I recently typed in the SKETCH.BAS program from the October 1984 Rainbow Magazine, page 97. This is a magazine, and a program, for the TRS-80 Color Computer. This program uses increments of 1 for its line numbering. On CLOADing the program in, free memory stands at 17049. After using RENUM 10,1,10 to renumber it in increments of 10, free memory stands at 16,953.
A savings of 96 bytes may not sound like much, but this is a very small program; and it’s still half a percent of available RAM. The difference could be the difference between a program fitting into available RAM or not fitting. This computer only has 22823 bytes of RAM free even with no program in memory at all.

What data structure should I use in Python to represent a preflop tree in Poker - NL Texas Holdem?

I am trying to model preflop play of NL Texas Holdem and store corresponding ranges/actions in Python.
It is often referred to colloquially in videos as "in this branch of the tree..." (for lets say player on the Button facing 3-bet from Big Bling).
So I thought a tree would be appropriate.
But the problem is that in a tree branches have no value, that is "Facing 3-Bet from the Button" is just a "child" of "Big Blind 3-bets", but in the scenario that I am trying to model "Facing 3-Bet from the Button" is different if the Big Blind bets 9, 12 or 15 big blinds.
How should I model that?
The second problem is than comparing similar situations preflop.
If any action taken by player leads to another sub-node than similar situations preflop reside on different depths or levels of the tree and are not really comparable.
For example:
LJ (Seat 1) folds, HJ (Seat 2) folds, CO (Seat 3) folds, Button is facing unopened pot ---> level 4
LJ (Seat 1) raises first in, HJ (Seat 2) raises/3-bets, LJ (Seat 1) raises all in, HJ (Seat 2) is facing all-in bet ---> level 4
So both are reached after 4-actions although the scenarios could not be more apart. Makes it difficult to compare between similar situations. How should I go about it?

Cobol - parsing group items in a cobol program

I need to extract information from a COBOL program. I'm using the ANTLR grammar for COBOL. I need to extract group variables as a whole. I'm not able to extract this with ANTLR as the parser extracts every variable subdivision/group item as an individual element.
I need somehow to get the group items as a bunch. I'm new to COBOL, so I want to get an understanding of how the compiler understands which elements to include in a group, and where to stop.
EX:
01 EMPREC.
02 EEMPNAME.
10 FIRSTNAME PIC X(10)
10 LASTNAM PIC X(15)
07 SNO PIC X(15)
Is the above definition valid? Will the compiler include all elements(=>2 and <=49) after the first item (01 EMPREC), in the group EMPREC until it encounters another 01 or 77 ? Is this safe to assume?
Is the level information enough to derive what elements fall under a group?
Any pointers is appreciated.
I am the author of the COBOL ANTLR4 grammar you found in the ANTLR4 grammars project. The COBOL grammar generates only an Abstract Syntax Tree (AST).
In contrast, what you ask for is an Abstract Semantic Graph (ASG), which represents grouping of variables and in general relationships between AST elements.
Such an ASG is generated by the COBOL parser at my proleap-cobol-parser project. This project uses the mentioned COBOL grammar and resolves relationships between AST elements.
An example for parsing data description entries can be found in this unit test.
You actually had two questions:
"Is the [...] definition valid?" No it is not as you have no previous level 07. If you change the level of EEMPNAME to 07 or SNO to 02 it is valid. Group items may have a USAGE clause but no PICTURE.
This leads to the question "I want to get an understanding of how the compiler understands which elements to include in a group, and where to stop".
You need to store the level number together with the variable. If you want to know what is part of the group then you need to check this level and all below. If you want to check the complete level 02 group use only the variables with an higher level number below until you get to the next level 02 or a higher level (in this case 01), if you want the
Depending on your needs you additional need to check if the next variable with the same level has a REDEFINES in, in this case it belongs to the same group (storage-wise). Similar applies to level 66 (renames, doesn't have its own storage).
Level 88 has no storage either, it is just for validation entries depending on the parsing you want to do you can ignore them.
Important: level 88 does not create a sub-item, you can have multiple ones and a lower level number afterwards.
The level numbers that always defines a new item are 01, and with extensions 66, 77 and 78.
01 vargroup.
02 var-1 pic 9.
88 var-is-even values 0, 2, 4 6 8 .
88 var-is-not-even values 1 3 5 7 9.
88 var-is-big value 6 thru 9.
02 var-2 pic x.
01 new-var pic x.
77 other-var pic 9.
I suggest to read some COBOL sources and come up with a new question, if necessary. For example CBL_OC_DUMP.
I suspect you are going to need to put some additional code behind your ANTLR parser. If you tokenize each individual item, then keeping up with a stack of group items is somewhat easy. However, trying to grab the entire group item as a single production will be very hard.
Some of the challenges that ANTLR will not be up to are 1) group items can contain group items; 2) group items can redefine other items, or be redefined; 3) the little used, but very complicating level-66 renames clause.
If you treat each numbered data definition as a separate production, and maintain a stack, pushing for new items, popping once you have completed processing an item, and knowing that you have completed a group once you see the same level number again, your life will be easier.
It is quite a while now since I've done COBOL, but there are quite a lot of issues if my memory serves me correctly.
1) 01 levels always start in column 8.
2) When assigning subsiquent levels you are better off incrementing my +5
01 my-record.
05 my-name pic x(30) value spaces.
05 my-address1 pic x(40) value spaces.
3) 77 levels I thought are now obsolete since they are not an efficeint use of memory. Also when 77 levels are used they should always be defined at the start of the working storage section. Obviously record layouts are defined in file section unless using write from and read into?
4) If you are defining lots of new-var pic x. Don't use new 01 levels for each!
01 ws-flages.
05 ws_flag1 pic x value space.
05 ws_flag2 pic x value space.
etc.
For COBOL manuals try Stern & Stern.
Hope this helps!

how to find edge from data in Excel

I'm trying to find the relation (edges) between nodes using Excel and VBA. I will use the output in Gephi, but the data that I have in Excel is too large, and this an example for my question to find the true relations.
If I have this data:
'data for id_books that user_id borrowed
user_id id_book book
1 55 physic
2 55 physic
2 55 physic
3 55 physic
4 55 physic
this is the output is show me the users that borrowed the same book from library:
nodes(user_id): edges(relation between user_id)
source,target
1 1,2
2 1,3
3 1,4
4 2,3
2,4
2,3
2,4
is that correct to show me 1,2 just once?
There are two closely related structures in Discrete Mathematics, graphs and multigraphs. A graph is a set of nodes and a set of pairs of nodes. If you want to define a graph whose nodes are users and whose edges correspond to the relation of having borrowed the same book at least once, then it wouldn't make sense to list an edge like (1,2) more than once. On the other hand, in a multigraph edges can be repeated. Storing (1,2) multiple times would tell you that user 1 and user 2 have borrowed the same book, with at least one of those users having borrowed the book at least twice. If you would find that information useful, use a multigraph. Otherwise use a graph. I would think that something like Gephi would be able to draw both graphs and multigraphs, so in that sense it really is up to you. Note, however, that drawings of multigraphs can be harder to read since they have more visual clutter. I hate cluttered diagrams, so I would probably prefer to use single-edge graphs rather than multi-edge multigraphs, but that is more of a preference on my part. You might have a strong reason to prefer multigraphs in your intended application.

Question about relations between two numbers

Is there is any relation between numbers' bits when one is divisible by another? What is the relation between the bits of 36 and the bit sequences of 9 or 4 or 12, or between 10 (1010) and 5 (101), or 21 (10101) and 7 (00111)?
Thanks. I am sorry if some sentence is not correct, but I hope you understand what I want.
I know this is not exactly what you're asking, but it may be helpful. There are many tricks for establishing binary number divisibility by manipulation of bits. For example a binary number is divisible by three if the sum of its even binary bits minus the sum of its odd binary bits all modulus 3 is zero. Here's a link discussing binary divisibility.
Let's take the example of 36.
36 = 0010 0100
36 is 4 * 9, that is
4 = 0100
9 = 1001
If you multiply them (like you would on a normal multiplication) you'll have
0100 x
1001
--------
0100
0000
0000
0100
-------
0100100
So essentially 0100 x 1001 = 0010 0100 (you can repeat the same for any other pair of divisors of course)
Now, is there any special relation that will allow you to get all the divisors of 36 just by looking at its bits? The answer, alas, is no :)
EDIT: there is no KNOWN relation at least but, who knows, in the future maybe some smart mathematician will find one. As of today, the answer is still no.
So you want to know if you can 'quickly' do Integer Factorization by just looking at the bits?
Good luck with that!
Obviously, that a is a multiple of b can be recognized given the binary representions of a and b (it's what the hardware does when executing the following code
boolean isMultiple = a % b == 0;
) and hence there is such a relationship.
Ask a more specific question to get a more specific response ...
The easiest to see is the number of consecutive 0 in the least significant digits designates the largest power of two that is a factor of your number n. There are apparently other tests, as DonnyD pointed out (I hadn't known that one) but I expect they're not going scale very well. If they did, public key cryptography, as it's generally implemented, would quickly become a thing of the past.
That's not to say that such methods can't be discovered / invented. For instance it's been shown that arbitrarily large numbers can be easily factored using quantum methods, but nobody's ever been actually able to implement a working system.
The bottom line is that we've entrusted our online financial system and national security apparatus to PKI based methods primarily because we assume that factoring numbers is hard for arbitrarily large numbers. But as Moron seemed to be implying in his answer, you're welcome to give it a whirl.

Resources