List comprehensions and matrices (raws) - python-3.x

Please, have a look at this constriction:
M = [[1,2,3],
[4,5,6],
[7,8,9]]
T = [row[1] for row in M]
print(T)
The result is [2, 5, 8]
I managed to find something here:
http://docs.python.org/py3k/tutorial/datastructures.html#nested-list-comprehensions
But I'm not satisfied with my understanding of this scheme with 'raw'.
Could you tell me where else in the documentation can I read about it?
By the way, why raw? It seems to be a column?

T = [row[1] for row in M]
This is a list comprehension. List comprehensions basically allow you to create lists on the fly while iterating through other iterables (in this case M).
The code above is more or less identical to this:
T = [] # create empty list that holds the result
for row in M: # iterate through all 'rows' in M
cell = row[1] # get the second cell of the current row
T.append(cell) # append the cell to the list
This is all just put together into a single line and a bit more efficient, but the basic idea is the same.
M is a matrix, but the internal representation you chose is a list of lists; or a list of rows. And in T you want to select a single column of the matrix although you have no direct access to columns in the matrix T. So you basically go through each row, take the cell of the column you are interested in and create a new list with the cells of your columns (as lists are usually horizontally aligned, you are strictly getting the transposed vector of your column).

You iterate through the rows and take second element of the row. Then you collect the extracted elements from the rows. It means that you have extracted the column.
Read the list comprehension from the right to the left. It says:
Loop through the matrix M to get the row each time (for row in M).
Apply the expression to the row to get what you need (here row[1]).
Iterate through the constructed results and build the list of them ([...]).
The last point makes it the list comprehension. The thing between the [ and ] is called a generator expression. You can also try:
column = list(row[1] for row in M)
And you get exactly the same. That is because the list() construct a list from any iterable. And the generator expression is such iterable thing. You can also try:
my_set = set(row[1] for row in M)
to get the set of the elements that form the column. The syntactically brief form is:
my_set = {row[1] for row in M}
and it is called set comprehension. And there can be also a dictionary comprehension like this:
d = { row[1]: True for row in M }
Here rather artificially, the row[1] is used as the key, the True is used as the value.

Related

Textjoin values of column B if duplicates are present in column A

I want to consolidate the data of column B into a single cell ONLY IF the index (ie., Column A) is duplicated.
For example:
Currently, I'm doing manually for each duplicated index by using the following formula:
=TEXTJOIN(", ",TRUE,B4:B6)
Is there a better way to do this all at once?
Any help is appreciated.
There may easier way but you can try this formula-
=BYROW(A2:A17,LAMBDA(p,IF(INDEX(MAP(A2:A17,LAMBDA(x,SUM(--(A2:INDEX(A2:A17,ROW(x)-1)=x)))),ROW(p)-1,1)=1,TEXTJOIN(", ",1,FILTER(B2:B17,A2:A17=p)),"")))
Using REDUCE might be possible for a more succinct solution, though try this for now:
=BYROW(A2:A17,LAMBDA(ζ,LET(α,A2:A17,IF((COUNTIF(α,ζ)>1)*(COUNTIF(INDEX(α,1):ζ,ζ)=1),TEXTJOIN(", ",,FILTER(B2:B17,α=ζ)),""))))
For the sake of alternatives about how to solve it:
Using XMATCH/UNIQUE
=LET(A, A2:A17, ux, UNIQUE(A),idx, FILTER(XMATCH(ux, A), COUNTIF(A, ux)>1),
MAP(SEQUENCE(ROWS(A)), LAMBDA(s, IF(ISNA(XMATCH(s, idx)), "", TEXTJOIN(",",,
FILTER(B2:B17, A=INDEX(A,s)))))))
or using SMALL/INDEX to identify the first element of the repetition:
=LET(A, A2:A17, n, ROWS(A), s, SEQUENCE(n),
MAP(A, s, LAMBDA(aa,ss, LET(f, FILTER(B2:B17, A=aa), IF((ROWS(f)>1)
* (INDEX(s, SMALL(IF(A=aa, s, n+1),1))=ss), TEXTJOIN(",",, f), "")))))
Here is the output:
Explanation
XMATCH and UNIQUE
The main idea here is to identify the first unique elements of column A via ux, and find their corresponding index position in A via XMATCH(ux, A). It is an array of the same size as ux. Then COUNTIF(A, ux)>1) returns an array of the same size as XMATCH output indicating where we have a repetition.
Here is the intermediate result:
XMATCH(ux, A) COUNTIF(A, ux)>1)
1 FALSE
2 FALSE
3 TRUE
6 FALSE
7 TRUE
9 TRUE
11 FALSE
12 TRUE
15 FALSE
16 FALSE
so FILTER takes only the rows form the first column where the second column is TRUE, i.e the index position (idx) where the repetition starts. For our sample it will be: {3;7;9;12}.
Now we iterate over the sequence of index positions (s) via MAP . If s is found in idx via XMATCH (also XLOOKUP(s, idx, TRUE, FALSE) can be used for the same purpose) then we join the values of column B filtered by column A equal to INDEX(A,s).
SMALL and INDEX
This is a more flexible approach because in the case we want to do the concatenation in another position of the repetition you just need to specify the order and the formula doesn't change.
We iterate via MAP through elements of column A and index position (s). The name f has the filtered values from column B where column A is equal to a given value of the iteration aa. We need to identify only filtered rows with repetition, so the first condition ROWS(f) > 1 ensures it.
The second condition identifies only the first element of the repetition:
INDEX(s, SMALL(IF(A=aa, s, n+1),1))=ss
The second argument of SMALL indicates we want the first smallest value, but it could be the second, third, etc.
Where A is equal to aa, IF assigns the corresponding value of the sequence (remember IF works as an array formula), if not then it assigns a value that will never be the smallest one, for example, n+1, where n represents the number of rows of column B. SMALL returns the smallest index position. If the current index position ss is not the smallest one, the conditions FALSE.
Finally, we do a TEXTJOIN only when both conditions are met (we multiply them to ensure an AND condition).

builtins.TypeError: list indices must be integers or slices, not list

The question is:
Write a function get_column(game, col_num) that takes a legal 3 x 3 game of noughts and crosses as explained above and returns a 3-element list containing the values from column number col_num, top to bottom. You may assume col_num is in the range 0 to 2 inclusive.
Hint: Since noughts and crosses is always played on a 3 x 3 grid, you don't need to handle general n x m grids. It is sufficient to just explicitly select the row and column elements you need, so you don't actually require a loop for this question. However, you're welcome to try using a loop to give yourself more practice.
Hence I want to retrieve any column which I mention in the function from a list of list.
Below code is what I tried
def get_column(game, col_num):
"""returns a 3-element list containing the values from column number
col_num, top to bottom"""
j = col_num
result = []
for i in game:
result.append(game[i][j])
return result
I won't try to solve your exercise for you, but I can tell you why you are getting the error.
Your loop
for i in game:
Loops through the 3x3 list of lists. So it will loop 3 times, namely
i = ['O', 'X', 'O'] # pass 1
i = ['X', '',''] # pass 2
i = ['X', '',''] # pass 3
So i is a list. You are then trying to use i to index a list in this statement
result.append(game[i][j])
but lists must be indexed with a single integer (o, 1 or 2), or a slice(like 0:1, 1:2, etc).

How to sort list by another list?

I have excel list of names(Datalist) and I need to sort it to be in exact same order as similar list(Patternlist).
How can i sort Datalist to have same order as Patternlist?
Patternlist(each letter is first cell in a row):
X
Y
Z
Q
Datalist(each letter is first cell in a row):
Q
X
Y
Z
Manually doing it :
Label each row in Patternlist with 1,2,3,..
use index match to generate a 'sequence' list from the Datalist
=index([Datalist_QXYZ],match([1st_named_cell],[Patternlist_XYZQ],0))
copy the generated sequence list and paste as values, then sort.
(3b) If you need to actively generate new list.. then use rank() to manually sort it.
Hope it helps..

Union of cell array of cells

I'm looking for the way to do the union of two cell arrays of cell arrays of strings. For example:
A = {{'one' 'two'};{'three' 'four'};{'five' 'six'}};
B = {{'five' 'six'};{'seven' 'eight'};{'nine' 'ten'}};
And I'd like to get something like:
C = {{'one' 'two'};{'three' 'four'};{'five' 'six'};{'seven' 'eight'};{'nine' 'ten'}};
But when I use C = union(A, B) MATLAB returns an error saying:
Input A of class cell and input B of class cell must be cell arrays of strings, unless one is a string.
Does anyone know how to do something like this in a hopefully simple way? I'd greatly appreciate it.
ALTERNATIVE: A way to have a cell array of separated strings in any other way than a cell array of cell array of strings would be also useful, but as far as I know, it's not possible.
Thank you!
C=[A;B]
allWords=unique([A{:};B{:}])
F=cell2mat(cellfun(#(x)(ismember(allWords,x{1})+2*ismember(allWords,x{2}))',C,'uni',false))
[~,uniqueindices,~]=unique(F,'rows')
C(sort(uniqueindices))
What my code does: it builds up a list of all words allwords, then this list is used to build up a matrix which contains the correlation between the rows and which word they contain. 1=Match for first wird, 2=Match for second word. Finally, on this numeric matrix unique can be applied to get the indices.
Including my update, now the 2 words per cell is hardcoded. To get rid of this limitation it would be neseccary to replace the anonymous function (#(x)(ismember(allWords,x{1})+2*ismember(allWords,x{2}))) with a more generic implementation. Probably using cellfun again.
Union doesn't seem like compatible for cell arrays of cells. So, we need to look for some workaround.
One approach would be to get the data from A and B concatenated vertically. Then, along each column assign each cell of strings an unique ID. Those IDs can then be combined into a double array that opens up the possibility of of using unique with 'rows' option to get us the desired output. This is precisely achieved here.
%// Slightly complicated input for safest verification of results
A = {{'three' 'four'};
{'five' 'six'};
{'five' 'seven'};
{'one' 'two'}};
B = {{'seven' 'eight'};
{'five' 'six'};
{'nine' 'ten'};
{'three' 'six'};};
t1 = [A ; B] %// concatenate all cells from A and B vertically
t2 = vertcat(t1{:}) %// Get all the cells of strings from A and B
t22 = mat2cell(t2,size(t2,1),ones(1,size(t2,2)));
[~,~,row_ind] = cellfun(#(x) unique(x,'stable'),t22,'uni',0)
mat1 = horzcat(row_ind{:})
[~,ind] = unique(mat1,'rows','stable')
out1 = t2(ind,:) %// output as a cell array of strings, used for verification too
out = mat2cell(out1, ones(1,size(out1,1)),size(out1,2)) %//desired output
Output -
out1 =
'three' 'four'
'five' 'six'
'five' 'seven'
'one' 'two'
'seven' 'eight'
'nine' 'ten'
'three' 'six'

Adding two matrices in python

def addM(a, b):
res = []
for i in range(len(a)):
row = []
for j in range(len(a[0])):
row.append(a[i][j]+b[i][j])
res.append(row)
return res
I found this code here which was made by #Petar Ivanov, this code adds two matrices, i really don't understand the 3rd line, why does he use len(a) and the 5th line, why does he use len(a[0]). In the 6th line, also why is it a[i][j] +b[i][j]?
The matrix here is a list of lists, for example a 2x2 matrix will look like: a=[[0,0],[0,0]]. Then it is easy to see:
len(a) - number of rows.
len(a[0]) - number of columns (since this is a matrix, the length of a[0] is the same as length of any a[i]).
This way, i is the number of row, j is the number of column and a[i][j]+b[i][j] is simply adding up the elements of two matrices which are placed in the same locations in the matrices.
For all this to work, a and b should be of the same shapes (so, numbers of rows and columns would match).

Resources