How to intrpret Tcl "list as string" as list? - string

By some unknown reason the variable result in the following line of code
set result [[$sqlCmd execute] allrows -as lists]
gets string which looks like list: {2 3 4 5}
If I write puts "result $result => [llength $result]" it prints {2 3 4 5} => 1
if I write puts [list $result], it prints {{2 3 4 5}}, what is correct because list creates list from one string.
Is there any way to convert this string to what it expected to be - list - without any string processing steps like deletion of braces and splitting string to list by split function? I suggest it must be some interpretation but I'm unable to find nice solutition.

The allrows method always returns a list, one per row (even when there's only a single row returned). When the -as lists option is passed in, each element of that list is itself a list representing the columns in that row.
Thus, to iterate over the columns of that row, you'd do:
set result [[$sqlCmd execute] allrows -as lists]
set rowresult [lindex $result 0]
foreach col $rowresult {
puts "I've got a '$col'"
}
You're usually recommended to use the default that represents rows as dictionaries indexed by column name, as that has a better representation of SQL NULLs (i.e., the column is absent then instead of being the driver-designated null value, which is often and ambiguously the empty string).

allrows is giving you a list of lists, each sublist representing a row. There is 1 row in this list, 2 3 4 5, so the length is 1. You can index or iterate over the list the usual ways to access its one element.
# If you're assuming there will only be one row
set only_row [lindex $result 0]
# Or if you want to iterate over all rows
foreach row $result {
do whatever with $row
}

Related

TCL, extract 2 integers from string into list?

I have 2 string formatted as such:
(1234, 4567)
And I have a list
points {0 1 2 4}
I would like to extract 2 integers from the first list and replace the first two integers in the list, after that extract two more integers from the 2nd list and replace the 3rd and 4th integers in the list so at the end I will have a list of 4 integers from the two strings.
So far I have tried all kind of things but always end up with errors or brackets in the list which I do not want. I feel I am missing out on the easy way to do that.
With the first set of values, you can parse with scan or regexp; in this case, I think scan looks better:
set input "(1234, 5678)"
scan $input "(%d,%d)" a b
To update a Tcl list (formally, one in a variable), you use lset; you can give a sequence of (zero-based) indices to it to navigate into the exact place in the list where you want to update:
set workingArea "points {0 1 2 4}"
lset workingArea 1 2 $a
lset workingArea 1 3 $b
puts $workingArea
# prints: points {0 1 1234 5678}

About lists in python

I have an excel file with a column in which values are in multiple rows in this format 25/02/2016. I want to save all this rows of dates in a list. Each row is a separate value. How do I do this? So far this is my code:
I have an excel file with a column in which values are in multiple rows in this format 25/02/2016. I want to save all this rows of dates in a list. Each row is a separate value. How do I do this? So far this is my code:
import openpyxl
wb = openpyxl.load_workbook ('LOTERIAREAL.xlsx')
sheet = wb.get_active_sheet()
rowsnum = sheet.get_highest_row()
wholeNum = []
for n in range(1, rowsnum):
wholeNum = sheet.cell(row=n, column=1).value
print (wholeNum[0])
When I use the print statement, instead of printing the value of the first row which should be the first item in the list e.g. 25/02/2016, it is printing the first character of the row which is the number 2. Apparently it is slicing thru the date. I want the first row and subsequent rows saved as separate items in the list. What am I doing wrong? Thanks in advance
wholeNum = sheet.cell(row=n, column=1).value assigns the value of the cell to the variable wholeNum, so you're never adding anything to the initial empty list and just overwrite the value each time. When you call wholeNum[0] at the end, wholeNum is a the last string that was read, and you're getting the first character of it.
You probable want wholeNum.append(sheet.cell(row=n, column=1).value) to accumulate a list.
wholeNum =
This is an assignment. It makes the name wholeNum refer to whatever object the expression to the right of the = operator evaluates to.
for ...:
wholeNum = ...
Performing assignment in a loop is frequently not useful. The name wholeNum will refer to whatever value was assigned to it in the last iteration of the loop. The other iterations have no discernible effect.
To append values to a list, use the .append() method.
for ...:
wholeNum.append( ... )
print( wholeNum )
print( wholeNum[0] )

Count number of occurences of a string and relabel

I have a n x 1 cell that contains something like this:
chair
chair
chair
chair
table
table
table
table
bike
bike
bike
bike
pen
pen
pen
pen
chair
chair
chair
chair
table
table
etc.
I would like to rename these elements so they will reflect the number of occurrences up to that point. The output should look like this:
chair_1
chair_2
chair_3
chair_4
table_1
table_2
table_3
table_4
bike_1
bike_2
bike_3
bike_4
pen_1
pen_2
pen_3
pen_4
chair_5
chair_6
chair_7
chair_8
table_5
table_6
etc.
Please note that the dash (_) is necessary Could anyone help? Thank you.
Interesting problem! This is the procedure that I would try:
Use unique - the third output parameter in particular to assign each string in your cell array to a unique ID.
Initialize an empty array, then create a for loop that goes through each unique string - given by the first output of unique - and creates a numerical sequence from 1 up to as many times as we have encountered this string. Place this numerical sequence in the corresponding positions where we have found each string.
Use strcat to attach each element in the array created in Step #2 to each cell array element in your problem.
Step #1
Assuming that your cell array is defined as a bunch of strings stored in A, we would call unique this way:
[names, ~, ids] = unique(A, 'stable');
The 'stable' is important as the IDs that get assigned to each unique string are done without re-ordering the elements in alphabetical order, which is important to get the job done. names will store the unique names found in your array A while ids would contain unique IDs for each string that is encountered. For your example, this is what names and ids would be:
names =
'chair'
'table'
'bike'
'pen'
ids =
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
1
1
1
1
2
2
names is actually not needed in this algorithm. However, I have shown it here so you can see how unique works. Also, ids is very useful because it assigns a unique ID for each string that is encountered. As such, chair gets assigned the ID 1, followed by table getting assigned the ID of 2, etc. These IDs will be important because we will use these IDs to find the exact locations of where each unique string is located so that we can assign those linear numerical ranges that you desire. These locations will get stored in an array computed in the next step.
Step #2
Let's pre-allocate this array for efficiency. Let's call it loc. Then, your code would look something like this:
loc = zeros(numel(A), 1);
for idx = 1 : numel(names)
id = find(ids == idx);
loc(id) = 1 : numel(id);
end
As such, for each unique name we find, we look for every location in the ids array that matches this particular name found. find will help us find those locations in ids that match a particular name. Once we find these locations, we simply assign an increasing linear sequence from 1 up to as many names as we have found to these locations in loc. The output of loc in your example would be:
loc =
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
5
6
7
8
5
6
Notice that this corresponds with the numerical sequence (the right most part of each string) of your desired output.
Step #3
Now all we have to do is piece loc together with each string in our cell array. We would thus do it like so:
out = strcat(A, '_', num2str(loc));
What this does is that it takes each element in A, concatenates a _ character and then attaches the corresponding numbers to the end of each element in A. Because we want to output strings, you need to convert the numbers stored in loc into strings. To do this, you must use num2str to convert each number in loc into their corresponding string equivalents. Once you find these, you would concatenate each number in loc with each element in A (with the _ character of course). The output is stored in out, and we thus get:
out =
'chair_1'
'chair_2'
'chair_3'
'chair_4'
'table_1'
'table_2'
'table_3'
'table_4'
'bike_1'
'bike_2'
'bike_3'
'bike_4'
'pen_1'
'pen_2'
'pen_3'
'pen_4'
'chair_5'
'chair_6'
'chair_7'
'chair_8'
'table_5'
'table_6'
For your copying and pasting pleasure, this is the full code. Be advised that I've nulled out the first output of unique as we don't need it for your desired output:
[~, ~, ids] = unique(A, 'stable');
loc = zeros(numel(A), 1);
for idx = 1 : numel(names)
id = find(ids == idx);
loc(id) = 1 : numel(id);
end
out = strcat(A, '_', num2str(loc));
If you want an alternative to unique, you can work with a hash table, which in Matlab would entail to using the containers.Map object. You can then store the occurrences of each individual label and create the new labels on the go, like in the code below.
data={'table','table','chair','bike','bike','bike'};
map=containers.Map(data,zeros(numel(data),1)); % labels=keys, counts=values (zeroed)
new_data=data; % initialize matrix that will have outputs
for ii=1:numel(data)
map(data{ii}) = map(data{ii})+1; % increment counts of current labels
new_data{ii} = sprintf('%s_%d',data{ii},map(data{ii})); % format outputs
end
This is similar to rayryeng's answer but replaces the for loop by bsxfun. After the strings have been reduced to unique labels (line 1 of code below), bsxfun is applied to create a matrix of pairwise comparisons between all (possibly repeated) labels. Keeping only the lower "half" of that matrix and summing along rows gives how many times each label has previously appeared (line 2). Finally, this is appended to each original string (line 3).
Let your cell array of strings be denoted as c.
[~, ~, labels] = unique(c); %// transform each string into a unique label
s = sum(tril(bsxfun(#eq, labels, labels.')), 2); %'// accumulated occurrence number
result = strcat(c, '_', num2str(x)); %// build result
Alternatively, the second line could be replaced by the more memory-efficient
n = numel(labels);
M = cumsum(full(sparse(1:n, labels, 1)));
s = M((1:n).' + (labels-1)*n);
I'll give you a psuedocode, try it yourself, post the code if it doesn't work
Initiate a counter to 1
Iterate over the cell
If counter > 1 check with previous value if the string is same
then increment counter
else
No- reset counter to 1
end
sprintf the string value + counter into a new array
Hope this helps!

Importing Two Sets of Data from Same Excel Sheet In MATLab

I am working on a small project for a professor at the university who needs data sorted through MATLAB with various other operations done to the data. I have read in the data with no problem using:
filename = 'file.xlsx';
data = xlsread(filename)
With this it imports all the data into one big matrix. From here, within the file itself, there data is divided into 2 main categories, left knee and right knee.
What my issue is is I have tried to separate the data, but have not had any luck. Since the two sets are NOT divided by equal rows, I can't use a simple array to select the different columns. The green columns are set one and the gold is set two. Is there a way that I can look at the second row to see if it's left or right and then put the data into different sets that way? Or is there a better way to this?
Saw your screenshot ... you've GOT the left or right knee designation right there in the column header.
But xlsread doesn't give the column headers, only the numbers ... or does it?
From Matlab help for xlsread:
[ndata, text, alldata] = xlsread('myExample.xlsx')
ndata =
1 2 3
4 5 NaN
7 8 9
text =
'First' 'Second' 'Third'
'' '' ''
'' '' 'x'
alldata =
'First' 'Second' 'Third'
[ 1] [ 2] [ 3]
[ 4] [ 5] 'x'
[ 7] [ 8] [ 9]
xlsread returns numeric data in array ndata, text data in cell array text, and unprocessed data in cell array alldata.
So, right now you are getting "ndata" but you want to get "text" too. Set up one more additional output argument for xlsread and you should get it.
[data, text, ~] = xlsread(filename); % the ~ just means throw away that third output
Then you can use strfind or strcmp on the appropriate row to pull out "Left" or "Right."
If you know the data is placed in a specific range you can do this for each set of data:
filename = 'file.xlsx';
sheet = 1;
xlRange = 'B2:C3';
subsetA = xlsread(filename, sheet, xlRange)
or you can read a column of data:
columnB = xlsread(filename,'B:B')
otherwise you should separate the data after loading it into data array as you did before.

Grouping common elements using awk

The following table illustrates a brief snapshot of the data that I wish to manipulate. I am looking for an awk script that will group similar elements into one group. For eg. if you look at the table below:
Numbers (1,2,3,4,6) should all belong to one group. So row1 row2 row4 row8 will be group "1"
Number 9 is unique and does not have any common elements. So it will reside alone in a separate group say group 2
Similarly numbers 5,7 will reside in one group say group 3 and so on...
The file:
heading1 heading2 numberlist group
name1 text 1,2,3 1
name2 text 2 1
name3 text 9 2
name4 text 1,4 1
name5 text 5,7 3
name6 text 7 3
name7 text 8 4
name8 text 6,2 1
I was searching for queries similar to mine and found this link. Grouping lists by common elements. But the solution is in C++ and not awk, which is my primary requirement.
Incidentally I also found this awk solution that is somewhat related to my query but it was devoid of handling of comma separated values.
awk script grouping with array
Numberlist i.e. $3 is my only consideration for grouping.
This problem seemed almost same as one of my problems and i had used one column in your example to solve my problem :) So...
[[bash_prompt$]]$ cat log ; echo "########"; \
> cat test.sh ;echo "########"; awk -f test.sh log
heading1 heading2 numberlist group
name1 text 1,2,3
name2 text 2
name3 text 9
name4 text 1,4
name5 text 5,7
name6 text 7
name7 text 8
name8 text 6,2
########
/^name/{
i=0; j=0;
split($3,a,",");
for(var in a) {
for(var1 in q) {
split(q[var1],r,",");
for(var2 in r) {
if(r[var2] == a[var]) {
i=1;
j=((var1+1));
}
}
}
}
if(i == 0) {
q[length(q)] = $3;
j=length(q);
}
print $1 "\t\t" $2 " \t\t" $3 "\t\t" j;
}
########
name1 text 1,2,3 1
name2 text 2 1
name3 text 9 2
name4 text 1,4 1
name5 text 5,7 3
name6 text 7 3
name7 text 8 4
name8 text 6,2 1
[[bash_prompt$]]$
Update:
split splits the first argument by the delimiter passed in third argument and puts it into an array pointed by the second argument. Here main array is q, which holds the group members of a group, it's basically an array of arrays where the index of an element is the group id, and the element is collection all the members of the group. so q[0]="1,2,3" indicates 0th group is containing members 1,2 and 3. Now in awk, first one line is read which starts with name (/^name/). Then the 3rd field (1,2,3) is broken down into an array a. Now for each element in an array a, we go for each group stored into q (for(var1 in q)) , then inside each group, we split them into another temporary array r (split(q[var1],r,",")), i.e. "1,2,3" is split into an array r. Now each element in r is compared to the element in a. if a match found, the group's index is the index of that row (array index starts from 0, group's from 1, so ((var1+1)) used. Now if not found, just add this as a new group in q and the last index + 1, i.e. length of the array is the index for the row
Update:
/^name/{
j=0;
split($3,a,",");
for(var in a) {
if(q[a[var]] != 0) {
j=q[a[var]]; i=1;
break;
}
}
j = (j == 0) ? ++k : j;
for(var1 in a) {
if(q[a[var1]] == 0) {
q[a[var1]] = j;
}
}
print $1 "\t\t" $2 " \t\t" $3 "\t\t" j;
}
Update:
base is awk has associative array and each element is accessed by a string key. Earlier approach was to store each group in an array where key is the index of the group. So when we were reading a column, we will read each group, split the group in individual element, compare each of the element with each element of the column. But instead of storing a group, if we store the elements in an array where key is the element themselves and value at key is the index of the group to which the element belongs. So when we read a column, we split the column in individual element (split($3,a,",");) then check element in array if there is a group index with the element as key in if(q[a[var]] != 0)( in awk, if the element is not there, by default an element with value 0 is initialized there, so the check q[a[var]] != 0 ). If any element is found, we take the element's group index as the index of the column and break. else j will remain 0. if j remains 0, ++k gives the latest group index. Now we found the group index for the column elements. Need to carry that index to those elements which are not a part of any other group( there will be cases where multiple elements in same column belongs to different group, here we are taking the first come, first serve approach, but do not over write the group index of others already belonging to another group). So for each element in column (for(var1 in a)) , if it does not belong to a group (if(q[a[var1]] == 0)) , give it a group index q[a[var1]] = j;. So here all accesses are linear because we are accessing using elements directly a key. Thus no breaking up a group again and again for every element and hence a shorter time. My first approach was based on one of my own problem ( i mentioned in first line ) which was more complex processing but shorter data set. But this one required a simpler straight forward logic.

Resources