I tried to find the pairs in multiple columns in excel.
abc def 1 <-duplicate 1
ael fjw 1
dlf qwr 1
cvz god 1 <-duplicate 2
abc def -1 <-duplicate 1
slf erw -1
def abc -1 <-duplicate 1
god cvz -1 <-dupllicate 2
cnv odf -1
After that, I should eliminate the pairs that have the value -1.
I tried excel duplicate values pairs in multiple column post, but it showed an unexpected result.
If it is hard to run in Excel, it is okay to suggest the code in python or R.
In particular, I checked the post Removing duplicate interaction pairs in python sets which is a similar problem in python.
But this example is corresponding to the numerical value.
Also, if there are any problems with my question, please correct them.
Assuming your first row of data is in A1:C1, this formula in D1:
=IF(AND(SUM(COUNTIFS(A$1:A1,INDEX(A1:B1,{1;2}),B$1:B1,INDEX(A1:B1,{2;1})))>1,C1=-1),"Delete","")
and copied down.
If your version of Excel does not use the semicolon as row- or column-separator within array constants then the parts
{1;2}
and
{2;1}
will require amendment.
everyone...
I just started on python a couple of days ago because I require to handle some excel data in order to automatically update the data of certain cells from one file into another.
However, I'm kind of stuck since I have barely programmed before, and it's my first time using python as well, but my job required me to find a solution and I'm trying to make it work even though it's not my field of expertise.
I used the "xlrd library", imported my file and managed to print the columns I'm needing... However, I can't find a way to put those columns into a matrix in order to handle the data like this:
Matrix =[DataColumnA DataColumnG DataColumnH] in the size [nrows x 3]
As for now, I have 3 different outputs for the 3 different columns I need, but I'm trying to join them together into one big matrix.
So far my code looks like this:
import xlrd
workbook = xlrd.open_workbook("190219_serviciosWRAmanualV5.xls");
worksheet = workbook.sheet_by_name("ServiciosDWDM");
workbook2 = xlrd.open_workbook("Potencia2.xlsx");
worksheet2 = workbook2.sheet_by_name("Hoja1");
filas = worksheet.nrows
filas2 = worksheet2.nrows
columnas = worksheet.ncols
for row in range (2, filas):
Equipo_A = worksheet.cell(row,12).value
Client_A = worksheet.cell(row,13).value
Line_A = worksheet.cell(row, 14).value
print (Equipo_A, Line_A, Client_A)
So I have only gotten, as mentioned above, the data in the columns which is what I'm printing which you can see.
What I'm trying to do, or the main thing I need to do is to read the cell of the first row in Column A and look for it in the other excel file... if the names match, I would have to validate that for the same row (in file 1) the data in both the ColumnG and ColumnH is the same as the data in the second file.
If they match I would have to update Column J in the first file with the data from the second file.
My other approach is to retrieve the value of the cell in ColumnA and look for it in the column A of the second file, then I would make an if conditional to see if ColumnsG and H are equal to Column C of 2nd file and so on...
The thing here is, I have no idea how to pin point the position of the cell and extract the data to make the conditional for this second approach.
I'm not sure if by making that matrix my approach is okay or if the second way is better, so any suggestion would be absolutely appreciated.
Thank you in advance!
I am trying to figure out how to specify a common range for xlsread() function in matlab.
Usually I use n=xlsread('filename','#sheet','A1:A10'), but I have quite a bit of data in the same sheet and I'd like to know if I can specify it with one range, i.e . if all my data is between '1:10', I want to specify 1:10 as range, and only call the letter values of each column.
I was thinking to do it as follows:
function [a,b,c]=getdata(filename,'1:10')
a=xlsread(filename,1,'A:A'???)
b=xlsread(filename,1,'B:B'???)
c=xlsread(filename,1,'C:C'???)
end
After some research I could not find any information as to how this is done.
Thanks in advance,
Greg
If you want to read 1 to 10 rows of column A, use:
data = xlsread(filename, 1, 'A1:A10');
If you want to read 1 to 10 rows of all columns, use:
data = xlsread(filename, 1, '1:10');
If you want to read 1 to 10 rows of, say, first three columns A, B, and C, use:
data = xlsread(filename, 1, 'A1:C10');
Using dynamic variable names is always a bad idea. Read this for explanation. But if you still want to create a, b, and c and so on depending on the number of columns in the Excel file, you can use:
for k=1:size(data,2)
assignin('caller', char(96+k), data(:,k)); %or char(64+k) for block letters
end
The above will work if number of columns are less than or equal to 26. This may only be feasible if you're dealing with a few columns. But I still recommend to avoid it.
still fairly new to matlab, picked up this data analysis code from someone and I had to add in new functions.
for one function I'm calculating the average of every 3 entries in one column and print the result on another column. so it would be something like this
1 -1
3 -1
5 =(1+3+5)/3
7 -1
1 -1
1 =(7+1+1)/3
4 -1
what I wish to do is to print a blank in the cells that have -1. my first thought was to just assign string values to my results instead of ints. this didn't work because I think there is a line of code in there somewhere that converts everything to ints.
another possible solution is just to reopen the file and loop through all cells replacing any -1's with blank strings, though I'm not sure how to do this, and it's inefficient.
as last resort, I guess I can always tell the user of this xls sheet to use the find/replace function in excel before processing it.
edit: partial code of the save part:
data = [data.time, data.avg_time'];
data2 = num2cell(data);
data3 = {'t', 'avg t'};
data = [data3; data2];
xlswrite([filename, '.xls'], data);
I misunderstood your question (i thought of replacing NaN's with -1, thanks Amro).
You can use this:
A(A(:,2)==-1,2)=NaN
where A is the matrix you created first.
Hope it helps you :)
I need to be able to search my whole table for a row that matches multiple criteria. We use a program that outputs data in the form of a .csv file. It has rows that separate sets of data, each of these headers don't have any columns that are unique in of them self but if i searched the table for multiple values i should be able to pinpoint each header row. I know i can use Application.WorksheetFunction.Match to return a row on a single criteria but i need to search on two three or four criteria.
In pseudo-code it would be something like this:
Return row number were column A = bill & column B = Woods & column C = some other data
We need to work with arrays:
There are 2 kinds of arrays:
numeric {1,0,1,1,1,0,0,1}
boolean {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
to convert between them we can use:
MATCH function
MATCH(1,{1,0,1,1,1,0,0,1},0) -> will result {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
simple multiplication
{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}*{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE} -> will result {1,0,1,1,1,0,0,1}
you can can check an array in the match function, entering it like in the picture below, be warned that MATCH function WILL TREAT AN ARRAY AS AN "OR" FUNCTION (one match will result in true
ie:
MATCH(1,{1,0,1,1,1,0,0,1},0)=TRUE
, YOU MUST CTR+SHIFT+ENTER !!! FOR IT TO GIVE AN ARRAY BACK!!!
in the example below i show that i want to sum the hours of all the employees except the admin per case
we have 2 options, the long simple way, the complicated fast way:
long simple way
D2=SUMPRODUCT(C2:C9,(A2=A2:A9)*("admin"<>B2:B9)) <<- SUMPRODUCT makes a multiplication
basically A1={2,3,11,3,2,4,5,6}*{0,1,1,0,0,0,0,0} (IT MUST BE A NUMERIC ARRAY TO THE RIGHT IN SUMPRODUCT!!!)
ie: A1=2*0+3*1+11*1+3*0+2*0+4*0+5*0+6*0
this causes a problem because if you drag the cell to autocomplete the rest of the cells, it will edit the lower and higher values of
ie: D9=SUMPRODUCT(C9:C16,(A9=A9:A16)*("admin"<>B9:B16)), which is out of bounds
same as the above if you have a table and want to view the results in a diferent order
the fast complicated way
D3=SUMPRODUCT(INDIRECT("c2:c9"),(A3=INDIRECT("a2:a9"))*("admin"<>INDIRECT("b2:b9")))
it's the same, except that INDIRECT was used on the cells that we want not be modified when autocompleting or table reorderings
be warned that INDIRECT sometimes give VOLATILE ERROR,i recommend not using it on a single cell or using it only once in an array
f* c* i cant post pictures :(
table is:
case emplyee hours totalHoursPerCaseWithoutAdmin
1 admin 2 14
1 him 3 14
1 her 11 14
2 him 3 5
2 her 2 5
3 you 4 10
3 admin 5 10
3 her 6 10
and for the functions to check the arrays, open the insert function button (it looks like and fx) then doubleclick MATCH and then if you enter inside the Lookup_array a value like
A2=A2:A9 for our example it will give {TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE} that is because only the first 3 lines are from case=1
Something like this?
Assuming that you data in in A1:C20
I am looking for "Bill" in A, "Woods" in B and "some other data" in C
Change as applicable
=IF(INDEX(A1:A20,MATCH("Bill",A1:A20,0),1)="Bill",IF(INDEX(B1:B20,MATCH("Woods",B1:B20,0),1)="Woods",IF(INDEX(C1:C20,MATCH("some other data",C1:C20,0),1)="some other data",MATCH("Bill",A1:A20,0),"Not Found")))
SNAPSHOT
I would use this array* formula (for three criteria):
=MATCH(1,((Range1=Criterion1)*(Range2=Criterion2)*(Range3=Criterion3)),0)
*commit with Ctrl+Shift+Enter