still fairly new to matlab, picked up this data analysis code from someone and I had to add in new functions.
for one function I'm calculating the average of every 3 entries in one column and print the result on another column. so it would be something like this
1 -1
3 -1
5 =(1+3+5)/3
7 -1
1 -1
1 =(7+1+1)/3
4 -1
what I wish to do is to print a blank in the cells that have -1. my first thought was to just assign string values to my results instead of ints. this didn't work because I think there is a line of code in there somewhere that converts everything to ints.
another possible solution is just to reopen the file and loop through all cells replacing any -1's with blank strings, though I'm not sure how to do this, and it's inefficient.
as last resort, I guess I can always tell the user of this xls sheet to use the find/replace function in excel before processing it.
edit: partial code of the save part:
data = [data.time, data.avg_time'];
data2 = num2cell(data);
data3 = {'t', 'avg t'};
data = [data3; data2];
xlswrite([filename, '.xls'], data);
I misunderstood your question (i thought of replacing NaN's with -1, thanks Amro).
You can use this:
A(A(:,2)==-1,2)=NaN
where A is the matrix you created first.
Hope it helps you :)
Related
First of all thanks in advance, there are always answers here so we learn a lot from the experts. I'm a noob using "pandas" (it's super handie for what i tried and achieved so far).
I have these data, handed to me like this (don't have access to the origin), 20k rows or more sometimes. The 'in' and 'out' columns may have one or more data per date, so when i get a 'in' the next data could be a 'out' or a 'in', depending, leaving me a blank cell, that's the problem (see first image).
I want to filter the first datetime-in, to left it in one column and the last datetime-out in another but the two in one row (see second image); the data comes in a csv file. I am doing this particular work manually with LibreOffice Calc (yeap).
So far, I have tried locating and relocating, tried merging, grouping... nothing works for me so i feel frustrated, ¿would you please lend me a hand? here is a minimal sample of the file
By the way english is not my language. ¡Thanks so much!
First:
out_column = df["out"].tolist()
This gives you all the out dates as a list, we will need that later.
in_column = df["in"].tolist() # in is used by python so I suggest renaming that row
I treat NaT as NaN (Null) in this Case.
Now we have to find what rows to keep, which we do by going through the in column and only keeping the rows after a NaN (and the first one):
filtered_df = []
tracker = False
for index, element in enumerate(in):
if index == 0 or tracker is True:
filtered_df.append(True)
tracker = False
continue
if element is None:
tracker = True
filtered_df.append(False)
Then you filter your df by this Boolean List:
df = df[filtered_df]
Now you fix up your out column by removing the null values:
while null in out_column:
out_column.remove(null)
Last but not least you overwrite your old out column with the new one:
df["out"] = out_column
everyone...
I just started on python a couple of days ago because I require to handle some excel data in order to automatically update the data of certain cells from one file into another.
However, I'm kind of stuck since I have barely programmed before, and it's my first time using python as well, but my job required me to find a solution and I'm trying to make it work even though it's not my field of expertise.
I used the "xlrd library", imported my file and managed to print the columns I'm needing... However, I can't find a way to put those columns into a matrix in order to handle the data like this:
Matrix =[DataColumnA DataColumnG DataColumnH] in the size [nrows x 3]
As for now, I have 3 different outputs for the 3 different columns I need, but I'm trying to join them together into one big matrix.
So far my code looks like this:
import xlrd
workbook = xlrd.open_workbook("190219_serviciosWRAmanualV5.xls");
worksheet = workbook.sheet_by_name("ServiciosDWDM");
workbook2 = xlrd.open_workbook("Potencia2.xlsx");
worksheet2 = workbook2.sheet_by_name("Hoja1");
filas = worksheet.nrows
filas2 = worksheet2.nrows
columnas = worksheet.ncols
for row in range (2, filas):
Equipo_A = worksheet.cell(row,12).value
Client_A = worksheet.cell(row,13).value
Line_A = worksheet.cell(row, 14).value
print (Equipo_A, Line_A, Client_A)
So I have only gotten, as mentioned above, the data in the columns which is what I'm printing which you can see.
What I'm trying to do, or the main thing I need to do is to read the cell of the first row in Column A and look for it in the other excel file... if the names match, I would have to validate that for the same row (in file 1) the data in both the ColumnG and ColumnH is the same as the data in the second file.
If they match I would have to update Column J in the first file with the data from the second file.
My other approach is to retrieve the value of the cell in ColumnA and look for it in the column A of the second file, then I would make an if conditional to see if ColumnsG and H are equal to Column C of 2nd file and so on...
The thing here is, I have no idea how to pin point the position of the cell and extract the data to make the conditional for this second approach.
I'm not sure if by making that matrix my approach is okay or if the second way is better, so any suggestion would be absolutely appreciated.
Thank you in advance!
I have this excel files, this is what my data looks like in the first workbook, which could have 2000 + entries and in a general format.
A
1 5001987
2 1458285
3 2506588
4 4745089
5 2540486
.
.
My other excel file looks like this, but also in a general, but the data within it is generated by something else which results of its output like this.
A
1 ['2506588']
2 ['2540181']
3 ['2553486']
4 ['2540181']
5 ['2540389']
6 ['2553384']
On a specific column somewhere, i have written this function:
=IF(VLOOKUP([outputbarcode.xlsx]Sheet1!$B$4,B2:B1992,2,TRUE),"Y","N")
I simply want it to look if excefile 2 cell A1 value exist in excelfile 1, print Y, if not, N.
Running the function above returns #N/A
Is there something wrong with my function?
On excel file 2, try:
=IFERROR(IF(INDEX(MATCH(VALUE(MID(A1,3,7)), Sheet1!A:A, 0),)>0, "Y"), "N")
Sheet1 is excel file 1 here. I prefer index & match to vlookup. You can search why.
I suggest that you do an edit/replace and remove those odd characters permanently. Then you won't need the mid() function but the rest of #Sangbok lee answer will be fine and that may help with future operations.
I am trying to extract a sub-string from a string. The strings are currently in an excel column, row by row and are like this:
ABC 54 SOMETHING 11165 POP 1234567890
SOMETHING ABC/W 05/1234500022385
SomethingW1234500006840Abc05 d 13/1/15
What I want is to extract any 5 or 13-digit number from each row string.
I have come up with this algorithm for the job:
1) Enter line
2) Scan string
3) If numeric/integer found, check length from start to end of numeric string
4) If length = 5 or if length = 13, output only numeric string to next column
5) Enter new line...
6) Continue 1 - 5 Till the data set is exhausted
Is there a function in excel that can do this?
P.S: I am open to learn any language/tool that can get the job done.
It might be easier than you are making it. If I were you, I'd update that question to give unambiguous pairs of inputs and desired outputs. And I would take a good hard look at the accepted answer to this possibly similar question as it looks like it could be useful. Undoubtedly, someone will come up with a more beautiful regex for you, but here is an idea that might work..
I need to know if there is any function that can import data from excel row by row?
I used to work with xlsread but it won't work for this case unless i use it in a function that takes all the columns and group all the element in the same row together...
Edit: I was able to do it using simple xlsread by the following code:
num = xlsread(excel_file,'B2:BI174');
row1=num(1:173:end);
It is tempting to read the data one row at a time, but that means you will waste time due to file access overhead. It's a lot faster to read all at once and re-pack into a cell array:
allData = xlsread('filename.xls');
oneRowPerElementCell = mat2cell(allData, ones(size(allData,1),1), size(allData,2));
Read xlsread documentation here to read a block from excel file.
Example: To read the first row from 1st to 26th coulmn use,
row1 = xlsread('filename.xlsx',sheet_no,'A1:Z1');