Data storing in haskell - excel like rows and cols - haskell

I have a following quest:
I have to write program in Haskell which will allow me to create something like excel sheet.
There are columns and rows, and each cell can hold number or string or some function (sum, mean, multiply etc). Each of the functions take as parameters a list of cells which are summed etc.
Now I am trying to figure out how to store this data into my program...
I was thinking about something like this:
data CellPos = CellPos Int Int -- row and col of Cell
data DataType = Text | String | SumFunction | ...... deriving (Enum)
data Cell = Cell CellPos DataType -- but here is a problem , how to put here data with type which depends on DataType???
I wanted just to have big list of Cell and search in it for specified column/row etc
But there must be some better solution for this – maybe some two dimensional array which auto adjust its size or something?
I will have to save/load a sheet to /from file...

Let's answer one question at a time:
data Cell = Cell CellPos DataType
"but here is a problem , how to put here data with type which depends on DataType???"
Put that data into DataType:
data DataType = Text String | Number Double | Function CellPos (DataType -> DataType)
"I wanted just to have big list of Cell and search in it for specified column/row etc. But there must be some better solution for this - maybe some two dimmensional array which auto adjust its size or something?"
I suggest a Map CellPos DataType.
"I will have to save/load a sheet to /from file..."
The simplest thing will probably be to derive Show and Read and use the resulting functions together with readFile and writeFile. The only caveat here (with respect to DataType as defined earlier in this answer) is that functions cannot be serialized. To get around this, make a more explicit type for the functions in cells -- perhaps an abstract syntax tree for some simple expression language.

Related

Data extraction from excel with operators is unable to store values

I have a Excel file with two columns. One has a name other has the corresponding mass to it. I have used the corresponding lines to read it and find the position of the name. But when I am trying to find the mass to the corresponding name as shown below it is not able to store it in the memory. In the Excel file, I have the mass values as 1.989*10^30. This seems to affect the code as the same code works fine when the cells in the excel has just numeric values.
majbod = 'Sun';
minbod = 'Earth';
majbodin = readtable("Major_and_Minor_Bodies.xlsx","Sheet",1);
minbodin = readtable("Major_and_Minor_Bodies.xlsx","Sheet",2);
MAJORBODY = table2array(majbodin(:,"Major_Body"));
MINORBODY = table2array(minbodin(:,"Minor_Body"));
mmaj = table2array(majbodin(:,"Mass"));
mmin = table2array(minbodin(:,"Mass"));
selected_majbody = find(strcmp(MAJORBODY,majbod));
selected_minbody = find(strcmp(MINORBODY,minbod));
M = mmaj(selected_majbody);
m = mmin(selected_minbody);
disp([M ;m])
Is there a better way to write the code compared to the way which I wrote?
Thanks.
Excel does it's best to figure out what kind of data is in each cell. Since your data has something besides just numbers, Excel treats it like a string. You have a couple of options for getting around that:
If you put an equals sign in front of it, it will treat it like an equation, and calculate the value of 1.989*10^3 for you. this will be a number.
Since scientific notation is so common, programmers have created a shortcut for representing it. They often use the character 'E' where you use "*10^". This means that if you type "1.989E30", excel will recognize that as a number.
If keeping the current string format is very important, you could probably modify the string during extraction - replace '*10^' with E, and then whatever language you are using will have a string to number parser you can use.
If the real problem is that the real numbers are just too long in Excel, you can always format the cell that they are in. (right click the cell, select format cells, then select scientific.)
Good luck

Making a vector out of excel columns using python

everyone...
I just started on python a couple of days ago because I require to handle some excel data in order to automatically update the data of certain cells from one file into another.
However, I'm kind of stuck since I have barely programmed before, and it's my first time using python as well, but my job required me to find a solution and I'm trying to make it work even though it's not my field of expertise.
I used the "xlrd library", imported my file and managed to print the columns I'm needing... However, I can't find a way to put those columns into a matrix in order to handle the data like this:
Matrix =[DataColumnA DataColumnG DataColumnH] in the size [nrows x 3]
As for now, I have 3 different outputs for the 3 different columns I need, but I'm trying to join them together into one big matrix.
So far my code looks like this:
import xlrd
workbook = xlrd.open_workbook("190219_serviciosWRAmanualV5.xls");
worksheet = workbook.sheet_by_name("ServiciosDWDM");
workbook2 = xlrd.open_workbook("Potencia2.xlsx");
worksheet2 = workbook2.sheet_by_name("Hoja1");
filas = worksheet.nrows
filas2 = worksheet2.nrows
columnas = worksheet.ncols
for row in range (2, filas):
Equipo_A = worksheet.cell(row,12).value
Client_A = worksheet.cell(row,13).value
Line_A = worksheet.cell(row, 14).value
print (Equipo_A, Line_A, Client_A)
So I have only gotten, as mentioned above, the data in the columns which is what I'm printing which you can see.
What I'm trying to do, or the main thing I need to do is to read the cell of the first row in Column A and look for it in the other excel file... if the names match, I would have to validate that for the same row (in file 1) the data in both the ColumnG and ColumnH is the same as the data in the second file.
If they match I would have to update Column J in the first file with the data from the second file.
My other approach is to retrieve the value of the cell in ColumnA and look for it in the column A of the second file, then I would make an if conditional to see if ColumnsG and H are equal to Column C of 2nd file and so on...
The thing here is, I have no idea how to pin point the position of the cell and extract the data to make the conditional for this second approach.
I'm not sure if by making that matrix my approach is okay or if the second way is better, so any suggestion would be absolutely appreciated.
Thank you in advance!

append cell row to matrix

I'm reading an excelfile in matlab with
[NUM,TXT,RAW]=xlsread(DATENEXCEL,sSheet_Data);
In the excelfile are different datamatrices in different sheets in the following form
Date Firm1 Firm2 Firm3 ...
1.1.16 12 12 12
... ... ... ...
Currently I'm handling the pure data with the NUM object and the header row with the TXT object. My first issue is how to combine the header row with the data rows. Looping does not work, since I predefine the data matrix with
daten=zeros([length(sDatesequence) size(RAW,2)]);
because I want to be able to add more data from different sources to that object. Predefining with zeros, however, leads Matlab to expect doubles and not characters. Converting the cell array TXT with cell2mat delivers unsatisfying results:
cell2mat(TXT(1,:))=Firm1Firm2Firm3...
hence only a long string vector.
Question: Is there another way to combine character vectors and double matrices?
Regards,
Richard
You can combine them in a cell array.
c{1,1} = 'Firm1';
c{1,2} = datavector;
c{2,1} = 'Firm2';
c{2,2} = datavector;
But as far as I know it is not possible to add text headers to a numerical matrix, unless you do something with typcasting. But I would not recommend that.
d(1:8)='Firm1 '; %must have exactly eight characters (a double has a length of 8 bytes)
y = typecast(uint8(d),'double') %now you have a number that would fit in a matrix of doubles
x=char(typecast(y,'uint8')) %now it's converted back to text

Find string (from table) in cell in matlab

I want to find the location of one string (which I take it from a table) inside of a cell:
A is my table, and B is the cell.
I have tested :
strncmp(A(1,8),B(:,1),1)
but it couldn't find the location.
I have tested many commands like:
ismember,strmatch,find(strcmp),find(strcmpi)find(ismember),strfind and etc ... but they all give me errors mostly because of the type of my data !
So please suggest me a solution.
You want strfind:
>> strfind('0123abcdefgcde', 'cde')
ans =
7 12
If A is a table and B a cell array, you need to index this way:
strfind(B{1}, A.VarName{1});
For example:
>> A = cell2table({'cde'},'VariableNames',{'VarName'}); %// create A as table
>> B = {'0123abcdefgcde'}; %// create B as cell array of strings
>> strfind(B{1}, A.VarName{1})
ans =
7 12
Luis Mendo's answer is absolotely correct, but I want to add some general information.
Your problem is that all the functions you tried (strfind, ...) only work for normal strings, but not for cell array. The way you index your A and B in your code snippet they still stay a cell array (of dimension (1,1)). You need to use curly brackets {} to "get rid of" the cell array and get the containign string. Luis Mendo shows how to do this.
Modified solution from a Mathworks forum, for the case of a single-column table with ragged strings
find(strcmp('mystring',mytable{:,:}))
will give you the row number.

Apache POI : How to format numeric cell values

I am using Apache POI 3.9 for XLS/XLSX file processing.
In the XLS sheet, there is a column with numeric value like "3000053406".
When I read it with POI with..
cell.getNumericCellValue()
It gives me value like "3.00E+08". This create huge problem in my application.
How can I set the number formatting while reading data in Apcahe POI ?
There is a way that I know is to set the column as "text" type. But I want to know if there is any other way at Apache POI side while reading the data. OR can we format it by using simple java DecimalFormatter ?
This one comes up very often....
Picking one of my past answers to an almost identical question
What you want to do is use the DataFormatter class. You pass this a cell, and it does its best to return you a string containing what Excel would show you for that cell. If you pass it a string cell, you'll get the string back. If you pass it a numeric cell with formatting rules applied, it will format the number based on them and give you the string back.
For your case, I'd assume that the numeric cells have an integer formatting rule applied to them. If you ask DataFormatter to format those cells, it'll give you back a string with the integer string in it.
Problem can be strictly Java-related, not POI related, too.
Since your call returns a double,
double val = cell.getNumericCellValue();
You may want to get this
DecimalFormat df = new DecimalFormat("#");
int fractionalDigits = 2; // say 2
df.setMaximumFractionDigits(fractionalDigits);
double val = df.format(val);
Creating a BigDecimal with the double value from the numeric cell and then using the
BigDecimal.toPlainString()
function to convert it to a plain string and then storing it back to the same cell after erasing the value solved the whole problem of exponential representation of numeric values.
The below code solved the issue for me.
Double dnum = cellContent.getNumericCellValue();
BigDecimal bd = new BigDecimal(dnum);
System.out.println(bd.toPlainString());
cellContent.setBlank();
cellContent.setCellValue(bd.toPlainString());
System.out.println(cellContent.getStringCellValue());
long varA = new Double(cellB1.getNumericCellValue()).longValue();
This will bring the exact value in variable varA.

Resources