I've been using individual lists of data to update variables in my ABM. Unfortunately due to the size of data I am using now, it is becoming time consuming to build the lists and then tables. The data for these tables changes frequently, so it is not just a one time thing.
I am hoping to gain some ideas for a method to create a table that can be read directly from an excel spreadsheet, without going through the time to build the table explicitly by inputing the individual lists? My table includes one list of keys ( a list of over 1000 keys) and nearly a hundred variables corresponding to each key, that must be updated when the key is called. The data is produced from a different model (not an ABM) and produces an excel spreadsheet with Keys (X values) and Values (Y values). Something like:
X1 Y1,1 Y1,2 Y1,3… Y1,100
X2 Y2,1 Y2,2 Y2,3… Y2,100
…..
X1000 Y1000,1 Y1000,2 Y1000,3…. Y1000,100
If anyone has a faster method for getting large amounts data from excel into a NetLogo table, I would be very appreciative.
Two solutions, assuming you do not want to write an extension. You can save the Excel file as CSV and then
write a NetLogo procedure to read your CSV file. To get started, see http://netlogoabm.blogspot.com/2014/01/reading-from-csv-file.html
or
use a scripting language (Python recommended) to read the CSV file and then write out a .nls file with code for creating the table
Related
I'm very new to this tool and I want to do a simple operation:
Dump data from an XML to tables.
I have an Excel file that has around 10-12 sheets, and almost every sheet coresponds to a table.
With the first Excel input operation there is no problem.
The only problem is that, I don't know why but, when I try to edit (show the list of sheets, or get the list of columns) a second Excel Input the software just hangs, and when it responds just opens a warning with an error.
This is an image of the actual diagram that I'm trying to use:
This is a typical case of out of memory problem. PDI is not able to read the file and required more amount of memory to process the excel file. You need to give PDI more memory to work with your excel. Try increasing the memory of the Spoon. You can read Increase Spoon memory.
Alternatively, try to replicate your excel file with few rows of data keeping the structure of the file as it is e.g. a test file. You can either use that test file to generate the necessary sheet names and columns in excel step. Once you are done, you can point the original file and execute the job.
I have to .xlsx files. One has data "source.xlsx" and one has macros "work.xlsm". I can load the data from "source.xlsx" into "work.xlsm" using Excel's built-in load or using Application.GetOpenFilename. However, I don't want all the data in the source.xlsx. I only want to select specific rows, the criteria for which will be determined at run time.
Thinks of this as a SELECT from a database with parameters. I need to do this to limit the time and processing of the data being processed by "work.xlsx".
Is there a way to do that?
I tried using parameterized query from Excel --> [Data] --> [From Other Sources] but when I did that, it complained about not finding a table (same with ODBC). This is because the source has no table defined, so it makes sense. But I am restricted from touching the source.
So, In short, I need to filter data before exporting it in the target sheet without touching the source file. I want to do this either interactively or via a VBA macro.
Note: I am using Excel 2003.
Any help or pointers will be appreciated. Thx.
I used a macro to convert the source file from .xlsx to .csv format and then loaded the csv formatted file using a loop that contained the desired filter during the load.
This approach may not be the best, nevertheless, no other suggestion was offered and this one works!
The other approach is to abandon the idea of pre-filtering and sacrifice the load time delay and perform the filtering and removal of un-wanted rows in the "work.xlsm" file. Performance and memory size are major factors in this case, assuming code complexity is not the issue.
I'm trying to download data from excel to matlab. I downloaded data from yahoo finance. I want to load them in matlab however it's not successful. Right down here you have my codes and the message matlab is sending to me. Can somebody help me to improve my codes?
load SP100Duan.csv
Error using load
Number of columns on line 18 of ASCII file C:\Users\11202931\Desktop\SP100Duan.csv
must be the same as previous lines.
There are a TON of ways to get data into MATLAB. Generally speaking, mixed text/numbers gives MATLAB problems. You may want to clean up the excel file so there is no text in columns that should be numbers. Some possible methods to load data:
Use readtable function. Eg. mytable = readtable('mycsvfile.csv') All your data is put a table datatype, which I personally find convenient.
You can read data directly from an excel file using xlsread function. From your description, it sounds like your datatype is a .csv file though.
Use csvread function.
You can often copy and paste data in excel directly into a variable in Matlab. eg. (i) type: x = 0, then (ii) double click on x variable in your workspace, then (iii) copy data from excel and paste into your x variable (iv) execute something like save mydata.mat so you can later load it with load mydata.mat
This should be pretty simple . . .
xlsread('C:\Apple.xls', 'Sheet1', 'A1:G10')
Also, please see this link.
http://www.mathworks.com/help/matlab/ref/xlsread.html
Need your help badly. I am dealing with a workbook which has 7000 rows X 5000 columns data in one sheet. Each of this datapoint has to be manipulated and pasted in another sheet. The manipulations is relatively simple where each manipulation will take less than 10 lines of code (simple multiplications and divisions with a couple of Ifs). However, the file crashes every now and then and getting various types of errors. The problem is the filesize. To overcome this problem, I am trying a few approaches
a) Separate the data and output in different files. Keep both files open and take data chunk by chunk (typically 200 rows x 5000 columns) and manipulate that and paste that in output file. However, if both files are open, then I am not sure it remedies the problem since the memory consumed will be same either way i.e. instead of one file consuming a large memory, it would be two files together consuming the same memory.
b) Separate the data and output in different files. Access the data in the data file while it is still closed by inserting links in the output file through a macro, manipulate the data and paste it in output. This can be done chunk by chunk.
c) Separate the data and output in different files. Run a macro to open the data file and load a chunk of data say 200 rows into memory into an array and close it. Process the array and open the output file and paste the array results.
Which of the three approaches are better? I am sure there are other methods which are more efficient. Kindly suggest.
I am not familiar with Access but I tried to import the raw data into Access and it failed because it allowed only 255 columns.
Is there a way to keep the file open but wash it in and out of Memory. Then slight variations to a and c above can be tried. (I am afraid repeated opening and closing will crash the file.)
Look forward to your suggestions
If you don't want to leave Excel, one trick you can use is to save the base excel file as a binary ".xlsb". This will clean out a lot of potential rubbish that might be in the file (it all depends on where it first came from.)
I just shrank a load of webdata by 99.5% - from 300MB to 1.5MB - by doing this, and now the various manipulation in excel works like a dream.
The other trick (from the 80s :) ) if you are using a lot of in cell formulae rather than a macro to iterate through, is to:
turn calculate off.
copy your formulae
turn calculate on, or just run calculate manually
copy and paste-special-values the formulae outputs.
My suggestion is using a scripting language of your choice and working with decomposition/composition of spreadsheets in it.
I was composing and decomposing spreadsheets back in the days (in PHP, oh shame) and it worked like a charm. I wasn't even using any libraries.
Just grab yourself xlutils library for Python and get your hands dirty.
I need to import tabular data into my database. The data is supplied via spreadsheets (mostly Excel files) from multiple parties. The format of each of these files is similar but not the same and various transformations will be necessary to massage the data into the final format suitable for import. Furthermore the input formats are likely to change in the future. I am looking for a tool that can be run and administered by regular users to transform the input files.
Now let me list some of the transformations I am looking to do:
swap columns:
Input is:
|Name|Category|Price|
|data|data |data |
Output is
|Name|Price|Category|
|data|data |data |
rename columns
Input is:
|PRODUCTNAME|CAT |PRICE|
|data |data|data |
Output is
|Name|Category|Price|
|data|data |data |
map columns according to a lookup table, like in the above examples:
replace every occurrence of the string "Car" by "automobile" in the column Category
basic maths:
multiply the price column by some factor
basic string manipulations
Lets say that the format of the Price column is "3 x $45", I would want to split that into two columns of amount and price
filtering of rows by value: exclude all rows containing the word "expensive"
etc.
I have the following requirements:
it can run on any of these platform: Windows, Mac, Linux
Open Source, Freeware, Shareware or commercial
the transformations need to be editable via a GUI
if the tool requires end user training to use that is not an issue
it can handle on the order of 1000-50000 rows
Basically I am looking for a graphical tool that will help the users normalize the data so it can be imported, without me having to write a bunch of adapters.
What tools do you use to solve this?
The simplest solution IMHO would be to use Excel itself - you'll get all the Excel built-in functions and macros for free.Have your transformation code in a macro that gets called via Excel controls (for the GUI aspect) on a spreadsheet. Find a way to insert that spreadsheet and macro in your client's Excel files. That way you don't need to worry about platform compatibility (it's their file, so they must be able to open it) and all the rest. The other requirements are met as well. The only training would be to show them how to enable macros.
The Mule Data Integrator will do all of this from a csv file. So you can export your spreadsheet to a CSV file, and load the CSV file ito the MDI. It can even load the data directly to the database. And the user can specify all of the transformations you requested. The MDI will work fine in non-Mule environments. You can find it here mulesoft.com (disclaimer, my company developed the transformation technology that this product is based on).
You didn't say which database you're importing into, or what tool you use. If you were using SQL Server, then I'd recommend using SQL Server Integration Services (SSIS) to manipulate the spreadsheets during the import process.
I tend to use MS Access as a pipeline between multiple data sources and destinations - but you're looking for something a little more automated. You can use macros and VB script with Access to help through a lot of the basics.
However, you're always going to have data consistency problems with users mis-interpreting how to normalize their information. Good luck!