How to read excel into Julia?

How to read excel into Julia? - excel

I need to read an excel file into Julia. I tried the package "ExcelReaders". However, the package requires additionally the Python or the xlrd package. Although it uses the conda.jl package to install these dependencies automatically, I keep on running into different installation problems. Is there a simple way to read excel into Julia? Has anyone tried the Taro.jl package?

There is only one pure-Julia v1.0-compatible Excel reader available:
XLSX.jl
It has no dependencies on Python or Java. Install it from the package manager by typing ]add XLSX in the console, then load it with using XLSX. Here is the tutorial document.

The Taro.jl package works well to read excel into Julia. You can install the package with Pkg.add(Taro). Once the package is installed, you can load it with using Taro; Taro.init(). You can use Taro.readxl() to read excel files. The following post provides a somewhat nice tutorial on how to read excel files in Julia using Taro.jl:
https://economictheoryblog.com/2018/01/03/how-to-read-an-excel-file-in-julia-language-an-example/

Taro works pretty well (even if I say so myself). You need java installed on the machine, but after that, Pkg.add(Taro) will install all the dependencies for you. And, I think you'll have better luck with Taro with more complex excel files.

If you are fine with saving in the ods format, you could also use the OdsIO.jl package.
It uses a python module (ezodf) as well, but it should install it automatically in both Windows and Linux when you install OdsIO.jl.

If you can save as a .csv then CSV.jl works well.

The ExcelReaders package is also available
https://github.com/davidanthoff/ExcelReaders.jl

Related

Searching for Python MSDOS parser library

Does anyone knows a good Python library to parse MSDOS files and obtain metadata and start()'s bytecodes? Like an alternative version of pefile library but for MSDOS? I can't seems to find any via Google.
If there isn't, is there a good source to refer to on MSDOS's file format? This way, I can create my own parser instead. I know there are tools like IDA Pro and Reko decompiler but I need a MSDOS file parser to automate some stuff. Thank you in advanced!

Reko decompiler maintainer here. For what it's worth, you can use Reko's MS-DOS source code and translate it to Python. It's not a lot of code and MS-DOS executables aren't that complex to parse -- it's quite a simple format. The relevant files are:
https://github.com/uxmal/reko/blob/master/src/ImageLoaders/MzExe/ExeImageLoader.cs
https://github.com/uxmal/reko/blob/master/src/ImageLoaders/MzExe/MsdosImageLoader.cs
You could also try executing the Reko code directly from Python. The Reko binaries are available as a nuget package: https://www.nuget.org/packages/Reko.Decompiler.Runtime
Use the class Reko.ImageLoaders.MzExe.ExeImageLoader in the Reko.ImageLoaders.MzExe class. Integration could be done with http://pythonnet.github.io/

How to reduce package size of PySide6?

I am writing a software with PySide6. On my Mac the package has a size of 1.0GiB. Is there a way to easily reduce unnecessary files that I don't need to package.
I manually identified the files below as not necessary for my software. Still I end up with more than 500MB.
/Assistant.app
/Designer.app
/Linguist.app
/lupdate
/QtWebEngineCore
/QtWebEngineCore.framework

You can install from PyPi only the PySide6-Essentials package.
You can build from source and include via Qt installer just what you need.
P.s if you are stuggeling with building PySide from source I have a repo that might help.

Sqlite on Python 3 with Spatialite and full support for spatial indices (i.e. rtree)

This is the scenario:
I am using Python 3 (3.6 through 3.8 on Windows 10, using Pipenv and vanilla Python) to create an SQLite file with Spatial support and several triggers based on Spatial Indices.
Creating the database and adding records works just fine after loading Spatialite
conn.enable_load_extension(True)
conn.load_extension("mod_spatialite")
However, adding spatial links with code such as below
SELECT CreateSpatialIndex( 'nodes' , 'geometry' );"""
returns the following error
updateTableTriggers: "no such module: rtree"
I tried compiling the rtree extension following some recommendation from
Compiling SQLite RTREE in MSVC?
and using VS 2016 (16.4.2).
But I get all sorts of errors when trying to load that in SQL (Might not be compiling it properly, but I tried multiple things and nothing worked). My best attempt was a successful compilation using pretty much the instructions I referred to above, but when I attempted
p.conn.load_extension("libSqliteRtree.dll")
I got
sqlite3.OperationalError: The specified procedure could not be found.
I am really at loss here, as there seems to be very little discussion on this topic everywhere I looked. A few questions that come to mind are:
Are the specific compilation instructions/tricks/compiler versions that I should be using?
Is it even possible to compile and load rtree in Python 3 using the standard sqlite3 library?
Is this particular to Windows?
Are there alternative SQLite Python packages that could do the job (I didn't find any on PyPI)?
It is critical, however, that the solution works across different platforms.

I was just having the exact same problem, with Python 3.8 x64. I believe the problem was that the sqlite3.dll file inside my Python's installation DLLs folder had been compiled without RTREE enabled (https://sqlite.org/rtree.html).
To resolve this I visited the SQLite website, downloaded the .zip with the latest sqlite3.dll for Windows x64 (hoping RTREE would be enabled on that version, because I tried compiling it on my own and it didn't work), and swapped the old DLL in the DLLs folder with the newly downloaded DLL from the website. The RTREE error was gone! Detailed steps below:
Access https://sqlite.org/download.html and choose the "Precompiled Binaries for Windows" of your system. Mine was x64 because I was running Python x64. Download the .zip and unzip it.
Find your Python's installation folder (the folder which contains the python.exe you're running), and open the DLLs folder. You'll see there is a file there called sqlite3.dll. That's the one that comes with the Python installation.
Copy the sqlite3.dll from the unzipped folder in step 1 and paste into the DLLs folder, click Yes to substitute the file in the DLLs folder for the new one. That should solve the problem.

Copying base environment to create a new environment in Python

I'm working with some new libraries and I'm afraid that my script might show some troubles in the future with unexpected updates of packages. So I want to create a new environment but I don't want to manually install all the basic packages like numpy, pandas, etc. So, does it makes sense to create a new environment using conda which is the exact copy of my base environment or could it create some sort of conflict?

Copying using conda works, but if you used only virtualenv, you should manually build requirements.txt, create a new virtual environment, activate it, and then simply use pip install -r requirements.txt. Note the key word - manually.
For example if you needed requests, numpy and pandas, your requirements.txt would look like this:
requests==2.20.0
numpy==1.15.2
pandas==0.23.4
You could actually exclude numpy in this case, but you still keep it as you are using it and if you removed pandas you'd still need it. I build it by installing a new package and then using pip freeze to find the module I just installed and put it into the requirements.txt with current version. Of course if I ever get to the state where I will share it with someone, I replace == with >=, most of the time that's enough, if it conflicts, you need to check what the conflicting library requires, and adjust if possible, e.g. you put in latest numpy version as requirement, but older library needs specifically x.y.z version and your library is perfectly fine with that version too (ideal case).
Anyway, this is how much you have to keep around to preserve your virtual environment, also helps if you are going to distribute your project, as anyone can drop this file into a new folder with your source and create their own environment without any hassle.
Now, this is why you should build it manually:
$ pip freeze
certifi==2018.10.15
chardet==3.0.4
idna==2.7
numpy==1.15.2
pandas==0.23.4
python-dateutil==2.7.3
pytz==2018.5
requests==2.20.0
six==1.11.0
urllib3==1.24
virtualenv==16.0.0
six? pytz? What? Other libraries use them but we don't even know what they are for unless we look it up, and they shouldn't be listed as project dependencies, they will be installed if they depend on it.
This way you ensure that there won't be too many problems only in very rare cases where one library you are using needs a new version of another library but the other library wants an ancient version of the library of which the version is conflicting and in that case it's a big mess, but normally it doesn't happen.

R (OSX) read/write xls

I'm having a persistant problem finding and implementing any packages capable of reading and writing XLS files in an OSX version of R.
Anyone have any suggestions?
I've gone so far as to try using Perl to implement WriteXls.
Thanks

I have had success with the Java based xlsx package. Available on CRAN. It does .xls as well as .xlsx.

read.xls in package gdata has worked for me in the past. You can get some further tips at the wiki page.

Another package to try is XLConnect.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to read excel into Julia? - excel

There is only one pure-Julia v1.0-compatible Excel reader available: XLSX.jl It has no dependencies on Python or Java. Install it from the package manager by typing ]add XLSX in the console, then load it with using XLSX. Here is the tutorial document.

Taro works pretty well (even if I say so myself). You need java installed on the machine, but after that, Pkg.add(Taro) will install all the dependencies for you. And, I think you'll have better luck with Taro with more complex excel files.

If you are fine with saving in the ods format, you could also use the OdsIO.jl package. It uses a python module (ezodf) as well, but it should install it automatically in both Windows and Linux when you install OdsIO.jl.

If you can save as a .csv then CSV.jl works well.

The ExcelReaders package is also available https://github.com/davidanthoff/ExcelReaders.jl

Related

Searching for Python MSDOS parser library

How to reduce package size of PySide6?

Sqlite on Python 3 with Spatialite and full support for spatial indices (i.e. rtree)

Copying base environment to create a new environment in Python

R (OSX) read/write xls

Categories

Resources