order of processing of text files in python

order of processing of text files in python - python-3.x

a small part of my code is dedicated to unzipping a zip file and looking for certain things in text files. The problem I am facing right now is that python processes the text files in a certain order in my system but it seems to use another order when the code is run on my professor's system. That order of processing is critical for what the code is trying to do. It processes the files in the correct order in my system but doesn't on his, consequently resulting in wildly different outputs. My OS is open SUSE 15.0,his is open SUSE 15.1 he runs python 3.6.9, I run 3.6.5.
I tried downloading everything from github and running the code on my end but still cannot reproduce the issue he is having.
For what its worth, here is the list of modules I am using.
os,errno,sys,logging,shutil,argparse,pandas, zipfile
What are some possible reasons for this kind of discrepancy?

Related

How to make Collab-dependent Python code self-contained?

This question was migrated from Super User because it can be answered on Stack Overflow.
Migrated 12 days ago.
sorry if this question isn't in the right place, but I wouldn't qualify myself as a developer.
Overview: our organization runs a piece of critical Python code on Google's Collab from Google Drive. This code processes sensitive data and for privacy reasons, we'd like to move it out of Google. There's also a performance reason as the RAM requirements are sometimes higher than allowed by the free plan, halting code execution.
The setup: the code takes one very large PDF file as input (in the same Google Drive directory), processes it then spits out tens of usable XLSX files containing the relevant info, each in a different folder since the code is expected to be run once every 4-5 weeks. The very few parameters allowed are set in a simple text file.
The bug: no one in the organization currently has the skills to edit code and maintain a compatible Python environment.
The ideal end product would be a self-contained executable with no external dependency, able to run in any user directory. I know this is a rather inefficient use of disk space, but the only way to ensure proper execution every time. Primary target platforms would be Mac OS X 10.14 to current, or Debian-based Linux distribution, more precisely Linux Mint. Windows compatibility isn't a priority.
What would be the proper way to make this code independant?

Enormous appimage created by appimage-builder

I'm packaging an application I have written into an AppImage so that it can be delivered to Linux users.
One of the key features of the GUI toolkit I'm using is that it is small and lightweight, allowing me to compile a build which is statically linked to the GUI library of around 6Mb.
However, after building the AppImage - where I do what the instructions say - use all the functionality (which basically includes only using file browser dialogues to load files) - it generates an absolutely enormous AppImage of around 200Mb!
I know that AppImages are supposed to be a "little bit" big, but this is completely mad as a proposed solution for portability when the natively compiled binary including a statically linked GUI toolkit is only 6Mb.
However, I'm not convinced at all that I need all of that 200Mb. A very similar piece of software to mine, but that additionally uses Qt (which is pretty bloated in comparison) is only about 30Mb. I actually suspect that appimage-builder is doing something very wrong - I think it is listing the files in the directory I explore when using the file browser dialogue as dependencies (they are big files). I have no real other explanation. But if so how do I stop it doing that?
Why is mine so big? What can I do about it?
For the record I am using this method for building my AppImage
Building my binary separately
Running appimage-builder --generate and completing the form
Running appimage-builder --recipe AppImageBuilder.yml --skip-tests
Edit: Removing the obviously not needed files that were being packaged have reduced the size of the appimage to just 140Mb, but this is still almost 5 times bigger than equivalent appimages I've seen. Are there some tricks/options I'm not aware of?

In few recent days got started with AppImage and faced the same problem.
Shortly: check dependencies of your app by any possible ways and configure recipe to include only concrete dependencies and avoid includings of any theme/icon/etc packages which are not realy used :)
My case is a small app, written in Dart (with Flutter). The built project itself weights about 22MB (du -sh . in output directory). My host os is Linux Mint (Cinnamon).
First time I run appimage-builder --generate it generated me the "recipe" with 17 packages to be installed and bunch of libraries to be copied from /lib/x86_64-linux-gnu/. When I generated AppImage from this recipe, result was about 105MB, which are extremely large in my opinion for small app.
My first experiments was to cleanup included files section, as I guess "all necessary" libraries should be installed from apt. I referred to a few configs from network where were marked only few libraries for include and was exclude section, which contains some DE related files (themes, fonts, icons and etc.)
Using that I got result about 50MB (which are still large enough).
Next experiments were referred to from this issue - https://github.com/AppImageCrafters/appimage-builder/issues/130#issuecomment-843288012
Shortly - after generating an AppImage file, there appeared file .bundle.yml file inside AppDir folder, which contains deployed libraries. Advice is to try exclude something from that. May be it's a good enough advice, but it takes too long time to check for each package/library if it breaks resulted AppImage file at least with official tests of appimage-builder (docker containers). I faced more broken results than any sane size reduction.
My next experiment was to reduce dependencies which should be installed from package manager and use files from host system. I deleted AppDir and appimage-builder-cache folders and regenerated the recipe. At next step I commented all packages which should be installed from package manager and leaved only included files. Result was fail, because of needing one package, but after adding it I got AppImage result in 36MB. That sounds much better than starting 105MB or even previous result of 50MB.
Here I got small "boost" - Flutter project built into AOT binaries, without runtime. So I checked output of ldd for my app, and then mapped list of required libraries to list of library files which were detected by appimage-builder. Finally some of them was correct, some not found in ldd output and some was in ldd output, but were not detected by appimage-builder. I added all undetected, removed all unused. My final result is 26MB and it passed all appimage-builder tests (running in docker images of fedora, cent, debian, ubuntu and arch)
I understand that it's bad enough for continuous building, because it will require to always check for used libraries and adapt config if something changed, but for rare enough builds I guess it's has some kind of balance between good and bad.

Python program in command line or .exe gives a MemoryError, however works fine in Spyder IDE

Version: Python 3.7 (Spyder) -- OS: Windows10 -- System: Core i5 (6th gen) + 16gb RAM
I wrote a program which handles a lot of data. The following structure is used to accomplish this:
Program Description
A GUI interface is used as a main function (class). In here an interface pops up, asks the user for input, uses this input the make all kinds of calculation, which are specified in different functions.
The first function is an import function, where in a specified (by user) folder is searched for all .wav files and where the data of these are imported. All imported items are appended (numpy.append) to one big array.
The big array (for 20 files about 2.000.000.000 datapoints) is used to calculated characteristics of the sound file. The reason it has so many datapoints is because the samplerate of the .wav file is set on 78125 samples/s, which I need for accurate calculations.
After the calculations, 2 plots are generated in a specified folder and 2 csv's are also stored in that folder with the requested data.
Problem Statement
Running the main function (program) in the spyder environment, works totally fine. It take about 10 minutes to go through all the data and generates the output.
Compiling the function to an .exe using PyInstaller, works fine, no errors, everything dependency imported. However when running the program a MemoryError pops up almost immediatly (see image below).
Image: error message from command line when executing the exe file
Tried solutions
Running the python script via CLI, gives the same error
Running the .exe program with only 2 files to import, works all file, but incredibly slow (much slower than executed via spyder)
Questions
Why has spyder enough memory to process all data with no problems, but when executing the .py via command line or when executing the .exe file, there is always a memory error?
Why does the .exe or the .py via CL run slower than in the spyder IDE?
Goal
This program should be able to process noise data on every laptop in the company (also 8gb ram sometimes). So I want to find a way to let the program allocate all available RAM on the used machine.
Thanks in advance, for any help!

Meanwhile I have found the answer to my question thanks to Axe319:
The Spyder IDE was running on a 64bit version of python, making the program run smoothly and without any issues. Nevertheless my python interpreter was still a 32bit version of python.
Steps taken to solve the problem:
uninstall python 32bit version
install python 64bit version
install all used packages again using pip install -packages-
install PyInstaller again using pip install pyinstaller
Compile the program to .exe using PyInstaller
After this all seems to be working!

Can Xilinx ISE iMPACT write an SVF to a PicoBlaze like Adept can?

I'm midway through a VHDL class and have been able to play relatively nice with the ISE and Digilent toolchain in Linux... until trying to reflash a PicoBlaze program. For details, I am currently running and targeting,
Fedora 21 64-bit (3.19.3-200.fc21.x86_64)
Nexys2 development board from Digilent (with a Spartan3)
Xilinx ISE 14.7
Adept 2.16.1 Runtime
Adept 2.2.1 Utilities
I've been able to run ISE and program the Nexys2 bit files with iMPACT just fine so far in Linux, but this current project is to write an assembly program for the PicoBlaze soft core processor, compile and update the memory of the running vector without having to resynthesize any VHDL.
Using the steps from Kris Chaplin's post, I can compile a PSM to HEX and then convert that HEX file to an SVF in dosbox. From here I can use Digilent's Adept tool in Windows to program a top_level.bit file which has the PicoBlaze already synthesized, I could also do this in ISE's iMPACT in Linux. After the design is running, I can use Adept to program the SVF file into the running memory of the design and everything is peachy. However, trying to load the SVF into iMPACT in Linux throws an exception,
EXCEPTION:iMPACT:SVFYacc.c:208:1.10 - Data mismatch.
The only issue I've found online with that error shows that there should be an '#' symbol that needs to be removed, but I haven't seen any '#'s anywhere in the SVF.
I also tried to convert the SVF to XSVF. iMPACT doesn't throw an error loading the XSVF, but programming/executing the XSVF freezes the design instead of running the new program.
Adept doesn't have a comparable GUI in Linux that I've seen, just a cmd line tool 'djtgcfg'. Just like iMPACT, I've been able to program the toplevel.bit file fine with
$ djtgcfg prog -d Nexys2 -i 0 -f ../../toplevel.bit
but attempting to program the svf file with the same call doesn't seem to affect anything. It says it should take a few minutes and immediately reports "Programming succeeded" but I don't see any change on the device.
I'd really like to keep my environment all in Linux if I can, I don't have quite enough room on my laptop to juggle between two VMs.
Is it possible to use use iMPACT to write an SVF file to the Nexus2? Or can/should I be using the Adept utility differently?
Has anyone gotten this to work? Thanks a ton!

There are many better ways to reconfigure the PicoBlaze InstructionROM without resynthesizing:
use Xilinx's data2mem tool
This toll is shipped with ISE and can patch BlockRAM contents in bit-files
=> requires FPGA reprogramming
use PicoBlaze's embedded JTAGLoader6
Enable the embedded JTAGLoader6 design in the template file. Use JTAG_Loader_RH_64 binary or JTAG_Loader_Win7_64.exe to upload a hex-file via JTAG into the PicoBlaze ROM.
=> reconfigure ROM at runtime, no FPGA reprogramming needed
The manual from Ken Chapman offers several pages on how to use JTAG_Loader. Additionally, have a look into the PicoBlaze discussions at forums.xilinx.com. There are some discussions regarding bugs and issues around JTAG_Loader and how to solve them.
Also have a look into opbasm from Kevin Thibedeau as an alternative and improved PicoBlaze assembler. It is also shipped with an ROM patch tool.

I know it's a little bit late for the original poster, but I suspect I am taking the same class and I believe I have found a solution to upload picoblaze code on linux.
Download the KCPSM3 zip file from Xilinx IP Download, extract the contents and move the executables from the JTAG_loader folder to your working directory.
In dosbox run hex2svfsetup.exe for the nexys2 board select menu options 4 - 0 - 1 - 8
Use the assembler to create the .hex file
In dosbox run hex2svf.exe to create the svf file
Then run svf2xsvf.exe -d -i < input.svf > -o < output.xsvf >
The contrary to the JTAG_Loader_quick_guide.pdf in the initial zip file use impact and open the xsvf file and program using the xsvf file.

Symbolic link to latest file in a folder

I have a program which requires the path to various files. The files live in different folders and are constantly updated, at irregular intervals.
When the files are updated, they change name, so, for instance, in the folder dir1 I have fv01 and fv02. Later on the day someone adds fv02_v1; the day after someone adds fv03 and so on. In other words, I always have an updated file but with different name.
I want to create a symbolic link in my "run" folder to these files, such that said link always points to the latest file created.
I can do this in Python or Bash, but I was wondering what is out there, as this is hardly an uncommon problem.
How would you go about it?
Thank you.
Juan
PS. My operating system is Linux. I currently have a simple daemon (Python) that looks every once in a while (refreshes every minute) for the latest file. Seems kind of an overkill to me.

Unless there is some compelling reason that you have left unstated (e.g. thousands of files in the directory) just do it the way you suggest with a script sorting the files by modification time. There is no secret method that I am aware of.
You could write a daemon using inotify to monitor your directories and immediately set your links but that seems like overkill.
Edit: I just saw your edit. Since you have the daemon already, inotify might not be such a bad idea. It would be somewhat more efficient than constantly querying since the OS will tell you when something in your directories has changed.
I don't know python well enough to point you to anything specific but there must exist a wrapper for inotify.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string