How to automate the executions of the same program in pycharm? - python-3.x

I have a bunch of datasets that need to be tested by using always the same .py program.
I want to automate all the testing process, so that after that one dataset has been tested and evaluated, automatically the program .py starts by testing (and evaluating) the next one.
I'm using the PyCharm IDE and I tried by adding configuration files that execute the same .py program but that take in input different file paths.
I was wondering whether there is a tool (or a command sequence to be followed in the command line) that let me automate the process of automatic testing or that let me call all the configurations created one after the other.
I'm talking about 6-8 hours per configuration, and currently I've 4 configurations.
Thanks in advance!

Related

Filewatchers: infinite loops or simple function?

That's the question.
I'm developing a bash script and I need to watch files modifications.
This script is a deamon it create bind files to add domains on my server and it watch specifics Laravel log files to do the job.
Actually, I'm using an infinite loop where I get the file hash in a variable, and compare with the precedent data stored in it to watch modifications, every 2 seconds, but there is a better way to do it?
Thanks in advance

Best way to program flow through a job loop

I see that Origen supports passing jobs to the program command in this video. What would be the preferred method to run the program command in a job loop (i.e. job == 'ws' then job == 'ft', etc.).
thx
The job is a runtime concept, not a compile/generate time concept, so it doesn't really make sense to run the program command (i.e. generate the program) against different settings of job.
Origen doesn't currently provide any mechanism to pass define-type arguments through to the program generator from the command line, though you could implement that in your app easily enough by overriding the program command - i.e. capture and store them somewhere in your app and then continue with the regular command.
The 'Origen-way' of doing things like this is to setup different target files with different variables set within them, then execute the program command for the different targets.

Best practice for using Kiba as a batch process on files

We'd like to run Kiba as a batch process on a series of files. What would be the best structure to give a file mask, download the files from FTP, and then run the ETL job on each, sending a success or failure notification on a per file basis?
Is there a way to do this from within Kiba, or is the best practice just to handle all the non-ETL stuff externally, and then just call kiba on each file?
I would initially start with the simplest possible thing, which is like you said, using external files then calling Kiba on each one. E.g. :
Build a rake task to download the files locally (and remove them from the FTP, or at least move them to a separate folder to avoid double-processing), inside a well-known folder which will act as an inbox. See here for interesting links on how to do that.
Then build another rake task to iterate over the inbox folder and process a given file (using Dir[pattern].each).
Make sure to use a helper such as:
def system!(command)
fail "Command #{command} failed" unless system(command)
end
to make sure you detect failures in execution when making system calls.
For your ETL file itself, you would use one at_exit block to capture failure and notify accordingly (see example here with Bugsnag, and a post_process block to capture success and notify in that case.
This will definitely work and is simple, that said there are other possibilities, such as a single ETL file which will download files in a pre_process block, then have a source which will yield one filename per downloaded file, and maybe a transform which could itself call kiba on the command line, or even more advanced solutions.
I would stick to the simplest possible solution to get started, as always!

When running a shell script with loops, operation just stops

NOTICE: Feedback on how the question can be improved would be great as I am still learning, I understand there is no code because I am confident it does not need fixing. I have researched online a great deal and cannot seem to find the answer to my question. My script works as it should when I change the parameters to produce less outputs so I know it works just fine. I have debugged the script and got no errors. When my parameters are changed to produce more outputs and the script runs for hours then it stops. My goal for the question below is to determine if linux will timeout a process running over time (or something related) and, if, how it can be resolved.
I am running a shell script that has several for loops which does the following:
- Goes through existing files and copies data into a newly saved/named file
- Makes changes to the data in each file
- Submits these files (which number in the thousands) to another system
The script is very basic (beginner here) but so long as I don't give it too much to generate, it works as it should. However if I want it to loop through all possible cases which means I will generates 10's of thousands of files, then after a certain amount of time the shell script just stops running.
I have more than enough hard drive storage to support all the files being created. One thing to note however is that during the part where files are being submitted, if the machine they are submitted to is full at that moment in time, the shell script I'm running will have to pause where it is and wait for the other machine to clear. This process works for a certain amount of time but eventually the shell script stops running and won't continue.
Is there a way to make it continue or prevent it from stopping? I typed control + Z to suspend the script and then fg to resume but it still does nothing. I check the status by typing ls -la to see if the file size is increasing and it is not although top/ps says the script is still running.
Assuming that you are using 'Bash' for your script - most likely, you are running out of 'system resources' for your shell session. Also most likely, the manner in which your script works is causing the issue. Without seeing your script it will be difficult to provide additional guidance, however, you can check several items at the 'system level' that may assist you, i.e.
review system logs for errors about your process or about 'system resources'
check your docs: man ulimit (or 'man bash' and search for 'ulimit')
consider removing 'deep nesting' (if present); instead, create work sets where step one builds the 'data' needed for the next step, i.e. if possible, instead of:
step 1 (all files) ## guessing this is what you are doing
step 2 (all files)
step 3 (all files
Try each step for each file - Something like:
for MY_FILE in ${FILE_LIST}
do
step_1
step_2
step_3
done
:)
Dale

Run a file multiple times with different parameters

I am working in Windows and at my work i have a VB program that i need to run multiple times. It takes two input files. One of them is constant and the other input file changes according to which a new output file is created every time the program is run.
I need to know how can i automate this. Can this be done using a batch file? I an not sure if the VB program takes cmd inputs. How can I check and what shall I read? I don't have access to its source.
Every program that runs runs in the shell. right? So where can I see that? Maybe I could manipulate and repeat the exe execution using different parameters.
One way is to write a GUI Automation script. Following Stack exchange threads can help you get started with it:
Automate GUI tasks
https://stackoverflow.com/questions/120359/tools-for-automated-gui-testing-on-windows

Resources