How to create macro variables in python and recall them - python-3.x

I have a python code and I am saving the results in a destination with specific file name, this file name will change every time and it is a recurring event.
Here is my code:
import csv
outfile=open('path.macrovariable.csv','w',newline='')
writer=csv.writer(outfile)
writer.writerow(["Jobname","employement","company","descrption","location"])
writer.writerows([job_name])
writer.writerows([emplomnt_type])
writer.writerows([organisation])
writer.writerows([job_descrption])
writer.writerows([job_location])
In the outfile the file name here as "macrovariable" will change every time. I want to create a macrovariable at the top of the program which will be called later in the program in the place of "macrovariable" instead of hardcoding.
Thanks & regards,
Sanjay.

Related

Avoiding for loops when working with folders in Python

The code below is an attempt at a minimal reproducible example, it relies on the folders (folder_source and folder_target) and files (file_id1.csv, fileid2.csv). The code loads a csv from a directory, changes the name, and saves it to another directory.
The code works fine. I would like to know if there is a way of avoiding the nested for loop.
Thank you!
list_of_file_paths =['C:\\Users\\user\\Desktop\\folder_source\\file_id1.csv','C:\\Users\\user\\Desktop\\folder_source\\file_id2.csv']
list_of_variables =['heat','patience','charmander']
target_path=r'C:\\Users\\user\\Desktop\\folder_target\\'
for filepath_load in list_of_file_paths:
for variable in list_of_variables:
df_loaded = pd.read_csv(filepath_load) #grab one of the csv in the source folder
id_number=filepath_load.split(".")[0].split("_")[-1] #extracts the name of the id from the csv file
df_loaded.to_csv(target_path+id_number+'_'+variable+'.csv',index=False) #rename the folder and saves into another folder
You're looking for Cartesian product of 2 lists I guess?
from itertools import product
for (filepath_load, variable) in (product(list_of_file_paths, list_of_variables)):
df_loaded = pd.read_csv(filepath_load)
id_number=filepath_load.split(".")[0].split("_")[-1]
df_loaded.to_csv(target_path+id_number+'_'+variable+'.csv',index=False)
But as Roland Smith says, you have some redundancy here. I'd prefer his code, which has two loops but the minimal amount of I/O and computation.
If you really want to save each file into three identical copies with a different name, there is really no alternative.
Although I would move the inner loop down, removing redundant file reads.
for filepath_load in list_of_file_paths:
df_loaded = pd.read_csv(filepath_load)
id_number=filepath_load.split(".")[0].split("_")[-1]
for variable in list_of_variables:
df_loaded.to_csv(target_path+id_number+'_'+variable+'.csv',index=False)
Adittionally, consider using shutil.copy since the source file is not modified:
import shutil
for filepath_load in list_of_file_paths:
df_loaded = pd.read_csv(filepath_load)
id_number=filepath_load.split(".")[0].split("_")[-1]
for variable in list_of_variables:
shutil.copy(filepath_load, target_path+id_number+'_'+variable+'.csv')
That would employ the operating system's buffer cache, at least for the second and third write.

How to read the most recent Excel export into a Pandas dataframe without specifying the file name?

I frequent a real estate website that shows recent transactions, from which I will download data to parse within a Pandas dataframe. Everything about this dataset remains identical every time I download it (regarding the column names, that is).
The name of the Excel output may change, though. For example, if I already have download a few of these in my Downloads folder, the file that's exported may read "Generic_File_(3)" or "Generic_File_(21)" if I already have a few older "Generic_File" exports in that folder from a previous export.
Ideally, I'd like my workflow to look like this: export this Excel file of real estate sales, then run a Python script to read in the most recent export as a Pandas dataframe. The catch is, I don't want to have to go in and change the filename in the script to match the appending number of the Excel export everytime. I want the pd.read_excel method to simply read the "Generic_File" that is appended with the largest number (which will obviously correspond to the most rent export).
I suppose I could always just delete old exports out of my Downloads folder so the newest, freshest export is always named the same ("Generic_File", in this case), but I'm looking for a way to ensure I don't have to do this. Are wildcards the best path forward, or is there some other method to always read in the most recently downloaded Excel file from my Downloads folder?
I would use the OS package and create a method to read to file names in the downloads folder. Parsing string filenames you could then find the file following your specified format with the highest copy number. Something like the following might help you get started.
import os
downloads = os.listdir('C:/Users/[username here]/Downloads/')
is_file = [True if '.' in item else False for item in downloads]
files = [item for keep, item in zip(is_file, downloads) if keep]
** INSERT CODE HERE TO IDENTIFY THE FILE OF INTEREST **
Regex might be the best way to find matches if you have a diverse listing of files in your downloads folder.

Creating new text documents that have not existed before

I am using python to create a program that takes attendance. I want it to create a new text document each day that is saved as the date of that day. I know how to write into a file that already exists, but do not know how to create a new one through python. I also do not know how to make it create a new one each new day. Thanks guys!
This can be done easily, you just need to provide the date as filename which can be calculated like
import time
timestr = time.strftime("%Y%m%d-%H%M%S")
with open(timestr, 'w+') as f:
f.write('XXXXX')
# Write what further needed
A file will be created with the proper timestamp.

Trigger function in Excel when external CSV file is updated or batch file is finished

I have an excel workbook that uses a hotkey that launches a batch file, which launches a Node script, which updates a CSV file. Technical details on that are further below.
The workbook uses the CSV file as a data source. I can manually update the Workbook with the data from the CSV file by going to Data > Refresh All > Refresh All.
Is there any way to trigger an update in the workbook once there is new data in the CSV file, or when the batch file finishes? Conceptually, I'm asking how an external event can trigger something in Excel.
Here are fine details on the process:
When a hotkey is pressed in the Excel workbook, it launches MS console ("cmd.exe") and passes the location of a batch file to be ran and the value of the selected cell. The reason the batch file is run this way is probably not relevant to this question but I know it will be asked, so I'll explain: The batch file is to be located in the same directory as the workbook, which is not to be a hard-coded location. The problem is that launching a batch-file/cmd.exe directly will default to a working directory of C:\users\name\documents. So to launch the batch file in the same directory as the workbook, the path of the workbook is passed along to cmd.exe like so: CD [path] which is then concatenated inline with another command to launch the batch file with the value of the selected cell as an argument like so: CD [path] & batch.bat cellValue
Still with me?
The batch file then launches a Node script, again with the selected cell value as an argument.
The Node script pulls data from the web and dumps it in to a CSV file.
At this point, the workbook still has outdated data, and needs to be Refreshed. How can this be automatic?
I could just start a static timer in VBA after the batch file is launched, which then runs ActiveWorkbook.RefreshAll, but if the batch file takes too long, there will be issues.
I found a solution, although it may not be the most efficient way.
Right now, after Excel launches the batch file, I have it set to loop and repeatedly check the date modified of the CSV file via FileDateTime("filename.csv")
At first, this looping was an issue because I was worried about Excel excessively checking the date modified of the CSV. I thought it may cause issues with resources while it checks however many hundred or thousands of times a second. I could add a 1 second delay with the sleep or wait functions, but those cause Excel to hang. It would be frozen until the CSV files were updated, if at all. The user would have to use CTRL+BREAK in an emergency.
I was able to use a solution that just loops and performs DoEvents while checking until a certain amount of time has passed. This way, Excel is still functional during the wait. More info on that here: https://www.myonlinetraininghub.com/pausing-or-delaying-vba-using-wait-sleep-or-a-loop

Run a VBA macro in Spotfire using Ironpython

So I would try ask over in this thread IronPython - Run an Excel Macro but I don't have enough reputation.
So roughly following the code given in the link I created some code which would save a file to a specific location, then open a workbook that exists there, calling the macro's within it, which would perform a small amount of manipulation on the data which I downloaded to .xls, to make it more presentable.
Now I've isolated the problem to this particular part of the code (below).
Spotfire normally is not that informative but it gives me very little to go on here. It seems to be something .NET related but that's about all I can tell.
The Error Message
Traceback (most recent call last): File
"Spotfire.Dxp.Application.ScriptSupport", line unknown, in
ExecuteForDebugging File "", line unknown, in
StandardError: Exception has been thrown by the target of an
invocation.
The Script
from Spotfire.Dxp.Data.Export import DataWriterTypeIdentifiers
from System.IO import File, Directory
import clr
clr.AddReference("Microsoft.Office.Interop.Excel")
import Microsoft.Office.Interop.Excel as Excel
excel = Excel.ApplicationClass()
excel.Visible = True
excel.DisplayAlerts = False
workbook = ex.Workbooks.Open('myfilelocation')
excel.Run('OpenUp')
excel.Run('ActiveWorkbook')
excel.Run('DoStuff')
excel.Quit()
Right, so I'm answering my own question here but I hope it helps somebody. So the above code, as far as I'm aware was perfectly fine but didn't play well with the way my spotfire environment is configured. However, after going for a much more macro heavy approach I was able to find a solution.
On the spotfire end I've two input fields, one which gives the file path, the other the file name. Then there's a button to run the below script. These are joined together in the script but are crucially separate, as the file name needs to be inputted into a separate file, to be called by the main macro so that it can find the file location of the file.
Fundamentally I created three xml's to achieve this solution. The first was the main one which contained all of the main vba stuff. This would look into a folder in another xml which this script populates to find the save location of the file specified in an input field in spotfire. Finding the most recent file using a custom made function, it would run a simple series of conditional colour operations on the cell values in question.
This script populates two xml files and tells the main macro to run, the code in that macro automatically closing excel after it is done.
The third xml is for a series of colour rules, so user can custom define them depending on their dashboard. This gives it some flexibility for customisation. Not necessary but a potential request by some user so I decided to beat them to the punch.
Anyway, code is below.
from Spotfire.Dxp.Data.Export import DataWriterTypeIdentifiers
from System.IO import File, Directory
import clr
clr.AddReference("Microsoft.Office.Interop.Excel")
import Microsoft.Office.Interop.Excel as Excel
from Spotfire.Dxp.Data.Export import *
from Spotfire.Dxp.Application.Visuals import *
from System.IO import *
from System.Diagnostics import Process
# Input field which takes the name of the file you want to save
name = Document.Properties['NameOfDocument']
# Document property that takes the path
location = Document.Properties['FileLocation']
# just to debug to make sure it parses correctly. Declaring this in the script
# parameters section will mean that the escape characters of "\" will render in a
# unusable way whereas doing it here doesn't. Took me a long time to figure that out.
print(location)
# Gives the file the correct extension.
# Couldn't risk leaving it up to the user.
newname = name + ".xls"
#Join the two strings together into a single file path
finalProduct = location + "\\" + newname
#initialises the writer and filtering schema
writer = Document.Data.CreateDataWriter(DataWriterTypeIdentifiers.ExcelXlsDataWriter)
filtering = Document.ActiveFilteringSelectionReference.GetSelection(table).AsIndexSet()
# Writes to file
stream = File.OpenWrite(finalProduct)
# Here I created a seperate xls which would hold the file path. This
# file path would then be used by the invoked macro to find the correct folder.
names = []
for col in table.Columns:
names.append(col.Name)
writer.Write(stream, table, filtering, names)
stream.Close()
# Location of the macro. As this will be stored centrally
# it will not change so it's okay to hardcode it in.
runMacro = "File location\macro name.xls"
# uses System.Diagnostics to run the macro I declared. This will look in the folder I
# declared above in the second xls, auto run a function in vba to find the most
# up to date file
p = Process()
p.StartInfo.FileName = runMacro
p.Start()
Long story short: To run excel macro's from spotfire one solution is to use the system.dianostics method I use above and simply have your macro set to auto run.

Resources