I have a large number of .xlsx files downloaded from an external database which I want to work with. It has two worksheets, the first worksheet only has some comments on the data and the second one contains the data.
I've tried opening the excel spreadsheet using the following two options, but they both give me an error. The error disappears when I delete the first worksheet. But since I have >350 files, I don't want to delete the all those worksheets manually.
The code I tried
from openpyxl import load_workbook
wb = load_workbook('/Users/user/Desktop/Data_14.xlsx')
Which gives the error:
InvalidFileException: "There is no item named 'xl/styles.xml' in the archive"
And:
from xlrd import open_workbook
book = open_workbook('/Users/user/Desktop/Data_14.xlsx')
which gives a very long error message (KeyError: 9)
I think the problem is a formula error in the first excel worksheet. One cell in the worksheet says
- minimum percentage that must characterise the path from a subject Company up to its Ultimate owner: 50.01%
but it is not formatted as text. Executing the cell gives an error message in Excel. Inserting an " ' " to make it text lets me then open the file with python which is what I want to do.
Any ideas on how I can open the excel files automatically to solve this problem?
Solution:
I've named the script delsheet.py and placed it in a directory also containing the excel files.
I'm using Python 3.4.3 and Openpyxl 2.3.0 but this should work for Openpyxl 2.0+
I am on a Mac OS X running Yosemite.
Knowing your versions and settings would be useful because openpyxl can be fickle with syntax depending on the version.
Worksheet names, either I over looked or you failed to mention if the first worksheet in the Excel files have unique names or if they are all the same.
If they are all the same then that is convenient and if all the first sheets are named 'Sheet1' then this script will work as is, and that is how you worded the question so this is how I've written the solution; if different please clarify. Thanks.
Understanding the script:
First the script stores the path of the script location to know which directory it is being called from and therefore located.
From that location the script lists the files in the same directory with the file extension .xlsx appending them to the list 'spreadsheet_list'
Using a for loop and getting the number of elements in the list 'spreadsheet_list' lets the script know how long to iterate through the elements in the list.
the loop loads in an excel file from the list
removes 'sheet1'
saves the spreadsheet with the same original filename.
delsheet.py
#!/usr/bin/env python3
# Using python 3.4.3 and openpyxl 2.3.0
# Remove the first worksheet from a batch of excel sheets
# Import Modules
import sys, os, re
from openpyxl import Workbook, load_workbook
# Create List
spreadsheet_list = []
# Script path to the directory it is located.
pcwd=os.path.dirname(os.path.abspath(__file__))
# List files in directory by file extension
# Specify directory
items = os.listdir(pcwd)
# Specify extension in "if" loop and append the files in the directory to the "spreadsheet_list" list.
for names in items:
if names.endswith(".xlsx"):
spreadsheet_list.append(names)
# Debugging purposes: print out the list of appended excel files in script directory
# print(spreadsheet_list)
# For loop, using the number of elements in the spreadsheet_list we can determine how long the loop should go
for i in range(len(spreadsheet_list)):
# print(i) to see that i is = to the number of excel files located in the directory
# Load workbook into memory (Opening the Excel file automatically...)
wb = load_workbook(spreadsheet_list[int(i)])
## Store Sheet1 in the workbook as 'ws'
ws = wb['Sheet1']
## Remove the worksheet 'ws'
wb.remove_sheet(ws)
## Save the edited excel sheets (with the original name)
wb.save(spreadsheet_list[int(i)])
Please try this add-in to merge all 2nd sheets.
http://www.rondebruin.nl/win/addins/rdbmerge.htm
Or, run this script to delete all first sheets in all workbooks . . .
Sub Example()
Dim MyPath As String, FilesInPath As String
Dim MyFiles() As String, Fnum As Long
Dim mybook As Workbook
Dim CalcMode As Long
Dim sh As Worksheet
Dim ErrorYes As Boolean
Application.DisplayAlerts = False
'Fill in the path\folder where the files are
MyPath = "C:\Users\rshuell001\Desktop\excel\"
'Add a slash at the end if the user forget it
If Right(MyPath, 1) <> "\" Then
MyPath = MyPath & "\"
End If
'If there are no Excel files in the folder exit the sub
FilesInPath = Dir(MyPath & "*.xl*")
If FilesInPath = "" Then
MsgBox "No files found"
Exit Sub
End If
'Fill the array(myFiles)with the list of Excel files in the folder
Fnum = 0
Do While FilesInPath <> ""
Fnum = Fnum + 1
ReDim Preserve MyFiles(1 To Fnum)
MyFiles(Fnum) = FilesInPath
FilesInPath = Dir()
Loop
'Change ScreenUpdating, Calculation and EnableEvents
With Application
CalcMode = .Calculation
.Calculation = xlCalculationManual
.ScreenUpdating = False
.EnableEvents = False
End With
'Loop through all files in the array(myFiles)
If Fnum > 0 Then
For Fnum = LBound(MyFiles) To UBound(MyFiles)
Set mybook = Nothing
On Error Resume Next
Set mybook = Workbooks.Open(MyPath & MyFiles(Fnum))
On Error GoTo 0
If Not mybook Is Nothing Then
'Change cell value(s) in one worksheet in mybook
On Error Resume Next
With mybook.Worksheets(1)
ActiveSheet.Delete
End With
If Err.Number > 0 Then
ErrorYes = True
Err.Clear
'Close mybook without saving
mybook.Close savechanges:=False
Else
'Save and close mybook
mybook.Close savechanges:=True
End If
On Error GoTo 0
Else
'Not possible to open the workbook
ErrorYes = True
End If
Next Fnum
End If
If ErrorYes = True Then
MsgBox "There are problems in one or more files, possible problem:" _
& vbNewLine & "protected workbook/sheet or a sheet/range that not exist"
End If
'Restore ScreenUpdating, Calculation and EnableEvents
With Application
.ScreenUpdating = True
.EnableEvents = True
.Calculation = CalcMode
End With
Application.DisplayAlerts = True
End Sub
Related
I'm having an issue with the following code, that is supposed to sequentially open 〜100 csv files, check for a value in a cell (validation, if it is file with correct structure), copy single line of data and paste it into ThisWorkbook.Worksheets("2 CSV").Range("B" & row_number).
This solution worked for two years until this month. Now the whole Excel crashes randomly on any file without any error message. Sometimes it manages to loop through 20 files, sometimes 5.
The weirdest thing is, that I can loop manually using F8 through the whole thing without any problem.
The macro:
Sub b_load_csv()
Dim appStatus As Variant
Dim folder_path As String 'folder path to where CSVs are stored
Dim file_name As String 'file name of current CSV file
Dim row_number As Integer 'row number in target sheet
Dim source_sheet_name As String 'name of the source sheet of the CSV = CSV file name
Dim wb_src As Workbook 'variable for opened CSV source workbook
Dim sht_src As Worksheet 'variable for opened CSV source sheet
Dim sht_csv As Worksheet 'variable for target sheet in ThisWorkbook
With Application
.Calculation = xlCalculationManual
.ScreenUpdating = False
.DisplayAlerts = False
If .StatusBar = False Then appStatus = False Else appStatus = .StatusBar 'show currently processing file in status bar
End With
folder_path = "C:\Folder\SubFolder\" 'here are the files stored
file_name = Dir(folder_path & "*.csv") 'using dir to get file names
row_number = 3 'row number for pasting values
Set sht_csv = ThisWorkbook.Worksheets("2 CSV") 'target sheet for data aggregation
Do While file_name <> ""
Workbooks.Open (folder_path & file_name), UpdateLinks:=False, Local:=True 'open csv file
Set wb_src = Workbooks(file_name) 'assign opened csv file to variable
source_sheet_name = Left(file_name, InStr(file_name, ".") - 1) 'sheet name in csv is the same as the file name
Set sht_src = wb_src.Worksheets(source_sheet_name) 'assign source sheet to variable
If sht_src.Range("C1").Value2 = "OJ_POPIS" Then 'checks if the csv has the correct structure
sht_src.Range("A2:FZ2").Copy 'if so copies desired range
sht_csv.Range("B" & row_number).PasteSpecial 'and pastes it into target worksheet column B
End If
sht_csv.Range("A" & row_number).Value2 = file_name 'writes file name into column A
Application.CutCopyMode = False
wb_src.Close SaveChanges:=False
file_name = Dir() 'fetch next file name
row_number = row_number + 1
'the following lines is what I tried to fix the problem of random excel crashing
Set wb_src = Nothing
Set sht_src = Nothing
Application.StatusBar = "Processing file " & file_name
DoEvents
Application.Wait (Now + TimeValue("0:00:02"))
ThisWorkbook.Save 'save after every loaded file to see which files are causing the problem
Loop
MsgBox "Data from CSV files copied", vbOKOnly
Set sht_csv = Nothing
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
Application.DisplayAlerts = True
End Sub
Source CSV files are encoded both in UTF-8 and ANSI (my ACP is ANSI, 1250) and ; delimited.
Group policy restricting macros doesn't apply to me. I can sign my own macros.
What I tried:
Lines of code at the end of the loop
Identifying and deleting files triggering the crash (they have nothing in common, seemingly random, by the time a remove half of them... what is the point)
Simplifying the macro
New workbook
Different machine
VPN On/Off
Thank you for your help!
First thing I'd try is include a proper error handler (not resume next), particularly with x64, and ensure 'Break on all unhandled errors' is selected in Tools / Options / General.
Second thing I'd try is avoid using the clipboard -
With sht_src.Range("A2:FZ2")
sht_cvs.Range("B" & row_number).Resize(.Rows.Count, .Columns.Count).Value = .Value
End With
(no need then to clear CutCopyMode)
Third thing I'd try is don't filter with Dir but something like this -
sFilter = "*.cvs"
file_name = Dir$(, 15) ' without vbDirectory if not getting subfolders
Do While Len(file_name)
If file_name Like sFilter Then
' process file
End If
file_name = Dir$(, 15)
Loop
Fourth thing I'd try is a good cup of coffee!
I have multiple workbooks in a folder #1 and I'm trying to copy certain cells information from one worbook to another.
The source files in the folder are .xslm and named "1" "2" "3".... etc
The target files (which I'm trying to copy the cells to) are in another folder are .csv and named "1" "2" "3".... etc
I have about 1000 files that I'm trying information from. so copying them one by one will take me forever
Source File Screenshot
Target File Screenshot
Assuming the files you want to copy from are in a folder C:\MyExcelFiles\ and assuming they are named 1.xlsm, 2.xlsm and the output files should be 1.xls and 2.xls, then it is a straight forward thing to do:
Sub CopyMacro()
Dim SourceFolder As String
Dim SourceFileName As String
Dim DestinationFileName As String
Dim SourceWorkbook As String
Dim DestinationWorkbook As String
SourceFolder = "C:\MyExcelFiles\"
Application.DisplayAlerts = False ' avoid security warning
For I = 1 To 100
SourceFileName = SourceFolder & I & ".xlsm"
DestinationFileName = SourceFolder & I & ".xls" ' could be any other file
On Error Resume Next
Workbooks.Open SourceFileName, ReadOnly:=True
If Err > 0 Then
MsgBox "Could not open file :" & SourceFileName
Exit Sub
End If
SourceWorkbook = ActiveWorkbook.Name
On Error GoTo 0
ActiveWorkbook.Sheets(1).Activate ' assuming the data you want to copy is on the first sheet
Range("a1:d6").Copy
Workbooks.Add
DestinationWorkbook = ActiveWorkbook.Name
Range("a1").PasteSpecial xlPasteValues
Workbooks(DestinationWorkbook).SaveAs DestinationFileName
ActiveWorkbook.Close
Workbooks(SourceWorkbook).Close
DoEvents ' give a chance for mouse events and keyboard events to get executed
' this will also allow you to press CTRL+PAUSE if you want to stop the macro
Next
Application.DisplayAlerts = True 'Switch alerts back on
End Sub
Please keep in mind, I did not test the code. But I am sure you will be able to fix it if it has any bugs, or errors.
I have a VBA script in Excel which works fine but when saved as script_name.vbs and executed in cmd/powershell as cscript.exe script_name.vbs it throws the error:
dir_path\script_name.vbs(30, 37) Microsoft VBScript compilation error: Expected ')'
Firstly I apologise. This seems like a well-worn question but no answer I could find explains any reasons why my particular VBA script won't work.
I learnt that you cannot Dim As when running vbs on the cmd line so I removed that, and then got the above error. No question I found seems to indicate to me as to why.
Help much appreciated!
Thanks
FYI: The macro is to iterate through all files which have passwords in a folder and
Attempt a number of any possible passwords to open the file
Same again for workbook protection passwords
Unhide all worksheets
Save the file
Move onto the next file
Sub BAUProcessVBA()
Dim wb
Dim ws
Dim myPath
Dim myFile
Dim myExtension
Dim i
'Optimize Macro Speed
Application.ScreenUpdating = False
Application.EnableEvents = False
Application.Calculation = xlCalculationManual
myPath = "C:\blah\dir\"
'Target File Extension (must include wildcard "*")
myExtension = "*.xls*"
'Target Path with Ending Extention
myFile = Dir(myPath & myExtension)
'Loop through each Excel file in folder
Do While myFile <> ""
'Set variable equal to opened workbook
Debug.Print myFile
On Error Resume Next
Set wb = Workbooks.Open(Filename:=myPath & myFile, Password:="pw1", IgnoreReadOnlyRecommended:=True, ReadOnly:=False)
Set wb = Workbooks.Open(Filename:=myPath & myFile, Password:="pw2", IgnoreReadOnlyRecommended:=True, ReadOnly:=False)
On Error GoTo 0
'Ensure Workbook has opened before moving on to next line of code
DoEvents
'Remove workbook protection and, unhide all tabs, save the file
On Error Resume Next
wb.Unprotect "pw1"
wb.Unprotect "pw2"
On Error GoTo 0
On Error Resume Next
wb.Password = ""
On Error GoTo 0
For Each ws In wb.Worksheets
ws.Visible = xlSheetVisible
Next ws
'Save and Close Workbook
Application.CutCopyMode = False
wb.Close SaveChanges:=True
Application.EnableEvents = False
'Ensure Workbook has closed before moving on to next line of code
DoEvents
'Get next file name
myFile = Dir
Loop
'Reset Macro Optimization Settings
Application.EnableEvents = True
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub
You seem under the impression that Visual Basic for Applications vba and Visual Basic Script vbscript are identical languages. That is not the case. They may be more closely related than Visual Basic .Net vb.net and VBA or VBS, but they are still different languages.
Which is why we have different tags for all of them.
Now, to tackle your question:
VBA has got the Microsoft Office Object Library reference, which means native support for office objects.
Application doesn't exist in vbs, so you need to create that object: Set Application = WScript.CreateObject("Excel.Application")
Excel constants don't exist:
xlCalculationManual = -4135, xlCalculationAutomatic = -4105 and xlSheetVisible = -1
Dir doesn't exist, so you need to create a FileSystemObject
Named arguments don't exist, so you need commas:
Set wb = app.Workbooks.Open(myPath & myFile, , False, , "pw1", , True)
And DoEvents doesn't exist either.
To solve this problem I have used Python to open Excel and execute the Macro I want. Below is a function that should work for anyone.
Things I have learnt: If you have VBA code in Excel and want to run it without Excel then you cannot just save this as a .vbs and execute it on the command line with cscript.exe.
VBS and VBA are different languages.
Therefore, a quick tutorial for those stuck at the same problem but are unfamiliar with Python:
Download and install Python ensuring python is added to PATH. This script was written and successfully executed with Python 3.8 64-bit for Windows. https://www.python.org/downloads/
Save the below in a file called run_macro.py
On the last line of run_macro.py, with no indentation, type what is below within Code2
Carrying on with Code2: Inside of the quotes 'like this' type in what it's asking for. The filepath_incl_filename must contain the full path AND the filename whereas filename must contain ONLY the filename. Yes, it must be provided like this.
Copy the filepath where run_macro.py is located and press win+r and type 'cmd' to open the cmd terminal, then type cd <filepath from clipboard> and press enter
Now type python run_macro.py
So long as you get no errors and it appears to "freeze" then that means it's working. Otherwise, you will need to debug the errors.
Code:
import win32com.client as wincl
def run_excel_macro(filepath_incl_filename=r'raw_filepath', filename='', module_name='', macro_name=''):
"""
:param filepath_incl_filename: Must be r'' filepath to dir with filename but also include the filename in the filepath (c:\etc\folder\wb_with_macro.xlsm)
:param filename: Filename of xlsm with the Macro (wb_with_macro.xlsm)
:param module_name: Found inside 'Modules' of macros within that workbook
:param macro_name: The 'sub name_here()' means macro is called 'name_here'
:return: Nothing. Executes the Macro.
"""
# script taken from: https://stackoverflow.com/questions/19616205/running-an-excel-macro-via-python
# DispatchEx is required in the newest versions of Python.
excel_macro = wincl.DispatchEx("Excel.application")
workbook = excel_macro.Workbooks.Open(Filename=filepath_incl_filename, ReadOnly=1)
excel_macro.Application.Run(f"{filename}!{module_name}.{macro_name}")
# Save the results in case you have generated data
workbook.Save()
excel_macro.Application.Quit()
del excel_macro
Code2
run_excel_macro(
filepath_incl_filename=r'',
filename='',
module_name='',
macro_name=''
)
I am importing a batch of csv files from a folder all in separate worksheets, yet when I import the file, my new data loses leading zeroes for numbers and also loses its UTF-8 format. Is there any possible way to import the csv files while keeping leading zeroes and UTF-8 format?
Below is my vba
Option Explicit
Sub ImportCSVs()
Dim fPath As String
Dim fCSV As String
Dim wbCSV As Workbook
Dim wbMST As Workbook
Set wbMST = ThisWorkbook
fPath = "C:\mycsvfiles\Q3 2017\" 'path to CSV files, include the final \
Application.ScreenUpdating = False 'speed up macro
Application.DisplayAlerts = False 'no error messages, take default answers
fCSV = Dir(fPath & "*.csv") 'start the CSV file listing
On Error Resume Next
Do While Len(fCSV) > 0
Set wbCSV = Workbooks.Open(fPath & fCSV) 'open a CSV file
wbMST.Sheets(ActiveSheet.Name).Delete 'delete sheet if it exists
ActiveSheet.Move After:=wbMST.Sheets(wbMST.Sheets.Count) 'move new sheet into Mstr
Columns.AutoFit 'clean up display
fCSV = Dir 'ready next CSV
Loop
Application.ScreenUpdating = True
Set wbCSV = Nothing
End Sub
Thanks a million in advance! Let me know if I can provide additional information
I have a project where I maintain a list of all my students and their information in an Excel file labeled "BigList.xlsx". Then, I have about 40-50 other separate ancillary excel files that link to BigList by using VLOOKUP.
For example, in cell A1 of an ancillary file you might see a formula that looks like this:
=Vlookup(B3,
'c:\documents and settings\user\desktop\[BigList.xlsx]Sheet1'!$a$1:$b$10000,
2,false).
The vlookup link above references BigList.xlsx. However, I just realized that I need to change that file name to something else, like MasterDatabase.xlsm (notice the different extension). Is there an easy way to do this without having to manually go through all 40-50 files and doing a find & replace?
I think the basic idea is to change a hardcoded link into a dynamic one where I can change the filename of BigList.xlsx anytime, and not have to go back through all 40-50 files to update their links.
This should do what you require - maybe not super fast but if you only need to do it once on 50 workbooks it should be good enough. Note that the replace line should make the replacement in all the sheets of the workbook.
Option Explicit
Public Sub replaceLinks()
Dim path As String
Dim file As String
Dim w As Workbook
Dim s As Worksheet
On Error GoTo error_handler
path = "C:\Users\xxxxxx\Documents\Test\"
Application.DisplayAlerts = False
Application.ScreenUpdating = False
file = Dir$(path & "*.xlsx", vbNormal)
Do Until LenB(file) = 0
Set w = Workbooks.Open(path & file)
ActiveSheet.Cells.Replace What:="'THE_LINK_YOU_WANT_TO_CHANGE'!", _
Replacement:="'THE_NEW_LINK'!", LookAt:=xlPart
w.Save
w.Close
file = Dir$
Loop
Application.DisplayAlerts = True
Application.ScreenUpdating = True
Exit Sub
error_handler:
MsgBox Err.Description
Application.DisplayAlerts = True
Application.ScreenUpdating = True
End Sub
You can do this in Excel 2010 without using any code. (If memory serves, it will also work in earlier versions of Excel.)
Open all 50 ancillary excel files in Excel at the same time.
Open BigList.xlsx. (You now have 51 files open in Excel.)
Click File - Save As and save BigList as MasterDatabase.xlsm
Close the new MasterDatabase.xlsm file.
Look at one of the ancillary files and verify that Excel has it pointed to the new file.
Close and save all files.
This code will automate the link change directly
Update your paths to BigList.xlsx and MasterDatabase.xlsm in the code
Update your path to the 40-50 files (I have used c:\temp\")
The code will then open both these files (for quicker relinking), then one by open the files in strFilePath, change the link from WB1 (strOldMasterFile ) to Wb2 (strOldMasterFile ), then close the saved file
Please note it assumes all these files are closed on code start, as the code will open these file
Sub ChangeLinks()
Dim strFilePath As String
Dim strFileName As String
Dim strOldMasterFile As String
Dim strNewMasterFile As String
Dim WB1 As Workbook
Dim WB2 As Workbook
Dim WB3 As Workbook
Dim lngCalc As Long
strOldMasterFile = "c:\testFolder\bigList.xlsx"
strNewMasterFile = "c:\testFolder\newFile.xlsm"
On Error Resume Next
Set WB1 = Workbooks.Open(strOldMasterFile)
Set WB2 = Workbooks.Open(strNewMasterFile)
If WB1 Is Nothing Or WB2 Is Nothing Then
MsgBox "One (or both) of " & vbnerwline & strOldMasterFile & vbNewLine & strNewMasterFile & vbNewLine & "cannot be found"
WB1.Close False
WB2.Close False
Exit Sub
End If
On Error GoTo 0
With Application
.DisplayAlerts = False
.ScreenUpdating = False
.EnableEvents = False
lngCalc = .Calculation
.Calculation = xlCalculationManual
End With
strFilePath = "c:\temp\"
strFileName = Dir(strFilePath & "*.xls*")
'Error handling as link may not exist in all files
On Error Resume Next
Do While Len(strFileName) > 0
Set WB2 = Workbooks.Open(strFilePath & strFileName, False)
WB2.ChangeLink strOldMasterFile, strNewMasterFile, xlExcelLinks
WB2.Save
WB2.Close False
strFileName = Dir
Loop
On Error GoTo 0
WB1.Close False
WB2.Close False
With Application
.DisplayAlerts = True
.ScreenUpdating = True
.EnableEvents = True
.Calculation = lngCalc
End With
End Sub