Umlauts in Excel data - excel

I am reading Data From Excel and if the text in the cell contains umlauts (äöü) they not be correctly seen by my Perl script. The char is replaced by substitution character.
What do I need to do to correctly read special characters from Excel?
# get reference to Excel, Active Window, Active Sheet
my $excel = Win32::OLE->GetActiveObject('Excel.Application');
my $book = $excel -> ActiveWindow;
my $sheet = $book -> ActiveSheet();
my $text = $sheet->Cells(1, 2)->{Value};

It works for me (Windows 10, Strawberry Perl 5.30) when printing the content to the Windows command prompt window and using STDOUT encoding cp437:
use feature qw(say);
use strict;
use warnings;
use Win32::OLE;
use open ':std', ':encoding(cp437)';
# get reference to Excel, Active Window, Active Sheet
my $excel = Win32::OLE->GetActiveObject('Excel.Application');
my $book = $excel -> ActiveWindow;
my $sheet = $book -> ActiveSheet();
my $text = $sheet->Cells(1, 1)->{Value};
say $text;
Output:
äöü
Edit:
As noted by #ikegami you should determine the console output-code-page programmatically (instead of hardcoding the value cp437 as I did) like this:
use Win32;
my $coe = "cp" . Win32::GetConsoleOutputCP();
binmode STDOUT, "encoding($coe)";
See also this post for more information.

Related

How to read Excel file in Perl by sheet name

I am looking for some examples/advice on how to write a Perl script
to read data from an Excel file by sheet name and not sheet number.
This is an example with Spreadsheet, but it doesn't work with sheet name:
#Code Perl :
use Spreadsheet::Read qw(ReadData);
{
my $book = ReadData ("test.xls");
my $sheet = $book->sheet ("name_3");
my #rows = rows ($sheet);
...
}
Can you help me please?
It works for me when I use the OO API:
use warnings;
use strict;
use Spreadsheet::Read;
my $book = Spreadsheet::Read->new('test.xls');
my $sheet = $book->sheet('Sheet1');
my #rows = $sheet->rows();

perl script to read an xlsx file(which has many sheets) using the sheet name

I am trying to write a perl script which reads an excel file(which has many sheets in it) using the sheet name.
I know how to access a particular sheet of the excel file using the sheet number, but not sure how to read it using sheet name.
Any help provided is highly appreciated.
Below is the code I wrote to access the sheet using sheet number:
my $Sheet_Number = 26;
my $workbook = ReadData("<path to the excel file>");
for (my $i =2; $i<$limit; $i++){
my $cell = "A" . $i;
my $key_1 = $workbook->[$Sheet_Number]{$cell};
}
Thanks
----Edit----
I want to open the particular sheet within the Excel file by using the sheet name. And then read the data from that particular sheet. The name of the sheet will be entered by the user while running the script from the command line arguments.
Below is the code that I am using after getting suggested answers for my earlier question:
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse("$path");
my $worksheet;
if ($worksheet->get_name() eq "$Sheet_Name"){
for (my $i =2; $i<$limit; $i++){
my $cell = $worksheet->get_cell($i,"A");
my $value = $cell->value();
push #array_keys, $value;
}
}
I want to read the values of Column A of the particular sheet and push it into an array.
$Sheet_Name : It is the name of the sheet which is entered by the user as cmd line arg.
$path : It is the complete path to the Excel file
Error Message: Can't call method "get_name" on an undefined value at perl_script.pl (The error points to the line where the if-condition is used.)
Thanks for the help.
-----EDIT----
Anyone, with any leads on this post, please post your answer or suggestions. Appreciate any responses.
Thanks
The get_name() method of the worksheet object, in conjunction with Perl's grep command should get you this:
my ($worksheet) = grep { $_->get_name() eq 'Sheet2' } $workbook->worksheets();
This would be an un-golfed version of the same:
my $worksheet;
foreach $worksheet ($workbook->worksheets()) {
last if $worksheet->get_name() eq 'Sheet2';
}
Assuming there is a match... if not, I guess my un-golfed version would give you the last worksheet if there was no match.
-- Edit --
I made assumptions and -- you certainly do need to first call the method to load the workbook:
use strict;
use Spreadsheet::ParseExcel;
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse('/var/tmp/foo.xls');
Then the code above should work.

Process .xlsx to csv with Powershell using rename and set delimiter

I have an Excel file that I receive and want to process it to a CSV using Powershell.
I have to alter it quite specifically so it can be a reliable input for a program that will process the csv info.
I don't know the exact headers, but i know there can be duplicates.
What I do is open the xlsx file with excel and save it as CSV:
$objExcel = New-Object -ComObject Excel.Application
$objExcel.Visible = $True
$objExcel.DisplayAlerts = $True
$Workbook = $objExcel.Workbooks.open($xlsx1)
$WorkSheet = $WorkBook.sheets.item($sheet)
$xlCSV = 6
$Workbook = $objExcel.Workbooks.open($xlsx2)
$WorkSheet = $WorkBook.sheets.item($sheet)
$WorkBook.SaveAs($csv2,$xlCSV)
Now, the XLSX file will have comma's, so first I want to change them to dots.
I tried this, but it's not working:
$objRange = $worksheet.UsedRange
$objRange.Replace ",", "."
It errors out saying: Unexpected token '", "'.
Then when saving I want to set the Delimiter to comma, as it uses ";" standard.
With something like:
$WorkBook.SaveAs($csv2,$xlCSV) -delimiter ","
The last problem is the duplicate headers; this prevents PS to use Import-CSV. Here I tried, when file is separated with a comma it works:
Get-Content $downloads\BBKS_DIR_AUTO_COMMA.csv -totalcount 1 >$downloads\Headers.txt
But then I need to rename de duplicate names like I can have Regio, Regio, Regio.
I want to change this to Regio, Regio2, Regio3
My plan was to lookup the data of the txt, search for duplicates, and then ad an incremental nummer.
In the end I need to add a column with incremental numbers, but always with four numbers, like; 0001, 0002, 0010, 0020, 0200, 1500, I wont exceed 9999. How can this be done?
If you can help me, if only partially I'm very happy.
Further, I'm running Windows 7 x64, Powershell 3.0, Excel 2016 (if relevant)
If easier, its fine to go back to Command prompt for some tasks.
Personally, I wouldn't try and work with Excel sheets via Excel itself and COM - I'd use the excellent module https://github.com/dfinke/ImportExcel
Then you can import from the sheet straight to a native Powershell object array, and re-export with Export-Csv -Delimiter.
Edit: To answer follow ups :
Once you've loaded the module you can do "Get-Module ImportExcel | Select-Object -ExpandProperty ExportedCommands" to see what it makes available.
To import your Excel in the first place, do something like :
$WorkBook = Import-Excel
And if you need to take care of duplicate column names, you can do :
$WorkBook = Import-Excel -Header #("Regio1", "Regio2", "Regio")
Where the array you pass to -Header needs to include every column you want from the workbook.

How to save Excel file in working directory in Win32::OLE

I am trying to parse an Excel file in perl. After I extract the required info from it, I close the Excel file. At the end I am trying to save a new Excel file with a different name in the same directory. But this Excel is getting stored in 'My Documents' folder.
use Storable ;
use Cwd;
use Win32::OLE ;
use Win32::OLE qw(in with) ;
use Win32::OLE in ;
use Win32::OLE::Const 'Microsoft Excel';
use Excel::Writer::XLSX;
my $Excel = Win32::OLE->new("Excel.Application");
my $excel = $Excel->Workbooks->Add();
my $sheet = $excel->Worksheets(1);
$sheet->Activate();
my $new_file = "Temp_file.xlsm";
my $new_excel = cwd.'\\'.$new_file;
$new_excel =~ s/\//\\/g;
$excel->SaveAs($new_excel);
$Excel->{DisplayAlerts} = 0;
$excel->{Saved} = 1;
$excel->Close;
Here is an update based on your code. First, you are letting Win32::OLE errors to be silently ignored. Instead, set: $Win32::OLE::Warn = 3 so it croaks whenever something goes wrong. Second, the way you try to obtain an Excel.Application instance is not correct. For one thing, if something goes wrong, you will have instances of Excel will remain floating around.
You are also confusing an Excel instance, a workbook, and the sheets it contains. If you did have $Win32::OLE::Warn = 3, you would have received a notification of where things are going wrong.
You should also always have use strict and use warnings in your script. Others will be more inclined to try to help if they know your problem is not caused by some trivial typo.
You also don't need three separate use Win32::OLE statements.
The code below "works". Compare it to yours.
Finally, if you are manipulating your sheet via Win32::OLE, there is no reason to have Excel::Writer::XLSX in your code.
use feature 'say';
use strict;
use warnings;
use File::Spec::Functions qw( rel2abs );
use Win32::OLE qw(in with) ;
use Win32::OLE::Const 'Microsoft Excel';
$Win32::OLE::Warn = 3;
my $excel = eval {
Win32::OLE->GetActiveObject('Excel.Application');
} || Win32::OLE->new('Excel.Application', sub { $_[0]->Quit });
my $wb = $excel->Workbooks->Add;
my $sheet = $wb->Worksheets->Add;
$wb->SaveAs( rel2abs('temp_file.xlsm') );
$excel->{DisplayAlerts} = 0;
$wb->{Saved} = 1;
$wb->Close;

Using Perl to SaveAsXMLData on Excel 2010 Worksheet

I've an Excel 2010 spreadsheet with an XML map defined within it. Using Perl I want to save the worksheet as XML Data. I do not need to export the XML map file. From within Excel I can select "File > Save As > Save as type : XML Data". This is the output I want to create, but from my Perl script.
I can output the worksheet in CSV format using the SaveAs command with enum 6. I can also output the spreadsheet in XML format using SaveAs with enum 46, but this is not what I want. I want just the XML Data..
There appears to be a SaveAsXMLData function but I'm unable to get it working. Any help appreciated.
use strict;
use warnings;
use Win32::OLE qw(in with);
use Win32::OLE::Const 'Microsoft Excel';
use Win32::OLE::Variant;
use Win32::OLE::NLS qw(:LOCALE :DATE);
$Win32::OLE::Warn = 3; # Die on Errors.
my $Excel = Win32::OLE->GetActiveObject('Excel.Application')
|| Win32::OLE->new('Excel.Application', 'Quit');
$Excel->{DisplayAlerts}=0;
my $excel_file = 'c:\\temp\\master.xlsx';
my $csv_file = 'c:\\temp\\master.csv';
my $xml_file = 'c:\\temp\\master.xml';
my $workbook = $Excel->Workbooks->Open($excel_file);
# Alt+F11 in Excel to start VBA and after that F2 to start Object browser.
# 6 is CSV format
# 46 is XML spreadsheet
$workbook->SaveAs( $csv_file, 6 );
# Now just the XML Data
# The map is called MDBAC_Map
my $objMapToExport = $Excel->Workbooks->XmlMaps("MDBAC_Map");
$workbook->SaveAsXMLData( $xml_file, $objMapToExport );
$workbook->Close();
$Excel->Quit();
Fixed this myself (I was 99% there!). Using the macro recorder within Excel confirmed the required function calls as follows:
ChDir "C:\temp"
ActiveWorkbook.SaveAsXMLData Filename:="C:\temp\master.xml", Map:= _
ActiveWorkbook.XmlMaps("MDBAC_Map")
The line of code for exporting the XML map is wrong. Changed the above code as follows and the script works fine:
my $objMapToExport = $workbook->XmlMaps("MDBAC_Map");

Resources