Delete entire row using Spreadsheet::ParseXLSX - excel

I'm new to spreadsheet parsers in general, and can't find much info on CPAN other than a basic introduction of the main features.
I'm trying to read in a .xlsx file and delete an entire row if column 2 exists in a hash that I'm filtering against.
Then I want to print out the an edited file, also in .xlxs
This is what I can find from CPAN for Spreadsheet::ParseExcel
use strict;
use warnings;
use Spreadsheet::ParseXLSX;
my $parser = Spreadsheet::ParseXLSX->new;
my $workbook = $parser->parse("file.xlsx");
for my $worksheet ( $workbook->worksheets() ) {
my ( $row_min, $row_max ) = $worksheet->row_range();
my ( $col_min, $col_max ) = $worksheet->col_range();
for my $row ( $row_min .. $row_max ) {
# Here I want to delete an entire row if a column 2 of that row matches a value
# sudo code:
# delete 'row' if 'row column 2' exists $hash{$key}
# And then print out the edited .xlsx file
}
}
}
Can anyone give me some pointers?
Is Spreadsheet::ParseExcel the right module to use for this?

Spreadsheet::ParseXLSX is just for reading spreadsheets. It doesn't have facilities for updating and saving data from Perl to an Excel spreadsheet.
Then there are modules like Spreadsheet::WriteExcel and Excel::Writer::XLSX that can write spreadsheets but can't read them.
But put them together in the same script? Stand back and watch the magic happen.

Related

perl script to read an xlsx file(which has many sheets) using the sheet name

I am trying to write a perl script which reads an excel file(which has many sheets in it) using the sheet name.
I know how to access a particular sheet of the excel file using the sheet number, but not sure how to read it using sheet name.
Any help provided is highly appreciated.
Below is the code I wrote to access the sheet using sheet number:
my $Sheet_Number = 26;
my $workbook = ReadData("<path to the excel file>");
for (my $i =2; $i<$limit; $i++){
my $cell = "A" . $i;
my $key_1 = $workbook->[$Sheet_Number]{$cell};
}
Thanks
----Edit----
I want to open the particular sheet within the Excel file by using the sheet name. And then read the data from that particular sheet. The name of the sheet will be entered by the user while running the script from the command line arguments.
Below is the code that I am using after getting suggested answers for my earlier question:
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse("$path");
my $worksheet;
if ($worksheet->get_name() eq "$Sheet_Name"){
for (my $i =2; $i<$limit; $i++){
my $cell = $worksheet->get_cell($i,"A");
my $value = $cell->value();
push #array_keys, $value;
}
}
I want to read the values of Column A of the particular sheet and push it into an array.
$Sheet_Name : It is the name of the sheet which is entered by the user as cmd line arg.
$path : It is the complete path to the Excel file
Error Message: Can't call method "get_name" on an undefined value at perl_script.pl (The error points to the line where the if-condition is used.)
Thanks for the help.
-----EDIT----
Anyone, with any leads on this post, please post your answer or suggestions. Appreciate any responses.
Thanks
The get_name() method of the worksheet object, in conjunction with Perl's grep command should get you this:
my ($worksheet) = grep { $_->get_name() eq 'Sheet2' } $workbook->worksheets();
This would be an un-golfed version of the same:
my $worksheet;
foreach $worksheet ($workbook->worksheets()) {
last if $worksheet->get_name() eq 'Sheet2';
}
Assuming there is a match... if not, I guess my un-golfed version would give you the last worksheet if there was no match.
-- Edit --
I made assumptions and -- you certainly do need to first call the method to load the workbook:
use strict;
use Spreadsheet::ParseExcel;
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse('/var/tmp/foo.xls');
Then the code above should work.

In Perl, how can I copy a subset of columns from an XLSX work sheet to another?

I have a .xlsx file (only one sheet) with 15 columns. I want to read some specific columns, let's say columns 3, 5, 11, 14 and write it to a new Excel sheet. In this case some cells of input files are empty means don't have any value.
Here what I am trying:
use warnings;
use strict;
use Spreadsheet::ParseXLSX;
use Excel::Writer::XLSX;
my $parser = Spreadsheet::ParseXLSX->new;
my $workbook = $parser->parse("test.xlsx");
if ( !defined $workbook ) {
die $parser->error(), ".\n";
}
my $worksheet = $workbook->worksheet('Sheet1');
# but from here I don't know how to define row and column range to get specific column data.
# I am trying to get all data in an array, so I can write it in new .xlsx file.
# function to write data in new file
sub writetoexcel
{
my #fields = #_;
my $workbook = Excel::Writer::XLSX->new( 'report.xlsx' );
$worksheet = $workbook->add_worksheet();
my $row = 0;
my $col = 0;
for my $token ( #fields )
{
$worksheet->write( $row, $col, $token );
$col++;
}
$row++;
}
I also followed this Question, but no luck.
How can I read specific columns from .xlsx file and write it into new .xlsx file?
Have you never copied a subset of columns from an array of arrays to another?
Here is the input sheet I used for this:
and, this is what I get in the output file after the code is run:
#!/usr/bin/env perl
use strict;
use warnings;
use Excel::Writer::XLSX;
use Spreadsheet::ParseXLSX;
my #cols = (1, 3);
my $reader = Spreadsheet::ParseXLSX->new;
my $bookin = $reader->parse($ARGV[0]);
my $sheetin = $bookin->worksheet('Sheet1');
my $writer = Excel::Writer::XLSX->new($ARGV[1]);
my $sheetout = $writer->add_worksheet('Extract');
my ($top, $bot) = $sheetin->row_range;
for my $r ($top .. $bot) {
$sheetout->write_row(
$r,
0,
# of course, you need to do more work if you want
# to preserve formulas, formats etc. That is left
# to you, as you left that part of the problem
# unspecified.
[ map $sheetin->get_cell($r, $_)->value, #cols ],
);
}
$writer->close;

Append data to excel using perl

I am new to Perl.Can anyone tell me how to append data from one excel file to another existing excel file?
I have 2 excel files and 1 excel file is created every time I run the program but I need to copy the data from new excel file to another one as a backup. so I want to append the data to the backup file.
I tried searching in web but all i am getting is creating new files.
I think you want Spreadsheet::ParseExcel::SaveParser
# Open an existing file with SaveParser
my $parser = Spreadsheet::ParseExcel::SaveParser->new();
my $template = $parser->Parse('template.xls');
my $row = #Enter row value here, keep incrementing it for appending
my $col = #Enter column value here
# Add a cell
$worksheet->AddCell( $row, $col, 'New string' );
# Write over the existing file or write a new file.
$template->SaveAs('newfile.xls');

Can I create a Excel workbook inside a foreach in perl?

I want to create multiple Excel files. The files will output basically the same format, the only difference the data will be for different years.
If I run the program in the following way it runs and create the excel file without problems:
use warnings;
use Excel::Writer::XLSX;
use Date::Parse;
... ###some validation of the data to work with
... ### put data on hashes
...
my $workbook = Excel::Writer::XLSX->new( "Monitoring_Report_2013.xlsx" );
$worksheet1 = $workbook->add_worksheet('Q1');
$worksheet2 = $workbook->add_worksheet('Q2');
$worksheet3 = $workbook->add_worksheet('Q3');
$worksheet4 = $workbook->add_worksheet('Q4');
... ### create the different tables on each worksheet
...
...
If I add the foreach part so it can creates automatically the differents files to each year it runs but when I tried to open the excel file it generate a corrupt error.
use warnings;
use Excel::Writer::XLSX;
use Date::Parse;
...
...
...
my #years_in_data = ("2012", "2013", "2014");
foreach my $year(#years_in_data)
{
chomp $year;
...
...
...
my $workbook = Excel::Writer::XLSX->new( "Monitoring_Report_$year.xlsx" );
$worksheet1 = $workbook->add_worksheet('Q1');
$worksheet2 = $workbook->add_worksheet('Q2');
$worksheet3 = $workbook->add_worksheet('Q3');
$worksheet4 = $workbook->add_worksheet('Q4');
...
...
...
}
Can I create the files automatically or I need to write each file manually??
Thanks for your help!
Are you calling the $worksheet->write() method for each of these ? Maybe you are saying you do with all the dots. If you don't then I can see none of them getting written.

Using Spreadsheet::ParseExcel in Perl, but need help

I have a Perl program using Spreadsheet::ParseExcel. However, there are two difficulties that have arisen that I have been unable to figure out how to solve. The script for the program is as follows:
#!/usr/bin/perl
use strict;
use warnings;
use Spreadsheet::ParseExcel;
use WordNet::Similarity::lesk;
use WordNet::QueryData;
my $wn = WordNet::QueryData->new();
my $lesk = WordNet::Similarity::lesk->new($wn);
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse ( 'input.xls' );
if ( !defined $workbook ) {
die $parser->error(), ".\n";
}
WORKSHEET:
for my $worksheet ( $workbook->worksheets() ) {
my $sheetname = $worksheet->get_name();
my ( $row_min, $row_max ) = $worksheet->row_range();
my ( $col_min, $col_max ) = $worksheet->col_range();
my $target_col;
my $response_col;
# Skip worksheet if it doesn't contain data
if ( $row_min > $row_max ) {
warn "\tWorksheet $sheetname doesn't contain data. \n";
next WORKSHEET;
}
# Check for column headers
COLUMN:
for my $col ( $col_min .. $col_max ) {
my $cell = $worksheet->get_cell( $row_min, $col );
next COLUMN unless $cell;
$target_col = $col if $cell->value() eq 'Target';
$response_col = $col if $cell->value() eq 'Response';
}
if ( defined $target_col && defined $response_col ) {
ROW:
for my $row ( $row_min + 1 .. $row_max ) {
my $target_cell = $worksheet->get_cell( $row, $target_col);
my $response_cell = $worksheet->get_cell( $row, $response_col);
if ( defined $target_cell && defined $response_cell ) {
my $target = $target_cell->value();
my $response = $response_cell->value();
my $value = $lesk->getRelatedness( $target, $response );
print "Worksheet = $sheetname\n";
print "Row = $row\n";
print "Target = $target\n";
print "Response = $response\n";
print "Relatedness = $value\n";
}
else {
warn "\tWroksheet $sheetname, Row = $row doesn't contain target and response data.\n";
next ROW;
}
}
}
else {
warn "\tWorksheet $sheetname: Didn't find Target and Response headings.\n";
next WORKSHEET;
}
}
So, my two problems:
First of all, sometimes the program returns the error "No Excel data found in file," even though the data is there. Each Excel file is formatted the same way. There is only one sheet, with the A and B columns labelled 'Target' and 'Response,' respectively, with a list of words beneath them. However, it does not ALWAYS return this error. It works for one Excel file, but it does not work for a different one, even though both are formatted the exact same way (and yes, they are both the same file type, as well). I cannot find any reason for it to not read the second file, because it is identical to the first. The only difference is that the second file was created using an Excel macro; however, why would that matter? The file types and format are exactly the same.
Second, the variables '$target' and '$response' need to be formatted as strings in order for the 'my $value' expression to work. How do I convert them into string format? The value assigned to each variable is a word from the appropriate cell of the Excel spreadsheet. I don't know what format that is (and there is no apparent way in Perl for me to check).
Any suggestions?
In relation to your first question, the "no data found" error indicates some problem with the file format. I've seen this error with pseudo-Excel files such as Html or CSV files that have an xls extension. I've also seen this error with mal-formed files generated by third party apps.
You could do an initial verification of the files by doing a hexdump/xxd dump of a working and non working file and seeing if the overall structure is approximately the same (for example if it has similar magic numbers at the start and isn't Html).
It could also be an issue with Spreadsheet::ParseExcel. I am the maintainer of that module. If you like you could send me on a "good" and "bad" file, at the email address in the docs, and I will have a look at them.
First of all, if you are getting "no data found" you can thank proprietary Excel data file formats and the inability of even a good Perl library to extract information from them.
I strongly suggest that you export the Excel data in something easily parsed like CSV especially given the simple nature of the data layout you described. There may be a way to get Excel to process a batch but I have no idea. A quick search yielded a tool to use OpenOffice to do batch conversion.
The rest of your question is rather moot once you accept that Excel data files will not play nicely.
I wrote this code after a client couldn't decide whether the XLS he was sending every week was really in XLS format or just CSV.... HTH!
sub testForXLS ()
{
my ( $FileName ) = #_;
my $signature = '';
my $XLSsignature = 'D0CF11E0A1B11AE10000';
open(FILE, "<$FileName")||die;
read(FILE, $buffer, 10, 0);
close(FILE);
foreach (split(//, $buffer))
{ $signature .= sprintf("%02x", ord($_)); }
$signature =~ tr/a-z/A-Z/;
if ( $signature eq $XLSsignature )
{ return 1; } else { return 0; }
}

Resources