CSV to xls conversion in perl - excel

i wanted to parse a csv file in perl and want to generate an excel sheet.
As of now i am able to parse CSV file and converted into xls.This code is working properly, which is giving some 6 rows and 3 colums according to CSV.which is correct.After parsing it i want to do some formating also.let say any row or colum which has "Pass" as a string that should be of green color and fail then that should be of Red color.How can i do that please help..
#!/run/pkg/TWW-perl-/5.8.8/bin/perl -w
use strict;
use warnings;
use Spreadsheet::WriteExcel;
use Text::CSV::Simple;
use Spreadsheet::ParseExcel::Format
my $infile = "/project/ls1socdft_nobackup/rev2.0/user/Shah- B53654/dft/dfta/perl/pattern_qa/output_0/xls_info.csv";
#usage() unless defined $infile && -f $infile;
my $parser = Text::CSV::Simple->new;
my #data = $parser->read_file($infile);
my $headers = shift #data;
my $outfile = shift || "/project/ls1socdft_nobackup/rev2.0/user/Shah-B53654/dft/dfta/perl/pattern_qa/output_0/xls_info.xls";
my $subject = shift || 'worksheet';
my $workbook = Spreadsheet::WriteExcel->new($outfile);
my $bold = $workbook->add_format();
$bold->set_bold(1) ;
my $color =$workbook->add_format();
$color->set_bg_color('green');
my $color1=$workbook->add_format();
$color1->set_bg_color('red');
import_data($workbook, $subject, $headers, \#data);
# Add a worksheet
sub import_data {
my $workbook = shift;
my $base_name = shift ;
my $colums = shift;
my $data = shift;
my $limit = shift || 50_000;
my $start_row = shift ||1;
my $worksheet = $workbook->add_worksheet($base_name);
$worksheet->add_write_handler(qr[\w], \&store_string_widths);
my $w = 1;
$worksheet->write('A' . $start_row, $colums,$bold);
my $i = $start_row;
my $qty = 0;
for my $row (#$data) {
$qty++;
if ($i > $limit) {
$i = $start_row;
$w++;
$worksheet = $workbook->add_worksheet("$base_name - $w");
$worksheet->write('A1', $colums);
}
$worksheet->write(1+$i++,0, $row);}
autofit_columns($worksheet);
warn "Convereted $qty rows.";
return $worksheet;
}
sub store_string_widths {
my $worksheet = shift;
my $col = $_[1];
my $token = $_[2];
# Ignore some tokens that we aren't interested in.
return if not defined $token; # Ignore undefs.
return if $token eq ''; # Ignore blank cells.
return if ref $token eq 'ARRAY'; # Ignore array refs.
return if $token =~ /^=/; # Ignore formula
# Ignore numbers
#return if $token =~ /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d++))?$/;
# Ignore various internal and external hyperlinks. In a real scena+rio
# you may wish to track the length of the optional strings used wi+th
# urls.
return if $token =~ m{^[fh]tt?ps?://};
return if $token =~ m{^mailto:};
return if $token =~ m{^(?:in|ex)ternal:};
# We store the string width as data in the Worksheet object. We us+e
# a double underscore key name to avoid conflicts with future name +s.
#
my $old_width = $worksheet->{__col_widths}->[$col];
my $string_width = string_width($token);
if (not defined $old_width or $string_width > $old_width) {
# You may wish to set a minimum column width as follows.
#return undef if $string_width < 10;
$worksheet->{__col_widths}->[$col] = $string_width;
}
# Return control to write();
return undef;
}
sub string_width {
return length $_[0];
}
sub autofit_columns {
my $worksheet = shift;
my $col = 0;
for my $width (#{$worksheet->{__col_widths}}) {
$worksheet->set_column($col, $col, $width) if $width;
$col++;
}
}

There are two ways you can get the individual cell formatted the way you're expecting.
1: Instead of passing an array of data into the $worksheet->write() method, loop through each row and column and write each cell individually.
EX:
change
$worksheet->write('A1', $colums);
to
for (my $r=0;$r<#$colums;$r++) {
for (my $c=0;$c<#{$colums->[$r]}) {
$worksheet->write($r,$c,$colums->[$r]->[$c]);
}
Now, you can test each value being written for your criteria. If it matches, just include the format you want to use.
$worksheet->write($r,$c,$columns->[$r]->[$c],$color1);
}
2: The other option is to use Excel::Writer::XLSX
Spreadsheet::WriteExcel is in maintenance only mode and has
effectively been superseded by Excel::Writer::XLSX.
This module is more up to date and includes functions for conditional formatting, which can be added after writing your data.
Also, there should be no change in your excel generation code except for when you include the module and when you initialize it.
Then, you just specify the rules for the conditional formatting.
$worksheet->conditional_formatting( 'A1:J10',
{
type => 'text',
criteria => 'containing',
value => 'Pass',
format => $color,
}
);
$worksheet->conditional_formatting( 'A1:J10',
{
type => 'text',
criteria => 'containing',
value => 'Fail',
format => $color1,
}
);

Related

add chart to an existing excel using perl

I am new to perl.
i have an excel sheet with lot of data.. I need to update it and create a graph based on the data..using perl.
i am succeded in updating an existing excel..
now adding chart to it is not happening
use Spreadsheet::ParseExcel;
use Spreadsheet::ParseExcel::SaveParser;
use Spreadsheet::WriteExcel;
# Open an existing file with SaveParser
my $parser = Spreadsheet::ParseExcel::SaveParser->new();
my $template = $parser->Parse('MyExcel.xls');
my $worksheet = $template->worksheet('Firstsheet');
my $chart = $template->add_chart( type => 'line' );
$chart->add_series(
categories => '=URV!$A$17:$A$442',
values => '=URV!$D$17:$D$442',
name => 'pended graph',
);
This is not working.
Can't call method "add_chart" on an undefined value at charts4.ps line 20
Please help me with a sample working code..
Want to know whats the problem here.
add_chart() is one of the WORKBOOK METHODS. Try code like this:
use Spreadsheet::WriteExcel;
my $workbook = Spreadsheet::WriteExcel->new('perl.xls');
$worksheet = $workbook->add_worksheet();
$worksheet->write('A1', 'Hi Chart!');
my $chart = $workbook->add_chart( type => 'line', embedded => 1, name => 'pended graph' );
# Insert the chart into the a worksheet.
$worksheet->insert_chart( 'E2', $chart );
Update
The problem is that excel is very hard to update with perl.
An Excel file is a binary file within a binary file. It contains
several interlinked checksums and changing even one byte can cause it
to become corrupted.
As such you cannot simply append or update an Excel file. The only way
to achieve this is to read the entire file into memory, make the
required changes or additions and then write the file out again.
Spreadsheet::ParseExcel will read in existing excel files:
my $parser = Spreadsheet::ParseExcel->new();
# $workbook is a Spreadsheet::ParseExcel::Workbook object
my $workbook = $parser->Parse('blablabla.xls');
What you really want is Spreadsheet::ParseExcel::SaveParser, which is a combination of Spreadsheet::ParseExcel and Spreadsheet::WriteExcel.
Here is an example.
Summing it up, I would suggest you to read the excel data in and then try either of the following:
Create another xls file and use the Spreadsheet::WriteExcel::Chart
library.
Create a xlsx file and use the Excel::Writer::XLSX::Chart library.
Another close option would be to read the excel in with
Spreadsheet::ParseExcel::SaveParser and then add the chart and save
it, but with this module all original charts are lost.
If you are on a Windows machine you may try to use Win32::OLE.
Here is the example from Win32::OLE's own documentation:
use Win32::OLE;
# use existing instance if Excel is already running
eval {$ex = Win32::OLE->GetActiveObject('Excel.Application')};
die "Excel not installed" if $#;
unless (defined $ex) {
$ex = Win32::OLE->new('Excel.Application', sub {$_[0]->Quit;})
or die "Oops, cannot start Excel";
}
# get a new workbook
$book = $ex->Workbooks->Add;
# write to a particular cell
$sheet = $book->Worksheets(1);
$sheet->Cells(1,1)->{Value} = "foo";
# write a 2 rows by 3 columns range
$sheet->Range("A8:C9")->{Value} = [[ undef, 'Xyzzy', 'Plugh' ],
[ 42, 'Perl', 3.1415 ]];
# print "XyzzyPerl"
$array = $sheet->Range("A8:C9")->{Value};
for (#$array) {
for (#$_) {
print defined($_) ? "$_|" : "<undef>|";
}
print "\n";
}
# save and exit
$book->SaveAs( 'test.xls' );
undef $book;
undef $ex;
UPDATE#2
Here is an example code:
use strict;
use Spreadsheet::WriteExcel;
my $workbook = Spreadsheet::WriteExcel->new( 'chart_column.xls' );
my $worksheet = $workbook->add_worksheet();
my $bold = $workbook->add_format( bold => 1 );
# Add the worksheet data that the charts will refer to.
my $headings = [ 'Category', 'Values 1', 'Values 2' ];
my $data = [
[ 2, 3, 4, 5, 6, 7 ],
[ 1, 4, 5, 2, 1, 5 ],
[ 3, 6, 7, 5, 4, 3 ],
];
$worksheet->write( 'A1', $headings, $bold );
$worksheet->write( 'A2', $data );
###############################################################################
#
# Example 1. A minimal chart.
#
my $chart1 = $workbook->add_chart( type => 'column', embedded => 1 );
# Add values only. Use the default categories.
$chart1->add_series( values => '=Sheet1!$B$2:$B$7' );
# Insert the chart into the main worksheet.
$worksheet->insert_chart( 'E2', $chart1 );
###############################################################################
#
# Example 2. One more chart
#
my $chart2 = $workbook->add_chart( type => 'column', embedded => 1 );
# Configure the chart. # change the categories if required change the values as required
$chart2->add_series(
categories => '=Sheet1!$A$4:$A$7',
values => '=Sheet1!$B$4:$B$7',
);
$worksheet->insert_chart( 'N1', $chart2, 3, 3 );
Also,
If you don't mind xlsx over xls, you may use Excel::Writer::XLSX. It is more actively maintained.
The trick to be able to parse and use at the same time the functions inside the WriteExcel module is to use the the use Spreadsheet::ParseExcel::SaveParser; module.
Below i have an example. The example will not use the chart functions but the problem you have is not on how to use the chart functions of WriteExcel module but on how to parse an existing excel file and then use that parsed information with the WriteExcel modul (which is originally thought only for NEW excel files).
if ( ( -f $excel_file_name ) && ( ( stat $excel_file_name )[7] > 0 ) ) {
#PARSE EXCEL
use Spreadsheet::ParseExcel;
use Spreadsheet::ParseExcel::SaveParser;
# Open the template with SaveParser
my $parser = new Spreadsheet::ParseExcel::SaveParser;
my $template = $parser->Parse("$excel_file_name");
my $sheet = 0;
my $row = 0;
my $col = 0;
if ( !defined $template ) {
die $parser->error(), " Perlline:", __LINE__, " \n "; #probably the file is already open by your GUI
}
# Get the format from specific cell
my $format = $template->{Worksheet}[$sheet]->{Cells}[$row][$col]->{FormatNo};
# Add a new worksheet
#for my $worksheet ( $template->worksheets() ) {
my $worksheet_parser = $template->worksheet("$metrict_data_worksheet_name");
my ( $row_min, $row_max ) = $worksheet_parser->row_range();
my ( $col_min, $col_max ) = $worksheet_parser->col_range();
my #row_array_value;
for my $row ( 1 .. $row_max ) { #avoid header start from 1
for my $col ( $col_min .. $col_max ) {
my $cell = $worksheet_parser->get_cell( $row, $col );
next unless $cell;
#print "Row, Col = ($row, $col)\n";
#print "Value = ", $cell->value(), "\n";
#print "Unformatted = ", $cell->unformatted(), "\n";
#print "\n";
push( #row_array_value, $cell->value() );
} #end header column loops for one regression
} #end row loop all lines
#}
# The SaveParser SaveAs() method returns a reference to a
# Spreadsheet::WriteExcel object. If you wish you can then
# use this to access any of the methods that aren't
# available from the SaveParser object. If you don't need
# to do this just use SaveAs().
#
my $workbook;
{
# SaveAs generates a lot of harmless warnings about unset
# Worksheet properties. You can ignore them if you wish.
local $^W = 0;
# Rewrite the file or save as a new file
my $check_if_possible2write = Spreadsheet::WriteExcel->new($excel_file_name);
if ( defined $check_if_possible2write ) { #if not possible it will be undef
$workbook = $template->SaveAs("$excel_file_name");#IMPORTANT this is of type WriteExcel and not ParseExcel
}
else {
print "Not possible to write the Excel file :$excel_file_name, another user may have the file open. Aborting... ", __LINE__, " \n ";
exit;
}
}
#####################FROM HERE YOU CAN USE AGAIN use Spreadsheet::WriteExcel; ####################
use Spreadsheet::WriteExcel;
my $worksheet = $workbook->sheets("$metrict_data_worksheet_name");
my $column_header_count = 0;
foreach my $name ( sort { lc $a cmp lc $b } keys %merged_all_metrics ) {
$worksheet->write( $row_max + 1, $column_header_count, "$merged_all_metrics{$name}" ); #row,col start
$column_header_count++;
}
$worksheet->set_column( 'A:L', 50, undef, 0, 1, 0 ); #grouping #comp_src group
$worksheet->set_column( 'N:R', 50, undef, 0, 1, 0 ); #grouping
$workbook->close() or die "Error closing file: $!"; #CLOSE
}
The important part of the code is what happens after the comment line:
#####################FROM HERE YOU CAN USE AGAIN use Spreadsheet::WriteExcel; ####################
After that point you will see that you have a $workbook handler. This variable has all the information parsed and more important is that it is from type WriteExcel Object so you will have all the methods of this module available.
Important Notice. The parser is not able to parse charts and formulas (only values), therefore you will have to write then again on each parse->write loop.

Merged cells are unmerging on Save, using Spreadsheet::ParseExcel

I am writing a program to parse an .xls file. For that I have a template which contains five merged cells (B1,C1,D1,E1,F1) and written "User-Dependent errors" in that. In B2,C2,D2,E2,F2 I have written the error names and want to save their count every day. The code is working properly but after parsing and saving the merged cell (B1,C1,D1,E1,F1) is getting unmerged and the text is presented in B1. I need the merged cells as is (merged), even after parsing.
What do I have to do?
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use Spreadsheet::ParseExcel;
use Spreadsheet::ParseExcel::SaveParser;
my $date=$ARGV[1]; #yymmdd
my $hour=$ARGV[0]; #06
$date or $date=`date --date='1 day ago' +%Y%m%d`;
chomp $date;
chomp $hour;
my $db_name = "ravi";
my $table = "CDR";
my $sub_table = "Submission_Failures";
my $del_table = "Delivery_Failures";
my $host = "xxx.xx.x.xxx";
my $command = "cp /root/prac/CDR/CDR.xls /root/prac/CDR/CDR_Report_20$date$hour.xls";
print $command;
`$command`;
sub NULL_count
{
my $type = $_[0];
my #temp_array;
my $error_db = DBI->connect("DBI:mysql:database=$db_name;host=$host;mysql_socket=/opt/lampstack-5.5.27-0/mysql/tmp/mysql.sock","root","", {'RaiseError' => 1});
my $error_sth = $error_db->prepare("SELECT Error_list from error_potrait WHERE Date='$date' and Type='$type'");
$error_sth->execute() or die $DBI::errstr;
while (my $temp = $error_sth->fetchrow_array())
{
push(#temp_array, $temp);
}
my $temp = #temp_array;
foreach my $i ($temp .. 4)
{
$temp_array[$i] = "NULL";
}
$error_sth->finish();
return #temp_array;
}
my #db_system_errors = NULL_count ("Submission_user_error");
my #db_network_errors = NULL_count ("Submission_ESME_error");
my #db_ESME_errors = NULL_count ("Submission_system_error");
my #db_user_errors = NULL_count ("Submission_network_error");
my #del_user_errors = NULL_count ("Delivery_user_error");
my #del_network_errors = NULL_count ("Delivery_network_error");
my #del_system_errors = NULL_count ("Delivery_system_error");
my #submission_errors = (#db_network_errors,#db_system_errors,#db_ESME_errors,#db_user_errors);
my #delivery_errors = (#del_user_errors,#del_network_errors,#del_system_errors);
sub error_headers
{
my $sheet_no = shift;
my #array = #_;
my $row = 1;
my $col = 1;
# Open an existing file with SaveParser
my $parser = Spreadsheet::ParseExcel::SaveParser->new();
my $template = $parser->Parse("CDR_Report_20$date$hour.xls") or die "Cant open xls";
# Get the first worksheet.
my $sheet = $template->worksheet($sheet_no);
$sheet->AddCell( 1, 0, $date );
foreach my $value (#array)
{
$sheet->AddCell( $row, $col, $value );
++$col;
}
$template->SaveAs("CDR_Report_20$date$hour.xls");
}
error_headers (3,#submission_errors);
error_headers (4,#delivery_errors);
sub parser_excel
{
my $sql_comm = $_[0];
my $sheet_no = $_[1];
my $row = $_[2];
my $col = $_[3];
my $dbh = DBI->connect("DBI:mysql:database=$db_name;host=$host;mysql_socket=/opt/lampstack-5.5.27-0/mysql/tmp/mysql.sock","root","", {'RaiseError' => 1});
#Selecting the data to fetch
my $sth = $dbh->prepare("$sql_comm");
$sth->execute() or die $DBI::errstr;
# Open an existing file with SaveParser
my $parser = Spreadsheet::ParseExcel::SaveParser->new();
my $template = $parser->Parse("CDR_Report_20$date$hour.xls") or die "Cant open xls";
# Get the first worksheet.
my $sheet = $template->worksheet($sheet_no);
$sheet->AddCell( $_[4], 0, $date );
while (my #row = $sth->fetchrow_array())
{
my $Date_db = shift #row;
foreach my $value (#row)
{
$sheet->AddCell( $row, $col, $value );
++$col;
}
$row++;
$col=0;
}
$template->SaveAs("CDR_Report_20$date$hour.xls");
$sth->finish();
}
parser_excel("Select * from $table where Date = $date and Hour = $hour",2,1,0,0);
parser_excel("Select * from $sub_table where Date = $date and Hour = $hour",3,2,0,1);
parser_excel("Select * from $del_table where Date = $date and Hour = $hour",4,2,0,1);`
The docs for Spreadsheet::ParseExcel::SaveParser state that the module works "by reading it with Spreadsheet::ParseExcel and rewriting it with Spreadsheet::WriteExcel". So any merged cells will be lost when re-writing. You will need to use the WriteExcel module to re-create the merged cells, which means you will have to separate the reading and writing in your own script.
To merge the cells you use the "merge_range" method with a Format:
my $format = $workbook->add_format( align => 'left' );
$worksheet->merge_range('B1:F1', 'User-Dependent errors', $format);
See the docs for Spreadsheet::WriteExcel

In Perl, how can I copy a subset of columns from an XLSX work sheet to another?

I have a .xlsx file (only one sheet) with 15 columns. I want to read some specific columns, let's say columns 3, 5, 11, 14 and write it to a new Excel sheet. In this case some cells of input files are empty means don't have any value.
Here what I am trying:
use warnings;
use strict;
use Spreadsheet::ParseXLSX;
use Excel::Writer::XLSX;
my $parser = Spreadsheet::ParseXLSX->new;
my $workbook = $parser->parse("test.xlsx");
if ( !defined $workbook ) {
die $parser->error(), ".\n";
}
my $worksheet = $workbook->worksheet('Sheet1');
# but from here I don't know how to define row and column range to get specific column data.
# I am trying to get all data in an array, so I can write it in new .xlsx file.
# function to write data in new file
sub writetoexcel
{
my #fields = #_;
my $workbook = Excel::Writer::XLSX->new( 'report.xlsx' );
$worksheet = $workbook->add_worksheet();
my $row = 0;
my $col = 0;
for my $token ( #fields )
{
$worksheet->write( $row, $col, $token );
$col++;
}
$row++;
}
I also followed this Question, but no luck.
How can I read specific columns from .xlsx file and write it into new .xlsx file?
Have you never copied a subset of columns from an array of arrays to another?
Here is the input sheet I used for this:
and, this is what I get in the output file after the code is run:
#!/usr/bin/env perl
use strict;
use warnings;
use Excel::Writer::XLSX;
use Spreadsheet::ParseXLSX;
my #cols = (1, 3);
my $reader = Spreadsheet::ParseXLSX->new;
my $bookin = $reader->parse($ARGV[0]);
my $sheetin = $bookin->worksheet('Sheet1');
my $writer = Excel::Writer::XLSX->new($ARGV[1]);
my $sheetout = $writer->add_worksheet('Extract');
my ($top, $bot) = $sheetin->row_range;
for my $r ($top .. $bot) {
$sheetout->write_row(
$r,
0,
# of course, you need to do more work if you want
# to preserve formulas, formats etc. That is left
# to you, as you left that part of the problem
# unspecified.
[ map $sheetin->get_cell($r, $_)->value, #cols ],
);
}
$writer->close;

Error when using Spreadsheet::ParseExcel

I get 2 errors when compiling the following code:
#!/usr/bin/perl
use strict;
use warnings;
use Spreadsheet::ParseExcel;
my $xlsparser = Spreadsheet::ParseExcel->new();
my $xlsbook = $parser->parse('xsl_test.xls');
my $xls = $xls->worksheet(0);
my ( $row_first, $row_last ) = $xls->row_range();
my ( $col_first, $col_last ) = $xls->col_range();
my $csv = '';
for my $row ( $row_first .. $row_last ) { #Step through each row
for my $col ( $col_first .. $col_last ) { #Step through each column
my $cell = $xls->get_cell( $row, $col ); #Get the current cell
next unless $cell;
$csv .= $cell->unformatted(); #Get the cell's raw data -- no border colors or anything like that
if ( $col == $col_last ) {
$csv .= "\n"; #Make a new line at the end of the row
} else {
$csv .= ",";
}
}
}
Errors:
global symbol "$parser" requires explicit package name at line 8
global symbol "$xls" requires explicit package name at line 9
I get the above code from http://www.ehow.com/how_7352636_convert-xls-csv-perl.html, and installed the excel module using: cpan Spreadsheet::ParseExcel Spreadsheet::XLSX Spreadsheet::Read
What's causing the error?
Those errors mean that you are using strict, but you didn't declare some variables with my. For example, you declared $xmlprser, but then you tried to use $parser, which was not declared. The code you copied has errors.
A better place to get code is the source itself: Spreadsheet::ParseExcel
Try:
my $parser = Spreadsheet::ParseExcel->new();
my $xlsbook = $parser->parse('xsl_test.xls');
my $xls = $xlsbook->worksheet(0);

search for a cell from one excel and search in another excel and print if its not there using perl

I'm new to perl. I have two excel files containing huge no of rows and just two columns. I want to get each cell from one of the excel files and search whether its there in another excel file or not. if its not then print that cell.
I believe that if I get each cell from one of the excel and search it in another and then run a for loop for all the rows it will be done.
I reached upto getting the cell from first excel but how to search whether it is there in the another excel and printing it is the issue.
can anybody help. ??
I'm not entirely sure what you want, but this might give you some ideas. It's completely untested, though.
use strict;
use Spreadsheet::ParseExcel;
my $parser = Spreadsheet::ParseExcel->new();
my $workbook1 = $parser->parse('Book1.xls');
if (!defined $workbook1) { die $parser->error(), ".\n"; }
my $workbook2 = $parser->parse('Book2.xls');
if (!defined $workbook2) { die $parser->error(), ".\n"; }
$worksheet1 = $workbook1->worksheet('Sheet1');
$worksheet2 = $workbook2->worksheet('Sheet1');
my ($row_min1, $row_max1) = $worksheet1->row_range();
my ($col_min1, $col_max1) = $worksheet1->col_range();
for my $row1 ($row_min1 .. $row_max1) {
for my $col1 ($col_min1 .. $col_max1) {
my $cell1 = $worksheet1->get_cell($row1, $col1);
my ($row_min2, $row_max2) = $worksheet2->row_range();
my ($col_min2, $col_max2) = $worksheet2->col_range();
my $found_match = 0;
for my $row2 ($row_min2 .. $row_max2) {
for my $col2 ($col_min2 .. $col_max2) {
my $cell2 = $worksheet2->get_cell($row2, $col2);
if ($cell1->value() eq $cell2->value()) { # or == ?
$found_match = 1;
break;
}
}
break if $found_match;
}
if (!$found_match) {
print $cell1->value, "\n";
}
}
}
This is mostly from here: http://search.cpan.org/dist/Spreadsheet-ParseExcel/lib/Spreadsheet/ParseExcel.pm

Resources