Compare values from one excel workbook to another - excel-formula

I have two excel workbooks. Workbook1 has list of urls and other i.e. Workbook2 has along with list of urls few more columns.
Workbook1:
COLUMN A
url_list
url1
url2
url3
url
Workbook2:
COLUMN A COLUMN B COLUMN C
Key Words URL Jan 2015
Website search Engine Optimisation url1 72614
Website search Engine Optimisation url2 20890
Website search Engine Optimisation url3 133968
Engine Optimisation url7 584625
I want to compare list of urls from workbook1(Column A) with workbook2(Column B).
If any url from workbook1 is missing in workbook2 then it has to be added in workbook2 in the end.
For example:
Now url is not present in workbook2, so it will be add , and will look like this
Workbook2:
COLUMN A COLUMN B COLUMN C
Key Words URL Jan 2015
Website search Engine Optimisation url1 72614
Website search Engine Optimisation url2 20890
Website search Engine Optimisation url3 133968
Engine Optimisation url7 584625
url
I am using library phpexcel to work with excel sheets in php in windows 7.
Also is there any direct excel formula to do so?
I know with php i can do this.
Thanks

I have a similar task and i have been working tirelessly compiling some code. Though no comparison in-built functions exist, i get data from two different workbooks here (.xlsx files), retrieve specific columns from two worksheets, strip off unnecessary stuff from the data, and store the values in two different associative arrays. I then can use in-built php functions to compare the arrays. You can then pick out the values you intend to write to a new worksheet. I still have to do more work pertaining to my task but i hope this helps someone some day.
<?php
error_reporting(E_ALL);
ini_set('display_errors', TRUE);
ini_set('display_startup_errors', TRUE);
date_default_timezone_set('Europe/London');
define('EOL',(PHP_SAPI == 'cli') ? PHP_EOL : '<br />');
/** Include PHPExcel */
require_once dirname(__FILE__) . '/../Classes/PHPExcel.php';
//set_include_path(get_include_path() . PATH_SEPARATOR . '../../../Classes/');
//include_once 'Lib/PHPExcel.php';
$fileType = 'Excel2007';
$fileName = 'testBook.xlsx';
// Create new PHPExcel object
echo date('H:i:s') , " Create new PHPExcel object" , EOL;
$objPHPExcel = new PHPExcel();
$objPHPExcelXX = new PHPExcel();
$objPHPExcelW = new PHPExcel();
// Read the file
$objReader = PHPExcel_IOFactory::createReader('Excel2007');
$objReaderXX = PHPExcel_IOFactory::createReader($fileType);
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcelW, 'Excel2007');
$objReader->setReadDataOnly(true);
$objReaderXX->setReadDataOnly(true);
try {
$objPHPExcel = $objReader->load("Gemeinde_Bad_Rothenfelde.xlsx");
$objPHPExcelXX = $objReaderXX->load($fileName);
$objWorksheet = $objPHPExcel->getActiveSheet();
$objWorksheetXX = $objPHPExcelXX->getActiveSheet();
print($objWorksheet->getTitle());
print($objWorksheetXX->getTitle());
//$objColumn = $objWorksheet->getHighestColumn();
//$objOtherCols = $objWorksheet->getHighestColumn();
$highestRow = $objWorksheetXX->getHighestRow();
$gemendeHighest = $objWorksheet->getHighestRow();
}catch(Exception $e) {
die($e->getMessage());
}
print("\n");
$arrayOrtStr = array();
$arrayGemStr = array();
$count = 1;
$i = 0;
//$colOrtXX is column in primus sheet, $colOrts is column in Gemeinde sheet,the numbers are the real column numbers in the sheets
for ($row = 1, $colOrtXX=1, $colOrtsT=7, $colOrtsTeil=2,$colStrXX=3, $colOrt=6,$colStr = 10; $row <= $highestRow; $row++) {
//$cell = $objWorksheet->getCell($objColumn.$row);
//Getting cell values for Primus Sheet (Columns PostOrt,PostOrtsteil,PostStrasse)
$cellOrtXX = $objWorksheetXX->getCellByColumnAndRow($colOrtXX,$row);
$cellStrXX = $objWorksheetXX->getCellByColumnAndRow($colStrXX,$row)->setDataType(PHPExcel_Cell_DataType::TYPE_STRING);
$cellOrtsTeil = $objWorksheetXX->getCellByColumnAndRow($colOrtsTeil,$row);
$valOrtXX = $cellOrtXX->getValue();
$valStrXX = $cellStrXX->getValue();
$valOrtsTeil = $cellOrtsTeil->getValue();
// Get cell values for Gemeinde sheet (Columns Ort and Strasse)
$cellOrt = $objWorksheet->getCellByColumnAndRow($colOrt,$row);
$cellStr = $objWorksheet->getCellByColumnAndRow($colStr,$row)->setDataType(PHPExcel_Cell_DataType::TYPE_STRING);
//$cellOrtsT = $objWorksheet->getCellByColumnAndRow($colOrtsT,$row);
$valOrt = $cellOrt->getValue();
$valStr = $cellStr->getValue();
// array populated for strasse column in gemeinde sheet but numbers stripped off the address
$onlyStr = preg_replace('/[0-9]+/','',$valStr);
$arrayGemStr[$i] = array("Strasse"=>$onlyStr);
// Go through the Strasse column, only pick cells with Ort Bad Rothenfelde..compare and write
if($valOrtXX == "Bad Rothenfelde"){
// Creating associative array with Ortsteil and Strasse from Primus sheet
$arrayOrtStr[$i] = array("OrtsTeil"=>$valOrtsTeil,"Strasse"=>$valStrXX);
}
$i++;
//print_r($array);
}
$ortTeil = array();
$contentFound = array();
$withStr = array();
foreach($arrayOrtStr as $arr) {
$contentFound[] = $arr['Strasse'];
}
foreach($arrayOrtStr as $arr) {
if(in_array($arr['Strasse'], $contentFound)){
$ortTeil[] = $arr["OrtsTeil"];
$withStr[] = $arr["Strasse"];
}
}
echo '<br/>========================================================<br/>';
print_r($ortTeil);
print_r($withStr);
// Write the Excel file to filename some_excel_file.xlsx in the current directory
//$objWriter = new PHPExcel_Writer_Excel2007($objPHPExcelW);
//$objWriter->save('Gemeinde_Bad_.xlsx');

Copy ColumnA (excluding header/s) from Workbook1 and append to ColumnB of Workbook2 then apply Excel's Remove Duplicates to ColumnB of Workbook2. Removing duplicates should delete all entries from your example but you might blank out B2 (or maybe B1) from Workbook2 first to avoid that.

I post here a very simple method.
This is not a "direct formula", but it may work for you.
I will assume your sources are Sheet1 and Sheet2 in the same workbook, it is easy to adapt to your needs.
Steps to follow:
Add a helper column in Sheet1:
Enter formula =IF(ISNA(MATCH($A2,Sheet2!$B$2:$B$5,0)),ROW(),100000) in B2.
Copy downwards. This will extract the row numbers of URLs to be copied, using a number larger than those for the rest (100000 here). Replace Sheet2!$B$2:$B$5 by the actual range.
Set a list of indexes N of URLs to copy: Locate in Sheet2 the cell at the row just below the last (6 in your example) and the column just to the right of the last (D in your case). Enter the sequence 1,2,... from that cell down.
Pick the Nth URL to copy: Enter the formula =OFFSET(Sheet1!$A$2,SMALL(Sheet1!$B:$B,D6)-2,0) in B6. Copy down.
Variations on this can be produced.

We are migrating from PHPExcel to PhpSpreadsheet. Here is the snippet I used in my phpunit test to compare 2 excel files using PhpSpreadsheet:
// compare files
$reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();
// no need to read styles, we just care about data
$reader->setReadDataOnly(true);
// load expected file (stored somewhere in the tests directory)
$spreadsheetExpected = $reader->load($expectedFilePath);
// load the generated file
$spreadsheetActual = $reader->load($actualFilePath);
// loop through 3 pages, indices 0, 1, and 2
foreach (range(0, 2) as $sheet) {
// loop through 2 rows
foreach (range(1, 20) as $row) {
// loop through first 6 columns
foreach (['A', 'B', 'C', 'D', 'E', 'F'] as $column) {
// find coordination
$cell = $column . $row;
// get expected cell value
$expected = $spreadsheetExpected->getSheet($sheet)->getCell($cell)->getValue();
// get actual cell value
$actual = $spreadsheetActual->getSheet($sheet)->getCell($cell)->getValue();
// compare values, show the sheet and coordination in case of failure
$this->assertEquals($expected, $actual, "Mismatch in sheet {$sheet}, cell {$cell}");
}
}
}
Apparently this test fails on the first mismatch.

Related

match single cell value with column of values for every match return those rows Google-apps-script

I have a spreadsheet with 2 tabbed sheets. I am trying to run a macro so that when the user inputs a name in B2 of the 2nd sheet, it is matched with every instance of that name in the 1st sheet, column B. I then need to copy all of the data that appears in the matched cell's rows and have that pasted in the 2nd sheet starting with cell B3.
I have limited experience with VBA, but none with JS/Google-apps-script. Any help with how to write this would be greatly appreciated! Here is my first shot:
function onSearch() {
// raw data sheet
var original = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Form Responses 2");
// search for student sheet
var filtered = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Student Progress Search");
// retrieving the values in the raw data array of names
var searchColumn = 2;
var lr = original.getLastRow();
var searchRange = original.getRange(2,searchColumn, lr, 1).getValues();
// retrieving the name submitted on search
var inputName = filtered.getRange(2, 2).getValue();
// loop through all the names in the raw data and identify any matches to the search name
for (var i = 0; i < lr; i++){
var dataValue = searchRange[i];
var r = dataValue.getRow();
var line = [[r]];
var paste = filtered.getRange(3, 3);
// if the data is a match, return the value of that cell in the searched sheet
if (dataValue == inputName){ return paste.setValues(line);
}
}
}
Not sure if the built-in QUERY function would work for you. This here does exactly what you are looking for:
=QUERY(Sheet1!B:B,"select B where LOWER(B) like LOWER('%" &B2& "%')")
For example, if a user enters 'joe', the function will match any entry containing 'joe', regardless of case.

Delete rows after a date has passed automatically for Google Spreadsheets [duplicate]

I'd like to be able to delete an entire row in a Google Spreadsheets if the value entered for say column "C" in that row is 0 or blank. Is there a simple script I could write to accomplish this?
Thanks!
I can suggest a simple solution without using a script !!
Lets say you want to delete rows with empty text in column C.
Sort the data (Data Menu -> Sort sheet by column C, A->Z) in the sheet w.r.t column C, so all your empty text rows will be available together.
Just select those rows all together and right-click -> delete rows.
Then you can re-sort your data according to the column you need.
Done.
function onEdit(e) {
//Logger.log(JSON.stringify(e));
//{"source":{},"range":{"rowStart":1,"rowEnd":1,"columnEnd":1,"columnStart":1},"value":"1","user":{"email":"","nickname":""},"authMode":{}}
try {
var ss = e.source; // Just pull the spreadsheet object from the one already being passed to onEdit
var s = ss.getActiveSheet();
// Conditions are by sheet and a single cell in a certain column
if (s.getName() == 'Sheet1' && // change to your own
e.range.columnStart == 3 && e.range.columnEnd == 3 && // only look at edits happening in col C which is 3
e.range.rowStart == e.range.rowEnd ) { // only look at single row edits which will equal a single cell
checkCellValue(e);
}
} catch (error) { Logger.log(error); }
};
function checkCellValue(e) {
if ( !e.value || e.value == 0) { // Delete if value is zero or empty
e.source.getActiveSheet().deleteRow(e.range.rowStart);
}
}
This only looks at the value from a single cell edit now and not the values in the whole sheet.
I wrote this script to do the same thing for one of my Google spreadsheets. I wanted to be able to run the script after all the data was in the spreadsheet so I have the script adding a menu option to run the script.
/**
* Deletes rows in the active spreadsheet that contain 0 or
* a blank valuein column "C".
* For more information on using the Spreadsheet API, see
* https://developers.google.com/apps-script/service_spreadsheet
*/
function readRows() {
var sheet = SpreadsheetApp.getActiveSheet();
var rows = sheet.getDataRange();
var numRows = rows.getNumRows();
var values = rows.getValues();
var rowsDeleted = 0;
for (var i = 0; i <= numRows - 1; i++) {
var row = values[i];
if (row[2] == 0 || row[2] == '') {
sheet.deleteRow((parseInt(i)+1) - rowsDeleted);
rowsDeleted++;
}
}
};
/**
* Adds a custom menu to the active spreadsheet, containing a single menu item
* for invoking the readRows() function specified above.
* The onOpen() function, when defined, is automatically invoked whenever the
* spreadsheet is opened.
* For more information on using the Spreadsheet API, see
* https://developers.google.com/apps-script/service_spreadsheet
*/
function onOpen() {
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var entries = [{
name : "Remove rows where column C is 0 or blank",
functionName : "readRows"
}];
sheet.addMenu("Script Center Menu", entries);
};
Test spreadsheet before:
Running script from menu:
After running script:
I was having a few problems with scripts so my workaround was to use the "Filter" tool.
Select all spreadsheet data
Click filter tool icon (looks like wine glass)
Click the newly available filter icon in the first cell of the column you wish to search.
Select "Filter By Condition" > Set the conditions (I was using "Text Contains" > "word")
This will leave the rows that contain the word your searching for and they can be deleted by bulk selecting them while holding the shift key > right click > delete rows.
This is what I managed to make work. You can see that I looped backwards through the sheet so that as a row was deleted the next row wouldn't be skipped. I hope this helps somebody.
function UpdateLog() {
var returnSheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName('RetLog');
var rowCount = returnSheet.getLastRow();
for (i = rowCount; i > 0; i--) {
var rrCell = 'G' + i;
var cell = returnSheet.getRange(rrCell).getValue();
if (cell > 0 ){
logSheet.
returnSheet.deleteRow(i);
}
}
}
quite simple request. Try this :
function try_It(){
deleteRow(2); //// choose col = 2 for column C
}
function deleteRow(col){ // col is the index of the column to check for 0 or empty
var sh = SpreadsheetApp.getActiveSheet();
var data = sh.getDataRange().getValues();
var targetData = new Array();
for(n=0;n<data.length;++n){
if(data[n][col]!='' && data[n][col]!=0){ targetData.push(data[n])};
}
Logger.log(targetData);
sh.getDataRange().clear();
sh.getRange(1,1,targetData.length,targetData[0].length).setValues(targetData);
}
EDIT : re-reading the question I'm not sure if the question is asking for a 'live' on Edit function or a function (like this above) to apply after data has been entered... It's not very clear to me... so feel free to be more accurate if necessary ;)
There is a simpler way:
Use filtering to only show the rows which you want to delete. For example, my column based on which I want to delete rows had categories on them, A, B, C. Through the filtering interface I selected only A and B, which I wanted to delete.
Select all rows and delete them. Doing this, in my example, effectively selected all A and B rows and deleted them; now my spreadsheet does not show any rows.
Turn off the filter. This unhides my C rows. Done!
There is a short way to solve that instead of a script.
Select entire data > Go to menu > click Data tab > select create filter > click on filter next to column header > pop-up will appear then check values you want to delete > click okay and copy the filtered data to a different sheet > FINISH
reading your question carefully, I came up with this solution:
function onOpen() {
// get active spreadsheet
var ss = SpreadsheetApp.getActiveSpreadsheet();
// create menu
var menu = [{name: "Evaluate Column C", functionName: "deleteRow"}];
// add to menu
ss.addMenu("Check", menu);
}
function deleteRow() {
// get active spreadsheet
var ss = SpreadsheetApp.getActiveSpreadsheet();
// get active/selected row
var activeRow = ss.getActiveRange().getRowIndex();
// get content column C
var columnC = ss.getRange("C"+activeRow).getValue();
// evaluate whether content is blank or 0 (null)
if (columnC == '' || columnC == 0) {
ss.deleteRow(parseInt(activeRow));
}
}
This script will create a menu upon file load and will enable you to delete a row, based on those criteria set in column C, or not.
This simple code did the job for me!
function myFunction() {
var ss = SpreadsheetApp.getActiveSpreadsheet(); // get active spreadsheet
var activeRow = ss.getActiveRange().getRowIndex(); // get active/selected row
var start=1;
var end=650;
var match='';
var match2=0; //Edit this according to your choice.
for (var i = start; i <= end; i++) {
var columnC = ss.getRange("C"+i).getValue();
if (columnC ==match || columnC ==match2){ ss.deleteRow(i); }
}
}
The below code was able to delete rows containing a date more than 50 days before today in a particular column G , move these row values to back up sheet and delete the rows from source sheet.
The code is better as it deletes the rows at one go rather than deleting one by one. Runs much faster.
It does not copy back values like some solutions suggested (by pushing into an array and copying back to sheet). If I follow that logic, I am losing formulas contained in these cells.
I run the function everyday in the night (scheduled) when no one is using the sheet.
function delete_old(){
//delete > 50 day old records and copy to backup
//run daily from owner login
var ss = SpreadsheetApp.getActiveSpreadsheet();
var bill = ss.getSheetByName("Allotted");
var backss = SpreadsheetApp.openById("..."); //backup spreadsheet
var bill2 = backss.getSheetByName("Allotted");
var today=new Date();
//process allotted sheet (bills)
bill.getRange(1, 1, bill.getMaxRows(), bill.getMaxColumns()).activate();
ss.getActiveRange().offset(1, 0, ss.getActiveRange().getNumRows() - 1).sort({column: 7, ascending: true});
var data = bill.getDataRange().getValues();
var delData = new Array();
for(n=data.length-1; n>1; n--){
if(data[n][6] !=="" && data[n][6] < today.getTime()-(50*24*3600*1000) ){ //change the condition as per your situation
delData.push(data[n]);
}//if
}//for
//get first and last row no to be deleted
for(n=1;n<data.length; n++){
if(data[n][6] !=="" && data[n][6] < today.getTime()-(50*24*3600*1000) ){
var strow=n+1 ; //first row
break
}//if
}//for
for(n=data.length-1; n>1; n--){
if(data[n][6] !=="" && data[n][6] < today.getTime()-(50*24*3600*1000) ){
var ltrow=n+1 ; //last row
break
}//if
}//for
var bill2lr=bill2.getLastRow();
bill2.getRange((bill2lr+1),1,delData.length,delData[0].length).setValues(delData);
bill.deleteRows(strow, 1+ltrow-strow);
bill.getRange(1, 1, bill.getMaxRows(), bill.getMaxColumns()).activate();
ss.getActiveRange().offset(1, 0, ss.getActiveRange().getNumRows() - 1).sort({column: 6, ascending: true}); //get back ordinal sorting order as per column F
}//function

PHPExcel prevent calculating formula

I'm trying to convert a CSV file to a XLSX file using PHPExcel library. Once the csv file is read into PHPExcel object and before saving it as a xlsx file, I recalculate and set column widths based on relevant column content.
$objReader = PHPExcel_IOFactory::createReader('CSV');
$objPHPExcel = $objReader->load("test.csv");
$activesheet = $objPHPExcel->getActiveSheet();
$lastColumn = $activesheet->getHighestColumn(); // get last column with data
$lastColumn++;
for ($column = 'A'; $column != $lastColumn; $column++) { // for each column until last
$activesheet->getColumnDimension($column)->setAutoSize(true); // set autowidth
}
$objWriter = PHPExcel_IOFactory::createWriter($objPHPExcel, 'Excel2007');
$objWriter->save("downloads/test.xls");
with bit of a research i found that if there are any formulas in the file, call to setAutoSize() calculates the value for them to make use when calculating the column width.
My problem is that some of my csv files contain values that begins with = (equal sign) which are not formulas. for ex. cell values like '===='. This causes above code to throw an error PHPExcel_Calculation_Exception Formula Error: An unexpected error occured.
Since I know that any of my input csv files cannot contain formulas, is there a way to prevent PHPExcel calculating values for cells which contain values beginning with = sign?
After research and given suggestions I ended up iterating through all the cells and rewriting cell values (beginning with = sign), to prevent PHPExcel considering them as formulas. setCellValueExplicit() method instructs PHPExcel to not consider the cell value as a formula in this case.
foreach ($objPHPExcel->getWorksheetIterator() as $worksheet) {
foreach ($worksheet->getRowIterator() as $row) {
$cellIterator = $row->getCellIterator();
$cellIterator->setIterateOnlyExistingCells(true);
foreach ($cellIterator as $cell) {
if (preg_match( '/^=/', $cell->getValue())) {
$cellcoordinate = $cell->getCoordinate();
$worksheet->setCellValueExplicit($cellcoordinate,$worksheet->getCell($cellcoordinate));
}
}
}
}
It's painful, but couldn't find a better solution.

Node - exceljs: writing to file breaks fomulas in the file

I have an excel (xlsx) file that contains random columns. Some of these columns have formulas mapped to the sum of some cells; for example:
=J8+F9-H9
In my case I have the following three columns:
F: number
H: number
J: =sum of previous row's F and H cell's values.
I aim to get external data and store them cell by cell in this workbook. For this I am using Node module exceljs.
This is my code so far, I am harcoding values for now (which I will be getting from another file later on).
var workbook = new Excel.Workbook();
var filename = 'Bank Synoptic Journal.xlsx'
workbook
.xlsx
.readFile(filename)
.then(function() {
var worksheet = workbook.getWorksheet('Bank Synoptic');
var row = null;
row = worksheet.getRow(8);
row.getCell('J').value = Math.random();
row.commit();
for(var i=9; i<=305;i++) { //row
row = worksheet.getRow(i);
row.getCell('F').value = Math.random();
row.getCell('H').value = Math.random();
row.commit();
}
})
.then(function() {
return workbook.xlsx.writeFile(filename + '_modified.xlsx');
})
.then(function() {
console.log('Done!');
});
It prints the output into a new excel file. The problem I am facing is that for cells 'J' ie which contains the formulas; these cells are breaking with no consitency:
Some cells keep formulas and do the calculations
Others have no more formulas nor calculations done (have '0' instead of formula)
Recalculations are not done automatically using this injection mechanism
(Snapshots)
What I am missing or doing wrong that is leading to this error?
After several trials and errors I moved to Apache POI and so built the script using Java.
I downloaded and included the following JARs in my project:
It manipulates rows/columns and keeps the formulas intact. Once you open the modified excel file all you have to do is refresh (On Windows: ctrl + alt + f9) and it will recalculate.

How to make Google form submission sort into separate sheets?

I can only get this to sort automatically when I input the text manually into the cell. I've tried changing to OnFormSubmit but no luck. How can I rewrite this to have Google Docs automatically sort the form-submitted answers to separate tabs?
function onEdit(event) {
// assumes source data in sheet named Needed
// target sheet of move to named Acquired
// test column with yes/no is col 4 or D
var ss = SpreadsheetApp.getActiveSpreadsheet();
var s = SpreadsheetApp.getActiveSheet();
var r = SpreadsheetApp.getActiveRange();
if(s.getName() == "Inbox" && r.getColumn() == 6 && r.getValue() == "Los Angeles") {
var row = r.getRow();
var numColumns = s.getLastColumn();
var targetSheet = ss.getSheetByName("Los Angeles");
var target = targetSheet.getRange(targetSheet.getLastRow() + 1, 1);
s.getRange(row, 1, 1, numColumns).moveTo(target);
s.deleteRow(row)
}
}
Dont write the sorted sheet data manually. Instead use a single QUERY formula per sheet. With it you can filter sort grouo and pivot your data as you like.

Resources