How to keep original rotate page in itextSharp (dll) - excel

i would like create the project, reading from Excel and write on pdf and print this pdf.
From Excel file (from cell) read directory where is original pdf on computer or server, and next cell have info what write on the top in second pdf.
And problem is here, original pdf is horizontal, landscape, rotate and my program create copy from original pdf and write info from excel on the top on copy pdf file. But pdf which is landscape is rotate for 270 deegres. This is no OK. For portrait rotation working program OK, copy OK and write on the top of the copy is OK.
Where is my problem in my code.
Code:
public int urediPDF(string inTekst)
{
if (inTekst != "0")
{
string pisava_arialBD = #"..\debug\arial.ttf";
string oldFile = null;
string inText = null;
string indeks = null;
//razbitje stringa
string[] vhod = inTekst.Split('#');
oldFile = vhod[0];
inText = vhod[1];
indeks = vhod[2];
string newFile = #"c:\da\2";
//odpre bralnik pdf
PdfReader reader = new PdfReader(oldFile);
Rectangle size = reader.GetPageSizeWithRotation(reader.NumberOfPages);
Document document = new Document(size);
//odpre zapisovalnik pdf
FileStream fs = new FileStream(newFile + "-" + indeks + ".pdf", FileMode.Create, FileAccess.Write);
PdfWriter writer = PdfWriter.GetInstance(document, fs);
//document.Open();
document.OpenDocument();
label2.Text = ("Status: " + reader.GetPageRotation(reader.NumberOfPages).ToString());
//določi sejo ustvarjanje pdf
PdfContentByte cb = writer.DirectContent;
//izbira pisave oblike
BaseFont bf = BaseFont.CreateFont(pisava_arialBD, BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
cb.SetColorFill(BaseColor.RED);
cb.SetFontAndSize(bf, 8);
//pisanje teksta v pdf
cb.BeginText();
string text = inText;
//izbira koordinat za zapis pravilnega teksta v pdf (720 stopinj roatacija (ležeče) in 90 stopinj (pokončno))
if (reader.GetPageRotation(1) == 720) //ležeča postavitev
{
cb.ShowTextAligned(1, text, 10, 450, 0);
cb.EndText();
}
else //pokončna postavitev
{
cb.ShowTextAligned(1, text + " - pokončen", 10, 750, 0);
cb.EndText();
}
// create the new page and add it to the pdf
PdfImportedPage page = writer.GetImportedPage(reader, reader.NumberOfPages);
cb.AddTemplate(page, 0, 0);
// close the streams and voilá the file should be changed :)
document.Close();
fs.Close();
writer.Close();
reader.Close();
}
else
{
label2.Text = "Status: Končano zapisovanje";
return 0;
}
return 0;
}
Picture fake pdf:

As explained many times before (ITextSharp include all pages from the input file, Itext pdf Merge : Document overflow outside pdf (Text truncated) page and not displaying, and so on), you should read chapter 6 of my book iText in Action (you can find the C# version of the examples here).
You are using a combination of Document, PdfWriter and PdfImportedPage to split a PDF. Please tell me who made you do it this way, so that I can curse the person who inspired you (because I've answered this question hundreds of times before, and I'm getting tired of repeating myself). These classes aren't a good choice for that job:
you lose all interactivity,
you need to rotate the content yourself if the page is in landscape,
you need to take the original page size into account,
...
Your problem is similar to this one itextsharp: unexpected elements on copied pages. Is there any reason why you didn't read the documentation? If you say: "I didn't have the time", please believe me if I say that I have almost 20 years of experience as a developer, and I've never seen "reading documentation" as a waste of time.
Long story short: read the documentation, replace PdfWriter with PdfCopy, replace AddTemplate() with AddPage().

Related

Save xlsl file with Hebrew as txt to load into Photoshop data sets

I have an Excel (xlsx) file that has 3 columns of data that is set to replace said data in a Photoshop file (PSD), to do so I need to load it into Photoshop in a txt format, encoded to ANSI, so that Photoshop can read that file, and export it a bunch of times each time with the next row's properties.
However my Excel file has some Hebrew text, that is lost when encoding to ANSI, I tried other encodings but Photoshop doesn't accept them, how can I still feed Photoshop with the Hebrew data? (It's a lot of photos so I can't do it manually one by one)
This works for me: I've got a simple text file, with some Hebrew text on it.
And from Photoshop:
var myfile = "D:\\temp\\hebrew.txt"; // change this
var text = read_it(myfile);
alert(text);
// השועל החום המהיר קופץ מעל הכלב העצלן.
// function READ IT (filename with path) :returns string
// ----------------------------------------------------------------
function read_it(afilepath)
{
var theFile = new File(afilepath);
//read in file
var words = ""; // text collection string
var theTextFile = new File(theFile);
theTextFile.open('r');
while(!theTextFile.eof)
{
var line = theTextFile.readln();
if (line != null && line.length >0)
{
words += line + "\n";
}
}
theTextFile.close();
// return string
return words;
}

Export PDF file from Excel template with Qt and QAxObject

The project I am currently working on is to export an Excel file to PDF.
The Excel file is a "Template" that allows the generation of graphs. The goal is to fill some cells of the Excel file so that the graphs are generated and then to export the file in PDF.
I use Qt in C++ with the QAxObject class and all the data writing process works well but it's the PDF export part that doesn't.
The problem is that the generated PDF file also contains the data of the graphs while these data are not included in the print area of the Excel template.
The PDF export is done with the "ExportAsFixedFormat" function which has as a parameter the possibility to ignore the print area that is "IgnorePrintAreas" at position 5. Even if I decide to set this parameter to "false", so not to ignore the print area and therefore to take into account the print area, this does not solve the problem and it produces the same result as if this parameter was set to "true".
I tried to vary the other parameters, to change the type of data passed in parameter or not to use any parameter but it does not change anything to the obtained result which is always the same.
Here is the link to the "documentation" of the export command "ExportAsFixedFormat":
https://learn.microsoft.com/en-us/office/vba/api/excel.workbook.exportasfixedformat
I give you a simplified version of the command suite that is executed in the code:
Rapport::Rapport(QObject *parent) : QObject(parent)
{
//Create the template from excel file
QString pathTemplate = "/ReportTemplate_FR.xlsx"
QString pathReporter = "/Report"
this->path = QDir(QDir::currentPath() + pathReporter + pathTemplate);
QString pathAbsolute(this->path.absolutePath().replace("/", "\\\\"));
//Create the output pdf file path
fileName = QString("_" + QDateTime::currentDateTime().toString("yyyyMMdd-HHmmssff") + "_Report");
QString pathDocument = QStandardPaths::writableLocation(QStandardPaths::DocumentsLocation).append("/").replace("/", "\\\\");
QString exportName(pathDocument + fileName + ".pdf");
//Create the QAxObjet that is linked to the excel template
this->excel = new QAxObject("Excel.Application");
//Create the QAxObject « sheet » who can accepte measure data
QAxObject* workbooks = this->excel->querySubObject("Workbooks");
QAxObject* workbook = workbooks->querySubObject("Add(const QString&)", pathAbsolute);
QAxObject* sheets = workbook->querySubObject("Worksheets");
QAxObject* sheet = sheets->querySubObject("Item(int)", 3);
//Get some data measure to a list of Inner class Measurement
QList<Measurement*> actuMeasure = this->getSomeMeasure() ; //no need to know how it’s work…
//Create a 2 dimentional QVector to be able to place data on the table where we want (specific index)
QVector<QVector<QVariant>> vCells(actuMeasure.size());
for(int i = 0; i < vCells.size(); i++)
vCells[i].resize(6);
//Fill the 2 dimentional QVector with data measure
int row = 0;
foreach(Measurement* m, actuMeasure)
{
vCells[row][0] = QVariant(m->x);
vCells[row][1] = QVariant(m->y1);
vCells[row][2] = QVariant(m->y2);
vCells[row][3] = QVariant(m->y3);
vCells[row][4] = QVariant(m->y4);
vCells[row][5] = QVariant(m->y5);
row++;
}
//Transform the 2 dimentional QVector on a QVariant object
QVector<QVariant> vvars;
QVariant var;
for(int i = 0; i < actuMeasure.size(); i++)
vvars.append(QVariant(vCells[i].toList()));
var = QVariant(vvars.toList());
//Set the QVariant object that is the data measure on the excel file
sheet->querySubObject("Range(QString)", "M2:AB501")->setProperty("Value", var);
//Set the fileName on the page setup (not relevant for this example)
sheet->querySubObject("PageSetup")->setProperty("LeftFooter", QVariant(fileName));
//Export to PDF file with options – NOT WORKING !!!
workbook->dynamicCall("ExportAsFixedFormat(const QVariant&, const QVariant&, const QVariant&, const QVariant&, const QVariant&)", QVariant(0), QVariant(exportName), QVariant(0), QVariant(false), QVariant(false));
//Close
workbooks->dynamicCall("Close()");
this->excel->dynamicCall("Quit()");
}
A this point I really need help to find a way to solve this problem.
I also wonder if this is not a bug of the QAxObject class.
I finally found a solution on another forum.
If anyone needs help, I'll leave the link to the answer.

Excel and Libre Office conflict over Open XML output

Open XML is generating .xlsx files that can be read by Open Office, but not by Excel itself.
With this as my starting point( Export DataTable to Excel with Open Xml SDK in c#) I have added code to create a .xlsx file. Attempting to open with Excel, I'm asked if I want to repair the file. Saying yes gets "The workbook cannot be opened or repaired by Microsoft Excel because it's corrupt." After many hours of trying to jiggle the data from my table to make this work, I finally threw up my hands in despair and made a spreadsheet with a single number in the first cell.
Still corrupt.
Renaming it to .zip and exploring shows intact .xml files. On a whim, I took a legit .xlsx file created by Excel, unzipped it, rezipped without changing contents and renamed back to .xlsx. Excel declared it corrupt. So this is clearly not a content issue, but file a format issue. Giving up on Friday, I sent some of the sample files home and opened them there with Libre Office. There were no issues at all. File content was correct and Calc had no problem. I'm using Excel for Office 365, 32 bit.
// ignore the bits (var list) that get data from the database. I've reduced this to just the output of a single header line
List< ReportFilingHistoryModel> list = DB.Reports.Report.GetReportClientsFullHistoryFiltered<ReportFilingHistoryModel>(search, client, report, signature);
MemoryStream memStream = new MemoryStream();
using (SpreadsheetDocument workbook = SpreadsheetDocument.Create(memStream, SpreadsheetDocumentType.Workbook))
{
var workbookPart = workbook.AddWorkbookPart();
workbook.WorkbookPart.Workbook = new Workbook();
workbook.WorkbookPart.Workbook.Sheets = new Sheets();
var sheetPart = workbook.WorkbookPart.AddNewPart<WorksheetPart>();
var sheetData = new SheetData();
sheetPart.Worksheet = new Worksheet(sheetData);
Sheets sheets = workbook.WorkbookPart.Workbook.GetFirstChild<Sheets>();
string relationshipId = workbook.WorkbookPart.GetIdOfPart(sheetPart);
uint sheetId = 1;
if (sheets.Elements<Sheet>().Count() > 0)
{
sheetId = sheets.Elements<Sheet>().Select(s => s.SheetId.Value).Max() + 1;
}
Sheet sheet = new Sheet() { Id = relationshipId, SheetId = sheetId, Name = "History" };
sheets.Append(sheet);
Row headerRow = new Row();
foreach( var s in "Foo|Bar".Split('|'))
{
var cell = new Cell();
cell.DataType = CellValues.Number;
cell.CellValue = new CellValue("5");
headerRow.AppendChild(cell);
}
sheetData.AppendChild(headerRow);
}
memStream.Seek(0, SeekOrigin.Begin);
Guid result = DB.Reports.Report.AddClientHistoryList( "test.xlsx", memStream.GetBuffer(), "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
return Ok(result);
This should just work. I've noticed other stack overflow discussions that direct back to the first link I mentioned above. I seem to be doing it right (and Calc concurs). There have been discussions of shared strings and whatnot, but by using plain numbers I shouldn't be having issues. What am I missing here?
In working on this, I went with the notion that some extraneous junk on the end of a .zip file is harmless. 7-Zip, Windows Explorer and Libre Office all seem to agree (as does some other zip program I used at home whose name escapes me). Excel, however, does not. Using the pointer at memStream.GetBuffer() was fine, but using its length was not. (The preceding Seek() was unnecessary.) Limiting the write of the data to a length equal to the current output position keeps Excel from going off the rails.

How Can I paste the content of a text file 400 times

its for a unity research project, I dont want to have to press control v exactly 400 times. I just want to paste it in another .txt file
this is the text
http://pastebin.com/m1u4AFAr
Thank you for you help
Use Unity. Go to that link and copy the text. Run this script in Unity and it will duplicate the text for 400 times and save it. It will show you where it saved it. Any text you have in the clipboard will be duplicated 400 times.
void Start()
{
//Get Clipboard
string fileInfo = GUIUtility.systemCopyBuffer;
if (fileInfo == null)
{
Debug.Log("Clipboard is Empty. Exited");
return; //exit
}
//Multiply by file 400
System.Text.StringBuilder crazyfileX400 = new System.Text.StringBuilder();
for (int i = 0; i < 400; i++)
{
crazyfileX400.Append(fileInfo).Append("\r\n");
}
string filename = Application.persistentDataPath + "/" + "crazyFile.txt";
System.IO.File.WriteAllText(filename, crazyfileX400.ToString());
Debug.Log("File written to " + filename);
}
Have a look at this open source macro-creation and automation software.
You can use it to write a script that does the job for you. Here's a a script for simple copy and paste:
#c::
Send, {CTRLDOWN}c{CTRLUP}{ALTDOWN}{TAB}{ALTUP}
sleep, 300
Send, {CTRLDOWN}v{CTRLUP}{ENTER}{ALTDOWN}{TAB}{ALTUP}
return
You could adapt it to your situation by adding a loop.

PDF text search and split library

I am look for a server side PDF library (or command line tool) which can:
split a multi-page PDF file into individual PDF files, based on
a search result of the PDF file content
Examples:
Search "Page ???" pattern in text and split the big PDF into 001.pdf, 002,pdf, ... ???.pdf
A server program will scan the PDF, look for the search pattern, save the page(s) which match the patten, and save the file in the disk.
It will be nice with integration with PHP / Ruby. Command line tool is also acceptable. It will be a server side (linux or win32) batch processing tool. GUI/login is not supported. i18n support will be nice but no required. Thanks~
My company, Atalasoft, has just released some PDF manipulation tools that run on .NET. There is a text extract class that you can use to find the text and determine how you will split your document and a very high level document class that makes the splitting trivial. Suppose you have a Stream to your source PDF and an increasingly ordered List that describes the starting page of each split, then the code to generate your split files looks like this:
public void SplitPdf(Stream stm, List<int> pageStarts, string outputDirectory)
{
PdfDocument mainDoc = new PdfDocument(stm);
int lastPage = mainDoc.Pages.Count - 1;
for (int i=0; i < pageStarts.Count; i++) {
int startPage = pageStarts[i];
int endPage= (i < pageStarts.Count - 1) ?
pageStarts[i + 1] - 1 :
lastPage;
if (startPage > endPage) throw new ArgumentException("list is not ordered properly", "pageStarts");
PdfDocument splitDoc = new PdfDocument();
for (j = startPage; j <= endPage; j++)
splitDoc.Pages.Add(mainDoc.Pages[j];
string outputPath = Path.Combine(outputDirectory,
string.Format("{0:D3}.pdf", i + 1));
splitDoc.Save(outputPath);
}
if you generalize this into a page range struct:
public struct PageRange {
public int StartPage;
public int EndPage;
}
where StartPage and EndPage inclusively describe a range of pages, then the code is simpler:
public void SplitPdf(Stream stm, List<PageRange> ranges, string outputDirectory)
{
PdfDocument mainDoc = new PdfDocument(stm);
int outputDocCount = 1;
foreach (PageRange range in ranges) {
int startPage = Math.Min(range.StartPage, range.EndPage); // assume not in order
int endPage = Math.Max(range.StartPage, range.EndPage);
PdfDocument splitDoc = new PdfDocument();
for (int i=startPage; i <= endPage; i++)
splitDoc.Pages.Add(mainDoc.Pages[i]);
string outputPath = Path.Combine(outputDirectory,
string.Format("{0:D3}.pdf", outputDocCount));
splitDoc.Save(outputPath);
outputDocCount++;
}
}
PDFBox is a Java library but it does have some command line tools as well:
http://pdfbox.apache.org/
PDFBox can extract text and also rebuilt/split PDFS
pdfminer + multi-line pattern matching in python
You can use pdfsam to split your file in pages, then use pdftotext (from foolabs.com) to turn this into text and use ruby (or grep) to find the strings. Then you have the page ranges and can return the previous generated pages.

Resources