I am trying to use Documents4j java library in my Android App to convert Docx format document to PDF file but the output pdf file is damaged or corrupted.
The output pdf file is empty with 0 bytes.
I am using the below code to convert Docx to pdf.
String uniqueString = UUID.randomUUID().toString();
File outputFile = new File(Environment.getExternalStorageDirectory() + "/meer_" + uniqueString+".pdf");
File inputWord = new File(input);
try {
InputStream docxInputStream = new FileInputStream(inputWord);
OutputStream outputStream = new FileOutputStream(outputFile);
IConverter converter = LocalConverter.builder().build();
converter.convert(docxInputStream).as(DocumentType.DOCX).to(outputStream).as(DocumentType.PDF).execute();
outputStream.close();
System.out.println("success");
} catch (Exception e) {
e.printStackTrace();
}
if(outputFile.exists()){
openPdf(outputFile);
}
documents4j functions by delegating the conversion from the Java application to an instance of MS Word. This instance can run on a server which you can reach via HTTP(S) but the local converter will of course not work on Android which is unable to run MS Word.
Related
In my web application I have an image uploading module. I want to check the uploaded file whether it's an image file or any other file. I am using Java in server side.
The image is read as BufferedImage in java and then I am writing it to disk with ImageIO.write()
How shall I check the BufferedImage, whether it's really an image or something else?
Any suggestions or links would be appreciated.
I'm assuming that you're running this in a servlet context. If it's affordable to check the content type based on just the file extension, then use ServletContext#getMimeType() to get the mime type (content type). Just check if it starts with image/.
String fileName = uploadedFile.getFileName();
String mimeType = getServletContext().getMimeType(fileName);
if (mimeType.startsWith("image/")) {
// It's an image.
}
The default mime types are definied in the web.xml of the servletcontainer in question. In for example Tomcat, it's located in /conf/web.xml. You can extend/override it in the /WEB-INF/web.xml of your webapp as follows:
<mime-mapping>
<extension>svg</extension>
<mime-type>image/svg+xml</mime-type>
</mime-mapping>
But this doesn't prevent you from users who are fooling you by changing the file extension. If you'd like to cover this as well, then you can also determine the mime type based on the actual file content. If it's affordable to check for only BMP, GIF, JPG or PNG types (but not TIF, PSD, SVG, etc), then you can just feed it directly to ImageIO#read() and check if it doesn't throw an exception.
try (InputStream input = uploadedFile.getInputStream()) {
try {
ImageIO.read(input).toString();
// It's an image (only BMP, GIF, JPG and PNG are recognized).
} catch (Exception e) {
// It's not an image.
}
}
But if you'd like to cover more image types as well, then consider using a 3rd party library which does all the work by sniffing the file headers. For example JMimeMagic or Apache Tika which support both BMP, GIF, JPG, PNG, TIF and PSD (but not SVG). Apache Batik supports SVG. Below example uses JMimeMagic:
try (InputStream input = uploadedFile.getInputStream()) {
String mimeType = Magic.getMagicMatch(input, false).getMimeType();
if (mimeType.startsWith("image/")) {
// It's an image.
} else {
// It's not an image.
}
}
You could if necessary use combinations and outweigh the one and other.
That said, you don't necessarily need ImageIO#write() to save the uploaded image to disk. Just writing the obtained InputStream directly to a Path or any OutputStream like FileOutputStream the usual Java IO way is more than sufficient (see also Recommended way to save uploaded files in a servlet application):
try (InputStream input = uploadedFile.getInputStream()) {
Files.copy(input, new File(uploadFolder, fileName).toPath());
}
Unless you'd like to gather some image information like its dimensions and/or want to manipulate it (crop/resize/rotate/convert/etc) of course.
I used org.apache.commons.imaging.Imaging in my case. Below is a sample piece of code to check if an image is a jpeg image or not. It throws ImageReadException if uploaded file is not an image.
try {
//image is InputStream
byte[] byteArray = IOUtils.toByteArray(image);
ImageFormat mimeType = Imaging.guessFormat(byteArray);
if (mimeType == ImageFormats.JPEG) {
return;
} else {
// handle image of different format. Ex: PNG
}
} catch (ImageReadException e) {
//not an image
}
This is built into the JDK and simply requires a stream with support for
byte[] data = ;
InputStream is = new BufferedInputStream(new ByteArrayInputStream(data));
String mimeType = URLConnection.guessContentTypeFromStream(is);
//...close stream
Since Java SE 6 https://docs.oracle.com/javase/6/docs/api/java/net/URLConnection.html
Try using multipart file instead of BufferedImage
import org.apache.http.entity.ContentType;
...
public void processImage(MultipartFile file) {
if(!Arrays.asList(ContentType.IMAGE_JPEG.getMimeType(), ContentType.IMAGE_PNG.getMimeType(), ContentType.IMAGE_GIF.getMimeType()).contains(file.getContentType())) {
throw new IllegalStateException("File must be an Image");
}
}
So i am using selenium Webdriver+maven+TestNg.
I am trying to generate excel report using XL.generateReport method.Below is the code.
public void SendMail() throws Exception
{
String userDirector = System.getProperty("user.dir");
String resultFile = userDirector + "/Reports/";
Xl.generateReport(resultFile,"excel-report.xlsx");
Thread.sleep(3000);
Date dt = new Date();
SimpleDateFormat formatter = new SimpleDateFormat("dd/MM/yyyy");
String strDate= formatter.format(dt);
//authentication info
final String username = "abc1#test.com";
final String password = "pass#123";
String fromEmail = "abc2#test.com";
String toEmail = "abc3#test.com";
String ccEmail= "ravi#test.com";
String ccEMail2="Rajeev#test.com";
Properties properties = new Properties();
properties.put("mail.smtp.auth", "true");
properties.put("mail.smtp.starttls.enable", "true");
properties.put("mail.smtp.host", "smtp.office365.com");
properties.put("mail.smtp.port", "587");
Session session = Session.getInstance(properties, new javax.mail.Authenticator() {
protected PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(username,password);
}
});
//Start our mail message
MimeMessage msg = new MimeMessage(session);
try {
msg.setFrom(new InternetAddress(fromEmail));
msg.addRecipient(Message.RecipientType.TO, new InternetAddress(toEmail));
//msg.addRecipient(Message.RecipientType.CC, new InternetAddress(ccEmail));
msg.addRecipient(Message.RecipientType.CC, new InternetAddress(ccEMail2));
msg.setSubject("Test Execution Report on "+strDate);
Multipart emailContent = new MimeMultipart();
//Text body part
MimeBodyPart textBodyPart = new MimeBodyPart();
textBodyPart.setText("Hello, Good day! \n"
+ "\n"
+ "All scenarios have been executed. Please find the attached report of the execution. \n"
+ "\n"
+ "Thanks,\n"
+ "Mahesh.");
//Attachment body part.
MimeBodyPart pdfAttachment = new MimeBodyPart();
pdfAttachment.attachFile(resultFile+ "excel-report.xlsx");
//Attach body parts
emailContent.addBodyPart(textBodyPart);
emailContent.addBodyPart(pdfAttachment);
//Attach multi-part to message
msg.setContent(emailContent);
Transport.send(msg);
System.out.println("Sent message");
} catch (MessagingException e) {
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
So above code is basically going to create a report and store in Reports folder of my project directory.
Further I am using a code to email this generated excel on my email ID.
The above code works fine for me on my local machine.
The problem is when I put this code on server.
I am creating a Runner JAR file and I create a folder with this Runner JAR file, Reports folder and TestNg.xml with some other folder to upload image and stuff.
When I execute the JAR on server, excel report is not getting generated and no email is received with report attached. Same JAR does not work on any other machine that has Microsoft office as well as JAVA on their machine.
I am not sure if there is any path related issue when I run the JAR on server.
On the server , we do not have Microsoft Office installed. But I would like a code that works fine irrespective of whether we have Microsoft office installed or not.
One more addition to this is, I am also creating .html report with same path that is used for excel and this works fine on local as well as on server. HTML report is generated under Reports folder accurately.
Also I have tried changing the excel report path to my local machine folder as well as to the path of the server folder where i have kept the package. But nothing worked
So please help and let me know for solution.
Thanks in advance.
Problem
I'm trying to send a List to an Excel file by using the LoadFromCollection method.
After I generate an XLSX file, it does not open but if I change the extension to XLS, it appears like this.
ExcelFileError
Code
[HttpGet]
[Route("GetExcel")]
public async Task<HttpResponse> GetExcel()
{
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ClearContent();
HttpContext.Current.Response.ClearHeaders();
HttpContext.Current.Response.Buffer = true;
HttpContext.Current.Response.Cache.SetCacheability(HttpCacheability.NoCache);
var stream = new MemoryStream();
using (var excelPackage = new ExcelPackage())
{
ExcelWorksheet ws = excelPackage.Workbook.Worksheets.Add("SoftwareVersao");
var result = await _versaoAppService.Selecionar();
ws.Cells["A1"].LoadFromCollection(result);
ws.Cells.AutoFitColumns();
excelPackage.SaveAs(stream);
}
HttpContext.Current.Response.BinaryWrite(stream.ToArray());
HttpContext.Current.Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
HttpContext.Current.Response.AddHeader("content-disposition", "attachment;filename=" + "SoftwareVersao.xlsx");
HttpContext.Current.Response.Flush();
HttpContext.Current.ApplicationInstance.CompleteRequest();
return HttpContext.Current.Response;
}
This code generates a corrupted XLSX file if I change the lines below it generates the above image.
HttpContext.Current.Response.ContentType = "application/vnd.ms-excel"
HttpContext.Current.Response.AddHeader("content-disposition", "attachment;filename=" + "SoftwareVersao.xls");
Attempts
I've made several attempts.
I used HttpResponseMessage instead of HttpResponse
I used a blank template in creating the file
I changed the enconding
I used several code snippets
What am I doing wrong? Is there something I forgot to implement?
Update
I found this question and now could generate the XLSX file, but I'm having the same problem that the author of the question, where Excel have to repair the file.
WebApi using EF EPPlus returning gibberish in exce
I am trying to write a C# Azure Function to download and open an excel file using the OpenXml-SDK.
Office Interop doesn't work here because office is not available to the Azure Function.
I am trying to use OpenXml-SDK to open and read the file which seems to require a path to the saved file and not the url or a Stream downloaded from the remote url.
Given I don't know of a way to temporary store the excel file in Azure Functions, I used Azure File Storage.
I uploaded the excel file from the url to Azure File Storage, however I cannot open the excel file with OpenXML-SDK.
I tested the excel file in Azure File Storage is working, however, when I try to open the OpenXML.SpreadsheetDocument form a MemoryStream I get error indicating the file is corrupt.
If I try to open the SpreadsheetDocument passing the file Uri (https://learn.microsoft.com/en-us/azure/storage/storage-dotnet-how-to-use-files#develop-with-file-storage) then the address passes the 260 character limit.
I'm open to using a library other than OpenXML and ideally I would prefer not to have to store the excel file.
Open XML SDK works fine in Azure Function. I tested it on my side. Here is the full code.
#r "DocumentFormat.OpenXml.dll"
#r "WindowsBase.dll"
using System.Net;
using System.IO;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
public static HttpResponseMessage Run(HttpRequestMessage req, TraceWriter log)
{
log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");
WebClient client = new WebClient();
byte[] buffer = client.DownloadData("http://amor-webapp-test.azurewebsites.net/Content/hello.xlsx");
MemoryStream stream = new MemoryStream();
stream.Write(buffer, 0, buffer.Length);
stream.Position = 0;
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(stream, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;
SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
SharedStringTable sst = sstpart.SharedStringTable;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
Worksheet sheet = worksheetPart.Worksheet;
var cells = sheet.Descendants<Cell>();
var rows = sheet.Descendants<Row>();
log.Info(string.Format("Row count = {0}", rows.LongCount()));
log.Info(string.Format("Cell count = {0}", cells.LongCount()));
// One way: go through each cell in the sheet
foreach (Cell cell in cells)
{
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
{
int ssid = int.Parse(cell.CellValue.Text);
string str = sst.ChildElements[ssid].InnerText;
log.Info(string.Format("Shared string {0}: {1}", ssid, str));
}
else if (cell.CellValue != null)
{
log.Info(string.Format("Cell contents: {0}", cell.CellValue.Text));
}
}
}
return req.CreateResponse(HttpStatusCode.OK, "Hello ");
}
To use Open XML, please make sure you have created a bin folder under your function folder and uploaded DocumentFormat.OpenXml.dll and WindowsBase.dll to it.
"File contains corrupted data".
Have you tried another excel file to check whether the issue is related to specific excel file. I suggest you create a new simple excel to test your code again.
"It didn't work on my file with the same "File contains corrupted data" message. "
I download your excel file and found that it is a older version(.xls) of excel file.
To fixed the exception, you could convert the excel to latest version(.xlsx) or choose another excel parse library. ExcelDataReader could work for any versions of excel file. You could install this library using NuGet by searching 'ExcelDataReader'. Following is the sample code of how to parse .xls format excel file. I tested it on Azure Function, it did worked fine.
#r "Excel.dll"
#r "System.Data"
using System.Net;
using System.IO;
using Excel;
using System.Data;
public static HttpResponseMessage Run(HttpRequestMessage req, TraceWriter log)
{
log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");
WebClient client = new WebClient();
byte[] buffer = client.DownloadData("http://amor-webapp-test.azurewebsites.net/Content/abcdefg.xls");
MemoryStream stream = new MemoryStream();
stream.Write(buffer, 0, buffer.Length);
stream.Position = 0;
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
DataSet result = excelReader.AsDataSet();
for (int i = 0; i < result.Tables.Count; i++)
{
log.Info(result.Tables[i].TableName +" has " + result.Tables[i].Rows.Count + " rows.");
}
return req.CreateResponse(HttpStatusCode.OK, "Hello ");
}
Please add "Excel.dll" file to the bin folder of your function before executing upper code.
If you do need to save a temporary file, Azure Functions has a %TEMP% environment variable with a path to a temporary folder. This is a folder that is local to the vm that runs your function and will not be persisted.
However, saving the file locally / in Azure Files is unnecessary. You should be able to get the stream from the response to your get request and pass it straight to OpenXML.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(originalExcelUrl);
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
using (Stream stream = response.GetResponseStream())
{
var doc = SpreadsheetDocument.Open(stream, true);
// etc
}
the file you are trying to open is in a different format than specified by the file extension c# error when trying to open file in excel.
Here is my code
public ActionResult Export(string filterBy)
{
MemoryStream output = new MemoryStream();
StreamWriter writer = new StreamWriter(output, Encoding.UTF8);
var data = City.GetAll().Select(o => new
{
CountryName = o.CountryName,
StateName = o.StateName,
o.City.Name,
Title = o.City.STDCode
}).ToList();
var grid = new GridView { DataSource = data };
grid.DataBind();
var htw = new HtmlTextWriter(writer);
grid.RenderControl(htw);
writer.Flush();
output.Position = 0;
return File(output, "application/vnd.ms-excel", "test.xls");
}
when am trying to open excel i get this error
the file you are trying to open is in a different format than
specified by the file extension
After clicking on Yes the file open properly. but i don't want this msg to appear.
I have used CloseXML to solve the problem.
public static void ExportToExcel(IEnumerable<dynamic> data, string sheetName)
{
XLWorkbook wb = new XLWorkbook();
var ws = wb.Worksheets.Add(sheetName);
ws.Cell(2, 1).InsertTable(data);
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
HttpContext.Current.Response.AddHeader("content-disposition", String.Format(#"attachment;filename={0}.xlsx",sheetName.Replace(" ","_")));
using (MemoryStream memoryStream = new MemoryStream())
{
wb.SaveAs(memoryStream);
memoryStream.WriteTo(HttpContext.Current.Response.OutputStream);
memoryStream.Close();
}
HttpContext.Current.Response.End();
}
Installed ClosedXML in my project using Nuget Package Manager.
the file you are trying to open is in a different format than
specified by the file extension
You are constantly getting that warning message because the file that got created is not an actual excel file. If you will look into the generated file, it's just a bunch of html tags. Remember that a GridView's RenderControl will generate an html table.
To fix your issue, you need to either use a third party tool that creates a real excel file (one tool you might want to use is NPOI) or create a comma-delimited file, or simply a csv file, and return that file.
In case someone else stumbles across this... I needed to convert blobs back into files on-the-fly in C#. Pdf's worked well and excel gave me this same error as OP explains.
This is the code I wrote which handles excel differently from other file types.
Giving excel application/octet-stream with an actual filename solved my issue. Probably not the cleanest way to do it but it was good enough for my purposes.
string theExt = Path.GetExtension(theDoc.documentFileName).ToUpper();
Response.Clear();
if (theExt == ".XLS" || theExt == ".XLSX"){
Response.ContentType = "application/octet-stream";
Response.AddHeader("Content-Disposition", string.Format("inline; filename={0}", theDoc.documentFileName));
}
else{
Response.ContentType = theDoc.documentMimeType;
Response.AddHeader("Content-Disposition", string.Format("inline; filename={0}", theDoc.documentTitle));
}
using (MemoryStream stream = new MemoryStream(theDoc.file))
{
stream.WriteTo(Response.OutputStream);
stream.Close();
};
Response.End();
In case someone needs to export a dataset as excel file with CloseXML.
Dataset ds = { your data from db }
var xlsx = new XLWorkbook();
var dataTable = ds.Tables[0];
xlsx.Worksheets.Add(dataTable);
xlsx.SaveAs("export.xlsx");