how can I know file's locale? - locale

I want to read a file(like .txt) and do std::cout.
but if the file's locale doesn't same with my system's locale, it will print weird.
this is my question. how can I know the file's locale?
if I can get file's locale, I can change system's locale to file's locale and it will print clearly.

Read file using something like:
new InputStreamReader(new FileInputStream(...), )
For encoding Use the required encoding based on file source, you can guess that or try UTF-8 for a test.
Example:
String file = "file location";
String encoding = "utf-8";
try {
new InputStreamReader(new FileInputStream(file), encoding);
} catch (UnsupportedEncodingException e) {
System.out.println("Encoding Uknown " + encoding);
} catch (FileNotFoundException e) {
e.printStackTrace();
}

Related

when read excel how to skip some invalid characters

Read some excel using poi failed, encountered such an error
Caused by: org.xml.sax.SAXParseException; systemId: file://; lineNumber: 105; columnNumber: 147342; An invalid XML character (Unicode: 0xffff) was found in the element content of the document.
at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204)
at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:178)
at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
From xl/sharedStrings.xml, there exist <ffff> cause this problem.
How could read it successfully and just ignore these invalid characters? e.g.
aaa <ffff> bbb ==> aaa bbb
Those invalid characters should not be in the XML and Excel itself will not put them into there. So someone probably had done something wrong while creating that file using something else than Excel. That error should be avoided rather than trying to ignore the symptoms.
But I know how it feels to be depemdent on others work which will be done in far future, if even. So one needs improvising. But that is in this case only possible using ugly low level methods. Because the XML is invalid, parsing XML is not possible. So only String replacing will be possible.
In APACHE POI EXCEL XmlException: is an invalid XML character, is there any way to preprocess the excel file? I had schown this already. In that case to replace UTF-16-surrogate-pair numeric character references which also are invalid in XML.
In following I will show a code which is more flexible to add multiple other repairing actions to /xl/sharedStrings.xml if necessary.
The principle is using OPCPackage, which is the *.xlsx ZIP package, to get out the /xl/sharedStrings.xml as text string. Then do the needed replacings and put the repaired /xl/sharedStrings.xml back into the OPCPackage. Then do creating the XSSFWorkbook from that repaired OPCPackage instead of from the corrupt file.
import java.io.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import org.apache.poi.openxml4j.opc.*;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
class RepairSharedStringsTable {
static String removeInvalidXmlCharacters(String string) {
String xml10pattern = "[^"
+ "\u0009\r\n"
+ "\u0020-\uD7FF"
+ "\uE000-\uFFFD"
+ "\ud800\udc00-\udbff\udfff"
+ "]";
string = string.replaceAll(xml10pattern, "");
return string;
}
static void repairSharedStringsTable(OPCPackage opcPackage) {
for (PackagePart packagePart : opcPackage.getPartsByName(Pattern.compile("/xl/sharedStrings.xml"))) {
String sharedStrings = "";
try (BufferedInputStream inputStream = new BufferedInputStream(packagePart.getInputStream());
ByteArrayOutputStream sharedStringsBytes = new ByteArrayOutputStream() ) {
byte[] buffer = new byte[1024];
int length;
while ((length = inputStream.read(buffer)) != -1) {
sharedStringsBytes.write(buffer, 0, length);
}
sharedStrings = sharedStringsBytes.toString("UTF-8");
} catch (Exception ex) {
ex.printStackTrace();
}
System.out.println(sharedStrings);
//sharedStrings = replaceUTF16SurrogatePairs(sharedStrings);
sharedStrings = removeInvalidXmlCharacters(sharedStrings);
//sharedStrings = doSomethingElse(sharedStrings);
System.out.println(sharedStrings);
try (BufferedOutputStream outputStream = new BufferedOutputStream(packagePart.getOutputStream()) ) {
outputStream.write(sharedStrings.getBytes("UTF-8"));
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
public static void main(String[] args) throws Exception {
try (XSSFWorkbook workbook = new XSSFWorkbook(new FileInputStream("./Excel.xlsx"))) {
System.out.println("success");
} catch (Exception ex) {
System.out.println("failed");
ex.printStackTrace();
}
OPCPackage opcPackage = OPCPackage.open(new FileInputStream("./Excel.xlsx"));
repairSharedStringsTable(opcPackage);
opcPackage.flush();
try (XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);
FileOutputStream out = new FileOutputStream("./ExcelRepaired.xlsx");) {
workbook.write(out);
System.out.println("success");
} catch (Exception ex) {
System.out.println("failed");
ex.printStackTrace();
}
}
}
In my case below files all have invalid characters
xl/sharedStrings.xml
xl/worksheets/sheet1.xml
xl/worksheets/sheet8.xml
All these xml should be processed
opcPackage.getPartsByName(Pattern.compile("(/xl/sharedStrings.xml)|(/xl/worksheets/.+\\.xml)"))

How to avoid extra line feeds in MIME export?

I'm exporting MIME eMails with the following code:
public String fromRawMime(final Session s, final Document doc) throws NotesException {
final Stream notesStream = s.createStream();
final MIMEEntity rootMime = doc.getMIMEEntity();
// check if it is multi-part or single
if (rootMime.getContentType().equals("multipart")) {
this.printMIME(rootMime, notesStream);
} else {
// We can just write the content into the
// Notes stream to get the bytes
rootMime.getEntityAsText(notesStream);
}
// Write it out
notesStream.setPosition(0);
ByteArrayOutputStream out = new ByteArrayOutputStream();
out.append(notesStream.read());
notesStream.close();
notesStream.recycle();
rootMime.recycle();
return out.toString();
}
// Write out a mime entry to a Stream object, includes sub entries
private void printMIME(final MIMEEntity mimeRoot, final Stream out) throws NotesException {
if (mimeRoot == null) {
return;
}
// Encode binary as base64
if (mimeRoot.getEncoding() == MIMEEntity.ENC_IDENTITY_BINARY) {
mimeRoot.decodeContent();
mimeRoot.encodeContent(MIMEEntity.ENC_BASE64);
}
out.writeText(mimeRoot.getBoundaryStart(), Stream.EOL_NONE);
mimeRoot.getEntityAsText(out);
out.writeText(mimeRoot.getBoundaryEnd(), Stream.EOL_NONE);
if (mimeRoot.getContentType().equalsIgnoreCase("multipart")) {
// Print preamble if it isn't empty
final String preamble = mimeRoot.getPreamble();
if (!preamble.isEmpty()) {
out.writeText(preamble, Stream.EOL_NONE);
}
// Print content of each child entity - recursive calls
// Include recycle of mime elements
MIMEEntity mimeChild = mimeRoot.getFirstChildEntity();
while (mimeChild != null) {
this.printMIME(mimeChild, out);
final MIMEEntity mimeNext = mimeChild.getNextSibling();
// Recycle to ensure we don't bleed memory
mimeChild.recyle();
mimeChild = mimeNext;
}
}
}
The result contains one empty line for each line. Including the content that gets added using getEntityAsText. What am I missing to get rid of the extra lines?
The email RFCs require the use of CRLF to terminate text lines.
You are using EOL_NONE, so the writeText method isn't adding anything to the text, but apparently both the CR and LF are being treated as newlines in your output. You may want to try using out.writeText with EOL_PLATFORM instead.
The devils is in the details...
the printMIME function works just fine. Changing the EOL didn't have an impact. However I added EOL_PLATFORM later on for the final result to separate the headers from the content.
The offending code is this:
notesStream.setPosition(0);
ByteArrayOutputStream out = new ByteArrayOutputStream();
out.append(notesStream.read());
notesStream.close();
Turns out that it seems to interpret whatever was in the MIME as 2 line feeds. So the code needs to be changed to:
notesStream.setPosition(0);
String out = notesStream.readText();
notesStream.close();
so instead of a OutputStream I needed a String and instead of read() I needed readText(). Now working happily in my "project castle"

OpenIMAJ - error reading feature list saved as ascii

Working with OpenIMAJ I'd like to save feature lists for later use but I'm getting a java.util.NoSuchElementException: No line found exception (see below) while re-reading the feature file I just saved. I've checked that the text file exists though I'm not really sure whether the full contents is what is ought to be (it's very long).
Any ideas what's wrong?
Thanks in advance!
(My trial code is pasted below).
java.util.NoSuchElementException: No line found
at java.util.Scanner.nextLine(Unknown Source)
at org.openimaj.image.feature.local.keypoints.Keypoint.readASCII(Keypoint.java:296)
at org.openimaj.feature.local.list.LocalFeatureListUtils.readASCII(LocalFeatureListUtils.java:170)
at org.openimaj.feature.local.list.LocalFeatureListUtils.readASCII(LocalFeatureListUtils.java:136)
at org.openimaj.feature.local.list.MemoryLocalFeatureList.read(MemoryLocalFeatureList.java:134)
...
My trial code looks like this:
Video<MBFImage> originalVideo = getVideo();
MBFImage frame = originalVideo.getCurrentFrame().clone();
DoGSIFTEngine engine = new DoGSIFTEngine();
LocalFeatureList<Keypoint> originalFeatureList = engine.findFeatures(frame.flatten());
try {
originalFeatureList.writeASCII(new PrintWriter(new File("featureList.txt")));
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Saved feature list with "+originalFeatureList.size()+" keypoints.");
MemoryLocalFeatureList<Keypoint> loadedFeatureList = null;
try {
loadedFeatureList = MemoryLocalFeatureList.read(new File("featureList.txt"), Keypoint.class);
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Loaded feature list with "+loadedFeatureList.size()+" keypoints.");
I think the problem is that you're not closing the PrintWriter used to save the features, and that it hasn't had a time to actually write the contents. However you shouldn't really use the LocalFeatureList.writeASCII method directly as it will not write the header information; rather use IOUtils.writeASCII. Replace:
originalFeatureList.writeASCII(new PrintWriter(new File("featureList.txt")));
with
IOUtils.writeASCII(new File("featureList.txt"), originalFeatureList);
and then it should work. This also deals with closing the file once it's written.

Convert office document to pdf and display it on the browser

Please see the update question below (not the top one).
I tried to open any document type (especially PDF) on Liferay using this function. But I always get message Awt Desktop is not supported! as stated on the function. How can I enable the Awt Desktop? I tried searching over the internet and found nothing. Anyone help, pls? Thanks.
public void viewFileByAwt(String file) {
try {
File File = new File(getPath(file));
if (File.exists()) {
if (Desktop.isDesktopSupported()) {
Desktop.getDesktop().open(File);
} else {
System.out.println("Awt Desktop is not supported!");
}
} else {
//File is not exists
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
Source: http://www.mkyong.com/java/how-to-open-a-pdf-file-in-java/
UPDATE
As you see the code below, both mode (1 for download and 2 for preview) is working pretty well, but unfortunately the second mode (preview mode) is works only for PDF.
Now what I want to do is, while user clicking the preview button, files another than PDF (limited only for extension: DOC, DOCX, XLS, XLSX, ODT, ODS) must be converted to PDF first, and then display it on the browser with the same way as below code explained. Is it possible to do that? If it's too hard to have all of the converter on a function, then on a separated function each extension would be fine.
public StreamedContent getFileSelected(final StreamedContent doc, int mode) throws Exception {
//Mode: 1-download, 2-preview
try {
File localfile = new File(getPath(doc.getName()));
FileInputStream fis = new FileInputStream(localfile);
if (mode == 2 && !(doc.getName().substring(doc.getName().lastIndexOf(".") + 1)).matches("pdf")) {
localfile = DocumentConversionUtil.convert(doc.getName(), fis, doc.getName().substring(doc.getName().lastIndexOf(".") + 1), "pdf");
fis = new FileInputStream(localfile.getPath());
}
if (localfile.exists()) {
try {
PortletResponse portletResponse = (PortletResponse) FacesContext.getCurrentInstance().getExternalContext().getResponse();
HttpServletResponse res = PortalUtil.getHttpServletResponse(portletResponse);
if (mode == 1) res.setHeader("Content-Disposition", "attachment; filename=\"" + doc.getName() + "\"");
else if (mode == 2) res.setHeader("Content-Disposition", "inline; filename=\"" + doc.getName() + "\"");
res.setHeader("Content-Transfer-Encoding", "binary");
res.setContentType(getMimeType(localfile.getName().substring(localfile.getName().lastIndexOf(".") + 1)));
res.flushBuffer();
OutputStream out = res.getOutputStream();
byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = fis.read(buffer)) != -1) {
out.write(buffer, 0, bytesRead);
buffer = new byte[4096];
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (fis != null)
fis.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
} catch (Exception ex) {
ex.printStackTrace();
}
return null;
}
Liferay is a portal server; its user interface runs in a browser. AWT is the Java 1.0 basis for desktop UIs.
I don't think AWT is the way to display it.
Why can't you open the file and stream the bytes to the portlet using the application/pdf MIME type?
You have to first install openoffice on your machine
http://www.liferay.com/documentation/liferay-portal/6.1/user-guide/-/ai/openoffice
After configuring openoffice with liferay, you can use DocumentConversionUtil class from liferay to convert documents.
DocumentConversionUtil.convert(String id, InputStream is, String sourceExtension,String targetExtension)
Above code will return inputstream. After this conversion you can show pdf in your browser
Hope this helps you!!

Set file permission in java 5

I am using the below code to upload image. The problem is that after uploading the image i cant change the file permission. my file permission set by default is rw-r--r-- (0644). Is it possible to change the file permission or set it as 0777 by default? It works fine in my local system. But not able to change the permission in my linux server.
<%
try
{
int filesize=0;
String fieldname="",fieldvalue="",filename="",content="",bookid="",bkdescription="";
try {
List<FileItem> items = new ServletFileUpload(new DiskFileItemFactory()).parseRequest(request);
for (FileItem item : items) {
if (item.isFormField()) {
fieldname = item.getFieldName();
fieldvalue = item.getString();
if(fieldname.equals("homeid")){
bookid=fieldvalue;
}
if(fieldname.equals("bkdescription")){
bkdescription=fieldvalue;
}
} else {
try{
fieldname = item.getFieldName();
filename = FilenameUtils.getName(item.getName());
InputStream filecontent = item.getInputStream();
filesize=(int)item.getSize();
filename="literal_"+bookid+".jpg";
if(filesize>0){
byte[] b=new byte[filesize];
int c=0;
File f=new File(getServletConfig().getServletContext().getRealPath("/")+"/imagesX");
String filePah=getServletConfig().getServletContext().getRealPath("/")+"/imagesX";
if(f.isDirectory())
{
String fl[]=f.list();
for(int i=0;i<fl.length;i++)
{
File fd=new File(getServletConfig().getServletContext().getRealPath("/")+"/imagesX/"+fl[i]);
if(fd.getName().equals(filename))
fd.delete();
}
}
if(!f.exists())
{
new File(filePah).mkdir();
f.mkdir()
}
java.io.FileOutputStream fout=new java.io.FileOutputStream(getServletConfig().getServletContext().getRealPath("/")+"/imagesX/"+filename);
while((c = filecontent.read(b)) != -1 )
{
fout.write(b, 0, c);
}
fout.close();
filecontent.close();
}
}catch (Exception e) {
System.out.println("Exception in creation of file :"+e);
}
}
}
} catch (FileUploadException e) {
throw new ServletException("Cannot parse multipart request.", e);
}
}
catch(Exception exp)
{
out.println(exp);
}
%>
You cannot change the file permission from inside java code.
Your system's default umask is set to 0644 for new file. It wouldn't be good idea to change the default umask.
What you need is to do is set the permission of your directory to 0777 and then redefine your directory's ACL to recursive, so all new file created inside will inherit the same permission.
Heres a link which shows how to go about -
https://superuser.com/questions/151911/how-to-make-new-file-permission-inherit-from-the-parent-directory
An alternative solution is to change the permissions externally with a system command, chmod.
Example:
public static void runCmd (String[] cmd) {
try {
Process p = Runtime.getRuntime().exec(cmd);
BufferedReader r = new BufferedReader(
new InputStreamReader (
p.getInputStream()
)
);
} catch(Exception e) {
}
}
runCmd(new String[] {
"/bin/chmod",
"755",
"/path/to/your/script"
});
P.S. were you also trying to call Java from a stored proc in an Oracle database?

Resources