Generate a Word document using different languages - apache-poi

I want to create a Word document that uses different languages. In particular, I have a two-language original text where the language changes between English and German for each paragraph. This is what I tried:
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
import org.apache.poi.xwpf.usermodel.XWPFStyles;
public class DocxCreator {
public static void createDocument(File docxOutput) throws IOException {
XWPFDocument doc = new XWPFDocument();
XWPFStyles docStyles = doc.createStyles();
docStyles.setSpellingLanguage("de-DE");
{
XWPFParagraph para = doc.createParagraph();
XWPFRun run = para.createRun();
run.setLanguage("de-DE"); // XXX: this method does not exist
para.setText("Deutsch");
}
{
XWPFParagraph para = doc.createParagraph();
XWPFRun paraRun = para.createRun();
para.setStyle("en-US");
paraRun.setText("English");
}
/*- XXX: How do I add the style “en-US” to the document and set its language to en-US”? */
/* XXX: How do I enable global grammar and spell checking? */
try (FileOutputStream fos = new FileOutputStream(docxOutput)) {
doc.write(fos);
}
}
public static void main(String[] args) throws IOException {
createDocument(new File("multilang.docx"));
}
}

I do not think this is currently supported by POI.
Generally, the language of the text is specified on the XWPFRun (XWPF) / CharacterRun (HWPF) level.
For HWPF (old binary *.doc format) there exists at least a method CharacterRun.getLanguageCode() - but no respective setter.
For XWPF (new *.docx format) I do not see such a thing at all.
The language codes are the same for *.doc and *.docx. A list is available here.

Related

Finding .txt fIle using Java ( new to S.O.)

I'm a beginner when it comes to coding and i would appreciate help with my code that is attempting to find a text file (info.txt) which exists on my computer. It should then print out the text in that file. This is what I have so far, but only to get the error that the file is not found...
import java.io.FileReader;
import java.io.BufferedReader;
import java.io.IOException;
public class demo1 {
static String filename = ("info.txt");
public static void main (String [] args) throws IOException{
FileReader from = new FileReader (fileName);
BufferedReader br = new BufferedReader (fr);
String currentLine;
while ((currentLine = br.readLine ()) != null){
System.out.println(currentLine);
}
}
}
I would greatly appreciate any input and thanks in advance for your help!

How can I set background colour of a run (a word in line or a paragraph) in a docx file by using Apache POI?

I want to create a docx file by using Apache POI.
I want to set background colour of a run (i.e. a word or some parts of a paragraph).
How can I do this?
Is in possible via Apache POI or not.
Thanks in advance
Word provides two possibilities for this. There are really background colors possible within runs. But there are also so called highlighting settings.
With XWPF both possibilities are only possible using the underlying objects CTShd and CTHighlight. But while CTShd is shipped with the default poi-ooxml-schemas-3.13-...jar, for the CTHighlight the fully ooxml-schemas-1.3.jar is needed as mentioned in https://poi.apache.org/faq.html#faq-N10025.
Example:
import java.io.FileOutputStream;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTShd;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STShd;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STHighlightColor;
/*
To
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STHighlightColor;
the fully ooxml-schemas-1.3.jar is needed as mentioned in https://poi.apache.org/faq.html#faq-N10025
*/
public class WordRunWithBGColor {
public static void main(String[] args) throws Exception {
XWPFDocument doc= new XWPFDocument();
XWPFParagraph paragraph = doc.createParagraph();
XWPFRun run=paragraph.createRun();
run.setText("This is text with ");
run=paragraph.createRun();
run.setText("background color");
CTShd cTShd = run.getCTR().addNewRPr().addNewShd();
cTShd.setVal(STShd.CLEAR);
cTShd.setColor("auto");
cTShd.setFill("00FFFF");
run=paragraph.createRun();
run.setText(" and this is ");
run=paragraph.createRun();
run.setText("highlighted");
run.getCTR().addNewRPr().addNewHighlight().setVal(STHighlightColor.YELLOW);
run=paragraph.createRun();
run.setText(" text.");
doc.write(new FileOutputStream("WordRunWithBGColor.docx"));
}
}

word to FO conversion using hwpf apache poi

How do i convert a .doc file to FO using hwpf.converter.WordToFo class? I have tried searching but i could only get a word to html conversion.
I have also read the WordToFO manual at the apache-poi site, but could not get it.
Convert Word to HTML with Apache POI
I have tried to convert .doc to .fo using the following code, but after using apache-fop to convert the .fo file to .png, i am not able to get the images present in the word file.
package word2fo;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import javax.swing.text.Document;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.HWPFDocumentCore;
import org.apache.poi.hwpf.converter.WordToFoConverter;
import org.apache.poi.hwpf.converter.WordToFoUtils;
import org.apache.poi.hwpf.converter.WordToHtmlConverter;
import org.apache.poi.hwpf.converter.WordToHtmlUtils;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.w3c.dom.Node;
public class Doc2Fo{
public static void main(String[] args) throws Exception {
System.out.println("reached 1");
HWPFDocumentCore wordDocument = WordToFoUtils.loadDoc(new FileInputStream("D:\\Magna.doc"));
System.out.println("reached 2");
WordToFoConverter wordToFoConverter = new WordToFoConverter(
DocumentBuilderFactory.newInstance().newDocumentBuilder()
.newDocument());
System.out.println("reached 3");
wordToFoConverter.processDocument(wordDocument);
org.w3c.dom.Document htmlDocument = wordToFoConverter.getDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource((Node) htmlDocument);
StreamResult streamResult = new StreamResult(out);
System.out.println("reached 4");
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer;
try {
serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
//serializer.setOutputProperty(OutputKeys.METHOD, "xml-fo");
serializer.transform(domSource, streamResult);
out.close();
String result = new String(out.toByteArray());
System.out.println(result);
} catch (TransformerConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
HWPFDocument hwpfDocument = new HWPFDocument(POIDataSamples.getDocumentInstance().openResourceAsStream(sampleFileName));
WordToFoConverter wordToFoConverter = new WordToFoConverter(XMLHelper.getDocumentBuilderFactory().newDocumentBuilder().newDocument());
wordToFoConverter.processDocument(hwpfDocument);
StringWriter stringWriter = new StringWriter();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(new DOMSource(wordToFoConverter.getDocument()), new StreamResult(stringWriter));
String result = stringWriter.toString();
return result;

Read a String-object out of a .txt file from the res folder of a Blackberry app

I just started to develop a simple Blackberry app which shows a text sequence in a RichTextField on a MainScreen. When I define the String directly in the sourcecode, then I have no problem to display it. But if I try to read it in from a .txt file which is located in the res folder, then I get a NullPointerException.
The code below is what I did so far.
package mypackage;
import java.io.IOException;
import java.io.InputStream;
import net.rim.device.api.io.IOUtilities;
import net.rim.device.api.ui.component.RichTextField;
import net.rim.device.api.ui.container.MainScreen;
public final class MyScreen extends MainScreen{
String str = readFile("Testfile.txt");
public MyScreen(){
setTitle("Read Files");
add(new RichTextField(str));
}
public String readFile(String filename){
InputStream is = this.getClass().getResourceAsStream("/"+filename);
try {
byte[] filebytes = IOUtilities.streamToBytes(is);
is.close();
return new String(filebytes);
}
catch (IOException e){
System.out.println(e.getMessage());
}
return "";
}
}
Parts of this code I found in this forum but my problem is that I don't understand when I have to open a connection and when to close it.
And when do I need a Buffer?
And why do I have to convert a InputStream to a byte[] and then the byte[] to a String?
All I need is one method, where I can type in the Filename and get back a String-Object with the text which is in my .txt file.
And of course the method should save resources...
package mypackage;
import java.io.IOException;
import java.io.InputStream;
import net.rim.device.api.io.IOUtilities;
import net.rim.device.api.ui.component.RichTextField;
import net.rim.device.api.ui.container.MainScreen;
public final class MyScreen extends MainScreen {
public MyScreen() throws IOException {
setTitle("Read Files");
add(new RichTextField(readFileToString("Testfile.txt")));
}
public String readFileToString(String path) throws IOException {
InputStream is = getClass().getResourceAsStream("/"+path);
byte[] content = IOUtilities.streamToBytes(is);
is.close();
return new String(content);
}
}
Yes!!! I found a way to solve my problem.
I don't know why my previous code didn't work but this one works...
The only thing I've changed is that I've added the throws IOException instead of surrounding it with a try - catch block...

Formatting text using Apache POI 3.8 (HWPF)

I am trying to insert the following text in the document using Apache POI 3.8:
[Bold][Normal],
but the output document has this:
[Bold][Normal]
The code:
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.*;
import java.io.*;
public class Main {
public static void main(String[] args) throws IOException {
final HWPFDocument doc = new HWPFDocument(new FileInputStream("empty.dot"));
final Range range = doc.getRange();
final CharacterRun cr1 = range.insertAfter("[Bold]");
cr1.setBold(true);
final CharacterRun cr2 = cr1.insertAfter("[Normal]");
cr2.setBold(false);
doc.write(new FileOutputStream("output.doc"));
}
}
What is the correct way of doing this?
I do it like this. Using POI 3.11
paragraph = doc.createParagraph();
paragraph.setStyle(DOG_HEAD_STYLE);
XWPFRun tmpRun = paragraph.createRun();
tmpRun.setText("non bold text ");
tmpRun = paragraph.createRun();
tmpRun.setBold(true);
tmpRun.setText("bold text");
tmpRun = paragraph.createRun();
tmpRun.setBold(false);
tmpRun.setText(" non bold text again");

Resources