I need to zip files in a directory using groovy -- not using ant though.
I have tried out two versions of a code I found on the net.
1) If I comment out the whole InputStream section then zip file with all files is created. But the files are 0 size.
String zipFileName = ""
String inputDir = "c:/temp"
ZipOutputStream output = new ZipOutputStream(new FileOutputStream(zipFileName))
byte[] buf = new byte[1024]
new File(inputDir).eachFile() { file ->
println file.toString()
output.putNextEntry(new ZipEntry( // Create the name of the entry in the ZIP
InputStream input = file.getInputStream() // Get the data stream to send to the ZIP
// Stream the document data to the ZIP
int len;
while((len = > 0){
output.write(buf, 0, len);
output.closeEntry(); // End of document in ZIP
output.close(); // End of all documents - ZIP is complete
2) If I tried to use this code then the files in the created zip file got incorrect size. Max size is 1024.
import java.nio.channels.FileChannel
String zipFileName = ""
String inputDir = "c:/temp"
ZipOutputStream output = new ZipOutputStream(new FileOutputStream(zipFileName))
new File(inputDir).eachFile() { file ->
zipFile.putNextEntry(new ZipEntry(file.getName()))
def buffer = new byte[1024]
file.withInputStream { i ->
l =
// check wether the file is empty
if (l > 0) {
zipFile.write(buffer, 0, l)

Not sure if the way to get InputStream was good. I could create one using new FileInputStream(file);
Improved from first example, uses Java 7
import java.nio.file.Files
String zipFileName = "c:/"
String inputDir = "c:/temp"
ZipOutputStream output = new ZipOutputStream(new FileOutputStream(zipFileName))
new File(inputDir).eachFile() { file ->
if (!file.isFile()) {
println file.toString()
output.putNextEntry(new ZipEntry( // Create the name of the entry in the ZIP
InputStream input = new FileInputStream(file);
// Stream the document data to the ZIP
Files.copy(input, output);
output.closeEntry(); // End of current document in ZIP
output.close(); // End of all documents - ZIP is complete

based on your own coding, just omit this line.


Upload BASE64 binary file to SharePoint document library

I have been looking for ways to upload BASE64 binary files days and I am stuck.
First of all a do not know how to convert BASE64 binary file to array buffer, blob, ... Everything is about BASE64 string but I have BASE64 binary file.
Do you have any solution?
You need to convert this Base64 string to byte array. C# Programming provide several approaches to do this without trouble. Following Upload large files sample SharePoint Add-in and Convert.FromBase64String(String) Method, both at Microsoft Docs, the final code that meet your requirements will be like this:
//This approach is useful for short files, less than 2Mb:
public void UploadFileContentFromBase64(ClientContext ctx, string libraryName, string fileName, string base64Str)
Web web = ctx.Web;
// Ensure that target library exists. Create if it is missing.
if (!LibraryExists(ctx, web, libraryName))
CreateLibrary(ctx, web, libraryName);
FileCreationInformation newFile = new FileCreationInformation();
// The next line of code causes an exception to be thrown for files larger than 2 MB.
newFile.Content = Convert.FromBase64String(base64Str);
newFile.Url = fileName;
// Get instances to the given library.
List docs = web.Lists.GetByTitle(libraryName);
// Add file to the library.
Microsoft.SharePoint.Client.File uploadFile = docs.RootFolder.Files.Add(newFile);
//This other approach provides you to Upload large files, more than 2Mb:
public void UploadDocumentContentStreamFromBase64(ClientContext ctx, string libraryName, string fileName, string base64Str)
Web web = ctx.Web;
// Ensure that the target library exists. Create it if it is missing.
if (!LibraryExists(ctx, web, libraryName))
CreateLibrary(ctx, web, libraryName);
byte[] fileContent = Convert.FromBase64String(base64Str);
using (MemoryStream memStream = new MemoryStream(fileContent))
FileCreationInformation flciNewFile = new FileCreationInformation();
// This is the key difference for the first case - using ContentStream property
flciNewFile.ContentStream = memStream;
flciNewFile.Url = fileName;
flciNewFile.Overwrite = true;
List docs = web.Lists.GetByTitle(libraryName);
Microsoft.SharePoint.Client.File uploadFile = docs.RootFolder.Files.Add(flciNewFile);

Modifying the file contents of a zipfile entry

I would like to update the contents of text file located inside a zipfile.
I cannot find out how to do this, and the code below is not working properly.
May thanks for any help!!
String zipFileFullPath = "C:/path/to/myzipfile/"
ZipFile zipFile = new ZipFile(zipFileFullPath)
ZipEntry entry = zipFile.getEntry ( "someFile.txt" )
InputStream input = zipFile.getInputStream(entry)
BufferedReader br = new BufferedReader(new InputStreamReader(input, "UTF-8"))
String s = null
StringBuffer sb = new StringBuffer()
while ((s=br.readLine())!=null){
sb.append("adding some text..")
ZipOutputStream out = new ZipOutputStream(new FileOutputStream(zipFileFullPath))
out.putNextEntry(new ZipEntry("someFile.txt"));
int length
InputStream fin = new ByteArrayInputStream(sb.toString().getBytes("UTF8"))
while((length = > 0)
out.write(sb, 0, length)
Just some slight modifications to #Opal's answer, I've just:
used groovy methods where possible
packaged in a method
Groovy Snippet
void updateZipEntry(String zipFile, String zipEntry, String newContent){
def zin = new ZipFile(zipFile)
def tmp = File.createTempFile("temp_${System.nanoTime()}", '.zip')
tmp.withOutputStream { os ->
def zos = new ZipOutputStream(os)
zin.entries().each { entry ->
def isReplaced = == zipEntry
zos.putNextEntry(isReplaced ? new ZipEntry(zipEntry) : entry)
zos << (isReplaced ? newContent.getBytes('UTF8') : zin.getInputStream(entry).bytes )
assert new File(zipFile).delete()
updateZipEntry('/tmp/', 'META-INF/web.xml', '<foobar>new content!</foobar>')
What exactly isn't working? Is there any exception thrown?
As far as I know it's not possible to modify a zip file in situ. The following script rewrites the file and if desired entry is processed - modifies it.
def zipIn = new File('')
def zip = new ZipFile(zipIn)
def zipTemp = File.createTempFile('out', 'zip')
def zos = new ZipOutputStream(new FileOutputStream(zipTemp))
def toModify = 'lol.txt'
for(e in zip.entries()) {
if(! {
zos << zip.getInputStream(e).bytes
} else {
zos.putNextEntry(new ZipEntry(toModify))
zos << 'lollol\n'.bytes
I wasn't right. It's possible to modify zip file in situ, but Your solution will omit other files that were zipped. The output file will contain only one single file - the file You wanted to modify. I also suppose that You file was corrupted because of not invoking close() on out.
Below is You script slightly modified (more groovier):
def zipFileFullPath = ''
def zipFile = new ZipFile(zipFileFullPath)
def entry = zipFile.getEntry('lol.txt')
if(entry) {
def input = zipFile.getInputStream(entry)
def br = new BufferedReader(new InputStreamReader(input, 'UTF-8'))
def sb = new StringBuffer()
sb << br.text
sb << 'adding some text..'
def out = new ZipOutputStream(new FileOutputStream(zipFileFullPath))
out.putNextEntry(new ZipEntry('lol.txt'))
out << sb.toString().getBytes('UTF8')

Replace and delete PDF File

I am using the following piece of code to delete the old PDF and replace the old one with the new one but with no result. Is is possible to perform this operation on PDF files? As, throughout the net I see that these functions are used for .txt,.xls.doc...etc file types. Is there anything wrong with my code? Please help...
private void ListFieldNames(string s)
string pdfTemplate = #"z:\TEMP\PDF\PassportApplicationForm_Main_English_V1.0.pdf";
//var newFile = pdfTemplate;
string newFile = #"z:\TEMP\PDF\_PassportApplicationForm_Main_English_V1.0.pdf";
PdfReader pdfReader = new PdfReader(pdfTemplate);
for (int page = 1; page <= pdfReader.NumberOfPages; page++)
//ITextExtractionStrategy its = new iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy();
PdfReader reader = new PdfReader((string)pdfTemplate);
//PdfStamper stamper = new PdfStamper(reader, new FileStream(newFile, FileMode.Create));
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(newFile, FileMode.Create)))
AcroFields form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
//Replace Address Form field with my custom data
if (fieldKey.Contains("Surname"))
form.SetField(fieldKey, s);
// set form fields
//form.SetField("Address", s);
stamper.FormFlattening = true;
File.Copy(newFile, pdfTemplate);
Everything looks good to me, just change:
File.Copy(newFile, pdfTemplate);
change to:
File.Copy(newFile, pdfTemplate);
You can't copy a file if a file already exists at its location with the same name as it.
Delete existing file first.

How to zip and unzip folders and its sub folders in Silverlight?

I have a Windows Phone application. I am using SharpZipLib to zip folders and its sub folders. This is zipping only the folder but the data inside the folders is not getting zipped. Can anyone guide me how to do this?
My code:
private void btnZip_Click(object sender, RoutedEventArgs e)
using (IsolatedStorageFile appStore = IsolatedStorageFile.GetUserStoreForApplication())
foreach (string filename in appStore.GetFileNames(directoryName + "/" + "*.txt"))
textBlock2.Text = "Created file has Zipped Successfully";
public byte[] GetCompressedByteArray(string content)
byte[] compressedResult;
using (MemoryStream zippedMemoryStream = new MemoryStream())
using (ZipOutputStream zipOutputStream = new ZipOutputStream(zippedMemoryStream))
byte[] buffer;
using (MemoryStream file = new MemoryStream(Encoding.UTF8.GetBytes(content)))
buffer = new byte[file.Length];
file.Read(buffer, 0, buffer.Length);
ZipEntry entry = new ZipEntry(content);
zipOutputStream.Write(buffer, 0, buffer.Length);
compressedResult = zippedMemoryStream.ToArray();
return compressedResult;
public void WriteToIsolatedStorage(byte[] compressedBytes)
IsolatedStorageFile appStore = IsolatedStorageFile.GetUserStoreForApplication();
using (IsolatedStorageFileStream zipTemplateStream = new IsolatedStorageFileStream(ZipFolder+"/"+directoryName + ".zip", FileMode.OpenOrCreate, appStore))
using (BinaryWriter streamWriter = new BinaryWriter(zipTemplateStream))
I think you'll find this guide helpful.
An excerpt from the above link
The ZipFile object provides a method called AddDirectory() that
accepts a parameter directoryName. The problem with this method is
that it doesn't add the files inside the specified directory but
instead just creates a directory inside the zip file. To make this
work, you need to get the files inside that directory by looping thru
all objects in that directory and adding them one at a time. I was
able to accomplish this task by creating a recursive function that
drills through the whole directory structure of the folder you want to
zip. Below is a snippet of the function.
I guess you too are facing the same problem where the folder is added to the zip file, but the contents and sub folders are not zipped.
Hope this helps.
Have a look over here for a code sample on how to use SharpZipLib to zip a root folder including nested folders.

how reading nutch generated content data on the segment folder using java

I am trying to read the content data inside the segment folder. I think the content data file is written in a custom format
I experimented with nutch's Content class, but it does not recognize the format.
import org.apache.commons.cli.Options;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.util.GenericOptionsParser;
import org.apache.nutch.protocol.Content;
import org.apache.nutch.util.NutchConfiguration;
public class ContentReader {
public static void main(String[] args) throws IOException {
// Setup the parser
Configuration conf = NutchConfiguration.create();
Options opts = new Options();
GenericOptionsParser parser = new GenericOptionsParser(conf, opts, args);
String[] remainingArgs = parser.getRemainingArgs();
FileSystem fs = FileSystem.get(conf);
String segment = remainingArgs[0];
Path file = new Path(segment, Content.DIR_NAME + "/part-00000/data");
SequenceFile.Reader reader = new SequenceFile.Reader(fs, file, conf);
Text key = new Text();
Content content = new Content();
// Loop through sequence files
while (, content)) {
try {
System.out.write(content.getContent(), 0,
} catch (Exception e) {
has a map reduce implementation that reads content data in the segment directory.
spark/scala code to read data from the segments content folder.
How I read from the content folder in my project.
I have created a case class page which holds data read from the content folder
case class Page(var url: String, var title: String = null
,var contentType: String = null, var rawHtml: String = null,var language: String = null
,var metadata: Map[String,String])
Code to read from content folder
import org.apache.commons.lang3.StringUtils
import{Text, Writable}
import org.apache.nutch.crawl.{CrawlDatum, Inlinks}
import org.apache.nutch.parse.ParseText
import org.apache.nutch.protocol.Content
val contentDF = spark.sparkContext.sequenceFile(path.contentLocation, classOf[Text], classOf[Writable])
.map { case (x, y) => (x.toString, extract(y.asInstanceOf[Content])) }
/** converts Content object to Page **/
def extract(content: Content): Page = {
try {
val parsed = Page(content.getUrl)
var charset: String = getCharsetFromContentType(content.getContentType)
if (StringUtils.isBlank(charset)) {
charset = "UTF-8"
parsed.rawHtml = Try(new String(content.getContent, charset)).getOrElse(new String(content.getContent, "UTF-8"))
parsed.contentType = Try(content.getMetadata.get("Content-Type")).getOrElse("text/html")
// parsed.isHomePage = Boolean.valueOf(content.getMetadata.get("isHomePage"))
parsed.metadata = content.getMetadata.names().map(name => (name,content.getMetadata.get(name))).toMap
Try {
if (StringUtils.isNotBlank(content.getMetadata.get("Content-Language")))
parsed.language = content.getMetadata.get("Content-Language")
else if (StringUtils.isNotBlank(content.getMetadata.get("language")))
parsed.language = content.getMetadata.get("language")
else parsed.language = content.getMetadata.get("lang")
} catch {
case e: Exception =>
LOG.error("ERROR while extracting data from Content ", e)
/**Get Html ContentType **/
def getCharsetFromContentType(contentType: String): String = {
var result: String = "UTF-8"
Try {
if (StringUtils.isNotBlank(contentType)) {
val m = charsetPattern.matcher(contentType)
result = if (m.find) else "UTF-8"
