how to use cmu sphinx segmenter class

how to use cmu sphinx segmenter class - cmusphinx

I am trying to use segmenter class in cmu sphinx to get the time when a speech is recognized in a sound file. However, I can't seem to get it to compile and run. Is there something that needs to be configured first in order to use segmenter? Sorry I am new to CMU Sphinx.
------------------------------------Here is the code for the segmenter------------------------------
package edu.cmu.sphinx.tools.endpoint;
import java.io.File;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Scanner;
import edu.cmu.sphinx.frontend.Data;
import edu.cmu.sphinx.frontend.FrontEnd;
import edu.cmu.sphinx.frontend.util.AudioFileDataSource;
import edu.cmu.sphinx.frontend.util.WavWriter;
import edu.cmu.sphinx.util.props.ConfigurationManager;
import edu.cmu.sphinx.util.props.ConfigurationManagerUtils;
public class Segmenter {
public static void main(String[] argv) throws MalformedURLException,
IOException {
String configFile = null;
String inputFile = null;
String inputCtl = null;
String outputFile = null;
boolean noSplit = false;
for (int i = 0; i < argv.length; i++) {
if (argv[i].equals("-c")) {
configFile = argv[++i];
}
if (argv[i].equals("-i")) {
inputFile = argv[++i];
}
if (argv[i].equals("-ctl")) {
inputCtl = argv[++i];
}
if (argv[i].equals("-o")) {
outputFile = argv[++i];
}
if (argv[i].equals("-no-split")) {
noSplit = Boolean.parseBoolean(argv[i]);
}
}
if ((inputFile == null && inputCtl == null) || outputFile == null) {
System.out
.println("Usage: java -cp lib/batch.jar:lib/sphinx4.jar edu.cmu.sphinx.tools.endpoint.Segmenter "
+ "[ -config configFile ] -name frontendName "
+ "< -i input File -o outputFile | -ctl inputCtl -i inputFolder -o outputFolder >");
System.exit(1);
}
URL configURL;
if (configFile == null)
configURL = Segmenter.class.getResource("frontend.config.xml");
else
configURL = new File(configFile).toURI().toURL();
ConfigurationManager cm = new ConfigurationManager(configURL);
if (noSplit) {
ConfigurationManagerUtils.setProperty(cm, "wavWriter",
"captureUtterances", "false");
}
if (inputCtl != null) {
ConfigurationManagerUtils.setProperty(cm, "wavWriter",
"isCompletePath", "true");
}
if (inputCtl == null)
processFile(inputFile, outputFile, cm);
else
processCtl(inputCtl, inputFile, outputFile, cm);
}
static private void processFile(String inputFile, String outputFile,
ConfigurationManager cm) throws MalformedURLException, IOException {
FrontEnd frontend = (FrontEnd) cm.lookup("endpointer");
AudioFileDataSource dataSource = (AudioFileDataSource) cm
.lookup("audioFileDataSource");
System.out.println(inputFile);
dataSource.setAudioFile(new File(inputFile), null);
WavWriter wavWriter = (WavWriter) cm.lookup("wavWriter");
wavWriter.setOutFilePattern(outputFile);
frontend.initialize();
Data data = null;
do {
data = frontend.getData();
} while (data != null);
}
static private void processCtl(String inputCtl, String inputFolder,
String outputFolder, ConfigurationManager cm)
throws MalformedURLException, IOException {
Scanner scanner = new Scanner(new File(inputCtl));
while (scanner.hasNext()) {
String fileName = scanner.next();
String inputFile = inputFolder + "/" + fileName + ".wav";
String outputFile = outputFolder + "/" + fileName + ".wav";
processFile(inputFile, outputFile, cm);
}
scanner.close();
}
}
--------------------------and this is the error when i try to compile it---------------------------
ant -f C:\Users\Gerard\Documents\NetBeansProjects\sphinx4-5prealpha\nbproject\ide-file-targets.xml -Drun.class=edu.cmu.sphinx.tools.endpoint.Segmenter run-selected-file-in-sphinx4
run-selected-file-in-sphinx4:
Usage: java -cp lib/batch.jar:lib/sphinx4.jar edu.cmu.sphinx.tools.endpoint.Segmenter [ -config configFile ] -name frontendName < -i input File -o outputFile | -ctl inputCtl -i inputFolder -o outputFolder >
C:\Users\Gerard\Documents\NetBeansProjects\sphinx4-5prealpha\nbproject\ide-file-targets.xml:38: Java returned: 1
BUILD FAILED (total time: 0 seconds)
------------------------and this is ide-file-targets.xml line 38----------------------------------
java classname="${run.class}" failonerror="true" fork="true"

In common case see the TranscriberDemo. If you need to time-align long audio, see the AlignerDemo.

Related

How to read a compressed (gzip) file without extension in Spark

I am new to Spark and have a fun task in hand where I have to read a bunch of files from S3, which have some xml content in them.
These files are compressed (Gzip) but do not have that extension.
I read some questions on this here where people suggest to extend the default codec in Spark and force a different extension.
But in my case, there is no extension and the files are named in some 16 digit UUID format such as 2c7358ca472ad91057da84adfba.

You can use newAPIHadoopFile (instead of textFile) with a custom/modified TextInputFormat which forces the use of the GzipCodec.
Instead of calling sparkContext.textFile,
// gzip compressed but no .gz extension:
sparkContext.textFile("s3://mybucket/uuid")
we can use the underlying sparkContext.newAPIHadoopFile which allows us to specify how to read the input:
import org.apache.hadoop.mapreduce.lib.input.GzipInputFormatWithoutExtention
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.io.{LongWritable, Text}
sparkContext
.newAPIHadoopFile(
"s3://mybucket/uuid",
classOf[GzipInputFormatWithoutExtention], // This is our custom reader
classOf[LongWritable],
classOf[Text],
new Configuration(sparkContext.hadoopConfiguration)
)
.map { case (_, text) => text.toString }
The usual way of calling newAPIHadoopFile would be with TextInputFormat. This is the part which wraps how the file is read and where the compression codec is chosen based on the file extension.
Let's call it GzipInputFormatWithoutExtention and implement it as follow as an extension of TextInputFormat (this is a Java file and let's put it in package src/main/java/org/apache/hadoop/mapreduce/lib/input):
package org.apache.hadoop.mapreduce.lib.input;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.JobContext;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import com.google.common.base.Charsets;
public class GzipInputFormatWithoutExtention extends TextInputFormat {
public RecordReader<LongWritable, Text> createRecordReader(
InputSplit split,
TaskAttemptContext context
) {
String delimiter =
context.getConfiguration().get("textinputformat.record.delimiter");
byte[] recordDelimiterBytes = null;
if (null != delimiter)
recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
// Here we use our custom `GzipWithoutExtentionLineRecordReader`
// instead of `LineRecordReader`:
return new GzipWithoutExtentionLineRecordReader(recordDelimiterBytes);
}
#Override
protected boolean isSplitable(JobContext context, Path file) {
return false; // gzip isn't a splittable codec (as opposed to bzip2)
}
}
In fact we have to go one level deeper and also replace the default LineRecordReader (Java) with our own (let's call it GzipWithoutExtentionLineRecordReader).
As it's quite difficult to inherit from LineRecordReader, we can copy LineRecordReader (in src/main/java/org/apache/hadoop/mapreduce/lib/input) and slightly modify (and simplify) the initialize(InputSplit genericSplit, TaskAttemptContext context) method by forcing the usage of the Gzip codec:
(the only changes compared to the original LineRecordReader have been given a comment explaining what's happening)
package org.apache.hadoop.mapreduce.lib.input;
import java.io.IOException;
import org.apache.hadoop.io.compress.*;
import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.Seekable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
#InterfaceAudience.LimitedPrivate({"MapReduce", "Pig"})
#InterfaceStability.Evolving
public class GzipWithoutExtentionLineRecordReader extends RecordReader<LongWritable, Text> {
private static final Logger LOG =
LoggerFactory.getLogger(GzipWithoutExtentionLineRecordReader.class);
public static final String MAX_LINE_LENGTH =
"mapreduce.input.linerecordreader.line.maxlength";
private long start;
private long pos;
private long end;
private SplitLineReader in;
private FSDataInputStream fileIn;
private Seekable filePosition;
private int maxLineLength;
private LongWritable key;
private Text value;
private boolean isCompressedInput;
private Decompressor decompressor;
private byte[] recordDelimiterBytes;
public GzipWithoutExtentionLineRecordReader(byte[] recordDelimiter) {
this.recordDelimiterBytes = recordDelimiter;
}
public void initialize(
InputSplit genericSplit,
TaskAttemptContext context
) throws IOException {
FileSplit split = (FileSplit) genericSplit;
Configuration job = context.getConfiguration();
this.maxLineLength = job.getInt(MAX_LINE_LENGTH, Integer.MAX_VALUE);
start = split.getStart();
end = start + split.getLength();
final Path file = split.getPath();
// open the file and seek to the start of the split
final FileSystem fs = file.getFileSystem(job);
fileIn = fs.open(file);
// This line is modified to force the use of the GzipCodec:
// CompressionCodec codec = new CompressionCodecFactory(job).getCodec(file);
CompressionCodecFactory ccf = new CompressionCodecFactory(job);
CompressionCodec codec = ccf.getCodecByClassName(GzipCodec.class.getName());
// This part has been extremely simplified as we don't have to handle
// all the different codecs:
isCompressedInput = true;
decompressor = CodecPool.getDecompressor(codec);
if (start != 0) {
throw new IOException(
"Cannot seek in " + codec.getClass().getSimpleName() + " compressed stream"
);
}
in = new SplitLineReader(
codec.createInputStream(fileIn, decompressor), job, this.recordDelimiterBytes
);
filePosition = fileIn;
if (start != 0) {
start += in.readLine(new Text(), 0, maxBytesToConsume(start));
}
this.pos = start;
}
private int maxBytesToConsume(long pos) {
return isCompressedInput
? Integer.MAX_VALUE
: (int) Math.max(Math.min(Integer.MAX_VALUE, end - pos), maxLineLength);
}
private long getFilePosition() throws IOException {
long retVal;
if (isCompressedInput && null != filePosition) {
retVal = filePosition.getPos();
} else {
retVal = pos;
}
return retVal;
}
private int skipUtfByteOrderMark() throws IOException {
int newMaxLineLength = (int) Math.min(3L + (long) maxLineLength,
Integer.MAX_VALUE);
int newSize = in.readLine(value, newMaxLineLength, maxBytesToConsume(pos));
pos += newSize;
int textLength = value.getLength();
byte[] textBytes = value.getBytes();
if ((textLength >= 3) && (textBytes[0] == (byte)0xEF) &&
(textBytes[1] == (byte)0xBB) && (textBytes[2] == (byte)0xBF)) {
LOG.info("Found UTF-8 BOM and skipped it");
textLength -= 3;
newSize -= 3;
if (textLength > 0) {
textBytes = value.copyBytes();
value.set(textBytes, 3, textLength);
} else {
value.clear();
}
}
return newSize;
}
public boolean nextKeyValue() throws IOException {
if (key == null) {
key = new LongWritable();
}
key.set(pos);
if (value == null) {
value = new Text();
}
int newSize = 0;
while (getFilePosition() <= end || in.needAdditionalRecordAfterSplit()) {
if (pos == 0) {
newSize = skipUtfByteOrderMark();
} else {
newSize = in.readLine(value, maxLineLength, maxBytesToConsume(pos));
pos += newSize;
}
if ((newSize == 0) || (newSize < maxLineLength)) {
break;
}
LOG.info("Skipped line of size " + newSize + " at pos " +
(pos - newSize));
}
if (newSize == 0) {
key = null;
value = null;
return false;
} else {
return true;
}
}
#Override
public LongWritable getCurrentKey() {
return key;
}
#Override
public Text getCurrentValue() {
return value;
}
public float getProgress() throws IOException {
if (start == end) {
return 0.0f;
} else {
return Math.min(1.0f, (getFilePosition() - start) / (float)(end - start));
}
}
public synchronized void close() throws IOException {
try {
if (in != null) {
in.close();
}
} finally {
if (decompressor != null) {
CodecPool.returnDecompressor(decompressor);
decompressor = null;
}
}
}
}

Java read bytes from Socket on Linux

I'm trying to send a file from my Windows machine to my Raspberry-Pi 2, and I have a client and a server. The client should be able to send a zip file over the network to my server on my linux machine. I know my client and server work on Windows, as when I run both the client and server on windows and connect using 127.0.0.1 it works perfectly. But when sending it to my Pi, nothing gets sent over the socket. Any suggestion?
Server:
package zipsd;
import java.io.BufferedOutputStream;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.PrintStream;
import java.net.ServerSocket;
import java.net.Socket;
public class Main {
public static void main(String[] args) throws Exception {
if (args.length < 3)
System.out.println("Usage: zipsd <port> <directory> <password>");
else {
int port = Integer.parseInt(args[0]);
String directory = args[1];
String password = args[2];
System.out.println("zipsd: starting server on port " + port);
System.out.println("zipsd: directory = " + directory);
ServerSocket ss = new ServerSocket(port);
System.out.println("zipsd: listening...");
while (true) {
try {
Socket client = ss.accept();
System.out
.println("zipsd: from " + client.getInetAddress());
InputStream input = client.getInputStream();
BufferedReader in = new BufferedReader(
new InputStreamReader(input));
PrintStream out = new PrintStream(client.getOutputStream());
String pwdAttempt = in.readLine();
if (pwdAttempt != null) {
if (!pwdAttempt.equals(password)) {
out.println("[SERVER] zipsd: invalid password");
} else {
out.println("[SERVER] zipsd: authenticated");
String zipName = in.readLine();
if (zipName != null) {
File zipFile = new File(directory + "/"
+ zipName);
try {
FileOutputStream fos = new FileOutputStream(
zipFile);
BufferedOutputStream bos = new BufferedOutputStream(
fos);
byte[] data = new byte[1024 * 1024 * 50];
int count;
while((count = input.read(data)) > 0)
bos.write(data, 0, count);
for(int i = 0; i < 200; i++) //to see if data gets sent, it just prints 0's :(
System.out.println(data[i]);
System.out.println("Got zip file " + zipName);
bos.flush();
fos.close();
bos.close();
out.close();
in.close();
client.close();
} catch (Exception e) {
out.println("[SERVER] zipsd: error in transfer.");
}
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
}
Client:
package zipsend;
import java.io.BufferedInputStream;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.PrintStream;
import java.net.Socket;
import java.nio.file.Files;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;
public class Main {
public static class ZipUtils {
public static void zipFolder(final File folder, final File zipFile)
throws IOException {
zipFolder(folder, new FileOutputStream(zipFile));
}
public static void zipFolder(final File folder,
final OutputStream outputStream) throws IOException {
try (ZipOutputStream zipOutputStream = new ZipOutputStream(
outputStream)) {
processFolder(folder, zipOutputStream, folder.getPath()
.length() + 1);
zipOutputStream.flush();
zipOutputStream.finish();
zipOutputStream.close();
}
}
private static void processFolder(final File folder,
final ZipOutputStream zipOutputStream, final int prefixLength)
throws IOException {
for (final File file : folder.listFiles()) {
if (file.isFile()) {
final ZipEntry zipEntry = new ZipEntry(file.getPath()
.substring(prefixLength));
zipOutputStream.putNextEntry(zipEntry);
try (FileInputStream inputStream = new FileInputStream(file)) {
byte[] buf = new byte[(int) file.length() + 1];
int read = 0;
while ((read = inputStream.read(buf)) != -1) {
zipOutputStream.write(buf, 0, read);
}
}
zipOutputStream.flush();
zipOutputStream.closeEntry();
} else if (file.isDirectory()) {
processFolder(file, zipOutputStream, prefixLength);
}
}
}
}
public static void main(String[] args) throws Exception {
if(args.length < 4)
System.out.println("Usage: zipsend <folder> <ip> <port> <password>");
else {
String toZip = args[0];
String ip = args[1];
int port = Integer.parseInt(args[2]);
String pwd = args[3];
File folderToZip = new File(toZip);
if(!folderToZip.exists()) {
System.out.println("[ERROR] invalid folder name");
System.exit(1);
}
System.out.print("[INFO] connecting... ");
Socket s = new Socket(ip, port);
System.out.println("OK.");
System.out.println("[INFO] authenticating... ");
BufferedReader in = new BufferedReader(new InputStreamReader(s.getInputStream()));
PrintStream out = new PrintStream(s.getOutputStream());
out.println(pwd);
System.out.println(in.readLine());
System.out.println();
System.out.print("[INFO] zipping " + toZip + "... ");
File zipFile = new File(System.getProperty("user.dir") + "\\" + folderToZip.getName() + ".zip");
ZipOutputStream zout = new ZipOutputStream(new FileOutputStream(zipFile));
ZipUtils.processFolder(folderToZip, zout, folderToZip.getPath().length() + 1);
zout.close();
System.out.println("OK.");
//Transfer file
out.println(zipFile.getName());
byte[] data = new byte[(int)zipFile.length()];
FileInputStream fis = new FileInputStream(zipFile);
BufferedInputStream bis = new BufferedInputStream(fis);
System.out.println("[INFO] sending zip file... ");
OutputStream os = s.getOutputStream();
int count;
while((count = bis.read(data)) > 0) {
os.write(data, 0, count);
}
os.flush();
os.close();
fis.close();
bis.close();
s.close();
System.out.println("[INFO] done. Sent " + Files.size(zipFile.toPath()) + " bytes.");
zipFile.delete();
}
}
}

InputStream input = client.getInputStream();
BufferedReader in = new BufferedReader(new InputStreamReader(input));
Your problem is here. You can't use multiple inputs on a socket when one or more of them is buffered. The buffered input stream/reader will read-ahead and 'steal' data from the other stream. You need to change your protocol so you can use the same stream for the life of the socket at both ends. For example, use DataInputStream, with readUTF() for the file name and read() for the data: at the sender, use DataOutputStream, with writeUTF() for the filename and write() for the data.

using ANTLR in java cause OOM

I'm trying to parse a big log(30MByte) file with ANTLR.
But it crashed with OOM or became very slow as parser working.
As i knew,
1. Lexer scans text and yeids tokens
2. Parser consume tokens with given rule
Tokens already consumed should be collected by gc, but it seems not.
Can you tell me what is the problem?
(grammar or code)
Minimized grammars and codes are below
LogParser.g
grammar LogParser;
options {
language = Java;
}
rule returns [Line result]
:
stamp WS text NL
{
result = new Line();
result.setStamp(Integer.parseInt($stamp.text));
result.setText($text.text + $NL.text);
}
;
stamp
:
DIGIT+
;
text
:
CHAR+
;
DIGIT
:
'0'..'9'
;
CHAR
:
'A'..'Z'
;
WS
:
' '
;
NL
:
'\r'? '\n'
;
Test.java
import java.io.IOException;
import org.antlr.runtime.ANTLRFileStream;
import org.antlr.runtime.CharStream;
import org.antlr.runtime.CommonTokenStream;
import org.antlr.runtime.RecognitionException;
public class Test {
public static void main(String[] args) {
try {
CharStream input = new ANTLRFileStream("aaa.txt");
LogParserLexer lexer = new LogParserLexer(input);
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
LogParserParser parser = new LogParserParser(tokenStream);
int count = 0;
while (true) {
count++;
parser.rule();
parser.setBacktrackingLevel(0);
if (0 == count % 1000)
System.out.println(count);
}
} catch (IOException e) {
e.printStackTrace();
} catch (RecognitionException e) {
e.printStackTrace();
}
}
}
Line.java
public class Line {
private Integer stamp;
private String text;
public Integer getStamp() {
return stamp;
}
public void setStamp(Integer stamp) {
this.stamp = stamp;
}
public String getText() {
return text;
}
public void setText(String text) {
this.text = text;
}
#Override
public String toString() {
return String.format("%010d %s", stamp, text);
}
}
aaa.txt, randomly generated contents. its size is about 30mega byte.
0925489881 BIWRSAZLQTOGJUAVTRWV
0182726517 WWVNRKGGXPKPYBDIVUII
1188747525 NZONXSYIWHMMOLTVPKVC
1605284429 RRLYHBBQKLFDLTRHWCTK
1842597100 UFQNIADNPHQYTEEJDKQN
0338698771 PLFZMKAGLGWTHZXNNZEU
1971850686 TDGYOCGOMNZUFNGOXLPM
1686341878 NTYUXJSVQYXTBZAFLJJD
0849759139 YRXZSVWSZDBJPSNSWNJH
:
:
:
Sample generator
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.Random;
public class EntryPoint {
/**
* #param args
*/
public static void main(String[] args) {
FileWriter fw = null;
try {
int size = 20;
String formatLength = Integer.toString(Integer.MAX_VALUE);
String pattern = "%0" + formatLength.length() + "d ";
Random random = new Random();
File file = new File("aaa.txt");
fw = new FileWriter(file);
while (true) {
int nextInt = random.nextInt(Integer.MAX_VALUE);
StringBuilder sb = new StringBuilder();
sb.append(String.format(pattern, nextInt));
for (int i = 0; i < size; i++) {
sb.append((char) ('A' + random.nextInt(26)));
}
fw.append(sb);
fw.append(System.getProperty("line.separator"));
if (file.length() > 30000000)
break;
}
fw.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Result with JavaSE-1.6(jre6), windows 7 64 vmarg "-Xmx256M"
85000
86000
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.antlr.runtime.Lexer.emit(Lexer.java:160)
at org.antlr.runtime.Lexer.nextToken(Lexer.java:91)
at org.antlr.runtime.BufferedTokenStream.fetch(BufferedTokenStream.java:133)
at org.antlr.runtime.BufferedTokenStream.sync(BufferedTokenStream.java:127)
at org.antlr.runtime.CommonTokenStream.consume(CommonTokenStream.java:67)
at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:106)
at LogParserParser.text(LogParserParser.java:190)
at LogParserParser.rule(LogParserParser.java:65)
at Test.main(Test.java:21)

I believe UnbufferedTokenStream is what you want. Might need to unbuffer the char stream too.

download XSD with all imports

I use gml (3.1.1) XSDs in XSD for my application. I want to download all gml XSDs in version 3.1.1 in for example zip file. In other words: base xsd is here and I want to download this XSD with all imports in zip file or something like zip file. Is there any application which supports that?
I've found this downloader but it doesn't works for me (I think that this application is not supporting relative paths in imports which occurs in gml.xsd 3.1.1). Any ideas?

QTAssistant's XSR (I am associated with it) has an easy to use function that allows one to automatically import and refactor XSD content as local files from all sorts of sources. In the process it'll update schema location references, etc.
I've made a simple screen capture of the steps involved in achieving a task like this which should demonstrate its usability.

Based on the solution of mschwehl, I made an improved class to achieve the fetch. It suited well with the question. See https://github.com/mfalaize/schema-fetcher

You can achieve this using SOAP UI.
Follow these steps :
Create a project using the WSDL.
Choose your interface and open in interface viewer.
Navigate to the tab 'WSDL Content'.
Use the last icon under the tab 'WSDL Content' : 'Export the entire WSDL and included/imported files to a local directory'.
select the folder where you want the XSDs to be exported to.
Note: SOAPUI will remove all relative paths and will save all XSDs to the same folder.

I have written a simple java-main that does the job and change to relative url's
package dl;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.Authenticator;
import java.net.PasswordAuthentication;
import java.net.URI;
import java.net.URL;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Scanner;
import java.util.Set;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class SchemaPersister {
private static final String EXPORT_FILESYSTEM_ROOT = "C:/export/xsd";
// some caching of the http-responses
private static Map<String,String> _httpContentCache = new HashMap<String,String>();
public static void main(String[] args) {
try {
new SchemaPersister().doIt();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private void doIt() throws Exception {
// // if you need an inouse-Proxy
// final String authUser = "xxxxx";
// final String authPassword = "xxxx"
//
// System.setProperty("http.proxyHost", "xxxxx");
// System.setProperty("http.proxyPort", "xxxx");
// System.setProperty("http.proxyUser", authUser);
// System.setProperty("http.proxyPassword", authPassword);
//
// Authenticator.setDefault(
// new Authenticator() {
// public PasswordAuthentication getPasswordAuthentication() {
// return new PasswordAuthentication(authUser, authPassword.toCharArray());
// }
// }
// );
//
Set <SchemaElement> allElements = new HashSet<SchemaElement>() ;
// URL url = new URL("file:/C:/xauslaender-nachrichten-administration.xsd");
URL url = new URL("http://www.osci.de/xauslaender141/xauslaender-nachrichten-bamf-abh.xsd");
allElements.add ( new SchemaElement(url));
for (SchemaElement e: allElements) {
System.out.println("processing " + e);
e.doAll();
}
System.out.println("done!");
}
class SchemaElement {
private URL _url;
private String _content;
public List <SchemaElement> _imports ;
public List <SchemaElement> _includes ;
public SchemaElement(URL url) {
this._url = url;
}
public void checkIncludesAndImportsRecursive() throws Exception {
InputStream in = new ByteArrayInputStream(downloadContent() .getBytes("UTF-8"));
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(in);
List<Node> includeNodeList = null;
List<Node> importNodeList = null;
includeNodeList = getXpathAttribute(doc,"/*[local-name()='schema']/*[local-name()='include']");
_includes = new ArrayList <SchemaElement> ();
for ( Node element: includeNodeList) {
Node sl = element.getAttributes().getNamedItem("schemaLocation");
if (sl == null) {
System.out.println(_url + " defines one import but no schemaLocation");
continue;
}
String asStringAttribute = sl.getNodeValue();
URL url = buildUrl(asStringAttribute,_url);
SchemaElement tmp = new SchemaElement(url);
tmp.setSchemaLocation(asStringAttribute);
tmp.checkIncludesAndImportsRecursive();
_includes.add(tmp);
}
importNodeList = getXpathAttribute(doc,"/*[local-name()='schema']/*[local-name()='import']");
_imports = new ArrayList <SchemaElement> ();
for ( Node element: importNodeList) {
Node sl = element.getAttributes().getNamedItem("schemaLocation");
if (sl == null) {
System.out.println(_url + " defines one import but no schemaLocation");
continue;
}
String asStringAttribute = sl.getNodeValue();
URL url = buildUrl(asStringAttribute,_url);
SchemaElement tmp = new SchemaElement(url);
tmp.setSchemaLocation(asStringAttribute);
tmp.checkIncludesAndImportsRecursive();
_imports.add(tmp);
}
in.close();
}
private String schemaLocation;
private void setSchemaLocation(String schemaLocation) {
this.schemaLocation = schemaLocation;
}
// http://stackoverflow.com/questions/10159186/how-to-get-parent-url-in-java
private URL buildUrl(String asStringAttribute, URL parent) throws Exception {
if (asStringAttribute.startsWith("http")) {
return new URL(asStringAttribute);
}
if (asStringAttribute.startsWith("file")) {
return new URL(asStringAttribute);
}
// relative URL
URI parentUri = parent.toURI().getPath().endsWith("/") ? parent.toURI().resolve("..") : parent.toURI().resolve(".");
return new URL(parentUri.toURL().toString() + asStringAttribute );
}
public void doAll() throws Exception {
System.out.println("READ ELEMENTS");
checkIncludesAndImportsRecursive();
System.out.println("PRINTING DEPENDENCYS");
printRecursive(0);
System.out.println("GENERATE OUTPUT");
patchAndPersistRecursive(0);
}
public void patchAndPersistRecursive(int level) throws Exception {
File f = new File(EXPORT_FILESYSTEM_ROOT + File.separator + this.getXDSName() );
System.out.println("FILENAME: " + f.getAbsolutePath());
if (_imports.size() > 0) {
for (int i = 0; i < level; i++) {
System.out.print(" ");
}
System.out.println("IMPORTS");
for (SchemaElement kid : _imports) {
kid.patchAndPersistRecursive(level+1);
}
}
if (_includes.size() > 0) {
for (int i = 0; i < level; i++) {
System.out.print(" ");
}
System.out.println("INCLUDES");
for (SchemaElement kid : _includes) {
kid.patchAndPersistRecursive(level+1);
}
}
String contentTemp = downloadContent();
for (SchemaElement i : _imports ) {
if (i.isHTTP()) {
contentTemp = contentTemp.replace(
"<xs:import schemaLocation=\"" + i.getSchemaLocation() ,
"<xs:import schemaLocation=\"" + i.getXDSName() );
}
}
for (SchemaElement i : _includes ) {
if (i.isHTTP()) {
contentTemp = contentTemp.replace(
"<xs:include schemaLocation=\"" + i.getSchemaLocation(),
"<xs:include schemaLocation=\"" + i.getXDSName() );
}
}
FileOutputStream fos = new FileOutputStream(f);
fos.write(contentTemp.getBytes("UTF-8"));
fos.close();
System.out.println("File written: " + f.getAbsolutePath() );
}
public void printRecursive(int level) {
for (int i = 0; i < level; i++) {
System.out.print(" ");
}
System.out.println(_url.toString());
if (this._imports.size() > 0) {
for (int i = 0; i < level; i++) {
System.out.print(" ");
}
System.out.println("IMPORTS");
for (SchemaElement kid : this._imports) {
kid.printRecursive(level+1);
}
}
if (this._includes.size() > 0) {
for (int i = 0; i < level; i++) {
System.out.print(" ");
}
System.out.println("INCLUDES");
for (SchemaElement kid : this._includes) {
kid.printRecursive(level+1);
}
}
}
String getSchemaLocation() {
return schemaLocation;
}
/**
* removes html:// and replaces / with _
* #return
*/
private String getXDSName() {
String tmp = schemaLocation;
// Root on local File-System -- just grap the last part of it
if (tmp == null) {
tmp = _url.toString().replaceFirst(".*/([^/?]+).*", "$1");
}
if ( isHTTP() ) {
tmp = tmp.replace("http://", "");
tmp = tmp.replace("/", "_");
} else {
tmp = tmp.replace("/", "_");
tmp = tmp.replace("\\", "_");
}
return tmp;
}
private boolean isHTTP() {
return _url.getProtocol().startsWith("http");
}
private String downloadContent() throws Exception {
if (_content == null) {
System.out.println("reading content from " + _url.toString());
if (_httpContentCache.containsKey(_url.toString())) {
this._content = _httpContentCache.get(_url.toString());
System.out.println("Cache hit! " + _url.toString());
} else {
System.out.println("Download " + _url.toString());
Scanner scan = new Scanner(_url.openStream(), "UTF-8");
if (isHTTP()) {
this._content = scan.useDelimiter("\\A").next();
} else {
this._content = scan.useDelimiter("\\Z").next();
}
scan.close();
if (this._content != null) {
_httpContentCache.put(_url.toString(), this._content);
}
}
}
if (_content == null) {
throw new NullPointerException("Content of " + _url.toString() + "is null ");
}
return _content;
}
private List<Node> getXpathAttribute(Document doc, String path) throws Exception {
List <Node> returnList = new ArrayList <Node> ();
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
{
XPathExpression expr = xpath.compile(path);
NodeList nodeList = (NodeList) expr.evaluate(doc, XPathConstants.NODESET );
for (int i = 0 ; i < nodeList.getLength(); i++) {
Node n = nodeList.item(i);
returnList.add(n);
}
}
return returnList;
}
#Override
public String toString() {
if (_url != null) {
return _url.toString();
}
return super.toString();
}
}
}

I created a python tool to recursively download XSDs with relative paths in import tags (eg: <import schemaLocation="../../../../abc)
https://github.com/n-a-t-e/xsd_download
After downloading the schema you can use xmllint to validate an XML document

I am using org.apache.xmlbeans.impl.tool.SchemaResourceManager from the xmlbeans project. This class is quick and easy to use.
for example:
SchemaResourceManager manager = new SchemaResourceManager(new File(dir));
manager.process(schemaUris, emptyArray(), false, true, true);
manager.writeCache();
This class has a main method that documents the different options available.

How to Switch to OTA Provisioning option in Wireless toolkit J2ME?

I am new to J2ME and now I have to solve one J2ME project.
Below is my login form I don't know what to do with that but it is giving me the error "javax.microedition.io.ConnectionNotFoundException: TCP open". After searching on the google I got some hint that we have to run the code on "OTA Provisioning option ".
Now I don't know how to do that. I have "Version 2.5.2 for CLDC" of WTK. Can anybody suggest me on that?
package model;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import javax.microedition.rms.RecordEnumeration;
import javax.microedition.rms.RecordStore;
import view.Dialogs;
public class Login {
public static RecordStore rs; // Record store
static final String REC_STORE = "db_Login"; // Name of record store
static RecordEnumeration re;
private String login,password;
private int c;
public static String s1,s2,s3;
public Login()
{
if(LoginSrv.st1.equals("Invalid") && LoginSrv.st2.equals("User"))
{
System.out.println("Im from Invalid User");
new Dialogs();
}
else
{
login = LoginSrv.st1;
password = LoginSrv.st2;
c= LoginSrv.it1;
saveRecord();
}
}
public Login(String log,String pas, String ctr)
{
s1 = log;
s2 = pas;
s3 = ctr;
}
public void saveRecord()
{
try
{
rs = RecordStore.openRecordStore(REC_STORE, true );
re = rs.enumerateRecords(null, null, false);
ByteArrayOutputStream baosdata = new ByteArrayOutputStream();
DataOutputStream daosdata = new DataOutputStream(baosdata);
daosdata.writeUTF(login);
daosdata.writeUTF(password);
daosdata.writeInt(c);
byte[] record = baosdata.toByteArray();
rs.addRecord(record, 0, record.length);
System.out.println("Login record added");
}
catch(Exception e)
{
System.out.println("Im from model Login craete record"+e);
}
}
public static Login getRecord()
{
try
{
rs = RecordStore.openRecordStore(REC_STORE, true );
re = rs.enumerateRecords(null, null, false);
if(re.hasNextElement())
{
byte data[] = rs.getRecord(re.nextRecordId());
ByteArrayInputStream strmBytes = new ByteArrayInputStream(data);
DataInputStream strmDataType = new DataInputStream(strmBytes);
String log = strmDataType.readUTF();
String pass = strmDataType.readUTF();
String counter = strmDataType.readUTF();
return(new Login(log,pass,counter));
}
return null;
}
catch(Exception e)
{
System.out.println("Im from model Login loadRecord"+e);
return null;
}
}
}
-----------------------LoginSrv code ---------------------------------------
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package model;
import com.sun.lwuit.Dialog;
import controller.AppConstants;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.DataInputStream;
import java.io.DataOutputStream;
/**
*
* #author sandipp
*/
public class LoginSrv {
private ServCon srv;
public static String st1,st2,log,pas;
public static int it1;
public LoginSrv(String s1, String s2)
{
log = s1;
pas = s2;
it1=0;
try
{
ByteArrayOutputStream baosdata = new ByteArrayOutputStream();
DataOutputStream daosdata = new DataOutputStream(baosdata);
daosdata.writeUTF(s1);
daosdata.writeUTF(s2);
srv = new ServCon(new AppConstants().str1,null,baosdata.toByteArray(),false,false,null);
ByteArrayOutputStream obj = (ByteArrayOutputStream)srv.startTransfer();
byte[] record = obj.toByteArray();
ByteArrayInputStream instr = new ByteArrayInputStream(record);
DataInputStream indat = new DataInputStream(instr);
if(srv.getRc() == 200)
{
String su = indat.readUTF();
if(su.equals("successfull"))
{
st1 =indat.readUTF();
st2 =indat.readUTF();
}
else
{
Dialog.show("Error",su , null,Dialog.TYPE_INFO,null,5000);
}
}
else
{
Dialog.show("Error",srv.getRc() + " " + srv.getRm(), null,Dialog.TYPE_INFO,null,5000);
}
}
catch(Exception e)
{
System.out.println("Im from LoginSrv constructor:"+e);
}
}
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

how to use cmu sphinx segmenter class - cmusphinx

In common case see the TranscriberDemo. If you need to time-align long audio, see the AlignerDemo.

Related

How to read a compressed (gzip) file without extension in Spark

Java read bytes from Socket on Linux

using ANTLR in java cause OOM

download XSD with all imports

How to Switch to OTA Provisioning option in Wireless toolkit J2ME?

Categories

Resources