Markov-Text generating

Markov-Text generating - text

I've been looking in to generate text.
What I've learned so far is that I will have to use word-level Markov-text generation. I've found a few examples of those on this site. here
Now knowing this wouldn't work I tried it anyways and copied it to Processing. With the errors of not finding the correct libraries.
Is there anyone out there that has done this or can point me in a good direction to find more about doing text generation with processing. Or even somebody who want's to do a collab. Being open source and what not.
What I want isn't that more different than the example on the site, except the letter count should be word based and the database is given by words I put in there. The last part could be altered to an other source which I'm still brainstorming about. But could be everything actually with words. If you have any ideas please be free to contribute.
I'll edit this post when I know more from other forums. So when there's a solution I can pass it to others.
EDIT: SOLUTION CLICKBASED GENERATING
// required imports for Processing
import java.util.Hashtable;
import java.util.Vector;
String inputFile = "Sonnet51.txt";
Markov markovChain1;
String sentence = "";
void setup() {
size (900, 500);
background(0);
markovChain1 = new Markov();
// load text
String[] input = loadStrings(inputFile);
for (String line : input) {
markovChain1.addWords(line);
println(line);
}
// generate a sentence!
sentence = markovChain1.generateSentence();
println("-------------");
}
void draw() {
background(0);
// noLoop();
fill(255);
text(sentence, 19, 190);
fill(2, 255, 2);
text("Please press mouse", 19, height-33);
}
void mousePressed() {
// generate a sentence!
sentence = markovChain1.generateSentence();
println(sentence);
}
// ==========================================
class Markov {
Hashtable<String, Vector<String>> markovChain =
new Hashtable<String, Vector<String>>();
Markov() {
markovChain.put("_start", new Vector<String>());
markovChain.put("_end", new Vector<String>());
}
void addWords(String line) {
String[] words = line.split(" ");
for (int i=0; i<words.length; i++) {
if (i == 0) {
Vector<String> startWords = markovChain.get("_start");
startWords.add(words[i]);
Vector<String> suffix = markovChain.get(words[i]);
if (suffix == null) {
suffix = new Vector<String>();
suffix.add(words[i+1]);
markovChain.put(words[i], suffix);
}
}
else if (i == words.length-1) {
Vector<String> endWords = markovChain.get("_end");
endWords.add(words[i]);
}
else {
Vector<String> suffix = markovChain.get(words[i]);
if (suffix == null) {
suffix = new Vector<String>();
suffix.add(words[i+1]);
markovChain.put(words[i], suffix);
}
else {
suffix.add(words[i+1]);
markovChain.put(words[i], suffix);
}
}
}
}
String generateSentence() {
String newPhrase = "";
String nextWord = "";
Vector<String> startWords = markovChain.get("_start");
int startWordsLen = startWords.size();
nextWord = startWords.get(int(random(startWordsLen)));
newPhrase += " " + nextWord;
while (nextWord.charAt (nextWord.length ()-1) != '.') {
Vector<String> wordSelection=null;
wordSelection = markovChain.get(nextWord);
if (wordSelection!=null) {
int wordSelectionLen = wordSelection.size();
nextWord = wordSelection.get(int(random(wordSelectionLen-1)));
newPhrase += " " + nextWord;
}
else
{
return newPhrase.toString();
}
}
return newPhrase.toString();
}
} // class
//
use following text to use for the generator.
Thus can my love excuse the slow offence
Of my dull bearer when from thee I speed
From where thou art why should I haste me thence
Till I return of posting is no need
O! what excuse will my poor beast then find
When swift extremity can seem but slow
Then should I spur though mounted on the wind.
In winged speed no motion shall I know.
Then can no horse with my desire keep pace.
Therefore desire of perfectst love being made.
Shall neigh no dull flesh in his fiery race;
But love for love thus shall excuse my jade.
Since from thee going, he went wilful-slow
Towards thee Ill run, and give him leave to go.
It works completely and now I can begin to change it for making bigger texts. I anybody have ideas let me know. But this case is solved for me.
Thanks to ChrisIr from Processing forum.

The RiTa library already does this if you want to take a look at it. Or just use it. http://rednoise.org/rita/

I think that code you're using is making things more complicated. This Java example is much clearer, and should work "out of the box" in Processing – just copy/paste!
Here's the Processing-ified version that should work, though I think it might need some tweaking.
// required imports for Processing
import java.util.Hashtable;
import java.util.Vector;
String inputFile = "Sonnet51.txt";
Markov markovChain;
void setup() {
markovChain = new Markov();
// load text
String[] input = loadStrings(inputFile);
for (String line : input) {
markovChain.addWords(line);
}
// generate a sentence!
String sentence = markovChain.generateSentence();
println(sentence);
}
class Markov {
Hashtable<String, Vector<String>> markovChain = new Hashtable<String, Vector<String>>();
Markov() {
markovChain.put("_start", new Vector<String>());
markovChain.put("_end", new Vector<String>());
}
void addWords(String line) {
String[] words = line.split(" ");
for (int i=0; i<words.length; i++) {
if (i == 0) {
Vector<String> startWords = markovChain.get("_start");
startWords.add(words[i]);
Vector<String> suffix = markovChain.get(words[i]);
if (suffix == null) {
suffix = new Vector<String>();
suffix.add(words[i+1]);
markovChain.put(words[i], suffix);
}
} else if (i == words.length-1) {
Vector<String> endWords = markovChain.get("_end");
endWords.add(words[i]);
} else {
Vector<String> suffix = markovChain.get(words[i]);
if (suffix == null) {
suffix = new Vector<String>();
suffix.add(words[i+1]);
markovChain.put(words[i], suffix);
} else {
suffix.add(words[i+1]);
markovChain.put(words[i], suffix);
}
}
}
}
String generateSentence() {
String newPhrase= "";
String nextWord = "";
Vector<String> startWords = markovChain.get("_start");
int startWordsLen = startWords.size();
nextWord = startWords.get(int(random(startWordsLen)));
newPhrase += " " + nextWord;
while (nextWord.charAt (nextWord.length()-1) != '.') {
Vector<String> wordSelection = markovChain.get(nextWord);
int wordSelectionLen = wordSelection.size();
nextWord = wordSelection.get(int(random(wordSelectionLen)));
newPhrase += " " + nextWord;
}
return newPhrase.toString();
}
}

Related

I'm getting Resource Leaks

Whenever I run this code, it works pretty smoothly, until the while loop runs through once. It will go back and ask for the name again, and then skip String b = sc.nextLine();, and print the next line, instead.
static Scanner sc = new Scanner(System.in);
static public void main(String [] argv) {
Name();
}
static public void Name() {
boolean again = false;
do
{
System.out.println("What is your name?");
String b = sc.nextLine();
System.out.println("Ah, so your name is " + b +"?\n" +
"(y//n)");
int a = getYN();
System.out.println(a + "! Good.");
again = askQuestion();
} while(again);
}
static public boolean askQuestion() {
System.out.println("Do you want to try again?");
int answer = sc.nextInt();
if (answer == 1) {
return true;
}
else {
return false;
}
}
static int getYN() {
switch (sc.nextLine().substring(0, 1).toLowerCase()) {
case "y":
return 1;
case "n":
return 0;
default:
return 2;
}
}
}
Also, I'm trying to create this program in a way that I can ask three questions (like someone's Name, Gender, and Age, maybe more like race and whatnot), and then bring all of those answers back. Like, at the very end, say, "So, your name is + name +, you are + gender +, and you are + age + years old? Yes/No." Something along those lines. I know there's a way to do it, but I don't know how to save those responses anywhere, and I can't grab them since they only occur in the instance of the method.

Don't try to scan text with nextLine() AFTER using nextInt() with the same scanner! It may cause problems. Open a scanner method for ints only...it's recommended.
You could always parse the String answer of the scanner.
Also, using scanner this way is not a good practice, you could organize questions in array an choose a loop reading for a unique scanner instantiation like this:
public class a {
private static String InputName;
private static String Sex;
private static String Age;
private static String input;
static Scanner sc ;
static public void main(String [] argv) {
Name();
}
static public void Name() {
sc = new Scanner(System.in);
String[] questions = {"Name?","Age","Sex?"};//
int a = 0;
System.out.println(questions[a]);
while (sc.hasNext()) {
input = sc.next();
setVariable(a, input);
if(input.equalsIgnoreCase("no")){
sc.close();
break;
}
else if(a>questions.length -1)
{
a = 0;
}
else{
a++;
}
if(a>questions.length -1){
System.out.println("Fine " + InputName
+ " so you are " + Age + " years old and " + Sex + "." );
Age = null;
Sex = null;
InputName = null;
System.out.println("Loop again?");
}
if(!input.equalsIgnoreCase("no") && a<questions.length){
System.out.println(questions[a]);
}
}
}
static void setVariable(int a, String Field) {
switch (a) {
case 0:
InputName = Field;
return;
case 1:
Age = Field;
return;
case 2:
Sex = Field;
return;
}
}
}
Pay attention on the global variables, wich stores your info until you set them null or empty...you could use them to the final affirmation.
Hope this helps!
Hope this helps!

How to use string tokenizer on an array?

I see that you can't use string tokenizer on an array because you cant convert String() to String[]. After a length of time I realized that if the inputFromFile method reads it line by line, I can tokenize it line by line. I just don't know how to do it so that it returns the tokenized version of it.
I'm assuming in the line=in.ReadLine(); line I should put StringTokenizer token = new StringTokenizer(line,",").. but it doesn't seem to be working.
Any help? (I have to tokenize the commas).
public class Project1 {
private static int inputFromFile(String filename, String[] wordArray) {
TextFileInput in = new TextFileInput(filename);
int lengthFilled = 0;
String line = in.readLine();
while (lengthFilled < wordArray.length && line != null) {
wordArray[lengthFilled++] = line;
line = in.readLine();
}// while
if (line != null) {
System.out.println("File contains too many Strings.");
System.out.println("This program can process only "
+ wordArray.length + " Strings.");
System.exit(1);
} // if
in.close();
return lengthFilled;
} // method inputFromFile
public static void main(String[] args) {
String[] numArray = new String[100];
inputFromFile("input1.txt", numArray);
for (int i = 0; i < numArray.length; i++) {
if (numArray[i] == null) {
break;
}
System.out.println(numArray[i]);
}// for
for (int i=0;i<numArray.length;i++)
{
Integer.parseInt(numArray[i]);
}
}// main
}// project1

This is what I meant:
while (lengthFilled < wordArray.length && line != null) {
String[] tokens = line.split(",");
if(tokens == null || tokens.length == 0) {
//line without required token, add whole line as it is
wordArray[lengthFilled++] = line;
} else {
//add each token into wordArray
for(int i=0; i<tokens.length;i++) {
wordArray[lengthFilled++] = tokens[i];
}
}
line = in.readLine();
}// while
There can be other approaches as well. For instance, you can use a StringBuilder to read everything as one big string and them split it on your required tokens etc. The above logic is just to point you in right direction.

J2ME read text file into String array

Could you please point out where is the bug in my code?
I have a simple text file with the following data structure:
something1
something2
something3
...
It results a String[] where every element is the last element of the file. I can't find the mistake, but it goes wrong somewhere around the line.setLength(0);
Any ideas?
public String[] readText() throws IOException {
InputStream file = getClass().getResourceAsStream("/questions.txt");
DataInputStream in = new DataInputStream(file);
StringBuffer line = new StringBuffer();
Vector lines = new Vector();
int c;
try {
while( ( c = in.read()) != -1 ) {
if ((char)c == '\n') {
if (line.length() > 0) {
// debug
//System.out.println(line.toString());
lines.addElement(line);
line.setLength(0);
}
}
else{
line.append((char)c);
}
}
if(line.length() > 0){
lines.addElement(line);
line.setLength(0);
}
String[] splitArray = new String[lines.size()];
for (int i = 0; i < splitArray.length; i++) {
splitArray[i] = lines.elementAt(i).toString();
}
return splitArray;
} catch(Exception e) {
System.out.println(e.getMessage());
return null;
} finally {
in.close();
}
}

I see one obvious error - you're storing the same StringBuffer instance multiple times in the Vector, and you clear the same StringBuffer instance with setLength(0). I'm guesing you want to do something like this
StringBuffer s = new StringBuffer();
Vector v = new Vector();
...
String bufferContents = s.toString();
v.addElement(bufferContents);
s.setLength(0);
// now it's ok to reuse s
...

If your problem is to read the contents of the file in a String[], then you could actually use apache common's FileUtil class and read in an array list and then convert to an array.
List<String> fileContentsInList = FileUtils.readLines(new File("filename"));
String[] fileContentsInArray = new String[fileContentsInList.size()];
fileContentsInArray = (String[]) fileContentsInList.toArray(fileContentsInArray);
In the code that you have specified, rather than setting length to 0, you can reinitialize the StringBuffer.

Java ME Utility Functions

JavaME is quite sparse on features. Please list your favourite utility functions for making using it more like using proper Java, one per answer. Try to make your answers specific to Java ME.

Small Logging Framework
MicroLog
http://microlog.sourceforge.net/site/

Splitting a string
static public String[] split(String str, char c)
{
int l=str.length();
int count = 0;
for(int i = 0;i < l;i++)
{
if (str.charAt(i) == c)
{
count ++;
}
}
int first = 0;
int last = 0;
int segment=0;
String[] array = new String[count + 1];
for(int i=0;i<l;i++)
{
if (str.charAt(i) == c)
{
last = i;
array[segment++] = str.substring(first,last);
first = last;
}
if(i==l-1){
array[segment++] = str.substring(first,l);
}
}
return array;
}

Read a line from a reader. See also this question.
public class LineReader{
private Reader in;
private int bucket=-1;
public LineReader(Reader in){
this.in=in;
}
public boolean hasLine() throws IOException{
if(bucket!=-1)return true;
bucket=in.read();
return bucket!=-1;
}
//Read a line, removing any /r and /n. Buffers the string
public String readLine() throws IOException{
int tmp;
StringBuffer out=new StringBuffer();
//Read in data
while(true){
//Check the bucket first. If empty read from the input stream
if(bucket!=-1){
tmp=bucket;
bucket=-1;
}else{
tmp=in.read();
if(tmp==-1)break;
}
//If new line, then discard it. If we get a \r, we need to look ahead so can use bucket
if(tmp=='\r'){
int nextChar=in.read();
if(tmp!='\n')bucket=nextChar;//Ignores \r\n, but not \r\r
break;
}else if(tmp=='\n'){
break;
}else{
//Otherwise just append the character
out.append((char) tmp);
}
}
return out.toString();
}
}

Audible Audio (.aa) file spec?

Does anyone know of a good resource on the Audible Audio (.aa) file spec?
I'm trying to write a program that can use them, if no one knows of a resource, any tips on reverse engineering the spec my self? I opened it up in a Hex editor and poked around, looks like an MP3 but with a ton more header info.

This site provides some more info in regards to where certain chunks of data reside within the .aa file.
http://wiki.multimedia.cx/index.php?title=Audible_Audio

I have done some research into the Audible header to create a player for my car radio/computer. Basically there is a block of 3700 characters at the beginning of the file that encompasses a number of fields of interest, such as Title, Author, Narrator, etc. I have some limited parsing code in C# to display some of the basic info from the .aa file. as follows:
private void ParseFields(string fileName)
{
string aaHeader;
string tryDate;
if (fileName == "") return;
using (StreamReader sr = new StreamReader(fileName))
{
char[] buff = new char[3700];
sr.Read(buff, 0, buff.Length);
aaHeader = new string(buff);
}
try
{
_author = GetParsedItem(aaHeader, "author");
}
catch
{
_author = "?";
}
try
{
_title = GetParsedItem(aaHeader, "short_title");
}
catch
{
_title = "???";
}
try
{
_narrator = GetParsedItem(aaHeader, "narrator");
}
catch
{
_narrator = "?";
}
try
{
_description = GetParsedItem(aaHeader, "description");
}
catch
{
_description = "???";
}
try
{
_longDescription = GetParsedItem(aaHeader, "long_description");
}
catch
{
_longDescription = "";
}
try
{
tryDate = GetParsedItem(aaHeader, "pubdate");
if (tryDate != "")
_pubDate = Convert.ToDateTime(GetParsedItem(aaHeader, "pubdate"));
else
_pubDate = DateTime.Today;
}
catch
{
_pubDate = DateTime.Today;
}
}
private string GetParsedItem(string buffer, string fieldName)
{
if (buffer.Contains(fieldName))
{
int pos = buffer.IndexOf(fieldName);
pos += fieldName.Length;
int posEnd = buffer.IndexOf('\0',pos);
//if the value for the field is empty, skip it and look for another
if (pos == posEnd)
{
pos = buffer.IndexOf(fieldName, posEnd);
pos += fieldName.Length;
posEnd = buffer.IndexOf('\0', pos);
}
return buffer.Substring(pos, posEnd - pos);
}
else
return "(not found - " + fieldName + ")";
}

I think, there is no spec. Have a look at Wikipedia/Audible.com:
quote:
[...]
Audible introduced one of the first digital audio players in 1997.
The following year it published a Web site from which audio files in its
proprietary .aa format could be downloaded. Audible holds a number of patents
in this area.
[...]
summary: proprietary/patents

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Markov-Text generating - text

The RiTa library already does this if you want to take a look at it. Or just use it. http://rednoise.org/rita/

Related

I'm getting Resource Leaks

How to use string tokenizer on an array?

J2ME read text file into String array

Java ME Utility Functions

Audible Audio (.aa) file spec?

Categories

Resources