I'm trying to write a parser in arcsecond, but it keeps cutting off the strings right before the commas.
I want to turn a custom data format into JSON, and the way it's currently structured, each line up until the first space is the key and everything after that is the value.
The problem is that my code stops reading the line when it hits a comma. I've checked it in a regex checker and it works fine there.
Relevant code:
const key = regex(/^[\t]?[A-Za-z]{1,20}[0-9]?/);
const alphaSpacesPunctuation = regex(/^[A-Za-z ,.:;?!'"]{3,300}/);
const value = choice([alphaUnderscore, alphaSpacesPunctuation, alphaOnly, pstring, integer, decimal, expo, triplet, quotes, textureName, regionName]);
const kvParser = choice([ sequenceOf([key, regex(/^[ ]{1,2}/), value]), empty, openBracket, closedBracket ]);
The line it's looking at is
About For over twenty years, she has lived in Canada.
It's separating "About" from the rest like it should, but stops right after "years". It does the same thing for a few other value strings, too. All the other punctuation marks get picked up as normal.
What might be causing this? Am I missing something super obvious?
Related
About twelve years ago, I wrote a small VB.NET application that loads strings from files. These strings may contain one or more of the following characters: à, è, é, ì, ò, ù, ä, ö. The application uses a special custom font (JazzText Extended) that does not have those special characters. Yet, I somehow managed to make the application display words correctly in that font, and twelve years later, I have no idea how - thanks for not leaving a line of comment, past me!
The program has the following routine:
Private Sub SetWord(ByVal word() As String)
Dim nword(3) As String
nword(0) = word(0)
nword(1) = word(1)
nword(2) = word(2)
For i As Integer = 0 To 2
nword(i) = nword(i).Replace("à", "")
nword(i) = nword(i).Replace("é", "")
nword(i) = nword(i).Replace("è", "")
nword(i) = nword(i).Replace("ì", "ê")
nword(i) = nword(i).Replace("ò", "")
nword(i) = nword(i).Replace("ù", "")
nword(i) = nword(i).Replace("ä", "")
nword(i) = nword(i).Replace("ö", "")
Next
lblItaWord.Text = nword(0).ToUpper
lblEngWord.Text = nword(1).ToUpper
lblFinWord.Text = nword(2).ToUpper
End Sub
What it does is, it takes an array that contains three words, and for each of those three words, it looks if it contains any of the special characters. If it does, it replaces them with... something, makes the words all caps, and then assigns each of them to one of three labels.
In Visual Studio, the replacement characters look like empty strings. I had to put the cursor in between the quotation marks to realise that it was in fact not an empty string and there was an invisible character there. Here on SO... I'm not sure what you'll see. You might see just a square, or some other weird character. (The ê character is an exception, it seems to display in the same way everywhere.)
If you copypaste any of the invisible/square characters to Google and search for it, you'll get a different representation that uses two characters—for example, the first one translates to ‡. Using this pair in place of the invisible/square character in the Replace method does not produce the correct result. FYI, the encoding I use to read the files (the default one used by IO.StreamReader if you don't specify any encoding) works fine: if I use a more standard font, all special characters display correctly without using the SetWord sub at all.
Now, I have absolutely no idea how those characters, whatever they may be, manage to make the app display correctly the words when the font I use does not have those characters. I have no idea how I found out about this trick, either. Right now, my problem is that I would like to replace those squares/invisible characters with something intelligible, and I have no idea how. Any ideas?
I'm building a command parser and I've successfully managed to split strings into separate words and get it all working, but the one thing I'm a bit stumped at is how to remove all punctuation from the string. Users will input characters like , . ! ? often, but with those characters there, it doesn't recognize the word, so any punctuation will need to be removed.
So far I've tested this:
func process_command(_input: String) -> String:
var words:Array = _input.replace("?", "").to_lower().split(" ", false)
It works fine and successfully removes question marks, but I want it to remove all punctuation. Hoping this will be a simple thing to solve! I'm new to Godot so still learning how a lot of the stuff works in it.
You could remove an unwantes character by putting them in an array and then do what you already are doing:
var str_result = input
var unwanted_chars = [".",",",":","?" ] #and so on
for c in unwanted_chars:
str_result = str_result.replace(c,"")
I am not sure what you want to achieve in the long run, but parsing strings can be easier with the use of regular expressions. So if you want to search strings for apecific patterns you should look into this:
regex
Given some input, which I'll just write here as example:
var input := "Hello, It's me!!"
We want to get a modified version where we have filtered the characters:
var output := ""
We need to know what we will filter. For example:
var deny_list := [",", "!"]
We could have a list of things we accept instead, you would just flip a conditional later on.
And then we can iterate over the string, for each character decide if we want to keep it, and if so add it to the output:
for position in input.length():
var current_character := input[position]
if not deny_list.has(current_character):
output += current_character
So I have to go through a bunch of code to get some data from an iframe. the iframe has a lot of data but in there is an object called '_name'. the first key of name is 'extension_id' and its value is a big long string. the json object is enclosed in apostrophes. I have tried removing the apostrophes but still instead of 'extension_id_output' I get a single curly bracket. the json object looks something like this
Frame {
...
...
_name: '{"extension_id":"a big huge string that I need"} "a bunch of other stuff":"this is a valid json object as confirmed by jsonlint", "globalOptions":{"crev":"1.2.50"}}}'
}
it's a whole big ugly paragraph but I really just need the extension_id. so this is the code I'm currently using after attempt 100 or whatever.
var frames = await page.frames();
// I'm using puppeteer for this part but I don't think that's relevant overall.
var thing = frames[1]._name;
console.log(frames[1])
// console.log(thing)
thing.replace(/'/g, '"')
// this is to remove the apostrophes from the outside of the object. I thought that would change things before. it does not. still outputs a single {
JSON.parse(thing)
console.log(thing[0])
instead of getting a big huge string that I need or whatever is written in extension_id. I get a {. that's it. I think that is because the whole object starts with a curly bracket. this is confirmed to me because console.log(thing[2]) prints e. so what's going on? jsonlint says this is a valid json object but maybe it's just a big string and I should be doing some kind of split to grab whaat's between the first : and the first ,. I'm really not sure.
For two reasons:
object[0] doesn't return the value an object's "first property", it returns the value of the property with the name "0", if any (there probably isn't in your object); and
Because it's JSON, and when you're dealing with JSON in JavaScript code, you are by definition dealing with a string. (More here.) If you want to deal with the object that the JSON describes, parse it.
Here's an example of parsing it and getting the value of the extension_id property from it:
const parsed = JSON.parse(frames[1]._name);
console.log(parsed.extension_id); // The ID
I am trying to add changes data in a csv file:
This is the sample data:
DATE status code value value2
"2016-01-26","Subscription All","119432660","1315529431362550","0.0080099833517888"
"2016-01-26","Subscription All","119432664","5836995058433524","0.033825584764444"
"2016-01-26","Subscription All","119432664","8287300074499777","0.076913377834744"
"2016-01-26","Subscription All","119432664","14870697739968326","0.0074188355187426"
My code used to format the data:
CSVReader reader = new CSVReader(new FileReader(new File(fileToChange)), CSVParser.DEFAULT_SEPARATOR, CSVParser.NULL_CHARACTER, CSVParser.NULL_CHARACTER, 1)
info "Read all rows at once"
List<String[]> allRows = reader.readAll();
CSVWriter writer = new CSVWriter(new FileWriter(fileToChange), CSVWriter.DEFAULT_SEPARATOR, CSVWriter.NO_QUOTE_CHARACTER)
writer.writeAll(allRows)
writer.close()
The output i get is this, with extra quote added instead of removing it.
""2016-01-26"",""Subscription All"",""119432660"",""1315529431362550"",""0.0080099833517888""
""2016-01-26"",""Subscription All"",""119432664"",""5836995058433524"",""0.033825584764444""
""2016-01-26"",""Subscription All"",""119432664"",""8287300074499777"",""0.076913377834744""
""2016-01-26"",""Subscription All"",""119432664"",""14870697739968326"",""0.0074188355187426""
I want to remove the quotes.
Please can someone help.
Also, is it possible to change the date format to yyyymmdd instead of yyyy-mm-dd?
allRows.each { String[] theLine ->
String newDate = theLine[0].replaceAll('-', '')
String newline = theLine.eachWithIndex { String s, int i -> return i > 0 ? s : newDate}
writer.writeLine(newline)
}
Thanks
When you instantiated your CSVReader you told it to treat no characters as quotes, therefore it read the existing quotes as data and did not remove them.
When you told CSVWriter not to add any quotes it honored your request. However, the input data contained quote characters, and the convention for including quotes inside a string in CSV is to double the quotes. Thus the
string value
ABC"DEF
gets coded in CSV as
"ABC""DEF"
So the result you see is the combination of not removing the quotes on input (you told it not to) and then doubling the quotes on output.
To solve this change the input option from NULL_CHARACTER to DEFAULT_QUOTE_CHARACTER. However be aware that if any of your data actually contains embedded quotes or commas the resulting output will not be valid CSV.
Also I think this might be a valid bug report against OpenCSV. I believe that OpenCSV needs to inform you if it is about to generate invalid CSV when you told it to omit quotes, probably via a runtime exception. Although I suppose they might argue that you chose to work without a net and should accept whatever you get. Personally I go for the "principle of least surprise", which IMHO would be not to double quotes when the output is unquoted.
Because quotation in your CSVReader is set to CSVParser.NULL_CHARACTER " is treated as normal character which is part of read token. This causes your array to contain data in form:
["2016-01-26", "Subscription All", "119432660", "1315529431362550", "0.0080099833517888"]
rather than:
[2016-01-26, Subscription All, 119432660, 1315529431362550, 0.0080099833517888]
So try changing option from CSVParser.NULL_CHARACTER to either
'"'
CSVParser.DEFAULT_QUOTE_CHARACTER (it also stores '"').
CsvToBean csvToBean = new CsvToBeanBuilder(new StringReader(csv))
.withMappingStrategy(strategy)
.withIgnoreLeadingWhiteSpace(true)
.withSeparator(',')
.withIgnoreEmptyLine(true)
.withQuoteChar('\'')
.withQuoteChar('"')
.build();
Ok, so I'm creating a Flash HUD in AS2 that runs on the Surface and connects to our server.
As it stands now, I'm having to hard code the IP addresses for the Surface to connect to, and I'm trying to get past this.
I have 4 text fields for the user to enter the 4 fields of IP address data. My issue at the moment is that if I set the String variable literally, it works fine. But if I dynamically create the string, instead of outputting on one line, it outputs each of the 4 strings separately.
Here's my code:
var newIP1 = getIP.IPtext.IP1.text; //grabbing the data from the UI
var newIP2 = getIP.IPtext.IP2.text;
var newIP3 = getIP.IPtext.IP3.text;
var newIP4 = getIP.IPtext.IP4.text;
var ipArray = new Array(newIP1,newIP2,newIP3,newIP4); //setting the array
trace (ipArray.join(".")); // output the string, replacing the commas with a period
//output:
//10
//.255
//.255
//.22
//If I do this it works fine
var IPstr = "10.255.255.2";
trace(IPstr);
// output: 10.255.255.22
I appreciate any help on this, thanks in advance.
Your code looks good and should work as expected.
One thing to check would be to see if there isn't a carriage return or newline character being added to each individual input box. One way to check would be to check the length of each of your input strings to ensure there isn't an invisible character there.