Using multiple delimiters with scanner - Java - java.util.scanner

I'm trying to use both tabs and newlines as delimiters to read from a .txt file. What I have at the moment is:
Scanner fileScanner = new Scanner(new FileReader("propertys.txt"));
fileScanner.useDelimiter("[\\t\\n]");
I've tried:
fileScanner.useDelimiter("\\t|\\n");
and
fileScanner.useDelimiter("[\\t|\\n]");
I've got no idea what's going wrong, I've searched around a lot and it looks like one of those should be working. Clearly I'm doing something wrong.

fileScanner.useDelimiter("\t|\n");
should work.
If you have two slashes "\n" the first acts as an escape and it won't work right.

For the regular expression used as a parameter in useDelimiter method, you should use newline as \n instead of \\n and tab as \t instead of \\t. From Java Pattern class: http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html.
A part from that, I think you should define you regular expression like, for example, this:
fileScanner.useDelimiter("\\s*[\t\n]\\s*");
to limit strings (\\s) between newline or tab characters.

Related

How to split a string before a dot in Lua?

I need to do a simple split of a string.
The string is "That.Awkward.Moment.2014.720p.BluRay.x264.YIFY.srt"
I just need "That.Awkward.Moment.2014.720p.BluRay.x264.YIFY" without ".srt"
I tried this and is wrong:
print(string.match("That.Awkward.Moment.2014.720p.BluRay.x264.YIFY.srt", '^.-.s'))
How would I do it?
Since regular matching is greedy, you just need to match anything until you see . (don't forget to escape it):
print(string.match("That.Awkward.Moment.2014.720p.BluRay.x264.YIFY.srt", '(.+)%.(.+)'))
will print
That.Awkward.Moment.2014.720p.BluRay.x264.YIFY srt

Add parenthesis to vim string

I'm having a bit of trouble using parenthesis in a vim string. I just need to add a set of parenthesis around 3 digits, but I can't seem to find where I'm suppose to correctly place them. So for example; I would have to place them around a phone number such as: 2015551212.
Right now I have a strings that separates the numbers and puts a hyphen between them. For example; 201 555-1212. So I just need the parenthesis. The final result should look like: (201) 555-1212
The string I have so far is this: s/\(\d\{3}\)\(\d\{3}\)/\1 \2-/g
How might I go about doing this?
Thanks
Just add the parens around the \1 in your replacement.
s/\(\d\{3\}\)\(\d\{3\}\)/(\1) \2-/g
If you want to go in reverse, and change "(800) 555-1212" to "8005551212", you can use something like this:
s/(\(\d\d\d\))\ \(\d\d\d\)-\(\d\d\d\d\)/\1\2\3/g
Instead of the \d\d\d, you could use \d\{3\}, but that is more trouble to type.

linux bash replace placeholder with unknown text which can contain any characters

If I want to replace for example the placeholder {{VALUE}} with another string which can contain any characters, what's the best way to do it?
Using sed s/{{VALUE}}/$(value)/g might fail if $(value) contains a slash...
oldValue='{{VALUE}}'
newValue='new/value'
echo "${var//$oldValue/$newValue}"
but oldValue is not a regexp but works like a glob pattern, otherwise :
echo "$var" | sed 's/{{VALUE}}/'"${newValue//\//\/}"'/g'
Sed also works like 's|something|someotherthing|g' (or with other delimiters for that matter), but if you can't control the input string, you'll have to use some function to escape it before passing it to sed..
The question asked basically duplicates How can I escape forward slashes in a user input variable in bash?, Escape a string for sed search pattern, Using sed in a makefile; how to escape variables?, Use slashes in sed replace, and many other questions. “Use a different delimiter” is the usual answer. Pianosaurus's answer and Ben Blank's answer list characters (backslash and ampersand) that need to be escaped in the shell, besides whatever character is used as an alternate delimiter. However, they don't address the quoting-a-quote problem that will occur if your “string which can contain any characters” contains a double quote. The same kind of problem can affect the ${parameter/pattern/string} shell variable expansion mentioned in a previous answer.
Some other questions besides the few mentioned above suggest using awk, and that is usually a good approach to changes that are more complicated than are easy to do with sed. Also consider perl and python. Besides single- and double-quoted strings, python has u'...' unicode quoting, r'...' raw quoting,ur'...' quoting, and triple quoting with ''' or """ delimiters. The question as stated doesn't provide enough context for specific awk/perl/python solutions.

Embedding text in AS2, like HEREDOC or CDATA

I'm loading a text file into a string variable using LoadVars(). For the final version of the code I want to be able to put that text as part of the actionscript code and assign it to the string, instead of loading it from an external file.
Something along the lines of HEREDOC syntax in PHP, or CDATA in AS3 ( http://dougmccune.com/blog/2007/05/15/multi-line-strings-in-actionscript-3/ )
Quick and dirty solutions I've found is to put the text into a text object in a movieclip and then get the value, but I dont like it
Btw: the text is multiline, and can include single quotes and double quotes.
Thanks!
I think in AS2 the only way seems to do it dirty. In AS3 you can embed resources with the Embed tag, but as far as I know not in AS2.
If it's a final version and it means you don't want to edit the text anymore, you could escape the characters and use \n as a line break.
var str = "\'one\' \"two\"\nthree";
trace(str);
outputs:
'one' "two"
three
Now just copy the text into your favourite text editor and change every ' and " to \' and \", also the line breaks to \n.
Cristian, anemgyenge's solution works when you realize it's a single line. It can be selected and replaced in a simple operation.
Don't edit the doc in the code editor. Edit the doc in a doc editor and create a process that converts it to a long string (say running it through a quick PHP script). Take the converted string and paste it in over the old string. Repeat as necessary.
It's way less than ideal from a long-term management perspective, especially if code maintenance gets handed off without you handing off the parser, but it gets around some of your maintenance issues.
Use a new pair of quotes on each line and add a space as the word delimiter:
var foo = "Example of string " +
"spanning multiple lines " +
"using heredoc syntax."
There is a project which may help that adds partial E4X support to ActionScript 2:
as24x
As well as a project which adds E4X support to Haxe, which can compile to a JavaScript target:
E4X Macro for Haxe

New lines in tab delimited or comma delimtted output

I am looking for some best practices as far as handling csv and tab delimited files.
For CSV files I am already doing some formatting if a value contains a comma or double quote but what if the value contains a new line character? Should I leave the new line intact and encase the value in double quotes + escape any double quotes within the value?
Same question for tab delimited files. I assume the answer would be very similar if not the same.
Usually you keep \n unaltered while exploiting the fact that the newline char will be enclosed in a " " string. This doesn't create ambiguities but it's really ugly if you have to take a look to the file using a normal texteditor.
But it is how you should do since you don't escape anything inside a string in a CSV except for the double quote itself.
#Jack is right, that your best bet is to keep the \n unaltered, since you'll expect it inside of double-quotes if that is the case.
As with most things, I think consistency here is key. As far as I know, your values only need to be double-quoted if they span multiple lines, contain commas, or contain double-quotes. In some implementations I've seen, all values are escaped and double-quoted, since it makes the parsing algorithm simpler (there's never a question of escaping and double-quoting, and the reverse on reading the CSV).
This isn't the most space-optimized solution, but makes reading and writing the file a trivial affair, for both your own library and others that may consume it in the future.
For TSV, if you want lossless representation of values, the "Linear TSV" specification is worth considering: http://paulfitz.github.io/dataprotocols/linear-tsv/index.html
For obvious reasons, most such conventions adhere to the following at a minimum:
\n for newline,
\t for tab,
\r for carriage return,
\\ for backslash
Some tools add \0 for NUL.

Resources