parse file with space delimitter

parse file with space delimitter - string

I have a file with chinese content that I need to parse. Each post has some weird delimitter between fields and I am trying to isolate the fields but cannot recognize the delimitter.
Dim stringSplitter() as string = {" "}
Try
sampleResults = entry.Split(stringSplitter,StringSplitOptions.RemoveEmptyEntries)
.....
A sample of the post content;
108087006686338t.qq.com/GAOCHUANG8899homeGAOCHUANG8899homehttp://t.qq.com/p/t/1080870066863382012-03-22 04:49:46
The separator starts after the first set of digits 108087006686338 DELIMITTER t.qq.com/GAOCHUANG8899home . I initially thought I could split it using json but this is definitely not json format.
Sorry when I post the original the delimitters disappear when making this post. The delimitter looks like a rectangular block
EDIT:
Ok using the hex editor I identified the character hex value as 01 and it looks like a period but the period has a value of 2E. Does this mean anything to anyone?
EDIT:
Reproducing the question: can I split a string based on a hex value. If the value is "01" then how would I split the string based on that value.
EDIT:
final answer:`
Dim hvalue as Char = Char(1)
Dim stringSplitter() as string = {hvalue}

Let's say you have input $input and delimitter with ascii code of 01.
Perl:
my $input = ...
my #output = split(chr(01), $input);
print "$_\t" for #output; # print all items
The code above will split your $input into #output array, so then you can access items via
$output[0] # first item
$output[1] # second item
...
$#output + 1 # number of items
Visual-Studio-2010:
Dim hvalue as Char = Char(1)
Dim stringSplitter() as string = {hvalue}

Related

Getting all words out of a string

I have this arrayList that receives data dynamically from a database
val deviceNameList = arrayListOf<String>()
Getting the index 0 of the arraylist ie deviceNameList[0] prints a string of such a format:
[Peter, James]
How can i list all names in deviceNameList[0] individually.

Assuming your input string is [Peter, James], you could try removing the square brackets at both ends, then regex splitting on comma followed by optional whitespace.
String input = "[Peter, James]";
String[] names = input.substring(1, input.length()-1).split(",\\s*");
System.out.println(Arrays.toString(names));
This prints:
[Peter, James]
Note that Java itself places square brackets around the array contents in Arrays.toString. They are not part of the actual data.

Regular expression for splitting a comma with the string

I had to split string data based on Comma.
This is the excel data:-
Please find the excel data
string strCurrentLine="\"Himalayan Salt Body Scrub with Lychee Essential Oil from Majestic Pure, All Natural Scrub to Exfoliate & Moisturize Skin, 12 oz\",SKU_27,\"Tombow Dual Brush Pen Art Markers, Portrait, 6-Pack\",SKU_27,My Shopify Store 1,Valid,NonInventory".
Regex CSVParser = new Regex(",(?=(?:[^\"]\"[^\"]\")(?![^\"]\"))");
string[] lstColumnValues = CSVParser.Split(strCurrentLine);
I have attached the image.The problem is I used the Regex to split the string with comma but i need the ouptut just like SKU_27 because string[0] and string2 contains the forward and backward slash.I need the output string1 and remove the forward and backward slash.

The file seems to be a CVA file. For CVA to be properly formatted, it will use quotes "" to wrap strings that contains comma, such as
id, name, date
1,"Some text, that includes comma", 2020/01/01
Simply split the string by comma, you will get the 2nd column with double quote.

I'm not sure whether you are asking how to remove the double-quotes from lstColumnValues[0] and lstColumnValues[2], or add them to lstColumnValues[1].
To remove the double-quotes, just use Replace:
string myString = lstColumnValues[0].Replace("\"", "");
If you need to add them:
string myString = $"\"{lstColumnValues[1]}\"";

how to modify textfile using U-SQL

I have a large file of around 130MB containing 10 A characters in each line and \t at the end of 10th "A" character, I want to extract this text file and then change all A's to B's. Can any one help with its code snippet?
this is what I have wrote till now
USE DATABASE imodelanalytics;
#searchlog =
EXTRACT characters string
FROM "/iModelAnalytics/Samples/Data/dummy.txt"
USING Extractors.Text(delimiter: '\t', skipFirstNRows: 1);
#modify =
SELECT characters AS line
FROM #searchlog;
OUTPUT #modify
TO "/iModelAnalytics/Samples/Data/B.txt"
USING Outputters.Text();
I'm new to this, so any suggestions will be helpful ! Thanks

Assuming all of the field would be AAAAAAAAAA then you could write:
#modify = SELECT "BBBBBBBBBB" AS characters FROM #searchlog;
If only some are all As, then you would do it in the SELECT clause:
#modify =
SELECT (characters == "AAAAAAAAAA" ? "BBBBBBBBBB" : characters) AS characters
FROM #searchlog;
If there are other characters around the AAAAAAAAAA then you would use more of the C# string functions to find them and replace them in a similar pattern.

python3 replace ' in a string

I am trying to clean text strings containing any ' or &#39 (which includes an ; but if i add it here you will see just ' again. Because the the ANSI is also encoded by stackoverflow. The string content contains ' and when it does there is an error.
when i insert the string to my database i get this error:
psycopg2.ProgrammingError: syntax error at or near "s"
LINE 1: ...tment and has commenced a search for mr. whitnell's
the original string looks like this:
...a search for mr. whitnell&#39s...
To remove the ' and &#39 ; I use:
stripped_content = stringcontent.replace("'","")
stripped_content = stringcontent.replace("&#39 ;","")
any advice is welcome, best regards

When you try to replace("&#39 ;","") it literally searching for "&#39 ;" occurrences in string. You need to convert "&#39 ;" to its character equivalent. Try this:
s = "That's how we 'roll"
r = s.replace(chr(int('&#39'[2:])), "")
and with this chr(int('&#39'[2:])) you'll get ' character.
Output:
Thats how we roll
Note
If you try to run this s.replace(chr(int('&#39'[2:])), "") without saving your result in variable then your original string would not be affected.

groovy - drop first line from string

I have to drop out the first line (UNA:+.? ') from the following input string:
UNA:+.? '
UNB+UNOA:2+422207530:9+8713381197918:14+20141212:1555+1082746344'
UNH+1+ORDERS:D:97A:UN'
BGM+220+105961-44+9'
DTM+137:20140121:102'
NAD+BY+0048003479::91'
NAD+SE+0000805406::91'
NAD+DP+0048003479::91'
CUX+2:USD+9'
PIA+1+M1PL05883LOT:BP::92'
PIA+1+927700077001:VP::91'
PRI+AAA:9:::1:PCE'
SCC+1'
QTY+21:10000:PCE'
DTM+2:11022014:102'
PIA+1+M1PL05883LOT:BP::92'
PIA+1+927700080201:VP::91'
PRI+AAA:9:::1:PCE'
SCC+1'
QTY+21:20000:PCE'
DTM+2:04022014:102'
UNS+S'
UNT++1'
UNZ+1+10596144'

#Jerry has the right answer...
Assuming your string is in a variable input, you can do:
String output = input.split('\n') // Split into an array based on newline
.drop(1) // Drop the first element
.join('\n') // Join back into a string separated by newline

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

parse file with space delimitter - string

Related

Getting all words out of a string

Regular expression for splitting a comma with the string

how to modify textfile using U-SQL

python3 replace ' in a string

groovy - drop first line from string

Categories

Resources