SPLIT results with separator - string

I'm trying to split a string (separated with the HTML break tag), without deleting the break tag. I think it's pretty messy to add a break as string after splitting, so is there any function/possibility to keep the separator while "splitting"?
Example:
<HTML><BODY><p>some text<br/>some more text</p></BODY></HTML>
Expected result:
<HTML><BODY><p>some text<br/>
some more text</p></BODY></HTML>

As far as I know SPLIT removes the separator from the results and it doesn't seem like you can change that.
But you could create your own separator by first replacing your <br/> tag with <br/> plus an arbitrary string that is highly unlikely to ever appear in your HTML source, and then split the HTML using this arbitrary string as a separator instead.
types:
begin of t_result,
segment(2000) type c,
end of t_result.
DATA:
source type string,
separator type string,
brtag type string,
repl type string,
result_tab type standard table of t_result,
result_row TYPE t_result.
brtag = '<br/>'.
separator = '|***SEP***|'.
concatenate brtag separator into repl.
source = '<HTML><BODY><p>some text<br/>some more text</p></BODY></HTML>'.
replace all occurrences of brtag in source with repl.
split source at separator into table result_tab.
LOOP AT result_tab INTO result_row.
WRITE:
result_row-segment.
ENDLOOP.
Output of that example report:
<HTML><BODY><p>some text<br/>
some more text</p></BODY></HTML>
The caveat of this solution is that your custom separator, if not chosen with some care, might appear in your HTML source on its own. I therefore would choose an arbitrary string with a special character or two that would be encoded in HTML (like umlauts) and therefore not appear in your source.

Just use the replace command. replace <br/> with <br/>CR_LF
The CR_LF refers to the carriage return linefeed character.
In more complex cases you can use regex expressions in abap.
class ZTEST_SO definition public create public .
public section.
methods t1.
ENDCLASS.
CLASS ZTEST_SO IMPLEMENTATION.
METHOD T1.
data: my_break type string,
my_string type string
value '<HTML><BODY><p>some text<br/>some more text</p></BODY></HTML>'.
my_break = '<br/>' && CL_ABAP_CHAR_UTILITIES=>CR_LF.
replace all occurrences of '<br/>' in my_string with my_break in character mode.
"check my_string in the debugger :)
"<HTML><BODY><p>some text<br/>
"some more text</p></BODY></HTML>
ENDMETHOD.
ENDCLASS.

Related

Regular expression for splitting a comma with the string

I had to split string data based on Comma.
This is the excel data:-
Please find the excel data
string strCurrentLine="\"Himalayan Salt Body Scrub with Lychee Essential Oil from Majestic Pure, All Natural Scrub to Exfoliate & Moisturize Skin, 12 oz\",SKU_27,\"Tombow Dual Brush Pen Art Markers, Portrait, 6-Pack\",SKU_27,My Shopify Store 1,Valid,NonInventory".
Regex CSVParser = new Regex(",(?=(?:[^\"]\"[^\"]\")(?![^\"]\"))");
string[] lstColumnValues = CSVParser.Split(strCurrentLine);
I have attached the image.The problem is I used the Regex to split the string with comma but i need the ouptut just like SKU_27 because string[0] and string2 contains the forward and backward slash.I need the output string1 and remove the forward and backward slash.
The file seems to be a CVA file. For CVA to be properly formatted, it will use quotes "" to wrap strings that contains comma, such as
id, name, date
1,"Some text, that includes comma", 2020/01/01
Simply split the string by comma, you will get the 2nd column with double quote.
I'm not sure whether you are asking how to remove the double-quotes from lstColumnValues[0] and lstColumnValues[2], or add them to lstColumnValues[1].
To remove the double-quotes, just use Replace:
string myString = lstColumnValues[0].Replace("\"", "");
If you need to add them:
string myString = $"\"{lstColumnValues[1]}\"";

Display the specific part of the string in PostgreSQL 9.3

I have a string to modify as per the requirements.
For example:
The given string is:
str1 varchar = '123,456,789';
I want to show the string as:
'456,789'
Note: The first part (delimited) with comma, I want to remove from string and show the rest of string.
In SQL Server I used STUFF() function.
SELECT STUFF('123,456,789',1,4,'');
Result:
456,789
Question: Is there any string function in PostgreSQL 9.3 version to do the same job?
you can use regular expressions:
select substring('123,456,789' from ',(.*)$');
The comma matches the first comma found in the string. The part inside the brackets (.*) is returned from the function. The symbol $ means the end of the string.
A alternative solution without regular expressions:
select str, substring(str from position(',' in str)+1 for length(str)) from
(select '123,456,789'::text as str) as foo;
You could first turn the string to array and return second and third cell:
select array_to_string((regexp_split_to_array('123,456,789', ','))[2:3], ',')
Or you could use substring-function with regular expressions (pattern matching):
SELECT substring('123,456,789' from '[0-9]+,([0-9]+,[0-9]+)')
[0-9]+ means one or more digits
parentheses tell to return that part from the string
Both solutions work on your specific string.
Your The SQL Server example indicates you just want to remove the first 4 characters, which makes the rest of your question seem misleading because it completely ignores what's in the string. Only the positions matters.
Be that as it may, the simple and cheap way to cut off leading characters is with right():
SELECT right('123,456,789', -4);
SQL Fiddle.

rstrip() has no effect on string

Trying to use rstrip() at its most basic level, but it does not seem to have any effect at all.
For example:
string1='text&moretext'
string2=string1.rstrip('&')
print(string2)
Desired Result:
text
Actual Result:
text&moretext
Using Python 3, PyScripter
What am I missing?
someString.rstrip(c) removes all occurences of c at the end of the string. Thus, for example
'text&&&&'.rstrip('&') = 'text'
Perhaps you want
'&'.join(string1.split('&')[:-1])
This splits the string on the delimiter "&" into a list of strings, removes the last one, and joins them again, using the delimiter "&". Thus, for example
'&'.join('Hello&World'.split('&')[:-1]) = 'Hello'
'&'.join('Hello&Python&World'.split('&')[:-1]) = 'Hello&Python'

Reading from a string using sscanf in Matlab

I'm trying to read a string in a specific format
RealSociedad
this is one example of string and what I want to extract is the name of the team.
I've tried something like this,
houseteam = sscanf(str, '%s');
but it does not work, why?
You can use regexprep like you did in your post above to do this for you. Even though your post says to use sscanf and from the comments in your post, you'd like to see this done using regexprep. You would have to do this using two nested regexprep calls, and you can retrieve the team name (i.e. RealSociedad) like so, given that str is in the format that you have provided:
str = 'RealSociedad';
houseteam = regexprep(regexprep(str, '^<a(.*)">', ''), '</a>$', '')
This looks very intimidating, but let's break this up. First, look at this statement:
regexprep(str, '^<a(.*)">', '')
How regexprep works is you specify the string you want to analyze, the pattern you are searching for, then what you want to replace this pattern with. The pattern we are looking for is:
^<a(.*)">
This says you are looking for patterns where the beginning of the string starts with a a<. After this, the (.*)"> is performing a greedy evaluation. This is saying that we want to find the longest sequence of characters until we reach the characters of ">. As such, what the regular expression will match is the following string:
<ahref="/teams/spain/real-sociedad-de-futbol/2028/">
We then replace this with a blank string. As such, the output of the first regexprep call will be this:
RealSociedad</a>
We want to get rid of the </a> string, and so we would make another regexprep call where we look for the </a> at the end of the string, then replace this with the blank string yet again. The pattern you are looking for is thus:
</a>$
The dollar sign ($) symbolizes that this pattern should appear at the end of the string. If we find such a pattern, we will replace it with the blank string. Therefore, what we get in the end is:
RealSociedad
Found a solution. So, %s stops when it finds a space.
str = regexprep(str, '<', ' <');
str = regexprep(str, '>', '> ');
houseteam = sscanf(str, '%*s %s %*s');
This will create a space between my desired string.

How to write properties file in Groovy without escape characters and single quotes?

I have something like:
def newProps = new Properties()
def fileWriter = new OutputStreamWriter(new FileOutputStream(propsFile,true), 'UTF-8')
def lineSeparator = System.getProperty("line.separator")
newProps.setProperty('SFTP_USER_HASH', userSftpHome.toString())
newProps.setProperty('GD_SFTP_URI', sftpHost.toString())
fileWriter.write(lineSeparator)
newProps.store(fileWriter, null)
fileWriter.close()
The problem is that store() method escapes ":" or "=" characters with backslash (). I don't want that because I store there some passwords and tokens and need to copy those values strictly in the key=value format.
Also, when I use the configSlurper, it stores the values with single quotes, like:
key='value'
Is there any solution for that? Saving in unescaped key=value format to properties file in Groovy?
You could do this:
def newProps = new Properties()
newProps.setProperty('SFTP_USER_HASH', 'woo')
newProps.setProperty('GD_SFTP_URI', 'ftp://woo.com')
propsFile.withWriterAppend( 'UTF-8' ) { fileWriter ->
fileWriter.writeLine ''
newProps.each { key, value ->
fileWriter.writeLine "$key=$value"
}
}
BUT, so long as you are reading the properties in with load, there should be no need for this as it should de-escape any escaped characters
The JDK's built in Properties class does that escaping by design. According to the Docs:
Then every entry in this Properties table is written out, one per
line. For each entry the key string is written, then an ASCII =, then
the associated element string. For the key, all space characters are
written with a preceding \ character. For the element, leading space
characters, but not embedded or trailing space characters, are written
with a preceding \ character. The key and element characters #, !, =,
and : are written with a preceding backslash to ensure that they are
properly loaded.
You can however, override this behavior by sub-classing the Properties class yourself. You'd need to override the load and store methods yourself and read/write yourself. It would be pretty straight forward; pretty good examples found here: Link

Resources