I want to replace this, a combination of element and text:
w<hi rend="superscript>ch
with this text string (eliminating the element):
which
Related
I have this arrayList that receives data dynamically from a database
val deviceNameList = arrayListOf<String>()
Getting the index 0 of the arraylist ie deviceNameList[0] prints a string of such a format:
[Peter, James]
How can i list all names in deviceNameList[0] individually.
Assuming your input string is [Peter, James], you could try removing the square brackets at both ends, then regex splitting on comma followed by optional whitespace.
String input = "[Peter, James]";
String[] names = input.substring(1, input.length()-1).split(",\\s*");
System.out.println(Arrays.toString(names));
This prints:
[Peter, James]
Note that Java itself places square brackets around the array contents in Arrays.toString. They are not part of the actual data.
This question already has answers here:
Remove all special characters, punctuation and spaces from string
(19 answers)
Closed 2 years ago.
I just scraped text data from a website and that data contains numbers, special characters and punctuation. After splitting the data and I tried to keep plain text but I'm getting spcaes, numbers, special characters. How to remove all those things and keep the text free from above things.
url = 'www.example.com'
html = urllib.request.urlopen(url).read().decode('utf-8')
text = get_text(html)
extracted_data = text.split()
refined_data = []
SYMBOLS = '{}()[].,:;+-*/&|<>=~0123456789'
for i in extracted_data:
if i not in SYMBOLS:
refined_data.append(i)
print("\n", "$" * 50, "HEYAAA we got arround: ", len(refined_data), " of keywords! Here are they: ","$" * 50, "\n")
print(type(refined_data))
output:
1.My
2.system
3.showing
4.error
5.404
6.I
7.don't
8.understand
9.why
10. it
11. showing ,
12.like
13.this?
14.53251
15.$45
extracted_data is the result of string.split()
The string.split() method used as such will split your text along 'any whitespaces'.
The not in operator compares i (the entire string) to a sequence. Your sequence here is just a single string, so it's like a list of the individual characters in that string.
So is 'system' in the sequence SYMBOLS? Asked again: is the string 'system' any of the characters in SYMBOLS? No it is not. Therefore, your if statement is executed and it is appended to your product.
Is '53251' in the list of one characters SYMBOLS? Not it is not. Therefore, it is appended.
And so on.
Such a list comparison is not necessary. You should be using str.strip()
I had to split string data based on Comma.
This is the excel data:-
Please find the excel data
string strCurrentLine="\"Himalayan Salt Body Scrub with Lychee Essential Oil from Majestic Pure, All Natural Scrub to Exfoliate & Moisturize Skin, 12 oz\",SKU_27,\"Tombow Dual Brush Pen Art Markers, Portrait, 6-Pack\",SKU_27,My Shopify Store 1,Valid,NonInventory".
Regex CSVParser = new Regex(",(?=(?:[^\"]\"[^\"]\")(?![^\"]\"))");
string[] lstColumnValues = CSVParser.Split(strCurrentLine);
I have attached the image.The problem is I used the Regex to split the string with comma but i need the ouptut just like SKU_27 because string[0] and string2 contains the forward and backward slash.I need the output string1 and remove the forward and backward slash.
The file seems to be a CVA file. For CVA to be properly formatted, it will use quotes "" to wrap strings that contains comma, such as
id, name, date
1,"Some text, that includes comma", 2020/01/01
Simply split the string by comma, you will get the 2nd column with double quote.
I'm not sure whether you are asking how to remove the double-quotes from lstColumnValues[0] and lstColumnValues[2], or add them to lstColumnValues[1].
To remove the double-quotes, just use Replace:
string myString = lstColumnValues[0].Replace("\"", "");
If you need to add them:
string myString = $"\"{lstColumnValues[1]}\"";
I'm trying to split a string (separated with the HTML break tag), without deleting the break tag. I think it's pretty messy to add a break as string after splitting, so is there any function/possibility to keep the separator while "splitting"?
Example:
<HTML><BODY><p>some text<br/>some more text</p></BODY></HTML>
Expected result:
<HTML><BODY><p>some text<br/>
some more text</p></BODY></HTML>
As far as I know SPLIT removes the separator from the results and it doesn't seem like you can change that.
But you could create your own separator by first replacing your <br/> tag with <br/> plus an arbitrary string that is highly unlikely to ever appear in your HTML source, and then split the HTML using this arbitrary string as a separator instead.
types:
begin of t_result,
segment(2000) type c,
end of t_result.
DATA:
source type string,
separator type string,
brtag type string,
repl type string,
result_tab type standard table of t_result,
result_row TYPE t_result.
brtag = '<br/>'.
separator = '|***SEP***|'.
concatenate brtag separator into repl.
source = '<HTML><BODY><p>some text<br/>some more text</p></BODY></HTML>'.
replace all occurrences of brtag in source with repl.
split source at separator into table result_tab.
LOOP AT result_tab INTO result_row.
WRITE:
result_row-segment.
ENDLOOP.
Output of that example report:
<HTML><BODY><p>some text<br/>
some more text</p></BODY></HTML>
The caveat of this solution is that your custom separator, if not chosen with some care, might appear in your HTML source on its own. I therefore would choose an arbitrary string with a special character or two that would be encoded in HTML (like umlauts) and therefore not appear in your source.
Just use the replace command. replace <br/> with <br/>CR_LF
The CR_LF refers to the carriage return linefeed character.
In more complex cases you can use regex expressions in abap.
class ZTEST_SO definition public create public .
public section.
methods t1.
ENDCLASS.
CLASS ZTEST_SO IMPLEMENTATION.
METHOD T1.
data: my_break type string,
my_string type string
value '<HTML><BODY><p>some text<br/>some more text</p></BODY></HTML>'.
my_break = '<br/>' && CL_ABAP_CHAR_UTILITIES=>CR_LF.
replace all occurrences of '<br/>' in my_string with my_break in character mode.
"check my_string in the debugger :)
"<HTML><BODY><p>some text<br/>
"some more text</p></BODY></HTML>
ENDMETHOD.
ENDCLASS.
I Have some text file. theses texts contain a string like this(a part of text):
<abbr class="word p1"">dd</abbr>
<img src"D:\Images\1.png">
<abbr class="word p1">dd</abbr>
<img src"D:\ticket\t\1.png">
In each text file,(D:\Images\1.png) png name is different but it is always numbers(from 1 to 114)for example(1,2,3,10,...)
I want to replace this text D:\Images\[number].png with a specific text for expample:
string newtext=Replace("D:\Images\[number].png","Something");
How can i do this?
thanks.
Use a regular expression:
string newtext = Regex.Replace(text, #"(D:\\Images\\)\d+(.png)","$1Something$2");
It will replace the full match, including D:\Images\ and .png, so $1 and $2 puts back what's caught by the parentheses, so that Somthing only replaces the digits.
Use regular expressions that are represented mostly be the Regex class. See these links:
http://www.codeproject.com/Articles/93804/Using-Regular-Expressions-in-C-NET
http://msdn.microsoft.com/en-us/library/ms228595%28v=vs.80%29.aspx