python3 replace ' in a string - string

I am trying to clean text strings containing any ' or &#39 (which includes an ; but if i add it here you will see just ' again. Because the the ANSI is also encoded by stackoverflow. The string content contains ' and when it does there is an error.
when i insert the string to my database i get this error:
psycopg2.ProgrammingError: syntax error at or near "s"
LINE 1: ...tment and has commenced a search for mr. whitnell's
the original string looks like this:
...a search for mr. whitnell&#39s...
To remove the ' and &#39 ; I use:
stripped_content = stringcontent.replace("'","")
stripped_content = stringcontent.replace("&#39 ;","")
any advice is welcome, best regards

When you try to replace("&#39 ;","") it literally searching for "&#39 ;" occurrences in string. You need to convert "&#39 ;" to its character equivalent. Try this:
s = "That's how we 'roll"
r = s.replace(chr(int('&#39'[2:])), "")
and with this chr(int('&#39'[2:])) you'll get ' character.
Output:
Thats how we roll
Note
If you try to run this s.replace(chr(int('&#39'[2:])), "") without saving your result in variable then your original string would not be affected.

Related

nodejs how to replace ; with ',' to make an sql query

I have a query that looks like this:
INSERT INTO table VALUES ('47677;2019;2019;10T-1001-10010AS;A05;International;TieLineKoman-KosovoB;L_KOM-KOSB;2018;NULL;NULL;;NULL;Tieline;NULL;10XAL-KESH-----J;0;3')
that is produced by parsing a csv file.
The query is not in a valid form, I have to replace all semicolons with the string ',' (comma inside single quotes). What I want to get is:
('47677','2019','2019','10T-1001-10010AS','A05','International','TieLineKoman-KosovoB','L_KOM-KOSB','2018','NULL','NULL','','NULL','Tieline','NULL','10XAL-KESH-----J','0','3')
I have tried to do this in many different ways, but I end up with backshlashes added in my string. This is what I get:
"INSERT INTO AllocatedEICDetail VALUES ('47677\\',\\'2019\\',\\'2019\\',\\'10T-1001-10010AS\\',\\'A05\\',\\'International\\',\\'TieLineKoman-KosovoB\\',\\'L_KOM-KOSB\\',\\'2018\\',\\'NULL\\',\\'NULL\\',\\'\\',\\'NULL\\',\\'Tieline\\',\\'NULL\\',\\'10XAL-KESH-----J\\',\\'0\\',\\'3')"
Any ideas how to do this properly without having the backslashes added?
Thank you!
//the string you have
const string = '47677;2019;2019;10T-1001-10010AS;A05;International;TieLineKoman-KosovoB;L_KOM-KOSB;2018;NULL;NULL;;NULL;Tieline;NULL;10XAL-KESH-----J;0;3';
//the string you need:
const targetString = string.replace(/\;/g,',');
You specify a small regex between the forward slashes in replace which is a simple ';', give it a 'g' flag for global which will replace all instances, and in the second argument supply what you need it replaced with.

how to modify textfile using U-SQL

I have a large file of around 130MB containing 10 A characters in each line and \t at the end of 10th "A" character, I want to extract this text file and then change all A's to B's. Can any one help with its code snippet?
this is what I have wrote till now
USE DATABASE imodelanalytics;
#searchlog =
EXTRACT characters string
FROM "/iModelAnalytics/Samples/Data/dummy.txt"
USING Extractors.Text(delimiter: '\t', skipFirstNRows: 1);
#modify =
SELECT characters AS line
FROM #searchlog;
OUTPUT #modify
TO "/iModelAnalytics/Samples/Data/B.txt"
USING Outputters.Text();
I'm new to this, so any suggestions will be helpful ! Thanks
Assuming all of the field would be AAAAAAAAAA then you could write:
#modify = SELECT "BBBBBBBBBB" AS characters FROM #searchlog;
If only some are all As, then you would do it in the SELECT clause:
#modify =
SELECT (characters == "AAAAAAAAAA" ? "BBBBBBBBBB" : characters) AS characters
FROM #searchlog;
If there are other characters around the AAAAAAAAAA then you would use more of the C# string functions to find them and replace them in a similar pattern.

Find index of a specific character in a string then parse the string

I have strings which looks like this [NAME LASTNAME/NAME.LAST#emailaddress/123456678]. What I want to do is parse strings which have the same format as shown above so I only get NAME LASTNAME. My psuedo idea is find the index of the first instance of /, then strip from index 1 to that index of / we found. I want this as a VBScript.
Your way should work. You can also Split() your string on / and just grab the first element of the resulting array:
Const SOME_STRING = "John Doe/John.Doe#example.com/12345678"
WScript.Echo Split(SOME_STRING, "/")(0)
Output:
John Doe
Edit, with respect to comments.
If your string contains the [, you can still Split(). Just use Mid() to grab the first element starting at character position 2:
Const SOME_STRING = "[John Doe/John.Doe#example.com/12345678]"
WScript.Echo Mid(Split(SOME_STRING, "/")(0), 2)
Your idea is good here, you should also need to grab index for "[".This will make script robust and flexible here.Below code will always return strings placed between first occurrence of "[" and "/".
var = "[John Doe/John.Doe#example.com/12345678]"
WScript.Echo Mid(var, (InStr(var,"[")+1),InStr(var,"/")-InStr(var,"[")-1)

Reading from a string using sscanf in Matlab

I'm trying to read a string in a specific format
RealSociedad
this is one example of string and what I want to extract is the name of the team.
I've tried something like this,
houseteam = sscanf(str, '%s');
but it does not work, why?
You can use regexprep like you did in your post above to do this for you. Even though your post says to use sscanf and from the comments in your post, you'd like to see this done using regexprep. You would have to do this using two nested regexprep calls, and you can retrieve the team name (i.e. RealSociedad) like so, given that str is in the format that you have provided:
str = 'RealSociedad';
houseteam = regexprep(regexprep(str, '^<a(.*)">', ''), '</a>$', '')
This looks very intimidating, but let's break this up. First, look at this statement:
regexprep(str, '^<a(.*)">', '')
How regexprep works is you specify the string you want to analyze, the pattern you are searching for, then what you want to replace this pattern with. The pattern we are looking for is:
^<a(.*)">
This says you are looking for patterns where the beginning of the string starts with a a<. After this, the (.*)"> is performing a greedy evaluation. This is saying that we want to find the longest sequence of characters until we reach the characters of ">. As such, what the regular expression will match is the following string:
<ahref="/teams/spain/real-sociedad-de-futbol/2028/">
We then replace this with a blank string. As such, the output of the first regexprep call will be this:
RealSociedad</a>
We want to get rid of the </a> string, and so we would make another regexprep call where we look for the </a> at the end of the string, then replace this with the blank string yet again. The pattern you are looking for is thus:
</a>$
The dollar sign ($) symbolizes that this pattern should appear at the end of the string. If we find such a pattern, we will replace it with the blank string. Therefore, what we get in the end is:
RealSociedad
Found a solution. So, %s stops when it finds a space.
str = regexprep(str, '<', ' <');
str = regexprep(str, '>', '> ');
houseteam = sscanf(str, '%*s %s %*s');
This will create a space between my desired string.

Is there a way to add quotes to a multi paragraph string

I wrote the following line:
string QuoteTest2 = "Benjamin Netnayahu,\"BB\", said that: \"Israel will not fall\"";
This example went well, but what can I do in case I want to write a multi paragraph string including quotes?
The following example shows that puting '#' before the doesn't cut it..
string QuoteTest2 = #"Benjamin Netnayahu,\"BB\", said that: \"Israel will not fall\"";
The string ends and the second quote and the over just gives me errors, what should I do?
Use double quotes to escape ""
e.g.
string QuoteTest2 = #"Benjamin Netnayahu,""BB"", said that: ""Israel will not fall""";

Resources