How to remove newlines from a group of n-consecutive lines in sublime text - text

I've a text file where each line contains a uuid within single quote followed by a comma. A sample of this file would look like the following:
'527a34922f3472506d93f393c1dd5cac',
'7bdce3215c3007ccfb3449702234a2b4',
'b74d228b5c6dbfd95ac989eb7b4837ac',
'59c7694db4effe03984d05b43c46c1ce',
'b038091601beb11c00d28d8ea277cecb',
'c3b489c4b7526adb36b049c76c75835d',
'cfdf54d36262c474103fba5486f3fa48',
'10d3d4c4aa0f162d5ab3a403010c2202',
'1103abf37755c8477f0177478a0f91cd',
...
I'm using Sublime Text 4, and I need to group every 3 lines and pull them into a single line. So, the output I'm expecting is:
'527a34922f3472506d93f393c1dd5cac', '7bdce3215c3007ccfb3449702234a2b4', 'b74d228b5c6dbfd95ac989eb7b4837ac',
'59c7694db4effe03984d05b43c46c1ce', 'b038091601beb11c00d28d8ea277cecb', 'c3b489c4b7526adb36b049c76c75835d',
'cfdf54d36262c474103fba5486f3fa48', '10d3d4c4aa0f162d5ab3a403010c2202', '1103abf37755c8477f0177478a0f91cd',
...
How do I achieve that? I could select every group of 3 lines using this command: ((.*\n){1, 3}), but not able to perform the grouping on each selection.

You can capture the body of every line and match the trailing newline of each, do this for 3 lines and replace them with the three lines separated by spaces instead, and end with another newline:
Find:
(.*)\n(.*)\n(.*)\n
Replace:
\1 \2 \3\n
Demo: https://regex101.com/r/lwmRCQ/1

Related

Groovy replace using Regex

I have varibale which contains raw data as shown below.
I want to replace the comma inside double quotes to nothing/blank.
I used replaceAll(',',''), but all other commas are also getting replaced.
So need regex function to identify pattern like "123,456,775" and then replace here comma into blank.
var = '
2/5/2023,25,"717,990","18,132,406"
2/4/2023,27,"725,674","19,403,116"
2/3/2023,35,"728,501","25,578,008"
1/31/2023,37,"716,580","26,358,186"
2/1/2023,37,"720,466","26,494,010"
1/30/2023,37,"715,685","26,517,878"
2/2/2023,37,"723,545","26,603,765" '
Tried replaceAll, but did not work
If you just want to replace "," with "", you have to escape the quotes this will do:
var.replaceAll(/\",\"/, /\"\"/)
If you want to replace commas inside the number strings, "725,674" with "725674" you will have to use a regex and capture groups, like this:
var.replaceAll(/(\"\d+),(\d+\")/, /$1$2/)
It will change for three groupings, like "18,132,406", you will have to use three capture groups.

Strip characters to the left of a specific character in a pandas column

I have the following data:
key German
0 0:- Profile 1
1 1:- Archetype Realist*in
2 2:- RIASEC Code: R- Realistic
3 3:- Subline Deine Stärke? Du bleibst dir selber treu.
4 4:- Copy Dein Erfolg basiert auf deiner praktischen Ver...
In the "Key" column I would like to remove the numbers and colon dash which follows. This order is always the same (from the left). So for the first row I would like to remove "0:- ", and just leave "Profile 1". I am struggling to find the correct regex expression to do what I want. Originally I tried the following:
df_json['key'] = df_json['key'].map(lambda x: x.strip(':- ')[1])
However, this approach is too restrictive since there can be multiple words in the field.
I would like to use pd.Series.str.replace(), but I cant figure out the correct regex expression to achieve the desired results. Any help would be greatly appreciated.
With your shown samples, please try following. Using replace function of Pandas here. Simple explanation would be, apply replace function of Pandas to German column of dataframe and then use regex ^[0-9]+:-\s+ to replace values with NULL.
df['German'].replace('(^[0-9]+:-\s+)','', regex=True)
Explanation:
^[0-9]+: match starting digits followed by colon here.
:-\s+: Match colon, followed by - followed by 1 or more space occurrences.
What about just using pandas.Series.str.partition instead of regular expressions:
df['German'] = df['German'].str.partition()[2]
This would split the series on the 1st space only and grab the trailing part. Alternatively to partition you could also just split:
df['German'] = df['German'].str.split(' ', 1).str[1]
If regex is a must for you, maybe use a lazy quantifier to match upto the 1st space character:
df['German'] = df['German'].replace('^.*? +','', regex=True)
Where:
^ - Start line anchor.
.*? - Any 0+ (lazy) characters other than newline upto;
+ - 1+ literal space characters.
Here is an online demo
You need
df_json['key'] = df_json['key'].str.replace(r'^\d+:-\s*', '', regex=True)
See the regex demo and the regex graph:
Details:
^ - start of string
\d+ - one or more digits
: - a colon
- - a hyphen
\s* - zero or more whitespaces
Extract any non white Space \S and Non Digits \D which are immediately to the left of unwanted characters
df['GermanFiltered']=df['German'].str.extract("((?<=^\d\:\-\s)\S+\D+)")

Match on lines that have one string and NOT another

I am able to match strings in lines of a file like so:
re.search(r"\b10/100/1000\b", line) and re.search(r"notco*", line):
However, I need to be able to match lines that have one string, UNLESS they have another.
Example: Match pattern of '40G' unless the line also contains the pattern 'Po'
just negate the second search:
re.search("40G",line) and not re.search("Po",line)
if no need for regex, then ... no need for regex, use in:
"40G" in line and "Po" not in line

how to modify textfile using U-SQL

I have a large file of around 130MB containing 10 A characters in each line and \t at the end of 10th "A" character, I want to extract this text file and then change all A's to B's. Can any one help with its code snippet?
this is what I have wrote till now
USE DATABASE imodelanalytics;
#searchlog =
EXTRACT characters string
FROM "/iModelAnalytics/Samples/Data/dummy.txt"
USING Extractors.Text(delimiter: '\t', skipFirstNRows: 1);
#modify =
SELECT characters AS line
FROM #searchlog;
OUTPUT #modify
TO "/iModelAnalytics/Samples/Data/B.txt"
USING Outputters.Text();
I'm new to this, so any suggestions will be helpful ! Thanks
Assuming all of the field would be AAAAAAAAAA then you could write:
#modify = SELECT "BBBBBBBBBB" AS characters FROM #searchlog;
If only some are all As, then you would do it in the SELECT clause:
#modify =
SELECT (characters == "AAAAAAAAAA" ? "BBBBBBBBBB" : characters) AS characters
FROM #searchlog;
If there are other characters around the AAAAAAAAAA then you would use more of the C# string functions to find them and replace them in a similar pattern.

how to search on the folded range in gvim?

Here is the material to test the search function:
move cursor in the line 5
zfG to fold line 5 until the end of the file. (please see the
attachment1)
When I input ?case, I can get the attachment2
Now, let me add a line which contains only one word case in the first line, and fold line 6 until the end of the file too. When I input ?case, I can only get the line 1, can not search line7?
How can I search on the folded lines?
The following is the raw material to test:
Note the use of ( ) with the pipe symbol to specify the 'or' condition
/[0-9]*/MATches if there are zero or more numbers in the line
/^[^#]/ Matches if the first character is not a # in the line
Notes:
1. Regular expressions are case sensitive
2. Regular expressions are to be used where pattern is specifiedYou can jump back to beginning of file by typing any one of the following command

Resources