Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 10 days ago.
Improve this question
I want to extract a substring which is present between the closing of the square bracket and opening of the next square brackets without blank spaces using regular expression. There can be multiple square brackets in one particular string.
Example
Input
str1 = '[abc] xyz [zas] bad [ras] kbc'
Output
[xyz, bad, kbc]
One approach here would actually be to use a regex replacement to strip off the [...] terms. Then, split on space to get a list of words/terms you want to keep.
str1 = '[abc] xyz [zas] bad [ras] kbc'
words = re.sub(r'\s*\[.*?\]\s*', ' ', str1).split()
print(words) # ['xyz', 'bad', 'kbc']
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 months ago.
Improve this question
I want to remove set of matching characters at the end of the string in a shell script. It should work in all the linux flavours, ideally with out using tools like sed,awk.
I found some examples on web but all of them are about removing a single character type.
Below is a set of examples which shows what I am trying to achieve.
Please help.
1. Input : test_-
Output: test
2. Input: test-_-
Output: test
3. Input: test1__-
Output: test1
I want to remove the all the "hyphen" and "underscore" characters from the end of the string.
Since you are tagging this zsh:
Assuming that your string is stored in a variable input, you can do a
if [[ $input =~ ^((.*[^-_])) ]]
then
output=$MATCH
fi
The .* does a greedy match, which guarantees that the last character is neither a dash nor a hyphen.
In bash, this works similar, only that you have to set
output=${BASH_REMATCH[1]}
Supposing your data is in a file, like
test_-
test-_-
test1__-
with grep
grep -oP '[a-z]*[0-9]*' data.txt
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
i Have Data Like this
9372127603
9372130412
9372140
9372175041
937218
9372190908
9372191764
i need output like
9372127603
9372130412
9372175041
9372190908
9372191764
what i do to achieve this in sublime text ?
You can do this via regex Find and Replace. Open Find → Replace…, make sure the regex option is on, enter ^\d{1,9}\n into the Find: field, and make sure the Replace: field is empty:
Explanation:
^ beginning of line
\d any digit
{1,9} match preceding between 1 and 9 times, inclusive
\n newline character
test at Regex101
Once you've done that, hit Replace All and any numbers less than 10 digits long will be removed:
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I've tried looking but all I came across was how to replace data inside brackets.
What I've been trying to do is, convert
Hello Hello <Hello> hello <hello>
Into
Abc Abc <Hello> abc <hello>
My brain says to add > at start and < at end and use regex to only replace the strings between > < . Tho, I've little to no idea on how to use regex, I think I'll be able to do it like that if I search for it. Still, is there any other neat way to do this?
Thanks.
Assuming the brackets are well-formed and aren't nested, match word characters that aren't inside brackets by using negative lookahead for [^<>]*>.
input = 'Hello Hello <Hello> hello <hello>'
print(re.sub(r'\w+(?![^<>]*>)', 'Abc', input))
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I want to get a word between two string like this
local str = "Hello Stackoverflow guys"
Suppose the word between the two strings (Stackoverflow) is unknown and I want to get this.
Is there a function for this?
You can use string patterns with captures for this.
https://www.lua.org/manual/5.3/manual.html#6.4.1
string.match("Hello Stackoverflow guys", "Hello (%a+) guys")
Returns any word of at least 1 letter that is between "Hello " and " guys".
In this case it's "Stackoverflow".
You can use different patterns of course to include numbers or other characters. Whatever you consider a word.
Of course it is also possible to get the second word without specifying "Hello " and " guys" or whatever. Just read the manual.
If you don't know the words, you can do
string.match("Hello Stackoverflow guys", "%s+(%S+)")
This finds the first run of whitespace and captures the following run of nonwhitespace characters.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
If 3549,2152,4701 in first column then remove the entry:
sample data:
18106|1.0.4.0/22
3549|1.0.10.0/24
5413|1.0.0.0/16
2152|1.4.0.0/16
3549|1.0.8.0/22
4701|1.0.0.0/8
Expedted output:
18106|1.0.4.0/22
5413|1.0.0.0/16
How to achieve this?
For your pattern to match only on the first field you have to anchor the expression to the start of the line:
grep -v -E '^(3549|2152|4701)\|'
The ^ marks the beginning of the line (and $ would mark the end of the line)
The -E activates enhanced regular expressions so you don't have to \ escape pipes and parentheses, and the -v inverses the search (returning only lines that do not match).
The ^ matches the start of the line then parentheses with the pipe symbol marks alternatives (3549, 2152 or 4701), and \| stands for the pipe symbol itself which your first field ends with, and needs to be escaped by the backslash so it's not treated as another alternation.
Be careful to use single quotes around it because otherwise the shell itself will interpret some of the special characters.