grep specific part out of a line of text - text

this is my first question here so please bear with me.
I have a large text file from which I need only one specific part of one line. I can grep the line but I do not know how I can get that specific part out of that line.
here is my text line (stored in output.txt)
><source src="https://download.foobar.com/content/mp4/web01/2017/05/08/24599/mp4_web01.mp4" type="video/mp4" data-label="Laag - 360p" /><source src="https://download.foobar.com/content/mp4/web02/2017/05/08/24599/mp4_web02.mp4" type="video/mp4" data-label="Hoog - 720p" /><source src="https://download.foobar.com/content/mp4/web03/2017/05/08/24599/mp4_web03.mp4" type="video/mp4" data-label="Normaal - 480p" /></video></div></div>
the part I need to extract from this line is:
https://download.foobar.com/content/mp4/web02/2017/05/08/24599/mp4_web02.mp4
Now I can do a grep like this but that gives me back three lines:
grep -Po '><source src="\K[^"]+' output.txt
gives me:
https://download.omroep.nl/nos/content/mp4/web01/2017/05/08/24599/mp4_web01.mp4
https://download.omroep.nl/nos/content/mp4/web02/2017/05/08/24599/mp4_web02.mp4
https://download.omroep.nl/nos/content/mp4/web03/2017/05/08/24599/mp4_web03.mp4
I would like to get only the line I am looking for without making the extra sed command to remove the first and third line of the results.
How can I grep the input line and only get back the intended line. I only need the link to the mp4_web02.mp4 file.
Can anyone help me get this into one grep command?

Related

2g is not working in sed to skip the first occurrence of a match

I am trying to replace some text in a xml file using sed. I am able to replace my text, but i want to skip the first occurence. i am using 2g, but it is not working. No error is displayed, but no change happens to file.
My Xml file :
<file-min-size>10830</file-min-size>
<rotate-log>true</rotate-log>
<file-min-size>25600</file-min-size>
<rotate-log>true</rotate-log>
<file-min-size>32300</file-min-size>
<rotate-log>true</rotate-log>
<file-min-size>13456</file-min-size>
<rotate-log>true</rotate-log>
My expected output :
<file-min-size>10830</file-min-size>
<rotate-log>true</rotate-log>
<file-min-size>25600</file-min-size>
<rotate-log>true insertvalue</rotate-log>
<file-min-size>32300</file-min-size>
<rotate-log>true insertvalue</rotate-log>
<file-min-size>13456</file-min-size>
<rotate-log>true insertvalue</rotate-log>
I am using the below sed command.
sed -i 's#</rotate-log>#insertvalue</rotate-log>#2g' myfile.xml
The above command is not working. if i remove 2g, then the text is repalcing. i want to skip the first occurence. Any help ?
Also when i run the command second time, the values are entering again. Is there a way to check and replace only if not available ?
With GNU sed, you may use
sed -i '/<\/rotate-log>/{:A;n;s#</rotate-log># insertvalue</rotate-log>#;bA}' file
See the online sed demo
The command finds the line with </rotate-log> and then
:A - sets a label A
n - discards the current pattern space value and reads the next line into it
s#</rotate-log># insertvalue</rotate-log># - replaces </rotate-log> with # insertvalue</rotate-log>
bA - goes to A label (reads the next line, replaces, goes on).

Assign new variable from each line of a text file

What I'm basically trying to do is automatically detect if there is text in a line, and if so create a new variable containing the text in said line , within a script. If there is no text in a line then the variable doesn't get created. I can do this manually by opening the file -
$ cat file.txt
sometxt
somemoretext
evenmoretext
...
then adding to my script the appropriate lines -
TXT=file.txt
VAR1=$(sed -n 1p $TXT)
VAR2=$(sed -n 2p $TXT)
...
but this is a pain since I have to count how many lines there are total, then copy and paste each line assigning the variables and changing 'VAR!' to 'VAR2' and '1p' to '2p'. There has to be an easier way. Thanks
#JNevil thanks for pointing me in the right direction.
Heres what ended up working for me -
for var_name in (cat links.txt); do
wget <servername.com>$var_name
done
Still dont know how to use curl but this worked fine!

finding number of occurences in large text file in linux

I have a 17 GB txt file and i cannot seem to load it via vim. Researched on solutions provided here. However i do not seem to understand them very well and i am not good with linux or perl.
I understand i would have to use grep or something.
grep -oP "/^2" file
I have tried up to this code but i cannot seem to find the solution to output the number of occurences without printing all the lines to screen
I would like to find the number of lines that starts with a digit 2 in the file and output the number to shell.
If you want to continue using PCRE:
grep -cP ^2 file
Using grep's "basic regular expressions":
grep -c ^2 file

funky file name output from shell/bash?

So, im making a small script to do an entire task for me. The task is to get the output of the dmidecode -Fn into a text file and then take a part of the dmidecode output, in my example, the Address (0xE0000) as the file name of the txt.
My script goes as follows and does work, i have tested it. The only little issue that i have, is that the file name of the txt appears as "? 0xE0000.txt"
My question is, why am i getting a question mark followed by a space in the name?
#!/bin/bash
directory=$(pwd)
name=$(dmidecode|grep -i Address|sed 's/Address://')
inxi -Fn > $directory/"$name".txt
The quotes in the "$name".txt is to avoid an "ambiguous redirect" error i got when running the script.
Update #Just Somebody
root#server:/home/user/Desktop# dmidecode | sed -n 's/Address://p'
0xE0000
root#server:/home/user/Desktop#
Solution
The use of |sed -n 's/^.*Address:.*0x/0x/p' got rid of the "? " in 0xE0000.txt
A big thanks to everyone!
You've got a nonprinting char in there. Try:
dmidecode |grep -i Address|sed 's/Address://'| od -c
to see exactly what you're getting.
UPDATE: comments indicate there's a tab char in there that needs to be cleaned out.
UPDATE 2: the leading tab is before the word Address. Try:
name=$(dmidecode |grep -i Address|sed 's/^.*Address:.*0x/0x/')
or as #just_somebody points out:
name=$(dmidecode|sed -n 's/^.*Address:.*0x/0x/p')
UPDATE 3
This changes the substitution regex to replace
^ (start of line) followed by .* (any characters (including tab!)) followed by Address: followed by .* (any characters (including space!)) followed by 0x (which are always at the beginning of the address since it's in hex)
with
0x (because you want that as part of the result)
If you want to learn more, read about sed regular expressions and substitutions.

How can I replace a specific line by line number in a text file?

I have a 2GB text file on my linux box that I'm trying to import into my database.
The problem I'm having is that the script that is processing this rdf file is choking on one line:
mismatched tag at line 25462599, column 2, byte 1455502679:
<link r:resource="http://www.epuron.de/"/>
<link r:resource="http://www.oekoworld.com/"/>
</Topic>
=^
I want to replace the </Topic> with </Line>. I can't do a search/replace on all lines but I do have the line number so I'm hoping theres some easy way to just replace that one line with the new text.
Any ideas/suggestions?
sed -i yourfile.xml -e '25462599s!</Topic>!</Line>!'
sed -i '25462599 s|</Topic>|</Line>|' nameoffile.txt
The tool for editing text files in Unix, is called ed (as opposed to sed, which as the name implies is a stream editor).
ed was once intended as an interactive editor, but it can also easily scripted. The way ed works, is that all commands take an address parameter. The way to address a specific line is just the line number, and the way to change the addressed line(s) is the s command, which takes the same regexp that sed would. So, to change the 42nd line, you would write something like 42s/old/new/.
Here's the entire command:
FILENAME=/path/to/whereever
LINENUMBER=25462599
ed -- "${FILENAME}" <<-HERE
${LINENUMBER}s!</Topic>!</Line>!
w
q
HERE
The advantage of this is that ed is standardized, while the -i flag to sed is a proprietary GNU extension that is not available on a lot of systems.
Use "head" to get the first 25462598 lines and use "tail" to get the remaining lines (starting at 25462601). Though... for a 2GB file this will likely take a while.
Also are you sure the problem is just with that line and not somewhere previous (ie. the error looks like an XML parse error which might mean the actual problem is someplace else).
My shell script:
#!/bin/bash
awk -v line=$1 -v new_content="$2" '{
if (NR == line) {
print new_content;
} else {
print $0;
}
}' $3
Arguments:
first: line number you want change
second: text you want instead original line contents
third: file name
This script prints output to stdout then you need to redirect. Example:
./script.sh 5 "New fifth line text!" file.txt
You can improve it, for example, by taking care that all your arguments has expected values.

Resources