clean letters and characters in files leaving only numbers using bash - linux

I am reading files and i am doing something like:
cat file | sed s/\ //g |awk '$0 !~ /[^0-9]/'
With this line I want to clean anything different to numbers.
But i have a problem, when the file is not sorted the command works fine, but with a sorted file the command not works, the output is empty.
Who can help me?
with grep -o '[0-9]+' not works because:
I have a file like:
311435ll3e
kk13322;.
erre433
The output is:
311435
3
13322
433
And the 3 is in the second line, the output that i need is:
3114353
13322
433

As a general rule, there is no reason to have both awk and sed appearing in the same pipe, due to a large overlap of capability, and frequently the same is true of awk/grep/sed combinations.
If you just want to suppress the non-digit characters within lines of characters, use (eg) sed -e 's/[^0-9]//g' file, or if you want to do it in place with no backup, sed -i -e 's/[^0-9]//g' file, or in place with backup to a .bak file, sed -ibak -e 's/[^0-9]//g' file.
To suppress blank lines, you can append |egrep -v '^$' after the sed, but it's more efficient to just use sed's d command to delete the pattern space and start next cycle if the pattern space is empty. For example,
sed -e 's/[^0-9]//g; /^$/d' file
does a d if the line is empty after substitution.
The form suggested in 1_CR's comment,
sed -e 's/[^0-9]//g' -e '/./!d'
is an alternative. That form tests if the line has at least one character in it, and if so does not do a d.
If you want to suppress everything in the file that's not digits, use tr -cd 0-9 < file. This suppresses line feeds also.
Note, the form tr -cd [0-9] < file or tr -cd '[0-9]' < file is not correct; it will fail to suppress ] and [ characters because tr will regard them as part of SET1.

Related

Eliminate multiple space from a file and modify the original file

Specify the command that removes multiple spaces from a text file, leaving a single space in their place. Extra requirements : Original file to be modified.
Managed to pull out those 3 commands:
awk '{$2=$2};1' filename.txt
tr -s '[:space:]' < filename.txt > filename.new && mv filename.new filename.txt
sed -i 's/\s\+/ /g' filename.txt
Not sure if using a 'temporary file' is the best way to do the trick. Is there any more efficient way to do the problem ? Doesn't matter if it is tr / sed / awk or anything else, you can post all of them.
Example input:
I'm just giving spaces
Output :
I'm just giving spaces
Edit: Still looking for more answers
I'd use ed over the non-standard sed -i (And non-portable RE in your example) if you want to alter the original file:
printf "%s\n" '1,$s/[[:space:]]\{2,\}/ /g' w | ed -s filename.txt
or with perl:
perl -pi -e 's/\s{2,}/ /g' filename.txt
The {2,} regular expression construct (\{2,\} for POSIX Basic Regular Expressions like sed and ed use) matches 2 or more of the previous token.
Both of these match any whitespace characters, not just space, because that's how your examples work. If the goal is to only compress multiple spaces, not spaces + tabs, switch out the [[:space:]] and \s for just a single space.
(Anything that modifies a file "in place", be it ed, sed -i, perl -i, or a regular editor, has a good chance that it's going to be using a temporary file under the hood, by the way. They just handle it for you so you don't have to do it manually like with your tr example.)

How to to delete a line given with a variable in sed?

I am attempting to use sed to delete a line, read from user input, from a file whose name is stored in a variable. Right now all sed does is print the line and nothing else.
This is a code snippet of the command I am using:
FILE="/home/devosion/scripts/files/todo.db"
read DELETELINE
sed -e "$DELETELINE"'d' "$FILE"
Is there something I am missing here?
Edit: Switching out the -e option with -i fixed my woes!
You need to delimit the search.
#!/bin/bash
read -r Line
sed "/$Line/d" file
Will delete any line containing the typed input.
Bear in mind that sed matches on regex though and any special characters will be seen as such.
For example searching for 1* will actually delete lines containing any number of 1's not an actual 1 and a star.
Also bear in mind that when the variable expands, it cannot contain the delimiters or the command will break or have unexpexted results.
For example if "$Line" contained "/hello" then the sed command will fail with
sed: -e expression #1, char 4: extra characters after command.
You can either escape the / in this case or use different delimiters.
Personally i would use awk for this
awk -vLine="$Line" '!index($0,Line)' file
Which searches for an exact string and has none of the drawbacks of the sed command.
You might have success with grep instead of sed
read -p "Enter a regex to remove lines: " filter
grep -v "$filter" "$file"
Storing in-place is a little more work:
tmp=$(mktemp)
grep -v "$filter" "$file" > "$tmp" && mv "$tmp" "$file"
or, with sponge (apt install moreutils)
grep -v "$filter" "$file" | sponge "$file"
Note: try to get out of the habit of using ALLCAPSVARS: one day you'll accidentally use PATH=... and then wonder why your script is broken.
I found this, it allows for a range deletion with variables:
#!/bin/bash
lastline=$(whatever you need to do to find the last line)` //or any variation
lines="1,$lastline"
sed -i "$lines"'d' yourfile
keeps it all one util.
Please try this :
sed -i "${DELETELINE}d" $FILE

Linux - Get a line from one file and corresponding line from a second file and pass into cp command

I have two .txt files.
'target.txt' is a list of target files
'destination.txt' is a list of (on corresponding lines) of destinations.
I'd like to create a command that does the following:
cp [line 1 from target.txt] [line 1 from destination.txt]
For each line of the files.
paste target.txt destination.txt | sed -e 's/^/cp /' > cp.cmds
Then, after inspecting cp.cmds for correctness, you can just run it as a shell script.
sh cp.cmds
The paste command merges two files by concatenating corresponding lines.
paste target.txt destination.txt | while read target dest; do
cp $target $dest
done
This will not work if any of the filenames contain spaces, though. If that's a requirement, I would use awk to read the first file into an array, then when reading the second file print a cp command with the corresponding lines and quotes around them, and pipe this to sh to execute it.
To handle whitespace in the filenames:
paste -d\\n target.txt destination.txt | xargs -d\\n -n2 -x cp
paste -d\\n interleaves lines of the argument files
xargs -d\\n -n2 reads two complete lines at a time and applies them as two arguments at the end of the command line. The -d flag disables all special processing of quotes, apostrophes and backslashes in the input lines, as well as the eof character (by default _).
The -d command-line options to xargs is a GNU extension. If you are stuck with a Posix standard xargs, you can use the following alternative, courtesy of the Open Group (see example 2, near the end of the page):
paste -d\\n target.txt destination.txt |
sed 's/[^[:alnum:]]/\\&/g' |
xargs -E "" -n 2 -x cp
The sed command backslash-escapes every non-alphanumeric character
xargs -E "" disables the end-of-file character handling.

How to remove lines from text file not starting with certain characters (sed or grep)

How do I delete all lines in a text file which do not start with the characters #, & or *? I'm looking for a solution using sed or grep.
Deleting lines:
With grep
From http://lowfatlinux.com/linux-grep.html :
The grep command selects and prints lines from a file (or a bunch of files) that match a pattern.
I think you can do something like this:
grep -v '^[\#\&\*]' yourFile.txt > output.txt
You can also use sed to do the same thing (check http://lowfatlinux.com/linux-sed.html ):
sed '^[\#\&\*]/d' yourFile.txt > output.txt
It's up to you to decide
Filtering lines:
My mistake, I understood you wanted to delete the lines. But if you want to "delete" all other lines (or filter the lines starting with the specified characters), then grep is the way to go:
grep '^[\#\&\*]' yourFile.txt > output.txt
sed -n '/^[#&*].*/p' input.txt > output.txt
this should work.
sed -ni '/^[#&*].*/p' input.txt
this one will edit the input file directly, be careful +
egrep '^(&|#|\*)' input.txt > output.txt

Delete whitespace in each begin of line of file, using bash

How i can delete whitespace in each line of file, using bash
For instance, file1.txt. Before:
gg g
gg g
t ttt
after:
gg g
gg g
t ttt
sed -i 's/ //g' your_file will do it, modifying the file inplace.
To delete only the whitespaces at the beginning of one single line, use sed -i 's/^ *//' your_file
In the first expression, we replace all spaces with nothing.
In the second one, we replace at the beginning using the ^ keyword
tr(delete all whitespaces):
$ tr -d ' ' <input.txt >output.txt
$ mv output.txt input.txt
sed(delete leading whitespaces)
$ sed -i 's/^ *//' input.txt
use can use perl -i for in place replacement.
perl -p -e 's/^ *//' file
To delete the white spaces before start of the line if the pattern matches. Use the following command.
For example your foo.in has pattern like this
This is a test
Lolll
blaahhh
This is a testtt
After issuing following command
sed -e '/This/s/ *//' < foo.in > foo.out
The foo.out will be
This is a test
Lolll
blaahhh
This is a testtt
"Whitespace" can include both spaces AND tabs. The solutions presented to date will only match and operate successfully on spaces; they will fail if the whitespace takes the form of a tab.
The below has been tested on the OP's specimen data set with both spaces AND tabs, matching successfully & operating on both:
sed 's/^[[:blank:]]*//g' yourFile
After testing, supply the -i switch to sed to make the changes persistent-

Resources