concatenating all files horizontally and only a specific column

concatenating all files horizontally and only a specific column - linux

In linux, is there a way to concatenate all the files in a directory that end with .out into one file? It would be even better if the final output file had them horizontally next to one another, rather than vertically. Even further, is it possible to only get the 6th column from each file (each column separated by a space).
I know I have been doing this in powershell. was wondering if linux can do this?
I know I can use
paste *.out > total.out
but how do I just paste the 6th column, which are separated by spaces?

Using bash and awk with temporary files to filter the sixth column of each *.out file.
#!/bin/bash
declare -a TEMPS
for name in *.out; do
TEMPS+=($(mktemp $name.XXXXXXXX))
awk '{ print $5 ;}' $name >${TEMPS[-1]}
done
paste -d ' ' "${TEMPS[#]}"
# Remove tmp files
rm "${TEMPS[#]}"
Output using the example files from #daniel
6 18 30
12 24 36

Save this script as a .sh file, then run it in your directory. This method uses sponge which you can install in Ubuntu with sudo apt-get install moreutils
saveColumn6.sh
# Make total.out a blank file
rm total.out
> total.out
# Go through every file ending in '.out'
for i in *.out
do
# cut out field 6, append it to total.out, and rewrite the file.
cut -d ' ' -f6 $i | paste -d' ' total.out - | sponge total.out
done
Here are the input files I used to test this.
in0.out
1 2 3 4 5 6
7 8 9 10 11 12
in1.out
13 14 15 16 17 18
19 20 21 22 23 24
in2.out
25 26 27 28 29 30
31 32 33 34 35 36
Here is the output file I received
total.out
6 18 30
12 24 36
Note that there is a leading space in this new database, I couldn't figure out how to get rid of that.
As Nightcrawler mentioned, Linux isn't the relevent component. You're looking for bash, the command-line shell used by many GNU/Linux based systems.

Related

Copying files with wildcard * why isn't it working?

There are 3 txt files called
1.txt 2.txt 3.txt
I want to batch copy with the name
1.txt.cp 2.txt.cp 3.txt.cp
using the wildcard *
I entered the command cp *.txt *.txt.cp
but it wasn't working...
cp : target *.txt.cp : is not a directory
what was the problem???

Use: for i in *.txt; do cp "$i" "$i.cp"; done
Example:
$ ls -l *.txt
-rw-r--r-- 1 halley halley 20 out 27 08:14 1.txt
-rw-r--r-- 1 halley halley 25 out 27 08:14 2.txt
-rw-r--r-- 1 halley halley 33 out 27 08:15 3.txt
$ ls -l *.cp
ls: could not access '*.cp': File or directory does not exist
$ for i in *.txt; do cp "$i" "$i.cp"; done
$ ls -l *.cp
-rw-r--r-- 1 halley halley 20 out 27 08:32 1.txt.cp
-rw-r--r-- 1 halley halley 25 out 27 08:32 2.txt.cp
-rw-r--r-- 1 halley halley 33 out 27 08:32 3.txt.cp
$ for i in *.txt; do diff "$i" "$i.cp"; done
$

If you are used to MS/Windown CMD shell, it is important to note that Unix system handle very differently the wild cards. MS/Windows has kept the MS/DOS rule that said that wild cards were not interpreted but were passed to the command. The command sees the wildcard characters and can handle the second * in the command as noting where the match from the first should go, making copy ab.* cd.* sensible.
In Unix (and derivatives like Linux) the shell is in charge of handling the wildcards and it replaces any word containing one with all the possible matches. The good news is that the command has not to care about that. But the downside is that if the current folder contains ab.txt ab.md5 cd.jpg, a command copy ab.* cd.* will be translated into copy ab.txt ab.md5 cd.jpg which is probably not want you would expect...
The underlying reason is Unix shells are much more versatile than the good old MS/DOS inherited CMD.EXE and do have simple to use for and if compound commands. Just look at #Halley Oliveira's answer for the syntax for your use case.

sed permission denied on temporary file

With sed I try to replace the value 0.1.233... On the command line there is no problem; however, when putting this command in a shell script, I get an error:
sed: couldn't open temporary file ../project/cas-dp-ap/sedwi3jVw: Permission denied
I don't understand where this temporary sedwi file comes from.
Do you have any idea why I have this temporary file and how I can pass it?
$(sed -i "s/$current_version/$version/" $PATHPROJET$CREATE_PACKAGE/Chart.yaml)
++ sed -i s/0.1.233/0.1.234/ ../project/cas-dp-ap/Chart.yaml
sed: couldn't open temporary file ../project/cas-dp-ap/sedwi3jVw: Permission denied
+ printf 'The version has been updated to : 0.1.234 \n\n \n\n'
The version has been updated to : 0.1.234
+ printf '***********************************'

sed -i is "in-place editing". However "in-place" isn't really. What happens is more like:
create a temporary file
run sed on original file and put changes into temporary file
delete original file
rename temporary file as original
For example, if we look at the inode of an edited file we can see that it is changed after sed has run:
$ echo hello > a
$ ln a b
$ ls -lai a b
19005916 -rw-rw-r-- 2 jhnc jhnc 6 Jan 31 12:25 a
19005916 -rw-rw-r-- 2 jhnc jhnc 6 Jan 31 12:25 b
$ sed -i 's/hello/goodbye/' a
$ ls -lai a b
19005942 -rw-rw-r-- 1 jhnc jhnc 8 Jan 31 12:25 a
19005916 -rw-rw-r-- 1 jhnc jhnc 6 Jan 31 12:25 b
$
This means that your script has to be able to create files in the folder where it is doing the "in-place" edit.

The proper syntax is identical on the command line and in a script. If you used $(...) at the prompt then you would have received the same error.
sed -i "s/$current_version/$version/" "$PATHPROJET$CREATE_PACKAGE/Chart.yaml"
(Notice also the quoting around the file name. Probably your private variables should use lower case.)
The syntax
$(command)
takes the output from command and tries to execute it as a command. Usually you would use this construct -- called a command substitution -- to interpolate the output of a command into a string, like
echo "Today is $(date)"
(though date +"Today is %c" is probably a better way to do that particular thing).

Remove lines with the character from the file

I have out put like this
'< Jan 20 Sep> This is the sample out put
This is Sample
>
'< Jan 21 Sep> This is the sample out put
This is Known Errors
>
So i need to remove all > special character from the file. Only the line where one special character > is present needs to be removed.
I would like to have below out put
'< Jan 20 Sep> This is the sample out put
This is Sample
'< Jan 21 Sep> This is the sample out put
This is Known Errors

You can use sed
sed '/^>$/d' myfile
If the output if good for you, you can use the -i flag to override your file:
sed -i '/^>$/d' myfile

grep -fvx '>' <myfile >myfile_with_offending_lines_removed

Delete everything after a certain character on each word using sed

I want to create a script using sed to achieve the following:
This:
22/0,01 1/1,05 11/0,01 35/6,04 6/0,03 3/0,04
To:
22 1 11 35 6 3
I want to remove everything after "/" on each word.

Just remove everything from / up to a space:
$ sed 's#/[^ ]*##g' file
22 1 11 35 6 3
Note I am using # as delimiter to avoid having to escape the /.

if the "everything after the / consists only of known characters, then this is rather easy:
sed -e 's|/[0-9,]*||g'

Add blank line after every result in grep

my grep command looks like this
zgrep -B bb -A aa "pattern" *
I would lke to have output as:
file1:line1
file1:line2
file1:line3
file1:pattern
file1:line4
file1:line5
file1:line6
</blank line>
file2:line1
file2:line2
file2:line3
file2:pattern
file2:line4
file2:line5
file2:line6
The problem is that its hard to distinguish when lines corresponding to the first found result end and the lines corresponding to the second found result start.
Note that although man grep says that "--" is added between contiguous group of matches. It works only when multiple matches are found in the same file. but in my search (as above) I am searching multiple files.
also note that adding a new blank line after every bb+aa+1 line won't work because what if a file has less than bb lines before the pattern.

pipe grep output through
awk -F: '{if(f!=$1)print ""; f=$1; print $0;}'

Pipe | any output to:
sed G
Example:
ls | sed G
If you man sed you will see
G Append's a newline character followed by the contents of the hold space to the pattern space.

The problem is that its hard to distinguish when lines corresponding to the first found result end and the lines corresponding to the second found result start.
Note that although man grep says that "--" is added between contiguous group of matches. It works only when multiple matches are found in the same file. but in my search (as above) I am searching multiple files.
If you don't mind a -- in lieu of a </blank line>, add the -0 parameter to your grep/zgrep command. This should allow for the -- to appear even when searching multiple files. You can still use the -A and -B flags as desired.

You can also use the --group-separator parameter, with an empty value, so it'd just add a new-line.
some-stream | grep --group-separator=

I can't test it with the -A and -B parameters so I can't say for sure but you could try using sed G as mentioned here on Unix StackEx. You'll loose coloring though if that's important.

There is no option for this in grep and I don't think there is a way to do it with xargs or tr (I tried), but here is a for loop that will do it (for f in *; do grep -H "1" $f && echo; done):
[ 11:58 jon#hozbox.com ~/test ]$ for f in *; do grep -H "1" $f && echo; done
a:1
b:1
c:1
d:1
[ 11:58 jon#hozbox.com ~/test ]$ ll
-rw-r--r-- 1 jon people 2B Nov 25 11:58 a
-rw-r--r-- 1 jon people 2B Nov 25 11:58 b
-rw-r--r-- 1 jon people 2B Nov 25 11:58 c
-rw-r--r-- 1 jon people 2B Nov 25 11:58 d
The -H is to display file names for grep matches. Change the * to your own file glob/path expansion string if necessary.

Try with -c 2; with printing a context I see grep is separating its found o/p

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

concatenating all files horizontally and only a specific column - linux

Related

Copying files with wildcard * why isn't it working?

sed permission denied on temporary file

Remove lines with the character from the file

Delete everything after a certain character on each word using sed

Add blank line after every result in grep

Categories

Resources