Hello: I have a lot of files called test-MR3000-1.txt to test-MR4000-1.nt, where the number in the name changes by 100 (i.e. I have 11 files),
$ ls test-MR*
test-MR3000-1.nt test-MR3300-1.nt test-MR3600-1.nt test-MR3900-1.nt
test-MR3100-1.nt test-MR3400-1.nt test-MR3700-1.nt test-MR4000-1.nt
test-MR3200-1.nt test-MR3500-1.nt test-MR3800-1.nt
and also a file called resonancia.kumac which in a couple on lines contains the string XXXX.
$ head resonancia.kumac
close 0
hist/delete 0
vect/delete *
h/file 1 test-MRXXXX-1.nt
sigma MR=XXXX
I want to execute a bash file which substitutes the strig XXXX in a file by a set of numbers obtained from the command ls *MR* | cut -b 8-11.
I found a post in which there are some suggestions. I try my own code
for i in `ls *MR* | cut -b 8-11`; do
sed -e "s/XXXX/$i/" resonancia.kumac >> proof.kumac
done
however, in the substitution the numbers are surrounded by sigle qoutes (e.g. '3000').
Q: What should I do to avoid the single quote in the set of numbers? Thank you.
This is a reproducer for the environment described:
for ((i=3000; i<=4000; i+=100)); do
touch test-MR${i}-1.nt
done
cat >resonancia.kumac <<'EOF'
close 0
hist/delete 0
vect/delete *
h/file 1 test-MRXXXX-1.nt
sigma MR=XXXX
EOF
This is a script which will run inside that environment:
content="$(<resonancia.kumac)"
for f in *MR*; do
substring=${f:7:3}
echo "${content//XXXX/$substring}"
done >proof.kumac
...and the output looks like so:
close 0
hist/delete 0
vect/delete *
h/file 1 test-MR300-1.nt
sigma MR=300
There are no quotes anywhere in this output; the problem described is not reproduced.
or if it could be perl:
#!/usr/bin/perl
#ls = glob('*MR*');
open (FILE, 'resonancia.kumac') || die("not good\n");
#cont = <FILE>;
$f = shift(#ls);
$f =~ /test-MR([0-9]*)-1\.nt/;
$nr = $1;
#out = ();
foreach $l (#cont){
if($l =~ s/XXXX/$nr/){
$f = shift(#ls);
$f =~ /test-MR([0-9]*)-1\.nt/;
$nr = $1;
}
push #out, $l;
}
close FILE;
open FILE, '>resonancia.kumac' || die("not good\n");
print FILE #out;
That would replace the first XXXX with the first filename, what seemed to be the question before change.
Related
I have a file that contains some lines as:
#SRR4293695.199563512 199563512
CAAAANCATTCGTAGACGACCTGCTCTGTNGNTACCNTCAANAGATCNGAAGAGCACACGTCTGAACTCCAGTCAC
+SRR4293695.199563512 199563512
A.AA<#FF)FFFFFFF<<<<FF7FFFFFF#.#<FF<#FFFF#FF<A<#FFFFFFFAFFFFFFAAAFFFFF<FFFF.
#SRR4293695.199563513 199563513
CTAAANCATTCGTAGACGACCTGCTT
+SRR4293695.199563513 199563513
<AAAA#FFFFFF<FFFFFFFFFFFFF
#SRR4293695.199563514 199563514
CCAACNTCATAGAGGGACAAGTGGCGATCNGNC
+SRR4293695.199563514 199563514
AAAAA#<F.F<<FA.F7AA.)<FAFA..7#.#A
#SRR4293695.199563515 199563515
TCGCGNCCTCAGATCAGACGTGGCGA
+SRR4293695.199563515 199563515
AAAAA#FFFFFF<FFFFFFFFFFFFF
#SRR4293695.199563516 199563516
TGACCNGGGTCCGGTGCGGAGAGCCCTTC
+SRR4293695.199563516 199563516
AAAAA#FAFFFF<F.FFAA.F)FFFFFAF
#SRR4293695.199563517 199563517
AAATGNTCATCGACACTTCGAACGCACT
+SRR4293695.199563517 199563517
AA)AA#F<FFFFFFAFFFFF<)FFFAFF
#SRR4293695.199563518 199563518
TCGTANCCAATGAGGTCTATCCGAGGCGCN
+SRR4293695.199563518 199563518
AAAAA#<FAAFFFF.FFFFFFFA.FFFFF#
#SRR4293695.199563519 199563519
AAAACNATTCGTAGACGNCCTGCTTNTGTNGNCACCNTNANNANNTCNGNAGAGCNCACNTCTGAACTCNAGTCAC
+SRR4293695.199563519 199563519
AAAAA#FFFFFFFFFFF#FFFFFFF#FF<#F#F.FF#7#F##F##A)#A#FF<F)#AAF#<FFFFAFF<#<FFFFF
#SRR4293695.199563520 199563520
GAAGCNGCACAGCTGGCNTTGGAGCNGANNCNGTAGNCNCNNTNNATNGNTCGGNNGAGNACACGTCTGNACTCCA
+SRR4293695.199563520 199563520
AAAAA#FFFFFFFFFFF#FFFFFFF#FF##A#FFFF#F#F##<##FF#F#FFFF##FFF#FFFFFFFFF#FFFFFF
#SRR4293695.199563521 199563521
TGGTCNGTGGGGAGTCGNCGCCTGCNTANNANTGTANGNANNANNAANANATCGNNAGANCACACGTCTNAACTCC
+SRR4293695.199563521 199563521
AAAAA#FFFFFFFFFFF#FFFFFFF#FF##F#FFFF#F#F##A##FF#A#FFFF##<FF#FFFFFFFFF#F<FFFF
#SRR4293695.199563522 199563522
TCGTANCCAATGAGGTCTATCCGAGGCGCN
+SRR4293695.199563522 199563522
AAAAA#<FAAFFFF.FFFFFFFA.FFFFF#
Then, I would like to filter these lines according to a condition :
taking in consideration the length of even lines: if that length is > 34 then that line and the preceding line must be removed.
I already did an algorithm: using a while to read all lines in the file, checking the condition and retaining only lines with length < 34. The problem is that it is taking some time.
inputFile=$1
outputFile=$2
while read first_line; read second_line
do
lread=${#second_line}
if [[ "$lread" -le 34 ]] ; then
echo $first_line >> $outputFile
echo $second_line >> $outputFile
fi
done < $inputFile
# This is for the last two lines
lread=${#second_line}
if [[ "$lread" -le 34 ]] ; then
echo $first_line >> $outputFile
echo $second_line >> $outputFile
fi
I was wondering if there is not another way, quicker.
The expected output:
#SRR4293695.199563513 199563513
CTAAANCATTCGTAGACGACCTGCTT
+SRR4293695.199563513 199563513
<AAAA#FFFFFF<FFFFFFFFFFFFF
#SRR4293695.199563514 199563514
CCAACNTCATAGAGGGACAAGTGGCGATCNGNC
+SRR4293695.199563514 199563514
AAAAA#<F.F<<FA.F7AA.)<FAFA..7#.#A
#SRR4293695.199563515 199563515
TCGCGNCCTCAGATCAGACGTGGCGA
+SRR4293695.199563515 199563515
AAAAA#FFFFFF<FFFFFFFFFFFFF
#SRR4293695.199563516 199563516
TGACCNGGGTCCGGTGCGGAGAGCCCTTC
+SRR4293695.199563516 199563516
AAAAA#FAFFFF<F.FFAA.F)FFFFFAF
#SRR4293695.199563517 199563517
AAATGNTCATCGACACTTCGAACGCACT
+SRR4293695.199563517 199563517
AA)AA#F<FFFFFFAFFFFF<)FFFAFF
#SRR4293695.199563518 199563518
TCGTANCCAATGAGGTCTATCCGAGGCGCN
+SRR4293695.199563518 199563518
AAAAA#<FAAFFFF.FFFFFFFA.FFFFF#
#SRR4293695.199563522 199563522
TCGTANCCAATGAGGTCTATCCGAGGCGCN
+SRR4293695.199563522 199563522
AAAAA#<FAAFFFF.FFFFFFFA.FFFFF#
Thanks in advance!
Here's an awk solution:
awk '!last { last = $0; next } length($0)<=34 { print last; print } { last = "" }' YOURFILE
The output is your expected output.
sed method:
sed -n 'h;n;/.\{34,\}/!{x;G;p}' inputfile > outputfile
h;n The odd numbered lines go into the hold buffer, then get the next line.
The resulting even numbered lines are checked for length. If they're not over 34 chars, the hold buffer is exchanged with the pattern space, then appended to it, (x;G;), so that both lines are in the pattern space, and printed.
Say I want to search for "ERROR" within a bunch of log files.
I want to print one line for every file that contains "ERROR".
In each line, I want to print the log file path on the left-most edge while the number of "ERROR" on the right-most edge.
I tried using:
printf "%-50s %d" $filePath $errorNumber
...but it's not perfect, since the black console can vary greatly, and the file path sometimes can be quite long.
Just for the pleasure of the eyes, but I am simply incapable of doing so.
Can anyone help me to solve this problem?
Using bash and printf:
printf "%-$(( COLUMNS - ${#errorNumber} ))s%s" \
"$filePath" "$errorNumber"
How it works:
$COLUMNS is the shell's terminal width.
printf does left alignment by putting a - after the %. So printf "%-25s%s\n" foo bar prints "foo", then 22 spaces, then "bar".
bash uses the # as a parameter length variable prefix, so if x=foo, then ${#x} is 3.
Fancy version, suppose the two variables are longer than will fit in one column; if so print them on as many lines as are needed:
printf "%-$(( COLUMNS * ( 1 + ( ${#filePath} + ${#errorNumber} ) / COLUMNS ) \
- ${#errorNumber} ))s%s" "$filePath" "$errorNumber"
Generalized to a function. Syntax is printfLR foo bar, or printfLR < file:
printfLR() { if [ "$1" ] ; then echo "$#" ; else cat ; fi |
while read l r ; do
printf "%-$(( ( 1 + ( ${#l} + ${#r} ) / COLUMNS ) \
* COLUMNS - ${#r} ))s%s" "$l" "$r"
done ; }
Test with:
# command line args
printfLR foo bar
# stdin
fortune | tr -s ' \t' '\n\n' | paste - - | printfLR
I have a .txt input file that is the product of a printf defining each line as POV(n)="sequenceX,yearY"
cat output.PA
POV01="SEQ010,FY15"
POV02="SEQ010,FY16"
POV03="SEQ020,FY15"
POV04="SEQ020,FY16"
How can I source this file so that I can export each POV as the variable value of sequence and fy, respectively for the given line?
export POV(n)="$seq,$fy"
the printf I have used to get tho this point is as follows:
cat step1
while read -r seq fy; do
printf 'POV%02d="%s,%s"\n' ${counter} ${seq} ${fy}
(( counter = counter + 1 ))
done <test_scenario_02.txt > output.PA
If I source output.PA I get the following:
./step2
POV00=YEAR,
POV01=SEQ010,FY15
POV02=SEQ010,FY16
POV03=SEQ020,FY15
POV04=SEQ020,FY16
POV05=SEQ030,FY15
POV06=SEQ030,FY16
POV07=SEQ030,FY15
POV08=SEQ030,FY16
POV09=SEQ040,FY15
POV10=SEQ040,FY16
POV11=SEQ050,FY15
POV12=SEQ050,FY16
$ cat step2
. ./output.PA
set | grep "^POV"
It is not at all clear what you want, but it seems like you are trying to create an array variable that holds all the value in output.PA. You probably don't need to do that, but this should work:
$ pov=($(sed -e 's/[^"]*"//' -e 's/"$//' output.PA))
$ echo ${pov[0]}
SEQ010,FY15
$ echo ${pov[1]}
SEQ010,FY16
$ echo ${pov[2]}
SEQ020,FY15
I have bunch of files with no pattern in their name at all in a directory. all I know is that they are all Jpg files. How do I rename them, so that they will have some sort of sequence in their name.
I know in Windows all you do is select all the files and rename them all to a same name and Windows OS automatically adds sequence numbers to compensate for the same file name.
I want to be able to do that in Linux Fedora but I you can only do that in Terminal. Please, help. I am lost.
What is the command for doing this?
The best way to do this is to run a loop in the terminal going from picture to picture and renaming them with a number that gets bigger by one with every loop.
You can do this with:
n=1
for i in *.jpg; do
p=$(printf "%04d.jpg" ${n})
mv ${i} ${p}
let n=n+1
done
Just enter it into the terminal line by line.
If you want to put a custom name in front of the numbers, you can put it before the percent sign in the third line.
If you want to change the number of digits in the names' number, just replace the '4' in the third line (don't change the '0', though).
I will assume that:
There are no spaces or other weird control characters in the file names
All of the files in a given directory are jpeg files
That in mind, to rename all of the files to 1.jpg, 2.jpg, and so on:
N=1
for a in ./* ; do
mv $a ${N}.jpg
N=$(( $N + 1 ))
done
If there are spaces in the file names:
find . -type f | awk 'BEGIN{N=1}
{print "mv \"" $0 "\" " N ".jpg"
N++}' | sh
Should be able to rename them.
The point being, Linux/UNIX does have a lot of tools which can automate a task like this, but they have a bit of a learning curve to them
Create a script containing:
#!/bin/sh
filePrefix="$1"
sequence=1
for file in $(ls -tr *.jpg) ; do
renamedFile="$filePrefix$sequence.jpg"
echo $renamedFile
currentFile="$(echo $file)"
echo "renaming \"$currentFile\" to $renamedFile"
mv "$currentFile" "$renamedFile"
sequence=$(($sequence+1))
done
exit 0
If you named the script, say, RenameSequentially then you could issue the command:
./RenameSequentially Images-
This would rename all *.jpg files in the directory to Image-1.jpg, Image-2.jpg, etc... in order of oldest to newest... tested in OS X command shell.
I wrote a perl script a long time ago to do pretty much what you want:
#
# reseq.pl renames files to a new named sequence of filesnames
#
# Usage: reseq.pl newname [-n seq] [-p pad] fileglob
#
use strict;
my $newname = $ARGV[0];
my $seqstr = "01";
my $seq = 1;
my $pad = 2;
shift #ARGV;
if ($ARGV[0] eq "-n") {
$seqstr = $ARGV[1];
$seq = int $seqstr;
shift #ARGV;
shift #ARGV;
}
if ($ARGV[0] eq "-p") {
$pad = $ARGV[1];
shift #ARGV;
shift #ARGV;
}
my $filename;
my $suffix;
for (#ARGV) {
$filename = sprintf("${newname}_%0${pad}d", $seq);
if (($suffix) = m/.*\.(.*)/) {
$filename = "$filename.$suffix";
}
print "$_ -> $filename\n";
rename ($_, $filename);
$seq++;
}
You specify a common prefix for the files, a beginning sequence number and a padding factor.
For exmaple:
# reseq.pl abc 1 2 *.jpg
Will rename all matching files to abc_01.jpg, abc_02.jpg, abc_03.jpg...
I want to extract lines before and after a matched pattern.
eg: if the file contents are as follows
absbasdakjkglksagjgj
sajlkgsgjlskjlasj
hello
lkgjkdsfjlkjsgklks
klgdsgklsdgkldskgdsg
I need find hello and display line before and after 'hello'
the output should be
sajlkgsgjlskjlasj
hello
lkgjkdsfjlkjsgklks
This is possible with GNU but i need a method that works in AIX / KSH SHELL WHERE NO GNU IS INSTALLED.
sed -n '/hello/{x;G;N;p;};h' filename
I've found it is generally less frustrating to build the GNU coreutils once, and benefit from many more features http://www.gnu.org/software/coreutils/
Since you'll have Perl on the machine, you could use the following code, but you'd probably do better to install the GNU utilities. This has options -b n1 for lines before and -f n1 for lines following the match. It works with PCRE matches (so if you want case-insensitive matching, add an i after the regex instead using a -i option. I haven't implemented -v or -l; I didn't need those.
#!/usr/bin/env perl
#
# #(#)$Id: sgrep.pl,v 1.7 2013/01/28 02:07:18 jleffler Exp $
#
# Perl-based SGREP (special grep) command
#
# Print lines around the line that matches (by default, 3 before and 3 after).
# By default, include file names if more than one file to search.
#
# Options:
# -b n1 Print n1 lines before match
# -f n2 Print n2 lines following match
# -n Print line numbers
# -h Do not print file names
# -H Do print file names
use warnings;
use strict;
use constant debug => 0;
use Getopt::Std;
my(%opts);
sub usage
{
print STDERR "Usage: $0 [-hnH] [-b n1] [-f n2] pattern [file ...]\n";
exit 1;
}
usage unless getopts('hnf:b:H', \%opts);
usage unless #ARGV >= 1;
if ($opts{h} && $opts{H})
{
print STDERR "$0: mutually exclusive options -h and -H specified\n";
exit 1;
}
my $op = shift;
print "# regex = $op\n" if debug;
# print file names if -h omitted and more than one argument
$opts{F} = (defined $opts{H} || (!defined $opts{h} and scalar #ARGV > 1)) ? 1 : 0;
$opts{n} = 0 unless defined $opts{n};
my $before = (defined $opts{b}) ? $opts{b} + 0 : 3;
my $after = (defined $opts{f}) ? $opts{f} + 0 : 3;
print "# before = $before; after = $after\n" if debug;
my #lines = (); # Accumulated lines
my $tail = 0; # Line number of last line in list
my $tbp_1 = 0; # First line to be printed
my $tbp_2 = 0; # Last line to be printed
# Print lines from #lines in the range $tbp_1 .. $tbp_2,
# leaving $leave lines in the array for future use.
sub print_leaving
{
my ($leave) = #_;
while (scalar(#lines) > $leave)
{
my $line = shift #lines;
my $curr = $tail - scalar(#lines);
if ($tbp_1 <= $curr && $curr <= $tbp_2)
{
print "$ARGV:" if $opts{F};
print "$curr:" if $opts{n};
print $line;
}
}
}
# General logic:
# Accumulate each line at end of #lines.
# ** If current line matches, record range that needs printing
# ** When the line array contains enough lines, pop line off front and,
# if it needs printing, print it.
# At end of file, empty line array, printing requisite accumulated lines.
while (<>)
{
# Add this line to the accumulated lines
push #lines, $_;
$tail = $.;
printf "# array: N = %d, last = $tail: %s", scalar(#lines), $_ if debug > 1;
if (m/$op/o)
{
# This line matches - set range to be printed
my $lo = $. - $before;
$tbp_1 = $lo if ($lo > $tbp_2);
$tbp_2 = $. + $after;
print "# $. MATCH: print range $tbp_1 .. $tbp_2\n" if debug;
}
# Print out any accumulated lines that need printing
# Leave $before lines in array.
print_leaving($before);
}
continue
{
if (eof)
{
# Print out any accumulated lines that need printing
print_leaving(0);
# Reset for next file
close ARGV;
$tbp_1 = 0;
$tbp_2 = 0;
$tail = 0;
#lines = ();
}
}
I had a situation where I was stuck with a slow telnet session on a tablet, believe it or not, and I couldn't write a Perl script very easily with that keyboard. I came up with this hacky maneuver that worked in a pinch for me with AIX's limited grep. This won't work well if your grep returns hundreds of lines, but if you just need one line and one or two above/below it, this could do it. First I ran this:
cat -n filename |grep criteria
By including the -n flag, I see the line number of the data I'm seeking, like this:
2543 my crucial data
Since cat gives the line number 2 spaces before and 1 space after, I could grep for the line number right before it like this:
cat -n filename |grep " 2542 "
I ran this a couple of times to give me lines 2542 and 2544 that bookended line 2543. Like I said, it's definitely fallable, like if you have reams of data that might have " 2542 " all over the place, but just to grab a couple of quick lines, it worked well.