Any one give me a solution for SORT - linux

I want to sort data from shortest to longest line ,the data contains
space ,character ,number,-,","
,i use sort -n ,but it did not solve the job.many thanks for help
Data here
0086
0086-
0086---
0086-------
0086-1358600966
0086-18868661318
00860
00860-13081022659
00860-131111111
00860-13176880028
00860-13179488252
00860-18951041771
00861
008629-83023520
0086000
0086010-61281306
and the rerult i want is
0086
0086-
00860
00861
0086000
0086---
0086-------
0086-1358600966
00860-131111111
008629-83023520
0086-18868661318
0086010-61281306
00860-13081022659
00860-13176880028
00860-13179488252
00860-18951041771
I do not care what characters ,just from short to long .2 lines with the same long can exchange ,it is not a problem .many thanks

Perl one-liner
perl -0777 -ne 'print join("\n", map {$_->[1]} sort {$a->[0] <=> $b->[0]} map {[length, $_]} split /\n/), "\n"' file
Explanation on demand.
With GNU awk, it's very simple:
gawk '
{len[$0] = length($0)}
END {
PROCINFO["sorted_in"] = "#val_num_asc"
for (line in len) print line
}
' file
See https://www.gnu.org/software/gawk/manual/html_node/Controlling-Scanning.html#Controlling-Scanning

Just try this once, May be it will help you.
awk '{ print length($0) " " $0; }' $file | sort -n | cut -d ' ' -f 2-
the -r option was for reversing the sort.

Using awk:
#!/usr/bin/awk -f
(l = length($0)) && !($0 in nextof) {
if (l in start) {
nextof[$0] = start[l]
} else {
if (!max || l > max) max = l
if (!min || l < min) min = l
nextof[$0] = 0
}
start[l] = $0
++count[l]
}
END {
for (i = min; i <= max; ++i) {
if (j = count[i]) {
t = start[i]
print t
while (--j) {
t = nextof[t]
print t
}
}
}
}
Usage:
awk -f script.awk file
Output:
0086
00861
00860
0086-
0086000
0086---
0086-------
008629-83023520
00860-131111111
0086-1358600966
0086010-61281306
0086-18868661318
00860-18951041771
00860-13179488252
00860-13176880028
00860-13081022659
Another Version:
#!/usr/bin/awk -f
(l = length($0)) && !($0 in nextof) {
if (l in start) {
nextof[lastof[l]] = $0
} else {
if (!max || l > max) max = l
if (!min || l < min) min = l
start[l] = $0
}
lastof[l] = $0
++count[l]
}
END {
for (i = min; i <= max; ++i) {
if (j = count[i]) {
t = start[i]
print t
while (--j) {
t = nextof[t]
print t
}
}
}
}
Output:
0086
0086-
00860
00861
0086---
0086000
0086-------
0086-1358600966
00860-131111111
008629-83023520
0086-18868661318
0086010-61281306
00860-13081022659
00860-13176880028
00860-13179488252
00860-18951041771

Related

Split string to fixed length chunks and write in separate line in Raku

I have a file test.txt:
Stringsplittingskills
I want to read this file and write to another file out.txt with three characters in each line like
Str
ing
spl
itt
ing
ski
lls
What I did
my $string = "test.txt".IO.slurp;
my $start = 0;
my $elements = $string.chars;
# open file in writing mode
my $file_handle = "out.txt".IO.open: :w;
while $start < $elements {
my $line = $string.substr($start,3);
if $line.chars == 3 {
$file_handle.print("$line\n")
} elsif $line.chars < 3 {
$file_handle.print("$line")
}
$start = $start + 3;
}
# close file handle
$file_handle.close
This runs fine when the length of string is not multiple of 3. When the string length is multiple of 3, it inserts extra newline at the end of output file. How can I avoid inserting new line at the end when the string length is multiple of 3?
I tried another shorter approach,
my $string = "test.txt".IO.slurp;
my $file_handle = "out.txt".IO.open: :w;
for $string.comb(3) -> $line {
$file_handle.print("$line\n")
}
Still it suffers from same issue.
I looked for here, here but still unable to solve it.
spurt "out.txt", "test.txt".IO.comb(3).join("\n")
Another approach using substr-rw.
subset PositiveInt of Int where * > 0;
sub break( Str $str is copy, PositiveInt $length )
{
my $i = $length;
while $i < $str.chars
{
$str.substr-rw( $i, 0 ) = "\n";
$i += $length + 1;
}
$str;
}
say break("12345678", 3);
Output
123
456
78
The correct answer is of course to use .comb and .join.
That said, this is how you might fix your code.
You could change the if line to check if it is at the end, and use else.
if $start+3 < $elements {
$file_handle.print("$line\n")
} else {
$file_handle.print($line)
}
Personally I would change it so that only the addition of \n is conditional.
while $start < $elements {
my $line = $string.substr($start,3);
$file_handle.print( $line ~ ( "\n" x ($start+3 < $elements) ));
$start += 3;
}
This works because < returns either True or False.
Since True == 1 and False == 0, the x operator repeats the \n at most once.
'abc' x 1; # 'abc'
'abc' x True; # 'abc'
'abc' x 0; # ''
'abc' x False; # ''
If you were very cautious you could use x+?.
(Which is actually 3 separate operators.)
'abc' x 3; # 'abcabcabc'
'abc' x+? 3; # 'abc'
infix:« x »( 'abc', prefix:« + »( prefix:« ? »( 3 ) ) );
I would probably use loop if I were going to structure it like this.
loop ( my $start = 0; $start < $elements ; $start += 3 ) {
my $line = $string.substr($start,3);
$file_handle.print( $line ~ ( "\n" x ($start+3 < $elements) ));
}
Or instead of adding a newline to the end of each line, you could add it to the beginning of every line except the first.
while $start < $elements {
my $line = $string.substr($start,3);
my $nl = "\n";
# clear $nl the first time through
once $nl = "";
$file_handle.print($nl ~ $line);
$start = $start + 3;
}
At the command line prompt, three one-liner solutions below.
Using comb and batch (retains incomplete set of 3 letters at end):
~$ echo 'StringsplittingskillsX' | perl6 -ne '.join.put for .comb.batch(3);'
Str
ing
spl
itt
ing
ski
lls
X
Simplifying (no batch, only comb):
~$ echo 'StringsplittingskillsX' | perl6 -ne '.put for .comb(3);'
Str
ing
spl
itt
ing
ski
lls
X
Alternatively, using comb and rotor (discards incomplete set of 3 letters at end):
~$ echo 'StringsplittingskillsX' | perl6 -ne '.join.put for .comb.rotor(3);'
Str
ing
spl
itt
ing
ski
lls

How can I reverse print the characters of a string in each cell using AWK?

Beth 45 4.00 0 0 .072
Danny 33 3.75 ^0 0 .089
The above is the file I want to operate.
I want to write an AWK script that can reverse print the characters of a string in every cell.
Here is the code:
BEGIN { OFS = "\t\t" }
function reverse_print(str)
{
s = "";
N = length(str);
for (i = 1; i <= N; i++)
a[i] = substr(str, i, 1);
for (i = N; i >= 1; i--)
s = s a[i];
return s;
}
{
for (i = 1; i <= NF; i++)
$i = reverse_print($i) ;
print;
}
END {}
However, it does not work. The program somehow becomes dead.
I have found if I don't use the loop and handle each field one by one like the following,
BEGIN { OFS = "\t\t" }
function reverse_print(str)
{
s = "";
N = length(str);
for (i = 1; i <= N; i++)
a[i] = substr(str, i, 1);
for (i = N; i >= 1; i--)
s = s a[i];
return s;
}
{
$1 = reverse_print($1) ;
$2 = reverse_print($2) ;
$3 = reverse_print($3) ;
$4 = reverse_print($4) ;
$5 = reverse_print($5) ;
$6 = reverse_print($6) ;
print;
}
END {}
it can work well.
Here is my desired output:
hteB 54 00.4 0 0 270.
ynnaD 33 57.3 0^ 0 980.
I have thought hard but still cannot figure out where I did wrong using the loop.
Who can tell me why ?
You're using the same variable i inside and outside of the function. Use a different variable in either location or change the function definition to reverse_print(str, i) to make the i used within the function local to that function rather than the same global variable being used in the calling code.
You should also make s and N function local:
function reverse_print(str, i, s, N)
but in fact the code should be written as:
$ cat tst.awk
BEGIN { OFS = "\t\t" }
function reverse_print(fwd, rev, i, n)
{
n = length(fwd)
for (i = n; i >= 1; i--)
rev = rev substr(fwd, i, 1);
return rev
}
{
for (i = 1; i <= NF; i++)
$i = reverse_print($i)
print
}
$ awk -f tst.awk file
hteB 54 00.4 0 0 270.
ynnaD 33 57.3 0^ 0 980.
Could you please try following.(This program is tested on GNU awk only and as per Ed sir's comment too this is undefined behavior for POSIX awk)
awk '
BEGIN{
OFS="\t\t"
}
{
for(i=1;i<=NF;i++){
num=split($i,array,"")
for(j=num;j>0;j--){
val=(j<num?val:"") array[j]
}
printf "%s%s",val,(i<NF?OFS:ORS)}
val=""
}' Input_file
There is a rev command in Linux: rev - reverse lines characterwise.
You can reverse a string by calling rev with awk builtin function system like:
#reverse-fields.awk
{
for (i = 1; i <= NF; i = i + 1) {
# command line
cmd = "echo '" $i "' | rev"
# read output into revfield
cmd | getline revfield
# remove leading new line
a = gensub(/^[\n\r]+/, "", "1", revfield)
# print reversed field
printf("%s", a)
# print tab
if (i != NF) printf("\t")
# close command
close(cmd)
}
# print new line
print ""
}
$ awk -f reverse-fields.awk emp.data
0 00.4 hteB
0 57.3 naD
01 00.4 yhtaK
02 00.5 kraM
22 05.5 yraM
81 52.4 eisuS

Average of multiple files without considering missing values

I want to calculate the average of 15 files:- ifile1.txt, ifile2.txt, ....., ifile15.txt. Number of columns and rows of each file are same. But some of them are missing values. Part of the data looks as
ifile1.txt ifile2.txt ifile3.txt
3 ? ? ? . 1 2 1 3 . 4 ? ? ? .
1 ? ? ? . 1 ? ? ? . 5 ? ? ? .
4 6 5 2 . 2 5 5 1 . 3 4 3 1 .
5 5 7 1 . 0 0 1 1 . 4 3 4 0 .
. . . . . . . . . . . . . . .
I would like to find a new file which will show the average of these 15 fils without considering the missing values.
ofile.txt
2.66 2 1 3 . (i.e. average of 3 1 4, average of ? 2 ? and so on)
2.33 ? ? ? .
3 5 4.33 1.33 .
3 2.67 4 0.66 .
. . . . .
This question is similar to my earlier question Average of multiple files in shell where the script was
awk 'FNR == 1 { nfiles++; ncols = NF }
{ for (i = 1; i < NF; i++) sum[FNR,i] += $i
if (FNR > maxnr) maxnr = FNR
}
END {
for (line = 1; line <= maxnr; line++)
{
for (col = 1; col < ncols; col++)
printf " %f", sum[line,col]/nfiles;
printf "\n"
}
}' ifile*.txt
But I can't able to modify it.
Use this:
paste ifile*.txt | awk '{n=f=0; for(i=1;i<=NF;i++){if($i*1){f++;n+=$i}}; print n/f}'
paste will show all files side by side
awk calculates the averages per line:
n=f=0; set the variables to 0.
for(i=1;i<=NF;i++) loop trough all the fields.
if($i*1) if the field contains a digit (multiplication by 1 will succeed).
f++;n+=$i increment f (number of fields with digits) and sum up n.
print n/f calculate n/f.
awk '
{
for (i = 1;i <= NF;i++) {
Sum[FNR,i]+=$i
Count[FNR,i]+=$i!="?"
}
}
END {
for( i = 1; i <= FNR; i++){
for( j = 1; j <= NF; j++) printf "%s ", Count[i,j] != 0 ? Sum[i,j]/Count[i,j] : "?"
print ""
}
}
' ifile*
assuming file are correctly feeded (no trailing empty space line, ...)
awk 'FNR == 1 { nfiles++; ncols = NF }
{ for (i = 1; i < NF; i++)
if ( $i != "?" ) { sum[FNR,i] += $i ; count[FNR,i]++ ;}
if (FNR > maxnr) maxnr = FNR
}
END {
for (line = 1; line <= maxnr; line++)
{
for (col = 1; col < ncols; col++)
if ( count[line,col] > 0 ) printf " %f", sum[line,col]/count[line,col];
else printf " ? " ;
printf "\n" ;
}
}' ifile*.txt
I just check the '?' ...

how to iterate over two sets of data?

I'm trying to create my own program to do a recursive listing: each line corresponds to the full path of a single file. The tricky part I'm working on now is: I don't want bind mounts to trick my program into listing files twice.
So I already have a program that produces the right output except that if /foo is bind mounted to /bar then my program incorrectly lists
/foo/file
/bar/file
I need the program to list just what's below (EDIT: even if it was asked to list the contents of /foo)
/bar/file
One approach I thought of is to mount | grep bind | awk '{print $1 " " $3}' and then iterate over this to sed every line of the output, then sort -u.
My question is how do I iterate over the original output (a bunch of lines) and the output from mount (another bunch of lines)? (or is there a better approach) This needs to be POSIX (EDIT: and work with /bin/sh)
Place the 'mount | grep bind' command into the AWK within a BEGIN block and store the data.
Something like:
PROG | awk 'BEGIN{
# Define the data you want to store
# Assign to global arrays
command = "mount | grep bind";
while ((command | getline) > 0) {
count++;
mount[count] = $1;
mountPt[count] = $3
}
}
# Assuming input is line-by-line and that mountPt is the value
# that is undesired
{
replaceLine=0
for (i=1; i<=count; i++) {
idx = index($1, mountPt[i]);
if (idx == 1) {
replaceLine = 1;
break;
}
}
if (replaceLine == 1) {
sub(mountPt[i], mount[i], $1);
}
if (printed[$1] != 1) {
print $1;
}
printed[$1] = 1;
} '
Where I assume your current program, PROG, outputs to stdout.
find YourPath -print > YourFiles.txt
mount > Bind.txt
awk 'FNR == NR && $0 ~ /bind/ {
Bind[ $1] = $3
if( ( ThisLevel = split( $3, Unused, "/") - 1 ) > Level) Level = ThisLevel
}
FNR != NR && $0 !~ /^ *$/ {
RealName = $0
for( ThisLevel = Level; ThisLevel > 0; ThisLevel--){
match( $0, "(/[^/]*){" ThisLevel "}" )
UnBind = Bind[ substr( $0, 1, RLENGTH) ]
if( UnBind !~ /^$/) {
RealName = UnBind substr( $0, RLENGTH + 1)
ThisLevel = 0
}
}
if( ! File[ RealName]++) print RealName
}
' Bind.txt YourFiles.txt
search based on a exact path/bind comparaison from a bind array loaded first
Bind.txt and YourFiles.txt could be a direct redirection to be "1" instruction and no temporary files
have to be adapted (first part of awk) if path in bind are using space character (assume not here)
file path are changed live when reading, compare to an existing bind relation
print file if not yet known

how to check if awk array is empty

I am brand new to AWK and trying to determine if my array is empty or not so i can print a message if so. Typically i am use to length functions and can check like that, but it does not seem AWK has those. Here is my working code, i just want to print out a different message if there is nothing in the array after parsing all my data.
#add to array if condition is met
if ($2 == "SOURCE" && $4 == "RESTRICTED"){
sourceAndRestricted[$3]++;
}
#print out array
for (var in sourceAndRestricted){
printf "\t\t"var"\n"
}
ive tried something like this and its not working. Suggestions?
for (var in sourceAndRestricted){
if (var > 1){
printf "\t\t"var"\n"
}
else {
print "NONE"
}
}
Check it with length() function:
if ( length(sourceAndRestricted) > 0 ) {
printf "\t\t"var"\n"
}
else
print "NONE"
}
$ cat tst.awk
function isEmpty(arr, idx) {for (idx in arr) return 0; return 1}
BEGIN {
map[3] = 27
print isEmpty(map)
delete map[3]
print isEmpty(map)
}
$ awk -f tst.awk
0
1

Resources