I'm using the following to output the result of an upload speed test
wput 10MB.zip ftp://user:pass#host 2>&1 | grep '\([0-9.]\+[KM]/s\)'
which returns
18:14:38 (10MB.zip) - '10.49M/s' [10485760]
Transfered 10,485,760 bytes in 1 file at 10.23M/s
I'd like to have the result 10.23M/s (i.e. the speed) echoed, and a comparison result:
if speed=>5 MB/s then echo "pass" else echo "fail"
So, the final output would be:
PASS 7 M/s
23/01/2013
ideally i'd like it all done on a single line so far i've got
wput 100M.bin ftp://test:test#0.0.0.0 2>&1 | grep -o '\([0-9.]\+[KM]/s\)$' | awk ' { if (($1 > 5) && ($2 == "M/s")) { printf("FAST %s\n ", $0); }}'
however it doesn't output anything if I remove
&& ($2 == "M/s"))
it works but I obviously want to it output above 5M/s and as it is it would still echo fast if it was over 1K/s. Can someone tell me what i've missed.
Using awk:
# Over 5M/s
$ cat pass
18:14:38 (10MB.zip) - '10.49M/s' [10485760]
Transfered 10,485,760 bytes in 1 file at 10.23M/s
$ awk 'END{f="FAIL "$NF;p="PASS "$NF;if($NF~/K\/s/){print f;exit};gsub(/M\/s/,"");print(int($NF)>5?p:f)}' pass
PASS 10.23M/s
# Under 5M/s
$ cat fail
18:14:38 (10MB.zip) - '3.49M/s' [10485760]
Transfered 10,485,760 bytes in 1 file at 3.23M/s
$ awk 'END{f="FAIL "$NF;p="PASS "$NF;if($NF~/K\/s/){print f;exit};gsub(/M\/s/,"");print(int($NF)>5?p:f)}' fail
FAIL 3.23M/s
# Also Handle K/s
$ cat slow
18:14:38 (10MB.zip) - '3.49M/s' [10485760]
Transfered 10,485,760 bytes in 1 file at 8.23K/s
$ awk 'END{f="FAIL "$NF;p="PASS "$NF;if($NF~/K\/s/){print f;exit};gsub(/M\/s/,"");print(int($NF)>5?p:f)}' slow
FAIL 8.23K/s
Not sure where you get 7 M/s from?
According to #Rubens, you can use grep -o with your regex to show the speed, just append $ for end of line
wput 10MB.zip ftp://user:pass#host 2>&1 | grep -o '\([0-9.]\+[KM]/s\)$'
With perl you can easily do the remaining stuff
use strict;
use warnings;
while (<>) {
if (m!\s+((\d+\.\d+)([KM])/s)$!) {
if ($2 > 5 && $3 eq 'M') {
print "PASS $1\n";
} else {
print "FAIL $1\n";
}
}
}
and then call it
wput 10MB.zip ftp://user:pass#host 2>&1 | perl script.pl
This is an answer to the question update.
With the awk program, you haven't split the speed into numeric and unit value. It is just one string.
Because fast speed is greater than 5 M/s, you can ignore K/s and extract the speed by splitting at the character M. Then you have the speed in $1 and can compare it
wput 100M.bin ftp://test:test#0.0.0.0 2>&1 | grep -o '[0-9.]\+M/s$' | awk -F '/M/' '{ if ($1 > 5) { printf("FAST %s\n ", $0); }}'
Related
I am writing a function in a BASH shell script, that should return lines from csv-files with headers, having more commas than the header. This can happen, as there are values inside these files, that could contain commas. For quality control, I must identify these lines to later clean them up. What I have currently:
#!/bin/bash
get_bad_lines () {
local correct_no_of_commas=$(head -n 1 $1/$1_0_0_0.csv | tr -cd , | wc -c)
local no_of_files=$(ls $1 | wc -l)
for i in $(seq 0 $(( ${no_of_files}-1 )))
do
# Check that the file exist
if [ ! -f "$1/$1_0_${i}_0.csv" ]; then
echo "File: $1_0_${i}_0.csv not found!"
continue
fi
# Search for error-lines inside the file and print them out
echo "$1_0_${i}_0.csv has over $correct_no_of_commas commas in the following lines:"
grep -o -n '[,]' "$1/$1_0_${i}_0.csv" | cut -d : -f 1 | uniq -c | awk '$1 > $correct_no_of_commas {print}'
done
}
get_bad_lines products
get_bad_lines users
The output of this program is now all the comma-counts with all of the line numbers in all the files,
and I suspect this is due to the input $1 (foldername, i.e. products & users) conflicting with the call to awk with reference to $1 as well (where I wish to grab the first column being the count of commas for that line in the current file in the loop).
Is this the issue? and if so, would it be solvable by either referencing the 1.st column or the folder name by different variable names instead of both of them using $1 ?
Example, current output:
5 6667
5 6668
5 6669
5 6670
(should only show lines for that file having more than 5 commas).
Tried variable declaration in call to awk as well, with same effect
(as in the accepted answer to Awk field variable clash with function argument)
:
get_bad_lines () {
local table_name=$1
local correct_no_of_commas=$(head -n 1 $table_name/${table_name}_0_0_0.csv | tr -cd , | wc -c)
local no_of_files=$(ls $table_name | wc -l)
for i in $(seq 0 $(( ${no_of_files}-1 )))
do
# Check that the file exist
if [ ! -f "$table_name/${table_name}_0_${i}_0.csv" ]; then
echo "File: ${table_name}_0_${i}_0.csv not found!"
continue
fi
# Search for error-lines inside the file and print them out
echo "${table_name}_0_${i}_0.csv has over $correct_no_of_commas commas in the following lines:"
grep -o -n '[,]' "$table_name/${table_name}_0_${i}_0.csv" | cut -d : -f 1 | uniq -c | awk -v table_name="$table_name" '$1 > $correct_no_of_commas {print}'
done
}
You can use awk the full way to achieve that :
get_bad_lines () {
find "$1" -maxdepth 1 -name "$1_0_*_0.csv" | while read -r my_file ; do
awk -v table_name="$1" '
NR==1 { num_comma=gsub(/,/, ""); }
/,/ { if (gsub(/,/, ",", $0) > num_comma) wrong_array[wrong++]=NR":"$0;}
END { if (wrong > 0) {
print(FILENAME" has over "num_comma" commas in the following lines:");
for (i=0;i<wrong;i++) { print(wrong_array[i]); }
}
}' "${my_file}"
done
}
For why your original awk command failed to give only lines with too many commas, that is because you are using a shell variable correct_no_of_commas inside a single quoted awk statement ('$1 > $correct_no_of_commas {print}'). Thus there no substitution by the shell, and awk read "$correct_no_of_commas" as is, and perceives it as an undefined variable. More precisely, awk look for the variable correct_no_of_commas which is undefined in the awk script so it is an empty string . awk will then execute $1 > $"" as matching condition, and as $"" is a $0 equivalent, awk will compare the count in $1 with the full input line. From a numerical point of view, the full input line has the form <tab><count><tab><num_line>, so it is 0 for awk. Thus, $1 > $correct_no_of_commas will be always true.
You can identify all the bad lines with a single awk command
awk -F, 'FNR==1{print FILENAME; headerCount=NF;} NF>headerCount{print} ENDFILE{print "#######\n"}' /path/here/*.csv
If you want the line number also to be printed, use this
awk -F, 'FNR==1{print FILENAME"\nLine#\tLine"; headerCount=NF;} NF>headerCount{print FNR"\t"$0} ENDFILE{print "#######\n"}' /path/here/*.csv
I have a scenario
where i want to hash some columns of csv file
how to do that with below data
ID|NAME|CITY|AGE
1|AB1|BBC|12
2|AB2|FGD|17
3|AB3|ASD|18
4|AB4|SDF|19
5|AB5|ASC|22
The Column name NAME | AGE should get hashed with random values
like below output
ID|NAME|CITY|AGE
1|68b329da9111314099c7d8ad5cb9c940|BBC|77bAD9da9893er34099c7d8ad5cb9c940
2|69b32fga9893e34099c7d8ad5cb9c940|FGD|68bAD9da989yue34099c7d8ad5cb9c940
3|46b329da9893e3403453d8ad5cb9c940|ASD|60bfgD9da9893e34099c7d8ad5cb9c940
4|50Cd29da9893e34099c7d8ad5cb9c940|SDF|67bAD9da98973e34099c7d8ad5cb9c940
5|67bAD9da9893e34099c7d8ad5cb9c940|ASC|67bAD9da11893e34099c7d8ad5cb9c940
When i tested this code below code gives me same value for the column 'NAME' it should give randomized values
awk '{
tmp="echo " $2 " | openssl md5 | cut -f2 -d\" \""
tmp | getline cksum
close(tmp)
$2=cksum
print
}' < sample.csv
output :
68b329da9893e34099c7d8ad5cb9c940
68b329da9893e34099c7d8ad5cb9c940
68b329da9893e34099c7d8ad5cb9c940
68b329da9893e34099c7d8ad5cb9c940
68b329da9893e34099c7d8ad5cb9c940
68b329da9893e34099c7d8ad5cb9c940
You may use it like this:
awk 'function hash(s, cmd, hex, line) {
cmd = "openssl md5 <<< \"" s "\""
if ( (cmd | getline line) > 0)
hex = line
close(cmd)
return hex
}
BEGIN {
FS = OFS = "|"
}
NR == 1 {
print
next
}
{
print $1, hash($2), $3, hash($4)
}' file
ID|NAME|CITY|AGE
1|d44aec35a11ff6fa8a800120dbef1cd7|BBC|2737b49252e2a4c0fe4c342e92b13285
2|157aa4a48373eaf0415ea4229b3d4421|FGD|4d095eeac8ed659b1ce69dcef32ed0dc
3|ba3c08d4a65f1baa1d7220a6802b5710|ASD|cf4278314ef8e4b996e1b798d8eb92cf
4|69be622e1c0d417ceb9b8fb0aa9dc574|SDF|3bb50ff8eeb7ad116724b56a820139fa
5|427872b1ac3a22dc154688ddc2050516|ASC|2fc57d6f63a9ee7e2f21a26fa522e3b6
You have to specify | as input and output field separators. Otherwise $2 is not what you expect, but an empty string.
awk -F '|' -v "OFS=|" 'FNR==1 { print; next } {
tmp="echo " $2 " | openssl md5 | cut -f2 -d\" \""
tmp | getline cksum
close(tmp)
$2=cksum
print
}' sample.csv
prints
ID|NAME|CITY|AGE
1|d44aec35a11ff6fa8a800120dbef1cd7|BBC|12
2|157aa4a48373eaf0415ea4229b3d4421|FGD|17
3|ba3c08d4a65f1baa1d7220a6802b5710|ASD|18
4|69be622e1c0d417ceb9b8fb0aa9dc574|SDF|19
5|427872b1ac3a22dc154688ddc2050516|ASC|22
Example using GNU datamash to do the hashing and some awk to rearrange the columns it outputs:
$ datamash -t'|' --header-in -f md5 2,4 < input.txt | awk 'BEGIN { FS=OFS="|"; print "ID|NAME|CITY|AGE" } { print $1, $5, $3, $6 }'
ID|NAME|CITY|AGE
1|1109867462b2f0f0470df8386036243c|BBC|c20ad4d76fe97759aa27a0c99bff6710
2|14da3a611e2f8953d76b6fb7866b01d1|FGD|70efdf2ec9b086079795c442636b55fb
3|710a24b9eac0692b1adaabd07726211a|ASD|6f4922f45568161a8cdf4ad2299f6d23
4|c4d15b255ef3c6a89d1fe2e6a26b8eda|SDF|1f0e3dad99908345f7439f8ffabdffc4
5|96b24a28173a75cc3c682e25d3a6bd49|ASC|b6d767d2f8ed5d21a44b0e5886680cb9
Note that the MD5 hashes are different in this answer than (At the time of writing) the ones in the others; that's because they use approaches that add a trailing newline to the strings being hashed, producing incorrect results if you want the exact hash:
$ echo AB1 | md5sum
d44aec35a11ff6fa8a800120dbef1cd7 -
$ echo -n AB1 | md5sum
1109867462b2f0f0470df8386036243c -
You might consider using a language that has support for md5 included, or at least cache the md5 results (I assume that the city and age have a limited domain, which is smaller than the number of lines).
Perl has support for md5 out of the box:
perl -M'Digest::MD5 qw(md5_hex)' -F'\|' -le 'if (2..eof) {
$F[$_] = md5_hex($F[$_]) for (1,3);
print join "|",#F
} else { print }'
online demo: https://ideone.com/xg6cxZ (to my surprise ideone has perl available in bash)
Digest::MD5 is a core module, any perl installation should have it
-M'Digest::MD5 qw(md5_hex)' - this loads the md5_hex function
-l handle line endings
-F'\|' - autosplit fields on | (this implies -a and -n)
2..eof - range operator (or flip-flop as some want to call it) - true between line 2 and end of the file
$F[$_] = md5_hex($F[$_]) - replace field $_ with it's md5 sum
for (1,3) - statement modifier runs the statement for 1 and 3 aliasing $_ to them
print join "|",#F - print the modified fields
else { print } - this hanldes the header
Note about speed: on my machine this processes ~100,000 lines in about 100 ms, compared with an awk variant of this answer that does 5000 lines in ~1 minute 14 seconds (i wasn't patient enough to wait for 100,000 lines)
time perl -M'Digest::MD5 qw(md5_hex)' -F'\|' -le 'if (2..eof) { $F[$_] = md5_hex($F[$_]) for (1,3);print join "|",#F } else { print }' <sample2.txt > out4.txt
real 0m0.121s
user 0m0.118s
sys 0m0.003s
$ time awk -F'|' -v OFS='|' -i md5.awk '{ print $1,md5($2),$3,md5($4) }' <(head -5000 sample2.txt) >out2.txt
real 1m14.205s
user 0m50.405s
sys 0m35.340s
md5.awk defines the md5 function as such:
$ cat md5.awk
function md5(str, cmd, l, hex) {
cmd= "/bin/echo -n "str" | openssl md5 -r"
if ( ( cmd | getline l) > 0 )
hex = substr(l,0,32)
close(cmd)
return hex
}
I'm using /bin/echo because there are some variants of shell where echo doesn't have -n
I'm using -n mostly because I want to be able to compare the results with the perl results
substr(l,0,32) - on my machine openssl md5 doesn't return just the sum, it has also the file name - see: https://ideone.com/KGMWPe - substr gets only the relevant part
I'm using a separate file because it seems much cleaner, and because I can switch between function implementations fairly easy
As I was saying in the beginning, if you really want to use awk, at least cache the result of the openssl tool.
$ cat md5memo.awk
function md5(str, cmd, l, hex) {
if (cache[str])
return cache[str]
cmd= "/bin/echo -n "str" | openssl md5 -r"
if ( ( cmd | getline l) > 0 )
hex = substr(l,0,32)
close(cmd)
cache[str] = hex
return hex
}
With the above caching, the results improve dramatically:
$ time awk -F'|' -v OFS='|' -i md5memo.awk '{ print $1,md5($2),$3,md5($4) }' <(head -5000 sample2.txt) >outmemo.txt
real 0m0.192s
user 0m0.141s
sys 0m0.085s
[savuso#localhost hash]$ time awk -F'|' -v OFS='|' -i md5memo.awk '{ print $1,md5($2),$3,md5($4) }' <sample2.txt >outmemof.txt
real 0m0.281s
user 0m0.222s
sys 0m0.088s
however your mileage my vary: sample2.txt has 100000 lines, with 5 different values for $2 and 40 different values for $4. Real life data may vary!
Note: I just realized that my awk implementation doesn't handle headers, but you can get that from the other answers
I have a file example.txt with about 3000 lines with a string in each line. A small file example would be:
>cat example.txt
saudifh
sometestPOIFJEJ
sometextASLKJND
saudifh
sometextASLKJND
IHFEW
foo
bar
I want to check all repeated lines in this file and output them. The desired output would be:
>checkRepetitions.sh
found two equal lines: index1=1 , index2=4 , value=saudifh
found two equal lines: index1=3 , index2=5 , value=sometextASLKJND
I made a script checkRepetions.sh:
#!bin/bash
size=$(cat example.txt | wc -l)
for i in $(seq 1 $size); do
i_next=$((i+1))
line1=$(cat example.txt | head -n$i | tail -n1)
for j in $(seq $i_next $size); do
line2=$(cat example.txt | head -n$j | tail -n1)
if [ "$line1" = "$line2" ]; then
echo "found two equal lines: index1=$i , index2=$j , value=$line1"
fi
done
done
However this script is very slow, it takes more than 10 minutes to run. In python it takes less than 5 seconds... I tried to store the file in memory by doing lines=$(cat example.txt) and doing line1=$(cat $lines | cut -d',' -f$i) but this is still very slow...
When you do not want to use awk (a good tool for the job, parsing the input only once),
you can run through the lines several times. Sorting is expensive, but this solution avoids the loops you tried.
grep -Fnxf <(uniq -d <(sort example.txt)) example.txt
With uniq -d <(sort example.txt) you find all lines that occur more than once. Next grep will search for these (option -f) complete (-x) lines without regular expressions (-F) and show the line it occurs (-n).
See why-is-using-a-shell-loop-to-process-text-considered-bad-practice for some of the reasons why your script is so slow.
$ cat tst.awk
{ val2hits[$0] = val2hits[$0] FS NR }
END {
for (val in val2hits) {
numHits = split(val2hits[val],hits)
if ( numHits > 1 ) {
printf "found %d equal lines:", numHits
for ( hitNr=1; hitNr<=numHits; hitNr++ ) {
printf " index%d=%d ,", hitNr, hits[hitNr]
}
print " value=" val
}
}
}
$ awk -f tst.awk file
found 2 equal lines: index1=1 , index2=4 , value=saudifh
found 2 equal lines: index1=3 , index2=5 , value=sometextASLKJND
To give you an idea of the performance difference using a bash script that's written to be as efficient as possible and an equivalent awk script:
bash:
$ cat tst.sh
#!/bin/bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: bash 4.0 required" >&2; exit 1;; esac
# initialize an associative array, mapping each string to the last line it was seen on
declare -A lines=( )
lineNum=0
while IFS= read -r line; do
(( ++lineNum ))
if [[ ${lines[$line]} ]]; then
printf 'Content previously seen on line %s also seen on line %s: %s\n' \
"${lines[$line]}" "$lineNum" "$line"
fi
lines[$line]=$lineNum
done < "$1"
$ time ./tst.sh file100k > ou.sh
real 0m15.631s
user 0m13.806s
sys 0m1.029s
awk:
$ cat tst.awk
lines[$0] {
printf "Content previously seen on line %s also seen on line %s: %s\n", \
lines[$0], NR, $0
}
{ lines[$0]=NR }
$ time awk -f tst.awk file100k > ou.awk
real 0m0.234s
user 0m0.218s
sys 0m0.016s
There are no differences in the output of both scripts:
$ diff ou.sh ou.awk
$
The above is using 3rd-run timing to avoid caching issues and being tested against a file generated by the following awk script:
awk 'BEGIN{for (i=1; i<=10000; i++) for (j=1; j<=10; j++) print j}' > file100k
When the input file had zero duplicate lines (generated by seq 100000 > nodups100k) the bash script executed in about the same amount of time as it did above while the awk script executed much faster than it did above:
$ time ./tst.sh nodups100k > ou.sh
real 0m15.179s
user 0m13.322s
sys 0m1.278s
$ time awk -f tst.awk nodups100k > ou.awk
real 0m0.078s
user 0m0.046s
sys 0m0.015s
To demonstrate a relatively efficient (within the limits of the language and runtime) native-bash approach, which you can see running in an online interpreter at https://ideone.com/iFpJr7:
#!/bin/bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: bash 4.0 required" >&2; exit 1;; esac
# initialize an associative array, mapping each string to the last line it was seen on
declare -A lines=( )
lineNum=0
while IFS= read -r line; do
lineNum=$(( lineNum + 1 ))
if [[ ${lines[$line]} ]]; then
printf 'found two equal lines: index1=%s, index2=%s, value=%s\n' \
"${lines[$line]}" "$lineNum" "$line"
fi
lines[$line]=$lineNum
done <example.txt
Note the use of while read to iterate line-by-line, as described in BashFAQ #1: How can I read a file line-by-line (or field-by-field)?; this permits us to open the file only once and read through it without needing any command substitutions (which fork off subshells) or external commands (which need to be individually started up by the operating system every time they're invoked, and are likewise expensive).
The other part of the improvement here is that we're reading the whole file only once -- implementing an O(n) algorithm -- as opposed to running O(n^2) comparisons as the original code did.
I have tried this :
dirs=$1
for dir in $dirs
do
ls -R $dir
done
Like this?:
$ cat > foo
this
nope
$ cat > bar
neither
this
$ sort *|uniq -c
1 neither
1 nope
2 this
and weed out the ones with just 1s:
... | awk '$1>1'
2 this
Use sort with uniq to find the duplicate lines.
#!/bin/bash
dirs=("$#")
for dir in "${dirs[#]}" ; do
cat "$dir"/*
done | sort | uniq -c | sort -n | tail -n1
uniq -c will prepend the number of occurrences to each line
sort -n will sort the lines by the number of occurrences
tail -n1 will only output the last line, i.e. the maximum. If you want to see all the lines with the same number of duplicates, add the following instead of tail:
perl -ane 'if ($F[0] == $n) { push #buff, $_ }
else { #buff = $_ }
$n = $F[0];
END { print for #buff }'
You could use awk. If you just want to "count the duplicate lines", we could infer that you're after "all lines which have appeared earlier in the same file". The following would produce these counts:
#!/bin/sh
for file in "$#"; do
if [ -s "$file" ]; then
awk '$0 in a {c++} {a[$0]} END {printf "%s: %d\n", FILENAME, c}' "$file"
fi
done
The awk script first checks to see if the current line is stored in the array a, and if it does, increments a counter. Then it adds the line to its array. At the end of the file, we print the total.
Note that this might have problems on very large files, since the entire input file needs to be read into memory in the array.
Example:
$ printf 'foo\nbar\nthis\nbar\nthat\nbar\n' > inp.txt
$ awk '$0 in a {c++} {a[$0]} END {printf "%s: %d\n", FILENAME, c}' inp.txt
inp.txt: 2
The word 'bar' exist three times in the file, thus there are two duplicates.
To aggregate multiple files, you can just feed multiple files to awk:
$ printf 'foo\nbar\nthis\nbar\n' > inp1.txt
$ printf 'red\nblue\ngreen\nbar\n' > inp2.txt
$ awk '$0 in a {c++} {a[$0]} END {print c}' inp1.txt inp2.txt
2
For this, the word 'bar' appears twice in the first file and once in the second file -- a total of three times, thus we still have two duplicates.
How can I save the contents of the variable sum in this operation?
$ seq 1 5 | awk '{sum+=$1} end {print sum; echo "$sum" > test_file}'
It looks like you're confusing BASH syntax and Awk. Awk is a programming language, and it has very different syntax from BASH.
$ seq 1 5 | awk '{ sum += $1 } END { print sum }'
15
You want to capture that 15 into a file:
$ seq 1 5 | awk '{ sum += $1 } END { print sum }' > test_file
That is using the shell's redirection. The > appears outside of the Awk program where the shell has control, and redirects standard out into the file test_file.
You can also redirect inside of Awk, but this is Awk's redirection. However, it uses the same syntax as BASH:
$ seq 1 5 | awk '{ sum += $1 } END { print sum > "test_file" }'
Note that the file name has to be quoted, or Awk will assume that test_file is a variable, and you'll get some error about redirecting to a null file name.
To write your output into a file, you have to redirect to "test_file" like this:
$ seq 5 | awk '{sum+=$1} END{print sum > "test_file"}'
$ cat test_file
15
Your version was not working because you were not quoting test_file, so for awk it was considered a variable. And as you have not defined it beforehand, awk couldn't redirect properly. David W's answer explains it pretty well.
Note also that seq 5 is equivalent to seq 1 5.
In case you want to save the result into a variable, you can use the var=$(command) syntax:
$ sum=$(seq 5 | awk '{sum+=$1} END{print sum}')
$ echo $sum
15
echo won't work in the awk command. Try this:
seq 1 5 | awk '{sum+=$1} END {print sum > "test_file"}
You don't need awk for this. You can say:
$ seq 5 | paste -sd+ | bc > test_file
$ cat test_file
15
This question is tagged with bash so here is a pure bash solution:
for ((i=1; i<=5; i++)); do ((sum+=i)); done; echo "$sum" > 'test_file'
Or this one:
for i in {1..5}; do ((sum+=i)); done; echo "$sum" > 'test_file'
http://sed.sourceforge.net/grabbag/scripts/add_decs.sed
#! /bin/sed -f
# This is an alternative approach to summing numbers,
# which works a digit at a time and hence has unlimited
# precision. This time it is done with lookup tables,
# and uses only 10 commands.
G
s/\n/-/
s/$/-/
s/$/;9aaaaaaaaa98aaaaaaaa87aaaaaaa76aaaaaa65aaaaa54aaaa43aaa32aa21a100/
:loop
/^--[^a]/!{
# Convert next digit from both terms into analog form
# and put the two groups next to each other
s/^\([0-9a]*\)\([0-9]\)-\([^-]*\)-\(.*;.*\2\(a*\)\2.*\)/\1-\3-\5\4/
s/^\([^-]*\)-\([0-9a]*\)\([0-9]\)-\(.*;.*\3\(a*\)\3.*\)/\1-\2-\5\4/
# Back to decimal, but keeping the carry in analog form
# \2 matches an `a' if there are at least ten a's, else nothing
#
# 1------------- 3- 4----------------------
# 2 5----
s/-\(aaaaaaaaa\(a\)\)\{0,1\}\(a*\)\([0-9b]*;.*\([0-9]\)\3\5\)/-\2\5\4/
b loop
}
s/^--\([^;]*\);.*/\1/
h