Extract values from a text file in linux - linux

I have a log file generated from sqlldr log file and I was wondering if I can write a shell to extract the following values from the log below using Linux. Thanks
Table_name: TEST
Row_load: 100
Row_fail: 10
Date_run: Feb 07, 2014
Table TEST:
100 Rows successfully loaded.
10 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 14486
Total logical records rejected: 0
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 3
Total stream buffers loaded by SQL*Loader load thread: 0
Run began on Fri Feb 07 12:21:24 2014
Run ended on Fri Feb 07 12:21:31 2014
Elapsed time was: 00:00:06.88
CPU time was: 00:00:00.11

If the structure of your log file is always same, then you can do the following with awk:
awk '
NR==1 { sub(/:/,x); print "Table_name: "$NF}
NR==2 { print "Row_load: " $1}
NR==3 { print "Row_fail: " $1}
/Run ended/ { print "Date_run: "$5 FS $6","$8}' file
Output
$ awk '
NR==1 { sub(/:/,x); print "Table_name: "$NF}
NR==2 { print "Row_load: " $1}
NR==3 { print "Row_fail: " $1}
/Run ended/ { print "Date_run: "$5 FS $6","$8}' file
Table_name: TEST
Row_load: 100
Row_fail: 10
Date_run: Feb 07,2014

Related

Convert a text into time format using bash script

I am new to shell scripting.. I have a tab-separated file, e.g.,
0018803 01 1710 2050 002571
0018951 01 1934 2525 003277
0019362 02 2404 2415 002829
0019392 01 2621 2820 001924
0019542 01 2208 2413 003434
0019583 01 1815 2134 002971
Here, the 3rd and 4th column is representing Start Time and End Time.
I want to convert these two columns in proper timeFrame so that I can get 6th column as the exact time difference between column 4 and column 3 in hours and minutes.
Column 6 result will be 3:40, 5:51, 00:11, 1:59, 2:05.
One way with awk:
$ cat test.awk
# create a function to split hour and minute
function f(h, x) {
h[0] = substr(x,1,2)+0
h[1] = substr(x,3,2)+0
}
{
f(start, $3);
f(end, $4);
span = end[1] - start[1] > 0 \
? sprintf("%d:%02d", end[0]-start[0], end[1]-start[1]) \
: sprintf("%d:%02d", end[0]-start[0]-1, 60+end[1]-start[1]);
print $0 OFS span
}
then run the awk file as the following:
$ awk -f test.awk input_file
Edit: per #glenn jackman's suggestion, the code can be simplified (refer to #Kamil Cuk's method):
function g(x) {
return substr(x,1,2)*60 + substr(x,3,2)
}
{
span = g($4) - g($3)
printf("%s%s%d:%02d\n", $0, OFS, int(span/60), span%60)
}
A simple bash solution using arithmetic expansion:
while IFS='' read -r l; do
IFS=' ' read -r _ _ st et _ <<<"$l"
d=$(( (10#${et:0:2} * 60 + 10#${et:2:2}) - (10#${st:0:2} * 60 + 10#${st:2:2}) ))
printf "%s %02d:%02d\n" "$l" "$((d/60))" "$((d%60))"
done < intput_file_path
will output:
0018803 01 1710 2050 002571 03:40
0018951 01 1934 2525 003277 05:51
0019362 02 2404 2415 002829 00:11
0019392 01 2621 2820 001924 01:59
0019542 01 2208 2413 003434 02:05
0019583 01 1815 2134 002971 03:19
Here is one in GNU awk using time functions, mktime to convert to epoch time and strftime to convert the time to desired format HH:MM:
$ awk -v OFS="\t" '{
dt3="1970 01 01 " substr($3,1,2) " " substr($3,3,2) " 00"
dt4="1970 01 01 " substr($4,1,2) " " substr($4,3,2) " 00"
print $0,strftime("%H:%M",mktime(dt4)-mktime(dt3),1) # thanks #glennjackman,1 :)
}' file
Output ($6 only):
03:40
05:51
00:11
01:59
02:05
03:19

bash - How to break a long string by word count into multiple string

I would like to break a long string by the word count and then continue display and break once every certain number of words reached.
For example.
I have a string :
value="Aug 04 03:49:00.082205 ALERT IPX-NG dropped -- total: 4693845, count: 39254, rate: 1.88% ; OUTPUT QUEUE frampedd: active=1, delivered=73265000210 Aug 04 09:43:00.795817 ALERT IPX-NG dropped -- total: 4765909, count: 72064, rate: 1.91% ; OUTPUT QUEUE frampedd: active=0, delivered=74220627600"
my expected output is :
Aug 04 03:49:00.082205 ALERT IPX-NG dropped -- total: 4693845, count: 39254, rate: 1.88% ; OUTPUT QUEUE frampedd: active=1, delivered=73265000210
Aug 04 09:43:00.795817 ALERT IPX-NG dropped -- total: 4765909, count: 72064, rate: 1.91% ; OUTPUT QUEUE frampedd: active=0, delivered=74220627600"
I couldn't use by character count as the number will be vary. So the best choice is to use word count.
EDIT:
Hi Guys,..i tried using the sed command and it seems to work!
sed 's/([^[:space:]]{1,}[[:space:]]{1,}){19}/&\n/'
thanks those who helps..you can give me better suggestions if there is :D.. would hope for pure bash command as i'm unable to install any other extension on the server.
If all the string count doesn't change, you can use xargs
$ value="Aug 04 03:49:00.082205 ALERT IPX-NG dropped -- total: 4693845, count: 39254, rate: 1.88% ; OUTPUT QUEUE frampedd: active=1, delivered=73265000210 Aug 04 09:43:00.795817 ALERT IPX-NG dropped -- total: 4765909, count: 72064, rate: 1.91% ; OUTPUT QUEUE frampedd: active=0, delivered=74220627600"
$ xargs -n 19 <<< "$value"
Aug 04 03:49:00.082205 ALERT IPX-NG dropped -- total: 4693845, count: 39254, rate: 1.88% ; OUTPUT QUEUE frampedd: active=1, delivered=73265000210
Aug 04 09:43:00.795817 ALERT IPX-NG dropped -- total: 4765909, count: 72064, rate: 1.91% ; OUTPUT QUEUE frampedd: active=0, delivered=74220627600
The xargs man page says this about the flag -n
-n max-args, --max-args=max-args
Use at most max-args arguments per command line. Fewer than
max-args arguments will be used if the size (see the -s
option) is exceeded, unless the -x option is given, in which
case xargs will exit.
In AWK:
awk '{{gsub(/^value="|"$/,""); for(i=1;i<=NF;i++) printf "%s%s",$i,(i%19?" ":"\n")}' file
or if your string does not in fact start with value=", you can loose the gsub:
awk '{for(i=1;i<=NF;i++) printf "%s%s",$i,(i%19?" ":"\n")}' file

how to extract numbers in the same location from many log files

I got an file test1.log
04/15/2016 02:22:46 PM - kneaddata.knead_data - INFO: Running kneaddata v0.5.1
04/15/2016 02:22:46 PM - kneaddata.utilities - INFO: Decompressing gzipped file ...
Input Reads: 69766650 Surviving: 55798391 (79.98%) Dropped: 13968259 (20.02%)
TrimmomaticSE: Completed successfully
04/15/2016 02:32:04 PM - kneaddata.utilities - DEBUG: Checking output file from Trimmomatic : /home/liaoming/kneaddata_v0.5.1/WGC066610D/WGC066610D_kneaddata.trimmed.fastq
04/15/2016 05:32:31 PM - kneaddata.utilities - DEBUG: 55798391 reads; of these:
55798391 (100.00%) were unpaired; of these:
55775635 (99.96%) aligned 0 times
17313 (0.03%) aligned exactly 1 time
5443 (0.01%) aligned >1 times
0.04% overall alignment rate
and the other files in the same format but different contents,like test2.log,test3.log to test60.log
I would like to extract two numbers from these files.For example the test1.log, the two numbers would be 55798391 55775635.
So the final generated file counts.txt would be something like this:
test1 55798391 55775635
test2 51000000 40000000
.....
test60 5000000 30000000
awk to the rescue!
$ awk 'FNR==9{f=$1} FNR==10{print FILENAME,f,$1}' test{1..60}.log
if not in the same directory, either call within a loop or create the file list and pipe to xargs awk
$ for i in {1..60}; do awk ... test$i/test$i.log; done
$ for i in {1..60}; do echo test$i/test$i.log; done | xargs awk ...

how to restore dump of all memcache keys stored in a file using memcache-tool correctly?

Step 1. taking dump of memcached keys from localhost
shub#S04:/usr/share/memcached/scripts$ ./memcached-tool localhost:11211 dump > /tmp/backup.log
Dumping memcache contents
Number of buckets: 1
Number of items : 4
Dumping bucket 1 - 4 total items
Step 2. restoring dump to one of the internal server
shub#S04:/usr/share/memcached$ nc 10.0.2.182 11112 < /tmp/test.log
STORED
STORED
STORED
STORED
Step 3. But when I ran stats, I only found 1 item whereas there were 4 items restored in the above command.
shub#S04:/usr/share/memcached/scripts$ echo "stats items" | nc 10.0.2.182 11112
STAT items:1:number 1
STAT items:1:age 588
STAT items:1:evicted 0
STAT items:1:evicted_nonzero 0
STAT items:1:evicted_time 0
STAT items:1:outofmemory 0
STAT items:1:tailrepairs 0
STAT items:1:reclaimed 24
STAT items:1:expired_unfetched 24
STAT items:1:evicted_unfetched 0
END
So I want a command that will restore the complete dump, here I think it is overwriting data on the same slab.
The log you export as this:
add key 32 timestamp 135 /r/n
value /r/n
when you resore this data to new memcache, the timestamp has been overdue, so you must modify timestamp as 0 or some future timestamp,(0 is never to due).

How to get file creation date/time in Bash/Debian?

I'm using Bash on Debian GNU/Linux 6.0. Is it possible to get the file creation date/time? Not the modification date/time.
ls -lh a.txt and stat -c %y a.txt both only give the modification time.
Unfortunately your quest won't be possible in general, as there are only 3 distinct time values stored for each of your files as defined by the POSIX standard (see Base Definitions section 4.8 File Times Update)
Each file has three distinct associated timestamps: the time of last
data access, the time of last data modification, and the time the file
status last changed. These values are returned in the file
characteristics structure struct stat, as described in <sys/stat.h>.
EDIT: As mentioned in the comments below, depending on the filesystem used metadata may contain file creation date. Note however storage of information like that is non standard. Depending on it may lead to portability problems moving to another filesystem, in case the one actually used somehow stores it anyways.
ls -i file #output is for me 68551981
debugfs -R 'stat <68551981>' /dev/sda3 # /dev/sda3 is the disk on which the file exists
#results - crtime value
[root#loft9156 ~]# debugfs -R 'stat <68551981>' /dev/sda3
debugfs 1.41.12 (17-May-2010)
Inode: 68551981 Type: regular Mode: 0644 Flags: 0x80000
Generation: 769802755 Version: 0x00000000:00000001
User: 0 Group: 0 Size: 38973440
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 76128
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x526931d7:1697cce0 -- Thu Oct 24 16:42:31 2013
atime: 0x52691f4d:7694eda4 -- Thu Oct 24 15:23:25 2013
mtime: 0x526931d7:1697cce0 -- Thu Oct 24 16:42:31 2013
**crtime: 0x52691f4d:7694eda4 -- Thu Oct 24 15:23:25 2013**
Size of extra inode fields: 28
EXTENTS:
(0-511): 352633728-352634239, (512-1023): 352634368-352634879, (1024-2047): 288392192-288393215, (2048-4095): 355803136-355805183, (4096-6143): 357941248-357943295, (6144
-9514): 357961728-357965098
mikyra's answer is good. The fact just like what he said.
[jason#rh5 test]$ stat test.txt
File: `test.txt'
Size: 0 Blocks: 8 IO Block: 4096 regular empty file
Device: 802h/2050d Inode: 588720 Links: 1
Access: (0664/-rw-rw-r--) Uid: ( 500/ jason) Gid: ( 500/ jason)
Access: 2013-03-14 01:58:12.000000000 -0700
Modify: 2013-03-14 01:58:12.000000000 -0700
Change: 2013-03-14 01:58:12.000000000 -0700
if you want to verify, which file was created first, you can structure your file name by appending system date when you create a series of files.
Note that if you've got your filesystem mounted with noatime for performance reasons, then the atime will likely show the creation time. Given that noatime results in a massive performance boost (by removing a disk write for every time a file is read), it may be a sensible configuration option that also gives you the results you want.
Creation date/time is normally not stored. So no, you can't.
You can find creation time - aka birth time - using stat and also match using find.
We have these files showing last modified time:
$ ls -l --time-style=long-iso | sort -k6
total 692
-rwxrwx---+ 1 XXXX XXXX 249159 2013-05-31 14:47 Getting Started.pdf
-rwxrwx---+ 1 XXXX XXXX 275799 2013-12-30 21:12 TheScienceofGettingRich.pdf
-rwxrwx---+ 1 XXXX XXXX 25600 2015-05-07 18:52 Thumbs.db
-rwxrwx---+ 1 XXXX XXXX 148051 2015-05-07 18:55 AsAManThinketh.pdf
To find files created within a certain time frame using find as below.
Clearly, the filesystem knows about the birth time of a file:
$ find -newerBt '2014-06-13' ! -newerBt '2014-06-13 12:16:10' -ls
20547673299906851 148 -rwxrwx--- 1 XXXX XXXX 148051 May 7 18:55 ./AsAManThinketh.pdf
1407374883582246 244 -rwxrwx--- 1 XXXX XXXX 249159 May 31 2013 ./Getting\ Started.pdf
We can confirm this using stat:
$ stat -c "%w %n" * | sort
2014-06-13 12:16:03.873778400 +0100 AsAManThinketh.pdf
2014-06-13 12:16:04.006872500 +0100 Getting Started.pdf
2014-06-13 12:16:29.607075500 +0100 TheScienceofGettingRich.pdf
2015-05-07 18:32:26.938446200 +0100 Thumbs.db
stat man pages explains %w:
%w time of file birth, human-readable; - if unknown
ls -i menus.xml
94490 menus.xml
Here the number 94490 represents inode
Then do a:
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg-root 4.0G 3.4G 408M 90% /
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/sda1 124M 27M 92M 23% /boot
/dev/mapper/vg-var 7.9G 1.1G 6.5G 15% /var
To find the mounting point of the root "/" filesystem, because the file menus.xml is on '/' that is '/dev/mapper/vg-root'
debugfs -R 'stat <94490>' /dev/mapper/vg-root
The output may be like the one below:
debugfs -R 'stat <94490>' /dev/mapper/vg-root
debugfs 1.41.12 (17-May-2010)
Inode: 94490 Type: regular Mode: 0644 Flags: 0x0
Generation: 2826123170 Version: 0x00000000
User: 0 Group: 0 Size: 4441
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 16
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x5266e438 -- Wed Oct 23 09:46:48 2013
atime: 0x5266e47b -- Wed Oct 23 09:47:55 2013
mtime: 0x5266e438 -- Wed Oct 23 09:46:48 2013
Size of extra inode fields: 4
Extended attributes stored in inode body:
selinux = "unconfined_u:object_r:usr_t:s0\000" (31)
BLOCKS:
(0-1):375818-375819
TOTAL: 2
Where you can see the creation time:
ctime: 0x5266e438 -- Wed Oct 23 09:46:48 2013
stat -c %w a.txt
%w returns the file creation(birth) date if it is available, which is rare.
Here's the link
As #mikyra explained, creation date time is not stored anywhere.
All the methods above are nice, but if you want to quickly get only last modify date, you can type:
ls -lit /path
with -t option you list all file in /path odered by last modify date.
If you really want to achieve that you can use a file watcher like inotifywait.
You watch a directory and you save information about file creations in separate file outside that directory.
while true; do
change=$(inotifywait -e close_write,moved_to,create .)
change=${change#./ * }
if [ "$change" = ".*" ]; then ./scriptToStoreInfoAboutFile; fi
done
As no creation time is stored, you can build your own system based on inotify.
Cited from https://unix.stackexchange.com/questions/50177/birth-is-empty-on-ext4/131347#131347 , the following shellscript would work to get creation time:
get_crtime() {
for target in "${#}"; do
inode=$(stat -c %i "${target}")
fs=$(df "${target}" | tail -1 | awk '{print $1}')
crtime=$(sudo debugfs -R 'stat <'"${inode}"'>' "${fs}" 2>/dev/null | grep -oP 'crtime.*--\s*\K.*')
printf "%s\t%s\n" "${target}" "${crtime}"
done
}
even better:
lsct ()
{
debugfs -R 'stat <'`ls -i "$1" | (read a b;echo -n $a)`'>' `df "$1" | (read a; read a b; echo "$a")` 2> /dev/null | grep --color=auto crtime | ( read a b c d;
echo $d )
}
lsct /etc
Wed Jul 20 19:25:48 2016
Another trick to add to your arsenal is the following:
$ grep -r "Copyright" /<path-to-source-files>/src
Generally speaking, if one changes a file they should claim credit in the “Copyright”. Examine the results for dates, file names, contributors and contact email.
example grep result:
/<path>/src/someobject.h: * Copyright 2007-2012 <creator's name> <creator's email>(at)<some URL>>

Resources