force linux sort to use lexicographic order - linux

I generated a text file with pseudo-random numbers like this:
-853340442 1130519212 -2070936922
-707168664 -2076185735 -2135012102
166464098 1928545126 5768715
1060168276 -684694617 395859713
-680897578 -2095893176 1457930442
299309402 192205833 1878010157
-678911642 2062673581 -1801057195
795693402 -631504846 2117889796
448959250 547707556 -1115929024
168558507 7468411 1600190097
-746131117 1557335455 73377787
-1144524558 2143073647 -2044347857
1862106004 -193937480 1596949168
-1193502513 -920620244 -365340967
-677065994 500654963 1031304603
Now I try to put it in order using linux sort command:
sort prng >prngsorted
The result is not what I expected:
1060168276 -684694617 395859713
-1144524558 2143073647 -2044347857
-1193502513 -920620244 -365340967
166464098 1928545126 5768715
168558507 7468411 1600190097
1862106004 -193937480 1596949168
299309402 192205833 1878010157
448959250 547707556 -1115929024
-677065994 500654963 1031304603
-678911642 2062673581 -1801057195
-680897578 -2095893176 1457930442
-707168664 -2076185735 -2135012102
-746131117 1557335455 73377787
795693402 -631504846 2117889796
-853340442 1130519212 -2070936922
Obviously, sort tries to parse strings and extract numbers for sorting. And it seems to ignore minus signs.
Is it possible to force sort to be a bit dumber and just compare lines lexicographically? The result should be like this:
-1144524558 2143073647 -2044347857
-1193502513 -920620244 -365340967
-677065994 500654963 1031304603
-678911642 2062673581 -1801057195
-680897578 -2095893176 1457930442
-707168664 -2076185735 -2135012102
-746131117 1557335455 73377787
-853340442 1130519212 -2070936922
1060168276 -684694617 395859713
166464098 1928545126 5768715
168558507 7468411 1600190097
1862106004 -193937480 1596949168
299309402 192205833 1878010157
448959250 547707556 -1115929024
795693402 -631504846 2117889796
Note: I tried -d option but it did not help
Note 2: Probably I should use another utility instead of sort?

The sort command takes account of your locale settings. Many of the locales ignore dashes for collation.
You can get appropriate sorting with
LC_COLLATE=C sort filename

custom sort with the help of awk
$ awk '{print ($1<0?"-":"+") "\t" $0}' file | sort -k1,1 -k2 | cut -f2-
-1144524558 2143073647 -2044347857
-1193502513 -920620244 -365340967
-677065994 500654963 1031304603
-678911642 2062673581 -1801057195
-680897578 -2095893176 1457930442
-707168664 -2076185735 -2135012102
-746131117 1557335455 73377787
-853340442 1130519212 -2070936922
1060168276 -684694617 395859713
166464098 1928545126 5768715
168558507 7468411 1600190097
1862106004 -193937480 1596949168
299309402 192205833 1878010157
448959250 547707556 -1115929024
795693402 -631504846 2117889796
sort by sign only first, then regular sort and remove sign afterwards...

Related

How to batch rename file names by delete content between certain characters Linux

I have many files named like this:
BH_Undetermined_S0_L001_R1_001.fastq.gz__1.fq.gz
BH_Undetermined_S0_L001_R1_001.fastq.gz__merged.fq.gz
BH_Undetermined_S0_L001_R2_001.fastq.gz__2.fq.gz
BHos1_S1_L001_R1_001.fastq.gz__1.fq.gz
BHos1_S1_L001_R1_001.fastq.gz__merged.fq.gz
BHos1_S1_L001_R2_001.fastq.gz__2.fq.gz
BHos2_S2_L001_R1_001.fastq.gz__1.fq.gz
BHos2_S2_L001_R1_001.fastq.gz__merged.fq.gz
BHos2_S2_L001_R2_001.fastq.gz__2.fq.gz
ChLJ1511Da_HNVMLBCXX_L2_1.fq.gz__1.fq.gz
ChLJ1511Da_HNVMLBCXX_L2_1.fq.gz__merged.fq.gz
ChLJ1511Da_HNVMLBCXX_L2_2.fq.gz__2.fq.gz
ChLJ1511Db_HNVMLBCXX_L2_1.fq.gz__1.fq.gz
ChLJ1511Db_HNVMLBCXX_L2_1.fq.gz__merged.fq.gz
ChLJ1511Db_HNVMLBCXX_L2_2.fq.gz__2.fq.gz
ML-3_H7VFTALXX_L2_1.fq.gz__1.fq.gz
ML-3_H7VFTALXX_L2_1.fq.gz__merged.fq.gz
ML-3_H7VFTALXX_L2_2.fq.gz__2.fq.gz
T2S170523_H23HKDMXX_L1_1.fq.gz__1.fq.gz
T2S170523_H23HKDMXX_L1_1.fq.gz__merged.fq.gz
T2S170523_H23HKDMXX_L1_2.fq.gz__2.fq.gz
T4S170523_H23HKDMXX_L1_1.fq.gz__1.fq.gz
I want to batch rename them by deleting the content between the first '_' and the '__'. So they will be like:
BH_1.fq.gz
BH_merged.fq.gz
BH_2.fq.gz
BHos1_1.fq.gz
BHos1_merged.fq.gz
BHos1_2.fq.gz
BHos2_1.fq.gz
BHos2_merged.fq.gz
BHos2_2.fq.gz
ChLJ1511Da_1.fq.gz
ChLJ1511Da_merged.fq.gz
ChLJ1511Da_2.fq.gz
ChLJ1511Db_1.fq.gz
ChLJ1511Db_merged.fq.gz
ChLJ1511Db_2.fq.gz
ML-3_H7VFTALXX_1.fq.gz
ML-3_H7VFTALXX_merged.fq.gz
ML-3_H7VFTALXX_2.fq.gz
T2S170523_1.fq.gz
T2S170523_merged.fq.gz
T2S170523_2.fq.gz
T4S170523_1.fq.gz
How to make it?
Thank you very much!
You can use the Perl-based rename (available is file-rename on Mint/19). Some systems are shipping with other version of rename (rename.ul), make sure you get the right one.
file-rename 's/_.*__//' BH_Undetermined_S0_L001_R1_001.fastq.gz__1.fq.gz
BH_Undetermined_S0_L001_R1_001.fastq.gz__merged.fq.gz
BH_Undetermined_S0_L001_R2_001.fastq.gz__2.fq.gz ...
You can verify rename operation without executing it (-v: verbose, -n: no action)
file-rename -n -v 's/_.*__//' BH_Undetermined_S0_L001_R1_001.fastq.gz__1.fq.gz ...

Sort list python3

I would like to order this list.
From:
01104D-BB'42
01104D-BB42
01104D-BB43
01104D-CC'42
01104D-CC'72
01104D-CC32
01104D-CC42
01104D-CC62
01104D-CC72
01104D-DD'74
01104D-DD'75
01104D-DD'76
01104D-DD'77
01104D-DD'78
01104D-DD75
01104D-DD76
01104D-DD77
01104D-DD78
01104D-EE'102
01104D-EE'12
01104D-EE'2
01104D-EE'32
01104D-EE'42
01104D-EE'52
01104D-EE'53
01104D-EE'72
01104D-EE'82
01104D-EE'92
01104D-EE102
01104D-EE12
01104D-EE2
01104D-EE3
01104D-EE32
01104D-EE42
01104D-EE52
01104D-EE62
01104D-EE72
01104D-EE82
01104D-EE83
01104D-EE92
01104D-EE93
To:
01104D-BB42
01104D-BB43
01104D-BB'42
01104D-CC32
01104D-CC42
01104D-CC62
01104D-CC72
01104D-CC'42
01104D-CC'72
01104D-DD75
01104D-DD76
01104D-DD77
01104D-DD78
01104D-DD'74
01104D-DD'75
01104D-DD'76
01104D-DD'77
01104D-DD'78
01104D-EE102
01104D-EE12
01104D-EE2
01104D-EE3
01104D-EE32
01104D-EE42
01104D-EE52
01104D-EE62
01104D-EE72
01104D-EE82
01104D-EE83
01104D-EE92
01104D-EE93
01104D-EE'102
01104D-EE'12
01104D-EE'2
01104D-EE'32
01104D-EE'42
01104D-EE'52
01104D-EE'53
01104D-EE'72
01104D-EE'82
01104D-EE'92
Can you help me?
thanks
I'm guessing here, because you haven't explained how you want the sort to be done. But it looks like you want the character ' to sort after the digits 0-9, and the ascii sort order puts it before the digits. If that is correct, then you need to substitute a different character for '. A good choice might be ~ because it is the last printable ascii character.
If your data is in mylist, then
mylist.sort(key=lambda a: a.replace("'","~"))
will sort it in the order I'm guessing you want.

entering text in a file at specific locations by identifying the number being integer or real in linux

I have an input like below
46742 1 48276 48343 48199 48198
46744 1 48343 48344 48200 48199
46746 1 48344 48332 48201 48200
48283 3.58077402e+01 -2.97697746e+00 1.50878647e+02
48282 3.67231688e+01 -2.97771595e+00 1.50419488e+02
48285 3.58558188e+01 -1.98122787e+00 1.50894850e+02
Each segment with the 2nd entry like 1 being integer is like thousands of lines and then starts the segment with the 2nd entry being real like 3.58077402e+01
Before anything beings I have to input a text like
*Revolved
*Gripped
*Crippled
46742 1 48276 48343 48199 48198
46744 1 48343 48344 48200 48199
46746 1 48344 48332 48201 48200
*Cracked
*Crippled
48283 3.58077402e+01 -2.97697746e+00 1.50878647e+02
48282 3.67231688e+01 -2.97771595e+00 1.50419488e+02
48285 3.58558188e+01 -1.98122787e+00 1.50894850e+02
so I need to enter specific texts at those locations. It is worth mentioning that the file is space delimited and not tabs delimited and that the text starting with * has to be at the very left of the line without spacing. The format of the rest of the file should be kept too.
Any suggestions with sed or awk would be highly appreaciated!
The text in the beginning could entered directly so that is not a prime problem since that is the start of the file, problematic is the second bunch of line so identify that the second entry has turned to real.
An awk with fixed strings:
awk 'BEGIN{print "*Revolved\n*Gripped\n*Crippled"}
match($2,"\+")&&!pr{print "*Cracked\n*Crippled";pr=1}1' yourfile
match($2,"\+")&&!pr : When + char is found at $2 field(real number) and pr flag is null.

List of SYNTAX for logstash's grok

The syntax for a grok pattern is %{SYNTAX:SEMANTIC}. How do i generate a list of all available SYNTAX keywords ? I know that I can use the grok debugger to discover patterns from text. But is there a list which i can scan through?
They are in GIT and included somewhere in the distribution. But it's probably just easiest to view it online:
https://github.com/elasticsearch/logstash/blob/v1.4.0/patterns/grok-patterns
The grok patterns files are now in the logstash-patterns-core repository.
Assuming you have a clone of it in the logstash-patterns-core directory on your filesystem, you can issue a command like this one to list all SYNTAX keywords:
$ find ./logstash-patterns-core/patterns -type f -exec awk '{print $1}' {} \; | grep "^[^#\ ]" | sort
As of commit 6655856, the output of the command (aka the list of SYNTAX keywords) looks like this (remember though that this list is not static):
BACULA_CAPACITY
BACULA_DEVICE
BACULA_DEVICEPATH
BACULA_HOST
BACULA_JOB
BACULA_LOG_ALL_RECORDS_PRUNED
BACULA_LOG_BEGIN_PRUNE_FILES
BACULA_LOG_BEGIN_PRUNE_JOBS
BACULA_LOG_CANCELLING
BACULA_LOG_CLIENT_RBJ
BACULA_LOG_DIFF_FS
BACULA_LOG_DUPLICATE
BACULA_LOG_ENDPRUNE
BACULA_LOG_END_VOLUME
BACULA_LOG_FATAL_CONN
BACULA_LOG_JOB
BACULA_LOG_JOBEND
BACULA_LOGLINE
BACULA_LOG_MARKCANCEL
BACULA_LOG_MAX_CAPACITY
BACULA_LOG_MAXSTART
BACULA_LOG_NEW_LABEL
BACULA_LOG_NEW_MOUNT
BACULA_LOG_NEW_VOLUME
BACULA_LOG_NO_AUTH
BACULA_LOG_NO_CONNECT
BACULA_LOG_NOJOBS
BACULA_LOG_NOJOBSTAT
BACULA_LOG_NOOPEN
BACULA_LOG_NOOPENDIR
BACULA_LOG_NOPRIOR
BACULA_LOG_NOPRUNE_FILES
BACULA_LOG_NOPRUNE_JOBS
BACULA_LOG_NOSTAT
BACULA_LOG_NOSUIT
BACULA_LOG_PRUNED_FILES
BACULA_LOG_PRUNED_JOBS
BACULA_LOG_READYAPPEND
BACULA_LOG_STARTJOB
BACULA_LOG_STARTRESTORE
BACULA_LOG_USEDEVICE
BACULA_LOG_VOLUME_PREVWRITTEN
BACULA_LOG_VSS
BACULA_LOG_WROTE_LABEL
BACULA_TIMESTAMP
BACULA_VERSION
BACULA_VOLUME
BASE10NUM
BASE16FLOAT
BASE16NUM
BIND9
BIND9_TIMESTAMP
BRO_CONN
BRO_DNS
BRO_FILES
BRO_HTTP
CATALINA_DATESTAMP
CATALINALOG
CISCO_ACTION
CISCO_DIRECTION
CISCOFW104001
CISCOFW104002
CISCOFW104003
CISCOFW104004
CISCOFW105003
CISCOFW105004
CISCOFW105005
CISCOFW105008
CISCOFW105009
CISCOFW106001
CISCOFW106006_106007_106010
CISCOFW106014
CISCOFW106015
CISCOFW106021
CISCOFW106023
CISCOFW106100
CISCOFW106100_2_3
CISCOFW110002
CISCOFW302010
CISCOFW302013_302014_302015_302016
CISCOFW302020_302021
CISCOFW304001
CISCOFW305011
CISCOFW313001_313004_313008
CISCOFW313005
CISCOFW321001
CISCOFW402117
CISCOFW402119
CISCOFW419001
CISCOFW419002
CISCOFW500004
CISCOFW602303_602304
CISCOFW710001_710002_710003_710005_710006
CISCOFW713172
CISCOFW733100
CISCO_INTERVAL
CISCOMAC
CISCO_REASON
CISCOTAG
CISCO_TAGGED_SYSLOG
CISCOTIMESTAMP
CISCO_XLATE_TYPE
CLOUDFRONT_ACCESS_LOG
COMBINEDAPACHELOG
COMMONAPACHELOG
COMMONMAC
CRON_ACTION
CRONLOG
DATA
DATE
DATE_EU
DATESTAMP
DATESTAMP_EVENTLOG
DATESTAMP_OTHER
DATESTAMP_RFC2822
DATESTAMP_RFC822
DATE_US
DAY
ELB_ACCESS_LOG
ELB_REQUEST_LINE
ELB_URI
ELB_URIPATHPARAM
EMAILADDRESS
EMAILLOCALPART
EXIM_DATE
EXIM_EXCLUDE_TERMS
EXIM_FLAGS
EXIM_HEADER_ID
EXIM_INTERFACE
EXIM_MSGID
EXIM_MSG_SIZE
EXIM_PID
EXIM_PROTOCOL
EXIM_QT
EXIM_REMOTE_HOST
EXIM_SUBJECT
GREEDYDATA
HAPROXYCAPTUREDREQUESTHEADERS
HAPROXYCAPTUREDRESPONSEHEADERS
HAPROXYDATE
HAPROXYHTTP
HAPROXYHTTPBASE
HAPROXYTCP
HAPROXYTIME
HOSTNAME
HOSTPORT
HOUR
HTTPD20_ERRORLOG
HTTPD24_ERRORLOG
HTTPDATE
HTTPD_COMBINEDLOG
HTTPD_COMMONLOG
HTTPDERROR_DATE
HTTPD_ERRORLOG
HTTPDUSER
INT
IP
IPORHOST
IPV4
IPV6
ISO8601_SECOND
ISO8601_TIMEZONE
JAVACLASS
JAVACLASS
JAVAFILE
JAVAFILE
JAVALOGMESSAGE
JAVAMETHOD
JAVASTACKTRACEPART
JAVATHREAD
LOGLEVEL
MAC
MAVEN_VERSION
MCOLLECTIVE
MCOLLECTIVEAUDIT
MCOLLECTIVEAUDIT
MINUTE
MONGO3_COMPONENT
MONGO3_LOG
MONGO3_SEVERITY
MONGO_LOG
MONGO_QUERY
MONGO_SLOWQUERY
MONGO_WORDDASH
MONTH
MONTHDAY
MONTHNUM
MONTHNUM2
NAGIOS_CURRENT_HOST_STATE
NAGIOS_CURRENT_SERVICE_STATE
NAGIOS_EC_DISABLE_HOST_CHECK
NAGIOS_EC_DISABLE_HOST_NOTIFICATIONS
NAGIOS_EC_DISABLE_HOST_SVC_NOTIFICATIONS
NAGIOS_EC_DISABLE_SVC_CHECK
NAGIOS_EC_DISABLE_SVC_NOTIFICATIONS
NAGIOS_EC_ENABLE_HOST_CHECK
NAGIOS_EC_ENABLE_HOST_NOTIFICATIONS
NAGIOS_EC_ENABLE_HOST_SVC_NOTIFICATIONS
NAGIOS_EC_ENABLE_SVC_CHECK
NAGIOS_EC_ENABLE_SVC_NOTIFICATIONS
NAGIOS_EC_LINE_DISABLE_HOST_CHECK
NAGIOS_EC_LINE_DISABLE_HOST_NOTIFICATIONS
NAGIOS_EC_LINE_DISABLE_HOST_SVC_NOTIFICATIONS
NAGIOS_EC_LINE_DISABLE_SVC_CHECK
NAGIOS_EC_LINE_DISABLE_SVC_NOTIFICATIONS
NAGIOS_EC_LINE_ENABLE_HOST_CHECK
NAGIOS_EC_LINE_ENABLE_HOST_NOTIFICATIONS
NAGIOS_EC_LINE_ENABLE_HOST_SVC_NOTIFICATIONS
NAGIOS_EC_LINE_ENABLE_SVC_CHECK
NAGIOS_EC_LINE_ENABLE_SVC_NOTIFICATIONS
NAGIOS_EC_LINE_PROCESS_HOST_CHECK_RESULT
NAGIOS_EC_LINE_PROCESS_SERVICE_CHECK_RESULT
NAGIOS_EC_LINE_SCHEDULE_HOST_DOWNTIME
NAGIOS_EC_PROCESS_HOST_CHECK_RESULT
NAGIOS_EC_PROCESS_SERVICE_CHECK_RESULT
NAGIOS_EC_SCHEDULE_HOST_DOWNTIME
NAGIOS_EC_SCHEDULE_SERVICE_DOWNTIME
NAGIOS_HOST_ALERT
NAGIOS_HOST_DOWNTIME_ALERT
NAGIOS_HOST_EVENT_HANDLER
NAGIOS_HOST_FLAPPING_ALERT
NAGIOS_HOST_NOTIFICATION
NAGIOSLOGLINE
NAGIOS_PASSIVE_HOST_CHECK
NAGIOS_PASSIVE_SERVICE_CHECK
NAGIOS_SERVICE_ALERT
NAGIOS_SERVICE_DOWNTIME_ALERT
NAGIOS_SERVICE_EVENT_HANDLER
NAGIOS_SERVICE_FLAPPING_ALERT
NAGIOS_SERVICE_NOTIFICATION
NAGIOSTIME
NAGIOS_TIMEPERIOD_TRANSITION
NAGIOS_TYPE_CURRENT_HOST_STATE
NAGIOS_TYPE_CURRENT_SERVICE_STATE
NAGIOS_TYPE_EXTERNAL_COMMAND
NAGIOS_TYPE_HOST_ALERT
NAGIOS_TYPE_HOST_DOWNTIME_ALERT
NAGIOS_TYPE_HOST_EVENT_HANDLER
NAGIOS_TYPE_HOST_FLAPPING_ALERT
NAGIOS_TYPE_HOST_NOTIFICATION
NAGIOS_TYPE_PASSIVE_HOST_CHECK
NAGIOS_TYPE_PASSIVE_SERVICE_CHECK
NAGIOS_TYPE_SERVICE_ALERT
NAGIOS_TYPE_SERVICE_DOWNTIME_ALERT
NAGIOS_TYPE_SERVICE_EVENT_HANDLER
NAGIOS_TYPE_SERVICE_FLAPPING_ALERT
NAGIOS_TYPE_SERVICE_NOTIFICATION
NAGIOS_TYPE_TIMEPERIOD_TRANSITION
NAGIOS_WARNING
NETSCREENSESSIONLOG
NONNEGINT
NOTSPACE
NUMBER
PATH
POSINT
POSTGRESQL
PROG
QS
QUOTEDSTRING
RAILS3
RAILS3FOOT
RAILS3HEAD
RAILS3PROFILE
RCONTROLLER
REDISLOG
REDISMONLOG
REDISTIMESTAMP
RPROCESSING
RT_FLOW1
RT_FLOW2
RT_FLOW3
RT_FLOW_EVENT
RUBY_LOGGER
RUBY_LOGLEVEL
RUUID
S3_ACCESS_LOG
S3_REQUEST_LINE
SECOND
SFW2
SHOREWALL
SPACE
SQUID3
SYSLOG5424BASE
SYSLOG5424LINE
SYSLOG5424PRI
SYSLOG5424PRINTASCII
SYSLOG5424SD
SYSLOGBASE
SYSLOGBASE2
SYSLOGFACILITY
SYSLOGHOST
SYSLOGLINE
SYSLOGPAMSESSION
SYSLOGPROG
SYSLOGTIMESTAMP
TIME
TIMESTAMP_ISO8601
TOMCAT_DATESTAMP
TOMCATLOG
TTY
TZ
UNIXPATH
URI
URIHOST
URIPARAM
URIPATH
URIPATHPARAM
URIPROTO
URN
USER
USERNAME
UUID
WINDOWSMAC
WINPATH
WORD
YEAR
If you have installed Logstash as a package, they can be found at /opt/logstash/patterns/grok-patterns.
You can view using these commands:
# find / -name patterns
/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.5/patterns
/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.5/lib/logstash/patterns
Just browse to the directory
# cd /opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.5/patterns
And here you have a whole list of patterns
aws exim haproxy
linux-syslog mongodb rails
bacula firewalls java mcollective nagios redis
bro grok-patterns junos mcollective-patterns postgresql ruby

Add a number to each line of a file in bash

I have some files with some lines in Linux like:
2013/08/16,name1,,5000,8761,09:00,09:30
2013/08/16,name1,,5000,9763,10:00,10:30
2013/08/16,name1,,5000,8866,11:00,11:30
2013/08/16,name1,,5000,5768,12:00,12:30
2013/08/16,name1,,5000,11764,13:00,13:30
2013/08/16,name2,,5000,2765,14:00,14:30
2013/08/16,name2,,5000,4765,15:00,15:30
2013/08/16,name2,,5000,6765,16:00,16:30
2013/08/16,name2,,5000,12765,17:00,17:30
2013/08/16,name2,,5000,25665,18:00,18:30
2013/08/16,name2,,5000,45765,09:00,10:30
2013/08/17,name1,,5000,33765,10:00,11:30
2013/08/17,name1,,5000,1765,11:00,12:30
2013/08/17,name1,,5000,34765,12:00,13:30
2013/08/17,name1,,5000,12765,13:00,14:30
2013/08/17,name2,,5000,1765,14:00,15:30
2013/08/17,name2,,5000,3765,15:00,16:30
2013/08/17,name2,,5000,7765,16:00,17:30
My column separator is "," and in the third column (currently ,,), I need the entry number within the same day. For example, with date
2013/08/16 I have 11 lines and with date 2013/08/17 7 lines, so I need add the numbers for example:
2013/08/16,name1,1,5000,8761,09:00,09:30
2013/08/16,name1,2,5000,9763,10:00,10:30
2013/08/16,name1,3,5000,8866,11:00,11:30
2013/08/16,name1,4,5000,5768,12:00,12:30
2013/08/16,name1,5,5000,11764,13:00,13:30
2013/08/16,name2,6,5000,2765,14:00,14:30
2013/08/16,name2,7,5000,4765,15:00,15:30
2013/08/16,name2,8,5000,6765,16:00,16:30
2013/08/16,name2,9,5000,12765,17:00,17:30
2013/08/16,name2,10,5000,25665,18:00,18:30
2013/08/16,name2,11,5000,45765,09:00,10:30
2013/08/17,name1,1,5000,33765,10:00,11:30
2013/08/17,name1,2,5000,1765,11:00,12:30
2013/08/17,name1,3,5000,34765,12:00,13:30
2013/08/17,name1,4,5000,12765,13:00,14:30
2013/08/17,name2,5,5000,1765,14:00,15:30
2013/08/17,name2,6,5000,3765,15:00,16:30
2013/08/17,name2,7,5000,7765,16:00,17:30
I need do it in bash. How can I do it?
This one's good too:
awk -F, 'sub(/,,/, ","++a[$1]",")1' file
Output:
2013/08/16,name1,1,5000,8761,09:00,09:30
2013/08/16,name1,2,5000,9763,10:00,10:30
2013/08/16,name1,3,5000,8866,11:00,11:30
2013/08/16,name1,4,5000,5768,12:00,12:30
2013/08/16,name1,5,5000,11764,13:00,13:30
2013/08/16,name2,6,5000,2765,14:00,14:30
2013/08/16,name2,7,5000,4765,15:00,15:30
2013/08/16,name2,8,5000,6765,16:00,16:30
2013/08/16,name2,9,5000,12765,17:00,17:30
2013/08/16,name2,10,5000,25665,18:00,18:30
2013/08/16,name2,11,5000,45765,09:00,10:30
2013/08/17,name1,1,5000,33765,10:00,11:30
2013/08/17,name1,2,5000,1765,11:00,12:30
2013/08/17,name1,3,5000,34765,12:00,13:30
2013/08/17,name1,4,5000,12765,13:00,14:30
2013/08/17,name2,5,5000,1765,14:00,15:30
2013/08/17,name2,6,5000,3765,15:00,16:30
2013/08/17,name2,7,5000,7765,16:00,17:30

Resources