convert string ddmmyyyy to yymmdd using linux - linux

I am struggling to convert a textfile which contains the string:
10.04.2015 12:00:15
to
150410
The date will always be the date format but is taken from a textfile so as far as I know I can't use a date command and I think I must use awk but I don't know it well and am struggling - advice would be appreciated?

You can use bash parameter expansion to extract substrings
$ date="10.04.2015 12:00:15"
$ newdate=${date:8:2}${date:3:2}${date:0:2}
$ echo $newdate
150410

The following sed command should be good enough:
sed -r 's/([0-9]+)\.([0-9]+)\.[0-9]{2}([0-9]{2}) ([0-9]+:?){3}/\3\2\1/g' input
or awk:
awk -F'[.: ]' '{print substr($3,3),$2,$1}' OFS='' input
After all I prefer the latter.

alternative answer, using awk:
DATE="10.04.2015 12:00:15"
echo $DATE | awk '{print substr($0,9,2) substr($0,4,2) substr($0,1,2)}'
Some explanation if you want to do modifications later:
$0 is the input text on which substr operates
the the first digit is where the substring starts
the second digit indicates the length of the substring

Related

Replacing characters in each line on a file in linux

I have a file with different word in each line.
My goal is to replace the first character to a capital letter and replace the 3rd character to "#".
For example: football will be exchanged to Foo#ball.
I tried thinking about using awk and sed.It didn't help me since (to my knowledge) sed needs an exact character input and awk can print the desired character but not change it.
With GNU sed and two s commands:
echo 'football' | sed -E 's/(.)/\U\1/; s/(...)./\1#/'
Output:
Foo#ball
See: 3.3 The s Command, 5.7 Back-references and Subexpressions and 5.9.2 Upper/Lower case conversion
This might work for you (GNU sed):
sed 's/\(...\)./\u\1#/' file
With bash you can use parameter expansions alone to accomplish the task. For example, if you read each line into the variable line, you can do:
line="${line^}" # change football to Football (capitalize 1st char)
line="${line:0:3}#${line:4}" # make 4th character '#'
Example Input File
$ cat file
football
soccer
baseball
Example Use/Output
$ while read -r line; do line="${line^}"; echo "${line:0:3}#${line:4}"; done < file
Foo#ball
Soc#er
Bas#ball
While shell is typically slower, when use is limited to builtins, it doesn't fall too far behind.
(note: your question says 3rd character, but your example replaces the 4th character with '#')
With GNU awk for the 3rd arg to match():
$ echo 'football' | awk 'match($0,/(.)(..).(.*)/,a){$0=toupper(a[1]) a[2] "#" a[3]} 1'
Foo#ball
Cyrus' or Potong's answers are the preferred ones. (For Linux or systems with GNU sed because of \U or \u.)
This is just an additional solution with awk because you mentioned it and used also awk tag:
$ echo 'football'|awk '{a=substr($0,1,1);b=substr($0,2,2);c=substr($0,5);print toupper(a)b"#"c}'
Foo#ball
This is a most simple solution without RegEx. It will also work on non-GNU awk.
This should work with any version of awk:
awk '{
for(i=1;i<=NF;i++){
# Note that string indexes start at 1 in awk !
$i=toupper(substr($i,1,1)) "" substr($i,2,1) "#" substr($i,3)
}
print
}' file
Note: If a word is less than 3 characters long, like it, it will be printed as It#
if your data in 'd' file, tried on gnu sed:
sed -E 's/^(\w)(\w\w)\w/\U\1\E\2#/' d

How To Delete All Words Before X Characters

I'm using code from this question How To Delete All Words After X Characters and I'm having a trouble keeping (not deleting) all the words after 30 characters.
Original code:
awk 'BEGIN{FS=OFS="" } length>30{i=30; while($i~/\w/) i++; NF=i-1; }1'
My attempt:
awk 'BEGIN{FS=OFS="" } length>30{i=30; while($i~/\w/) i++; NF=i+1; }1'
Basically, I understand I need to change the NF which was NF=i-1 so I tried changing it to NF=i+1 but obviously I'm only getting one field. How can I specify NF to print the rest of the line?
Sample data:
StackOverflow Users Are Brilliant And Hard Working
#character 30 ---------------^
Desired output:
And Hard Working
If you could please help me keep the rest of the line by using NF, I would really appreciate your positive input and support.
It is much easier using gnu grep:
grep -oP '^.{30}\w*\W*\K.*' file
And Hard Working
Where \K is used for reseting matched information.
RegEx Breakup:
^: Start
.{30}: Match first 30 characters
\w*: followed by 0 or more word characters
\W*: followed by 0 or more non-word characters
\K: reset matched information so far
.*: Match anything after this position
Using awk you can use this solution:
awk '{sub(/^.{30}[_[:alnum:]]*[[:blank:]]*/, "")} 1' file
And Hard Working
Finally a sed solution:
sed -E 's/^.{30}[_[:alnum:]]*[[:blank:]]*//' file
And Hard Working
another awk
awk '{print substr($0, index(substr($0,30),FS)+30)}'
find the delimiter index after the 30th char, take a substring from that index on.
I can't imagine why your considering anything related to NF for this since you're not doing anything with fields, you're just splitting each line at a blank char. It sounds like this is all you need for both questions, using GNU awk for gensub():
$ awk '{print gensub(/(.{30}\S*)\s+(.*)/,"\\1",1)}' file
StackOverflow Users Are Brilliant
$ awk '{print gensub(/(.{30}\S*)\s+(.*)/,"\\2",1)}' file
And Hard Working
or it's briefer using GNU sed:
$ sed -E 's/(.{30}\S*)\s+(.*)/\1/' file
StackOverflow Users Are Brilliant
$ sed -E 's/(.{30}\S*)\s+(.*)/\2/' file
And Hard Working
With the use of NF, you can try
awk '{for(i=1;i<=NF;i++){a+=length($i)+1;if(a>30){for(j=i+1;j<=NF;j++)b=b $j" ";print b;exit}}}'
cut -c30- file | cut -d' ' -f2-
this will keep only the words that start after 30th character (index >= 31)

How to use grep and sed in order to replace the substring after searching some specific string?

I want to know how to use two 'grep' and 'sed' utilities or something else in order to replace the substring. I will explain what I want to do below.
We have the file 'test.txt' with the following string:
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='AA5', A6='keyword_A'
After searching 'keyword_A' using grep, I want to replace the value of A5 with other string, for example, "NEW".
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
I tried to use two commands like
grep keyword_A test.txt | sed -e 's/blabla/blabla/'
After trying all I know, I gave up at all.
Please let me know the right solution.
First, you never need grep and sed. Sed has a full regular-expression search engine, so it is a superset of grep. This command will read test.txt, change the lines that you've indicated, and print the entire result on standard output:
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/g" < test.txt
If you want to store the results back into the file test.txt, use the -i (in-place editing) switch to sed:
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/g" -i.bak test.txt
If you want to select only the indicated lines, modify those, and print only those lines to standard out, use a combination of the p (print) command and the -n (no output) switch.
sed "/keyword_A/s/A5{ATTR}='[A-Z0-9]*'/A5{ATTR}='NEW'/gp" -n test.txt
Using grep+sed is always the wrong approach. Here's one way to do it with GNU awk:
$ awk '/keyword_A/{ $0=gensub(/(A5({[^}]+})?=\047)[^\047]+/,"\\1NEW",1) } 1' file
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
Using a couple variables you could define the keyword and replacement ( if they change at all ):
q="keyword_A"
r="NEW"
Then with sed:
sed -r "s/^(.+\{.+\}=')(.+)('.+"${q}".+)$/\1"${r}"\3/" file
Result:
A1='AA1', A2='AA2', A3='AA3', A4='AA4', A5{ATTR}='NEW', A6='keyword_A'
A5="NEW"
A6="keyword_A"
# with sed
sed "s/='[^']*\(',[[:blank:]]*A6='${A6}'\)/='${A5}\1/" YourFile
# with awk
awk -F "'" -v A5="${A5}" -v A6="${A6}" '
BEGIN { OFS="\047" }
$12 == A6 { $10 = A5; $0 = $0 }
7
' YourFile
Change by the end of the string, for sed and using ' as field separator in awk instead of traditional space.
assuming there is no ' in value (or need to treat the escaping method) for awk version
We can just directly replace the fifth column when the sting keyword_A is found as shown below:
awk -F, 'BEGIN{OFS=",";}/keyword_A/{$5="A5{ATTR}='"'"NEW"'"'"}1' filename
Couple of slight alternatives:
sed -r "/keyword_A/s/(A5[^']*')[^']*/\1NEW/"
awk -F"'" '/keyword_A/{$10 = "NEW"}1' OFS="'"
Of course the negative with awk is afterwards you would have to rename the new file.

Cut number from string

I want to cut several numbers from a .txt file to add them later up. Here is an abstract from the .txt file:
anonuser pts/25 127.0.0.1 Mon Nov 16 17:24 - crash (10+23:07)
I want to get the "10" before the "+" and I only want the number, nothing else. This number should be written to another .txt file. I used this code, but it only works if the number has one digit:
awk ' /^'anonuser' / {split($NF,k,"[(+0:)][0-9][0-9]");print k[1]} ' log2.txt > log3.txt
With GNU grep:
grep -Po '\(\K[^+]*' file > new_file
Output to new_file:
10
See: PCRE Regex Spotlight: \K
What if you use the match() function in awk?
$ awk '/^anonuser/ && match($NF,/^\(([0-9]*)/,a) {print a[1]}' file
10
How does this work?
/^anonuser/ && match() {print a[1]} if the line starts with anonuser and the pattern is found, print it.
match($NF,/^\(([0-9]*)/,a) in the last field ((10+23:07)), look for the string ( + digits and capture these in the array a[].
Note also that this approach allows you to store the values you capture, so that you can then sum them as you indicate in the question.
The following uses the same approach as the OP, and has a couple of advantages, e.g. it does not require anything special, and it is quite robust (with respect to assumptions about the input) and maintainable:
awk '/^anonuser/ {split($NF,k,/+/); gsub(/[^0-9]/,"",k[1]); print k[1]}'
for anything more complex use awk but for simple task sed is easy enough
sed -r '/^anonuser/{s/.*\(([0-9]+)\+.*/\1/}'
find the number between a ( and + sign.
I am not sure about the format in the file.
Can you use simple cut commands?
cut -d"(" -f2 log2.txt| cut -d"+" -f1 > log3.txt

Pick a specific value in a program output (Bash)

I'm running LIBSVM in linux terminal called by a C program. Ok, i need to pick the output but the format is the following
Accuracy = 80% (24/30) (classification)
I need to pick only the "80" value as an integer. I tried with sed and came to this command:
sed 's/[^0-9^'%']//g' 'f' >> f
This is filtering all integers in the output and, thus, isn't working yet, so I need help. Thanks in advance
Try grep in PCRE mode (-P), printing only the matched parts (-o), with a lookahead assertion:
$ echo "Accuracy = 80% (24/30) (classification)" | grep -Po '[0-9]+(?=%)'
80
The regexp:
[0-9] # match a digit
+ # one or more times
(?=%) # assert that the digits are followed by a %
It is very trivial with awk. Identify the column you need and strip the '%' sign from it. The /^Accuracy/ regex ensures that the action is only performed on the lines starting with Accuracy. You don't need it if your file only contains one line.
awk '/^Accuracy/{sub(/%/,"");print $3}' inputFile
Alternatively, you can set space and % as field separators and do
awk -F'[ %]' '/^Accuracy/{print $3}' inputFile
If you want to do it with sed then you can try something like:
sed '/^Accuracy/s/.* \(.*\)%.*/\1/' inputFile
This might work for you (GNU sed):
sed -nr '/^Accuracy = ([^%]*)%.*/s//\1/p' file

Resources