How to remove values from text file from different position - linux

I have a file containing different values:
30,-4,098511E-02
30,05,-4,098511E-02
41,9,15,54288
I need to remove values from this file but from different position, for example:
30
30,05
41,9
I tried to do it with sed to remove the last value but my problem is when I encounter the 41,9,15,54288 it does not work. Any idea if there is a way to do it?
I tried this
echo "30,-4,098511E-02" | sed 's/,.*/,/'

Using sed
$ sed -E 's/(([0-9]+,?){1,2}),[0-9-].*/\1/' input_file
30
30,05
41,9

I would do it using perl, like this:
#!/usr/bin/perl
use strict;
use warnings;
# my $inputPath = '/Users/myuser/Desktop/inputs/a.txt';
# my $outputPath = '/Users/myuser/Desktop/outputs/a_result.txt';
if ($inputPath eq "") {
print "Enter the full path of your input file: ";
$inputPath = <STDIN>;
chomp $inputPath;
}
if ($outputPath eq "") {
print "Enter the full path of your input file: ";
$outputPath = <STDIN>;
chomp $outputPath;
}
open my $info, $inputPath or die "Could not open $inputPath: $!";
open FH, '>', $outputPath or die "Could not open $outputPath : $!";
while( my $line = <$info>) {
chomp $line;
# print "line read: $line\n";
# 30,05,-4,098511E-02
# [0-9]: begins with a digit
# 3
# [0-9]+: begins with two digits
# 30
# [0-9]+: begins with two digits and a comma
# [0-9]+?: begins with two digits and has or has not a comma
# [0-9]+?: begins with two digits and has or has not a comma
# 30,
# {1,2}: one or two times
# 30,05,
# [0-9-]: anything that is a digit, or a dash
# 30,05,-
# [0-9-].: anything that is a digit, or a dash and any character after that
# 30,05,-4
# *: Matches anything in the place of the *, or a "greedy" match (e.g. ab*c returns abc, abbcc, abcdc)
# 30,05,-4,098511E-02
if ($line =~ m{((([0-9]+,?){1,2}),[0-9-].*)}) {
# print "becomes: $1\n";
print FH "$1\n"; # Print to the file
} else {
print "not found!\n";
}
}
close $info;
I wrote the explanations of my regex in the comments of my code.

Related

perl program for reading file contents

I want to write a perl program for opening a file and reading its content and the printing the number of lines, words and characters there are. I also want to print the number of times a specific word appeared in the file. Here is what I have done:
#! /usr/bin/perl
open( FILE, "test1.txt" ) or die "could not open file $1";
my ( $line, $word, $chars ) = ( 0, 0, 0 );
while (<FILE>) {
$line++;
$words += scalar( split( /\s+/, $_ ) );
$chars += length($_);
print $_;
}
$chars -= $words;
print(
"Total number of lines in the file:= $line \nTotal number of words in the file:= $words \nTotal number of chars in the file:= $chars\n"
);
As you can clearly see, I don't have any provision for taking user input of the words whose occurrence is to be counted. Because I don't know how to do it. Please help with counting of the number of occurrence part. Thank you
I guess you're doing this for learning purposes, so here is a good readable version of your problem (there might be a thousand others, because it's perl). If not, there's wc on the linxux command line.
Note that I'm using three argument open, it's generally better to do that.
For counting single words you'll most probably need a hash. And I used <<HERE docs, because they are nicer for formating. If you have any doubts, just look in the perldoc and ask your questions.
#!/usr/bin/env perl
use warnings; # Always use this
use strict; # ditto
my ($chars,$word_count ,%words);
{
open my $file, '<', 'test.txt'
or die "couldn't open `test.txt':\n$!";
while (<$file>){
foreach (split){
$word_count++;
$words{$_}++;
$chars += length;
}
}
} # $file is now closed
print <<THAT;
Total number of lines: $.
Total number of words: $word_count
Total number of chars: $chars
THAT
# Now to your questioning part:
my $prompt= <<PROMPT.'>>';
Please enter the words you want the occurrences for. (CTRL+D ends the program)
PROMPT
print $prompt;
while(<STDIN>){
chomp; # get rid of the newline
print "$_ ".(exists $words{$_}?"occurs $words{$_} times":"doesn't occur")
." in the file\n",$prompt;
}

How can I get Perl string to keep its original formatting after editing it?

I am attempting to write a code that will encrypt letters with a basic cyclic shift cipher while leaving any character that is not a letter alone. I am trying to do this through the use of a sub that finds the new value for each of the letters. When I run the code now,it formats the result so there is a single space between every encrypted letter instead of keeping the original formatting. I also cannot get the result to be only in lowercase letters.
sub encrypter {
my $letter = shift #_;
if ($letter =~ m/^[a-zA-Z]/) {
$letter =~ y/N-ZA-Mn-za-m/A-Za-z/;
return $letter;
}
else {
return lc($letter);
}
}
print "Input string to be encrypted: ";
my $input = <STDIN>;
chomp $input;
print "$input # USER INPUT\n";
my #inputArray = split (//, $input);
my $i = 0;
my #encryptedArray;
for ($i = 0; $i <= $#inputArray; $i++) {
$encryptedArray[$i] = encrypter($inputArray[$i]);
}
print "#encryptedArray # OUTPUT\n";
The problem is how you are printing the array.
Change this line:
print "#encryptedArray # OUTPUT\n";
to:
print join("", #encryptedArray) . " # OUTPUT\n";
Here is an example that illustrates the problem.
#!/usr/bin/perl
my #array = ("a","b","c","d");
print "#array # OUTPUT\n";
print join("", #array) . " # OUTPUT\n";
Output:
$ perl test.pl
a b c d # OUTPUT
abcd # OUTPUT
According to the Perl documentation on print:
The current value of $, (if any) is printed between each LIST item.
The current value of $\ (if any) is printed after the entire LIST has
been printed.
So two others ways to do it would be:
#!/usr/bin/perl
my #array = ("a","b","c","d");
$,="";
print #array, " #OUTPUT\n";
or
#!/usr/bin/perl
my #array = ("a","b","c","d");
$"="";
print #array, " #OUTPUT\n";
Here is a related answer and here is documentation explaining $" and $,.
Those spaces in your output from $" (list separator) because you use print "#encryptedArray" to print that array, which equals print join($", #encryptedArray), therefore you could disable them by
local $" = '';
or you could join that #encryptedArray by yourself before you print it, just as suggested by #Matt.
Note that there is no need for such complexity. tr/// - also known as y/// - wil convert the whole string for you. Like this
use strict;
use warnings;
print "Input string to be encrypted: ";
chomp(my $input = <STDIN>);
print "$input # USER INPUT\n";
(my $encrypted = $input) =~ tr/N-ZA-Mn-za-m/A-Za-z/;
print "$encrypted # OUTPUT\n";

grep lines before and after in aix/ksh shell

I want to extract lines before and after a matched pattern.
eg: if the file contents are as follows
absbasdakjkglksagjgj
sajlkgsgjlskjlasj
hello
lkgjkdsfjlkjsgklks
klgdsgklsdgkldskgdsg
I need find hello and display line before and after 'hello'
the output should be
sajlkgsgjlskjlasj
hello
lkgjkdsfjlkjsgklks
This is possible with GNU but i need a method that works in AIX / KSH SHELL WHERE NO GNU IS INSTALLED.
sed -n '/hello/{x;G;N;p;};h' filename
I've found it is generally less frustrating to build the GNU coreutils once, and benefit from many more features http://www.gnu.org/software/coreutils/
Since you'll have Perl on the machine, you could use the following code, but you'd probably do better to install the GNU utilities. This has options -b n1 for lines before and -f n1 for lines following the match. It works with PCRE matches (so if you want case-insensitive matching, add an i after the regex instead using a -i option. I haven't implemented -v or -l; I didn't need those.
#!/usr/bin/env perl
#
# #(#)$Id: sgrep.pl,v 1.7 2013/01/28 02:07:18 jleffler Exp $
#
# Perl-based SGREP (special grep) command
#
# Print lines around the line that matches (by default, 3 before and 3 after).
# By default, include file names if more than one file to search.
#
# Options:
# -b n1 Print n1 lines before match
# -f n2 Print n2 lines following match
# -n Print line numbers
# -h Do not print file names
# -H Do print file names
use warnings;
use strict;
use constant debug => 0;
use Getopt::Std;
my(%opts);
sub usage
{
print STDERR "Usage: $0 [-hnH] [-b n1] [-f n2] pattern [file ...]\n";
exit 1;
}
usage unless getopts('hnf:b:H', \%opts);
usage unless #ARGV >= 1;
if ($opts{h} && $opts{H})
{
print STDERR "$0: mutually exclusive options -h and -H specified\n";
exit 1;
}
my $op = shift;
print "# regex = $op\n" if debug;
# print file names if -h omitted and more than one argument
$opts{F} = (defined $opts{H} || (!defined $opts{h} and scalar #ARGV > 1)) ? 1 : 0;
$opts{n} = 0 unless defined $opts{n};
my $before = (defined $opts{b}) ? $opts{b} + 0 : 3;
my $after = (defined $opts{f}) ? $opts{f} + 0 : 3;
print "# before = $before; after = $after\n" if debug;
my #lines = (); # Accumulated lines
my $tail = 0; # Line number of last line in list
my $tbp_1 = 0; # First line to be printed
my $tbp_2 = 0; # Last line to be printed
# Print lines from #lines in the range $tbp_1 .. $tbp_2,
# leaving $leave lines in the array for future use.
sub print_leaving
{
my ($leave) = #_;
while (scalar(#lines) > $leave)
{
my $line = shift #lines;
my $curr = $tail - scalar(#lines);
if ($tbp_1 <= $curr && $curr <= $tbp_2)
{
print "$ARGV:" if $opts{F};
print "$curr:" if $opts{n};
print $line;
}
}
}
# General logic:
# Accumulate each line at end of #lines.
# ** If current line matches, record range that needs printing
# ** When the line array contains enough lines, pop line off front and,
# if it needs printing, print it.
# At end of file, empty line array, printing requisite accumulated lines.
while (<>)
{
# Add this line to the accumulated lines
push #lines, $_;
$tail = $.;
printf "# array: N = %d, last = $tail: %s", scalar(#lines), $_ if debug > 1;
if (m/$op/o)
{
# This line matches - set range to be printed
my $lo = $. - $before;
$tbp_1 = $lo if ($lo > $tbp_2);
$tbp_2 = $. + $after;
print "# $. MATCH: print range $tbp_1 .. $tbp_2\n" if debug;
}
# Print out any accumulated lines that need printing
# Leave $before lines in array.
print_leaving($before);
}
continue
{
if (eof)
{
# Print out any accumulated lines that need printing
print_leaving(0);
# Reset for next file
close ARGV;
$tbp_1 = 0;
$tbp_2 = 0;
$tail = 0;
#lines = ();
}
}
I had a situation where I was stuck with a slow telnet session on a tablet, believe it or not, and I couldn't write a Perl script very easily with that keyboard. I came up with this hacky maneuver that worked in a pinch for me with AIX's limited grep. This won't work well if your grep returns hundreds of lines, but if you just need one line and one or two above/below it, this could do it. First I ran this:
cat -n filename |grep criteria
By including the -n flag, I see the line number of the data I'm seeking, like this:
2543 my crucial data
Since cat gives the line number 2 spaces before and 1 space after, I could grep for the line number right before it like this:
cat -n filename |grep " 2542 "
I ran this a couple of times to give me lines 2542 and 2544 that bookended line 2543. Like I said, it's definitely fallable, like if you have reams of data that might have " 2542 " all over the place, but just to grab a couple of quick lines, it worked well.

Replace character with other character in a text file using perl

I am having problem in parsing the output from the text file. I want to add pipe symbol in between the character to do mutliple search similar to egrep, the text file is as follows
service entered the stopped state,critical
service entered the running state,clear
Code:
open(my $data, '<', $Config_File) or die "Could not open '$Config_File"
my $reg_exp;
my $severity;
my #fields=();
while (my $line = <$data>)
{
chomp $line;
if(!$line =~ /^$/)
{
#fields = split "," , $line;
$reg_exp = $fields[0];
$severity = $fields[1];
print $reg_exp;
}
}
#print $fields[0];
#last unless defined $line;
close($data);
expected output
service entered the stopped state|service entered the running state
You are not far off, you just need to actually concatenate the strings. The simplest way would be to push the $fields[0] to an array, and wait until the input is done to print it. I.e.:
my #data;
while (my $line = <$data>) {
next if $line =~ /^$/; # no need to chomp
my #fields = split /,/, $line;
push #data, $fields[0];
}
print join("|", #data), "\n";
I sense that you are trying to achieve something else with this code, and that this is a so-called XY-problem.

My array is showing empty after I insert huge data into it in perl

#!/usr/bin/perl -w
################################################################################
##Get_Duration.pl
#
# This is a perl script which is used to parse the audio files
# present in the device and build's the xml containing all the
# track i.e both audio and video files duration
#
# The xml file is created in the name of ParsedMetadataInformation.xml
# in <ATAF Path>/tmp/ directory.
#
#
# CHANGE HISTORY
# --------------------------------------------------------------------------
use strict;
use warnings;
use Env;
use File::Find;
use XML::TreePP;
use Data::Dumper;
my $data;
if (not defined $ATAF){
print "=====================================================\n";
print "ERROR: ATAF Path is not set.\n";
print "(Example: export ATAF=/home/roopa/ATAF)\n";
print "=====================================================\n";
exit 1;
}
print "Enter the Absolute path for the device to be scanned\n";
print "(Example: /media/RACE_1.6A)\n";
$DB::single=1;
my #metadataInfo = ();
print "Enter Path:";
my $configDir = <STDIN>;
chomp $configDir;
my #configFiles;
find( sub {push #configFiles, "$File::Find::name$/" if (/\.mp3|\.wma|\.wav|\.ogg| \.flac| \.m4a|\.mp4|\.avi|\.mpg|\.mpeg|\.mov|\.wmv|\.m4b$/i)}, $configDir);
chomp #configFiles;
if (!#configFiles){
print "=====================================================\n";
print "ERROR: No Files Found!!!\n";
print "=====================================================\n";
exit -1;
}
my $tpp = XML::TreePP->new();
my $metadataHashTree1 = ();
print "=====================================================\n";
print "Extracting the Metadata Information\n";
print "=====================================================\n";
foreach my $file (#configFiles){
print "Currently in: $file\n";
(my $fileName = $file) =~ s/^.*\///g;
$file =~ s/([\!\$\^\*\&\(\)\|\}\{\[\]\:\"\;\'\?\>\<\,\=\`\s])/\\$1/g;
#metadataInfo = (`ffmpeg -i $fileName`);
my $size= scalar (#metadataInfo);
#chomp #metadataInfo;
foreach my $eachfile (#metadataInfo){
if ($eachfile =~ m/^Duration: /i){
$eachfile =~ m/Duration:(.*?),/;
$data= $1;
$metadataHashTree1->{$fileName}->{'Duration'}=$data;
}
}
}
print "=====================================================\n";
print "Building XML tree\n";
print "=====================================================\n\n";
my $xml = $tpp->write($metadataHashTree1);
sleep 5;
print "=====================================================================\n";
print "Writing the XML tree in <ATAF Path>/tmp/ParsedMetadataInformation.xml\n";
print "=====================================================================\n\n";
open (FILEHANDLE, ">$ATAF/tmp/ParsedDurationInformation.xml") or die "ERROR: $!\n";
print FILEHANDLE $xml;
close FILEHANDLE;
sleep 5;
print "=====================================================\n";
print "Successfully Completed!!!\n";
print "=====================================================\n\n";
########################################################################################
In the above program I am trying to get the duration of a track using ffmpeg command and saving the output in #metadataInfo. But the array size shows 0 if I try to print using the command
$size= scalar (#metadataInfo);
"$File::Find::name$/"
should be
$File::Find::name
Appending $/ makes no sense.
You don't convert the file name to a shell literal.
`ffmpeg -i $fileName`
should be
use String::ShellQuote qw( shell_quote );
my $cmd = shell_quote('ffmpeg', '-i', $fileName);
`$cmd`
This will handle problems such as a spaces in the file name.
You don't check if the backticks succeeded. What's the value of $?? And if that's -1, what's the value of $!?

Resources