I'm new to Perl. I am reading a CSV file using Perl. The first column of the CSV is time (which is a float). I've read the CSV and displayed the contents of the CSV successfully. Further, I wish to use the CSV data for some computations. I need the time column as an array (or any data structure). On reading the time column and storing it in an array, it is stored as a string. I wish to have a numeric array for arithmetic computations.
I've tried adding 0, mul 1 and then storing it in the array,using sprintf but i'm encountering errors.
use v5.30.0;
use strict;
use warnings;
my $file = $ARGV[0] or die;
open(my $data, '<',$file) or die;
my #timeArray;
while(my $line = <$data>){
chomp $line;
my #words = split ",",$line;
#my $temp=$words[1]*1;
my $temp=sprintf "%.6f",$words[1];
push #timeArray,$temp;
}
Error:
Argument ""67.891947295"" isn't numeric in multiplication (*) at 3.pl line 12, <$data> line 19556.
and
Argument ""67.840034174"" isn't numeric in sprintf at 3.pl line 13, <$data> line 19555.
Also, why is the argument in "" "" .
It's a good idea to handle data like that with the proper module, because there are several important details that you didn't take care of. Examples:
The columns values may be enclosed in quotes
The first row may contain the header names of each column
The last record in the file may or may not have an ending line break
Etc.
Read the RFC-4180 document for more information.
There are lots of modules that can parse CSV format, for example: Text:CSV. It's very easy to install, and when you use it, your string to double problem will disappear.
Related
I have got a DateTime object that is in a specific locale (with a DateTime::Locale object attached to it). I want to write a date string to an XLS file using Spreadsheet::WriteExcel, but I want the output that's visible to user of the Excel file to be of the same locale as the one attached to my DateTime object.
There is some documentation on this matter within Spreadsheet::WriteExcel. It's possible to set formats using a combination of a string, $wb->add_format() and $ws->write_date_time().
I can get locale information from DateTime via DateTime::Locale by looking at the CLDR patterns. There are also the named formats, which are easier to use. Something like $locale->date_format_short is actually quite nice.
use DateTime::Locale;
say DateTime::Locale->load('en_GB')->date_format_short; # dd/MM/y
say DateTime::Locale->load('en_US')->date_format_short; # M/d/yy
Now the problem with this is, that Excel does not know what a single y means. So my workaround has been to just replace a single y with yy, as that seems to roughly be the same.
Excel also doesn't like upper case letters in the format. I have no idea how it distinguishes between minutes and months, but it works.
This example seems to work, but I am sure it's not perfect.
use strict;
use warnings;
use Spreadsheet::WriteExcel;
use DateTime;
use DateTime::Locale;
my $workbook = Spreadsheet::WriteExcel->new("date_time.xls");
my $worksheet = $workbook->add_worksheet();
# Write the column headers
$worksheet->write('A1', 'Formatted date');
$worksheet->write('B1', 'Format');
$worksheet->write('C1', 'Locale');
my $row = 0;
for my $locale (qw/en_GB en_US de_DE ko_KR/) {
$row++;
my $format_string =
lc(DateTime::Locale->load($locale)->date_format_short)
=~ s{
(?<!y) # 2. not preceded by a y
y # 1. a single y
(?!y) # 3. not followed by a y
}{yy}xr; # 4. replaced with two y
my $format = $workbook->add_format(num_format => $format_string);
$worksheet->write_date_time($row, 0, DateTime->now->datetime, $format);
$worksheet->write($row, 1, $format_string);
$worksheet->write($row, 2, $locale);
}
This produces the following Excel file.
They all work, but the code is smelly. Is there something I've overlooked? Maybe someone has written a more correct converter for these format strings that I've not seen yet.
Please note that DateTime::Format::Excel is not helpful as it only works the other way around, turning Excel dates into DateTime objects.
I want to read a .csv file with large strings with SAS. This is my file tmp.csv in comma separated values format
1,1005725,[(B42.ND761).B437]1-8-1-1-1-3-3-3-2-2/RT0658,5S3563A/RT0658,,,5S3563A,RT0658
2,09VL101,20347 PL6 O94 E98-1-0/K9616LM,19058/K9616LM,19058,,19058,K9616LM
3,09VL102,20351 PL6-1-0/K9616LM 19060/K9616LM,,19060,,19060,K9616LM
4,09VL103,20347 PL6 O94 E98-2-0/K9962LM,AID19058A/K9962LM,19058,,AID19058A,K9962LM
5,09VL105,,V4649A/F0001LM,,,V4649A,F0001LM
I've used this code, but it hasn´t worked.
DATA datos;
INFILE "C:\Users\UserName\Documents\tmp.csv" DLM="," DSD MISSOVER;
INPUT Num Code :$7. Pedigree : $44. LineCode : $17. FemaleCode $5. MaleCode $ NFemale $9. NMale $7. ;
RUN;
This should be the result
Correct Data
I think Joe has the right idea - your variable lengths are messed up. I was able to produce the desired result using your code but with some renaming and resizing of your variables.
DATA datos;
INFILE "C:\Users\UserName\Documents\tmp.csv" DLM="," DSD MISSOVER;
INPUT a:$1. b:$7. c:$44. d:$17. e:$5. f:$9. g:$7.;
RUN;
I know it seems like you are saving typing by putting the informats in the input statement, but I think it is much easier to define the variables first and then write the input statement. Especially when reading from a delimited file. If you define the variables in the same order that you want to read them you can even just use a variable list in the INPUT statement.
DATA datos;
INFILE "C:\Users\UserName\Documents\tmp.csv" DSD TRUNCOVER;
LENGTH NumCode $7 Pedigree $44 LineCode $17 FemaleCode $5 NFemale $9 NMale $7 ;
INPUT NumCode -- NMale ;
RUN;
Also it is generally better to use TRUNCOVER instead of MISSOVER option on the INFILE statement. Most of the time you do not want SAS to set the value to missing when you ask it to read 7 characters and there are only 3 available on the line. You would prefer the have SAS use the 3 characters that are available. It won't make a difference on delimited input, but if you use formatted input without the : modifier you can miss data.
I want to write a function that loads a text file and plots its content with time. I have 20 text files so I want to be able to choose from them.
My current not working code:
TextFile is a generic variable
text123.txt is the actual name of one of the files i want to load
function []= PlotText(TextFile)
text(1,:)=load('text123.txt') ;
t=0:10;
plot(t,text)
end
I appreciate any help!!
use importdata instead of load with appropriate delimiter. I assume you used Tab.
filename = 'num.txt';
delimiterIn = '\t';
text = importdata(filename,delimiterIn)
t=1:10;
plot(t,text);
Firstly, you can also use dlmread if your file contains only numeric data separated by the same symbol (called a delimiter) such as a comma (,), semicolon (;), space ( ), or tab ( ). This would look like:
function []= PlotText(TextFile)
text(1,:)=dlmread('text123.txt');
t=0:10;
plot(t,text)
end
Keep in mind that your code is written in a way that expects the contents of text123.txt to have 11 values in a single row. Also, if you are using multiple files, then I suggest having the file name be another input to the function:
function []= PlotText(TextFile,filename)
text(1,:)=load(filename) ;
t=0:10;
plot(t,text)
end
I don't know if Matlab can do this, but I want to store some strings in a 4×3 matrix, each element in the matrix is a string.
test_string_01 test_string_02 test_string_03
test_string_04 test_string_05 test_string_06
test_string_07 test_string_08 test_string_09
test_string_10 test_string_11 test_string_12
Then, I want to write this matrix into a plain text file, either comma or space delimited.
test_string_01,test_string_02,test_string_03
test_string_04,test_string_05,test_string_06
test_string_07,test_string_08,test_string_09
test_string_10,test_string_11,test_string_12
Seems like matrix data type is not capable of storing strings. I looked at cell. I tried to use dlmwrite() or csvwrite(), but both of them only accept matrices. I also tried cell2mat() first, but in that way all letters in the strings are comma seperated, like
t,e,s,t,_,s,t,r,i,n,g,_,0,1,t,e,s,t,_,s,t,r,i,n,g,_,0,2,t,e,s,t,_,s,t,r,i,n,g,_,0,3
So is there any way to achieve this?
It is possible to shorten yuk's solution a bit.
strings = {
'test_string_01','test_string_02','test_string_03'
'test_string_04','test_string_05','test_string_06'
'test_string_07','test_string_08','test_string_09'
'test_string_10','test_string_11','test_string_12'};
fid = fopen('output.txt','w');
fmtString = [repmat('%s\t',1,size(strings,2)-1),'%s\n'];
fprintf(fid,fmtString,strings{:});
fclose(fid);
Cell array is the way to store strings.
I agree it's a pain to save strings into a text file, but you can do it with this code:
strings = {
'test_string_01','test_string_02','test_string_03'
'test_string_04','test_string_05','test_string_06'
'test_string_07','test_string_08','test_string_09'
'test_string_10','test_string_11','test_string_12'};
fid = fopen('output.txt','w');
for row = 1:size(strings,1)
fprintf(fid, repmat('%s\t',1,size(strings,2)-1), strings{row,1:end-1});
fprintf(fid, '%s\n', strings{row,end});
end
fclose(fid);
Substitute \t with , to get csv file.
You can also store cell array of strings into Excel file with XLSWRITE (requires COM interface, so it's on Windows only):
xlswrite('output.xls',strings)
In most cases you can use the delimiter ' ' and get Matlab to save a string into file with dlmwrite.
For example,
output=('my_first_String');
dlmwrite('myfile.txt',output,'delimiter','')
will save a file named myfile.txt containing my_first_String.
I have multiple folders. There are multiple txt files inside these folder. I need to extract data (just a single value: value --->554) from a particular type of txt file in this folder.(individual_values.txt)
No 100 Value 555 level match 0.443 top level 0.443 bottom 4343
There will be many folders with same txt file names but diff value. Can all these values be copyed to excel one below the other.
I have to extract a value from a txt file which i mentioned above. Its a same text file with same name located inside different folders. All i want to do is extract this value from all the text file and paste it in excel or txt one below the other in each row.
Eg: The above is a text file here I have to get the value of 555 and similarly from other diff values.
555
666
666
776
Yes.
(you might want to clarify your question )
Your question isn't very clear, I imagine you want to know how this can be done.
You probably need to write a script that traverses the folders, reads the individual files, parses them for the value you want, and generates a Comma Separated Values (CSV) file. CSV files can easily be imported to Excel.
There are two or three basic methods you can use to get stuff into a Excel Spreadsheet.
You can use OLE wrappers to manipulate Excel.
You can write the file in a binary form
You can use Excel's import methods to take delimited text in as a spreadsheet.
I chose the latter way, because 1) it is the simplest, and 2) your problem is so poorly stated as it does not require a more complex way. The solution below outputs a tab-delimited text file that Excel can easily support.
In Perl:
use IO::File;
my #field_names = split m|/|, 'No/Value/level match/top level/bottom';
#' # <-- catch runaway quote
my $input = IO::File->new( '<data.txt' );
die 'Could not open data.txt for input!' unless $input;
my #data_rows;
while ( my $line = <$input> ) {
my %fields = $line =~ /(level match|top level|bottom|Value|No)\s+(\d+\S*)/g;
push #data_rows, \%fields if exists $fields{Value};
}
$input->close();
my $tab_file = IO::File->new( '>data.tab' );
die 'Could not open data.tab for output!' unless $tab_file;
$tab_file->print( join( "\t", #field_names ), "\n" );
foreach my $data_ref ( #data ) {
$tab_file->print( join( "\t", #$data_ref{#field_names} ), "\n" );
}
$tab_file->close();
NOTE: Excel's text processing is really quite neat. Try opening the text below (replacing the \t with actual tabs) -- or even copying and pasting it:
1\t2\t3\t=SUM(A1:C1)
I chose c#, because i thought it would be fun to use a recursive lambda. This will create the csv file containing matches to the regex pattern.
string root_path = #"c:\Temp\test";
string match_filename = "test.txt";
Func<string,string,StringBuilder, StringBuilder> getdata = null;
getdata = (path,filename,content) => {
Directory.GetFiles(path)
.Where(f=>
Path.GetFileName(f)
.Equals(filename,StringComparison.OrdinalIgnoreCase))
.Select(f=>File.ReadAllText(f))
.Select(c=> Regex.Match(c, #"value[\s\t]*(\d+)",
RegexOptions.IgnoreCase))
.Where(m=>m.Success)
.Select(m=>m.Groups[1].Value)
.ToList()
.ForEach(m=>content.AppendLine(m));
Directory.GetDirectories(path)
.ToList()
.ForEach(d=>getdata(d,filename,content));
return content;
};
File.WriteAllText(
Path.Combine(root_path, "data.csv"),
getdata(root_path, match_filename, new StringBuilder()).ToString());
No.
just making sure you have a 50/50 chance of getting the right answer
(assuming it was a question answerable by Yes and No) hehehe
File_not_found
Gotta have all three binary states for the response.