Issue with Spaces when Scripting Scala with Process - string

I am attempting to write an automated script to pngcrush my images (for a website I am working on) and I used scala as a scripting language to write the script to do this. Everything is going well, except that I am having a problem regarding using spaces when I execute the command. I read that you need to use
Seq(a,b,c,d)
where a,b,c,d are strings (that are meant to be separated by a single space) to deal with how Scala/Java handle Strings
The relevant code I have for generating the command to be executed is here. The result variable contains literal path to every filename
for (fileName <- result) {
val string = Seq("pngcrush","brute","-d","\"" + folder.getPath + "/\"","-e",fileName.getName) ++ fileName.getCanonicalPath.replace(" ","\\ ").split(" ").toSeq
I then use
string!
To execute the command. The problem is that the filename for the last section of the command (after the "-e " flag) isn't executed properly because it cannot deal with the directories that have spaces. An example output is shown below
List(pngcrush, brute, -d, "/tmp/d75f7d89-9ed5-4ff9-9181-41ae2fd82da8/", -e, users_off.png, /Users/mdedetrich/3dot/blublocks/src/main/webapp/img/sidebar/my\, group/users_off.png)
And if I run reduceLeft to get the spaces back I obviously get what the proper string is.
pngcrush brute -d "/tmp/1eaca157-0e14-430c-b0a4-677491d70583/" -e users_off.png /Users/mdedetrich/3dot/blublocks/src/main/webapp/img/sidebar/my\ group/users_off.png
Which is what the correct command should be (running the string manually in terminal works fine). However when I attempt to run this through Scala script, I get this
Could not find file: users_off.png
Could not find file: /Users/mdedetrich/3dot/blublocks/src/main/webapp/img/sidebar/my\
Could not find file: group/users_off.png
CPU time decoding 0.000, encoding 0.000, other 0.000, total 0.000 seconds
Any idea what I am doing incorrectly? It seems to be a problem with Scala not parsing strings that have spaces (and splitting it with Seq is not working either). I have tried both using a literal string with spaces and Seq, neither of which seem to work.

Why are you doing this:
replace(" ","\\ ").split(" ")
That is what's splitting the argument, not Process. Why don't you just use the following?
val string = Seq("pngcrush",
"brute",
"-d","\"" + folder.getPath + "/\"",
"-e",fileName.getName,
fileName.getCanonicalPath)

Related

Passing a DOS/Windows path in python

I run this code:
source_path="c:\temp\\"
dest_path="c:\\temp2\\"
for Name in UserNames:
run_win_cmd("robocopy "+ source_path + Name + "* "+ dest_path)
Name in UserNames gives a name like JBlackstone, and I get the following:
b'ERROR : Invalid Parameter #2 : "emp\\Flastone*"\r\n'
Complete with the b'. No matter how I seem to format the backslashes for the command line, it ends up wrong. Here it read \temp as tab emp. If I double backslashes \\temp\\ to the \, it puts in double backslashes. If I don't it reads them as formatting chararcters. I am using run_win_cmd to call the code.
Suggestions would be much appreciated.
I just rewrote it using Popen directly which appears to be the way most people do this in python now. I also pipe d stdout and stderr to a file. The process of doing that formatted the text correctly.

Strings containing " - " always break onto newline with ruamel.yaml

I'm fairly new to YAML, within a Python 3.7 project, and decided to use ruamel.yaml to get me started. I intend to use it to store metadata associated with some video files.
I am creating YAML files with the following code:
data[filename] = [{'video': video_path},
{'key_frame': frame_path},
{'processed': get_timestamp()}]
yaml.dump(data, file_handle)
The created YAML file looks like this:
video.mp4:
- video: /Users/xyz/video.mp4
- key_frame: /Users/xyz/imgOutput/frame
- Trigger.jpg
- processed: '2018-07-26 17:09:06'
The issue is that the key_frame is a file called "frame - Trigger.jpg". However, the line always breaks at the " - " (i.e. space-dash-space) in the filename. Result is something that, as a human-readable file, it looks very wrong. In fact, it's processed correctly when it's read back in (using yaml.open), and treated as a single string filename as it should be. It's just the formatting in the YAML file that's wrong.
Any thoughts on the cause? Is this expected behaviour? I've tried many different ways of quoting the string in case that's it (which doesn't make a difference - even quoted it will split over the line), but fundamentally it does work, from a code sense - but as YAML's big selling point is human-readable files, it'd be nice to understand what's causing it and how to fix it.
In YAML plain scalars (i.e. the ones without single or double quotes) can be wrapped to an indented newline on whitespace. That is what's happening.
To reproduce this is difficult as your question is quite incomplete, but some things can be easily seen from the output:
data is a dict
filename, video_path, and frame_path are defined as strings.
file_handle is probably some file stream opened for writing.
Others are less easily deduced:
get_timestamp() doesn't return a datetime.datetime() instance as one would expect from its name, but a string representation thereof. To prevent this string from being interpreted as a timestamp, it has to be quoted.
you are using the default YAML() instance (which equals typ='rt'), as the non-default ones would write the leaf mappings in flow style ( - {video: /Users/xyz/video.mp4}, etc.)
With that and the appropriate imports you can make a functioning program:
import datetime
import sys
import ruamel.yaml
yaml = ruamel.yaml.YAML(typ='rt')
def get_timestamp():
return datetime.datetime(2018, 7, 26, 17, 9, 6).isoformat(sep=' ', timespec='seconds')
data = {}
filename = 'video.mp4'
video_path = '/Users/xyz/video.mp4'
frame_path = '/Users/xyz/imgOutput/frame - Trigger.jpg'
file_handle = sys.stdout
data[filename] = [{'video': video_path},
{'key_frame': frame_path},
{'processed': get_timestamp()}]
yaml.dump(data, file_handle)
and this outputs:
video.mp4:
- video: /Users/xyz/video.mp4
- key_frame: /Users/xyz/imgOutput/frame - Trigger.jpg
- processed: '2018-07-26 17:09:06'
So we forgot something and that is:
yaml.width = 24 # range from 24-38 inclusive
with that you get your output:
video.mp4:
- video: /Users/xyz/video.mp4
- key_frame: /Users/xyz/imgOutput/frame
- Trigger.jpg
- processed: '2018-07-26 17:09:06'
so just remove the yaml.width = line and you should be all set.
Next time please provide a minimal, but complete, functioning program that actually produces the output.
My guess is that your frame_path is much longer that you show here, and that you don't have a user xyz. That causes you to get over the default width (defined in the emitter to be 80) and the plain scalar to wrap. Just set yaml.width = 4096 or whatever is necessary for your scalar length and nesting depth.
When in doubt if the YAML output is correct, read it back in (using an YAML(typ='safe').load(input_stream), it should produce the original data.
Can you try str(frame_path)
data[filename] = [{'video': video_path},
{'key_frame': str(frame_path)},
{'processed': get_timestamp()}]
There is nothing special about the dash. If the string is longer than a certain threshold it will break at the first space after that. The examples you gave do not reproduce this behaviour for me, but longer strings do.
The generated YAML is valid. Any string, if quoted or not, can be broken up to several lines.
Maybe you can adjust the threshold in ruamel. I can't find anything in the documentation, though.
(See also my article Strings in YAML)

linux - running system command in R and then writing output to a file

I have R code as below. Below code resides in a file called 'iot.R'. I am executing it in Linux.
I want to print content of variable 'fileinformation' to a file mentioned by file=fileConn...
I thought that the 3rd line will solve the issue, but it is not giving the required output :(
fileinformation = system(paste("file", filenames[1]))
#print(fileinformation)
cat(print(fileinformation),"\r\n","\r\n", file=fileConn)
When I run the file, i get below result. It prints to my screen, rather than writing to the file :(
> source('iot.R')
CH7Data_20130401T135010.csv: ASCII text, with CRLF line terminators
[1] 0
--------------------update1
I also tried below command, but didnt get the expected rsult
cat(capture.output(fileinformation),"\r\n","\r\n", file=fileConn)
You need to set the intern argument to TRUE in your call to system. For instance:
fileinformation<-system("file cinzia_2.gif",intern=TRUE)
fileinformation
#[1] "cinzia_2.gif: GIF image data, version 89a, 640 x 640"
Of course I tried a file on my pc. Setting intern to TRUE the return value of system becomes the console output of the command. Then, when you call cat, you don't need to enclose fileinformation into print, but a simple cat(fileinformation,"\r\n","\r\n", file=fileConn) will suffice.
Hi Just a comment as I dont have enough rep to comment in the normal way. but cant you use
write.table
to save the output to a file? It may be easier?

learnyounode 'My First I/O' example

This program puzzles me. The goal of this program is to count the number of newlines in a file and output it in command prompt. Learnyounode then runs their own check on the file and sees if their answer matches your answer.
So I start with the answer :
var fs = require('fs');
var filename = process.argv[2];
file = fs.readFileSync(filename);
contents = file.toString();
console.log(contents.split('\n').length - 1);
learnyounode verifies that this program correctly counts the number of new lines. But when I change the program to any of the following, it doesn't print out the same number as learnyounode prints out.
file = fs.readFileSync(C:/Nick/test.txt);
file = fs.readFileSync(test.txt);
Shouldn't nodejs readFileSync be able to input an address and read it correctly?
Lastly, this program is supposed to print out the # of newlines in a program. Why does both the correct program and learnyounode print out the same number that is different from the amount of newlines everytime I run this program?
For example, the number of newlines in test.txt is 3. But running this program prints out a different number everytime, like 45, 15, 2, etc. Yet at the same time, it is verified as a correct program by learnyounode because both their answers match! What is going on?
EDIT:
test.txt looks like this
ok
testing
123
So, I tried your program on my local machine and your program works fine. I am not an expert on learnyounode. I just tried it after your question but I think I understand how it works. As such, here are the answers to your questions:
Shouldn't nodejs readFileSync be able to input an address and read it correctly?
This method from nodejs is working fine. You can try printing the contents of the file and you'll see that there are no problems.
Why does both the correct program and learnyounode print out the same number that is different from the amount of newlines everytime I run this program.
learnyounode is running your program with a different filename as input each time. It verifies the output of your program by running its own copy of correct code against the same file.
But when I change the program to any of the following, it doesn't print out the same number as learnyounode prints out.
That is because at this point, your code is processing a fixed file whereas learnyounode is still processing different files on each iteration.
This tripped me up too. If you read the learnyounode instructions closely they explicitly say...
"The full path to the file to read will be provided as the first command-line argument."
This means they are providing the path to their own file.
When you use process.argv[2], this is passing in the 3rd array item (the learnyounode test txt file) into your script. If you run a console.log(process.argv); you'll see the full array object looks something like this:
[ '/usr/local/bin/node',
'/Users/user/pathstuff/learnyounode/firstio.js',
'/var/folders/41/p2jvc80j26l7nty0sk0zs1z40000gn/T/_learnyounode_1613.txt' ]
The reason the validation numbers begin to mismatch when you substitute your own text file for their is because your file always has 3 lines whereas their unit tests keep passing in different length files via process.argv.
Hope that helps.
when you are using process.argv[2] in learnyounode, the argument is provided by learnyounode automatically, so it prints different number of lines like 45, 15, 2 etc at multiple times verification.
If you remember the second challenge "BABYSTEPS" carefully this was given:
learnyounode will be supplying arguments to your program when you run
learnyounode verify program.js so you don't need to supply them yourself.
That's why different line numbers at program.js verification on multiple times.
there are two different ways.
if you run program like:
node program_name.js
than you need to add path to text file:
node program_name.js text_file.txt
in this case make sure that files are in the same directory.
or you can run it with command:
learnyounode program_name.js
and than default text file will be provided by learnyounode. You can watch content of this text file by using
console.log(buffer)
Problem statement says
The full path to the file to read will be provided as the first
command-line argument.
So you've to pass the path/to/file as an argument.
Remember process.argv
you should use the following method to execute .js files
node program_name.js /path/to/text_file_name
rather than
learnyounode run program_name.js /path/to/text_file_name
on this method, Node.js will run your program with specify files of you enter on the command-line-interface.
wish this answer can help you programming. :)

Issue with filepath name, possible corrupt characters

Perl and html, CGI on Linux.
Issue with file path name, being passed in a form field, to a CGI on server.
The issue is with the Linux file path, not the PC side.
I am using 2 programs,
1) program written years ago, dynamic html generated in a perl program, and presented to the user as a form. I modified by inserting the needed code to allow a the user to select a file from their PC, to be placed on the Linux machine.
Because this program already knew the filepath, needed on the linux side, I pass this filepath in a hidden form field, to program 2.
2) CGI program on Linux side, to run when form on (1) is posted.
Strange issue.
The filepath that I pass, has a very strange issue.
I can extract it using
my $filepath = $query->param("serverfpath");
The above does populate $filepath with what looks like exactly the correct path.
But it fails, and not in a way that takes me to the file open error block, but such that the call to the CGI script gives an error.
However, if I populate $filepath with EXACTLY the same string, via hard coding it, it works, and my file successfully uploads.
For example:
$fpath1 = $query->param("serverfpath");
$fpath2 = "/opt/webhost/ims/DOCURVC/data"
A comparison of $fpath1 and $fpath2 reveals that they are exactly equal.
A length check of $fpath1 and $fpath2 reveals that they are exactly the same length.
I have tried many methods of cleaning the data in $fpath1.
I chomp it.
I remove any non standard characters.
$fpath1 =~ s/[^A-Za-z0-9\-\.\/]//g;
and this:
my $safe_filepath_characters = "a-zA-Z0-9_.-/";
$fpath1 =~ s/[^$safe_filepath_characters]//g;
But no matter what I do, using $fpath1 causes an error, using $fpath2 works.
What could be wrong with the data in the $fpath1, that would cause it to successfully compare to $fpath2, yet not be equal, visually look exactly equal, show as having the exact same length, but not work the same?
For the below file open block.
$upload_dir = $fpath1
causes complete failure of CGI to load, as if it can not find the CGI (which I know is sometimes caused by syntax error in the CGI script).
$uplaod_dir = $fpath2
I get a successful file upload
$uplaod_dir = ""
The call to the cgi does not fail, it executes the else block of the below if, as expected.
here is the file open block:
if (open ( UPLOADFILE, ">$upload_dir/$filename" ))
{
binmode UPLOADFILE;
while ( <$upload_filehandle> )
{
print UPLOADFILE;
}
close UPLOADFILE;
$msgstr="Done with Upload: upload_dir=$upload_dir filename=$filename";
}
else
{
$msgstr="ERROR opening for upload: upload_dir=$upload_dir filename=$filename";
}
What other tests should I be performing on $fpath1, to find out why it does not work the same as its hard-coded equivalent $fpath2
I did try character replacement, a single character at a time, from $fpath2 to $fpath1.
Even doing this with a single character, caused $fpath1 to have the same error as $fpath2, although the character looked exactly the same.
Is your CGI perhaps running perl with the -T (taint mode) switch (e.g., #!/usr/bin/perl -T)? If so, any value coming from untrusted sources (such as user input, URIs, and form fields) is not allowed to be used in system operations, such as open, until it has been untainted by using a regex capture. Note that using s/// to modify it in-place will not untaint the value.
$fpath1 =~ /^([A-Za-z0-9\-\.\/]*)$/;
$fpath1 = $1;
die "Illegal character in fpath1" unless defined $fpath1;
should work if taint mode is your issue.
But it fails, and not in a way that takes me to the file open error block, but such that the call to the CGI script gives an error.
Premature end of script headers? Try running the CGI from the command line:
perl your_upload_script.cgi serverfpath=/opt/webhost/ims/DOCURVC/data

Resources