OpenTextFile vs Open file - excel

I've been using the Open method when dealing with files. I just found out about OpenTextFile and CreateTextFile. What is the difference between them and the Open method? Is one faster then the other? Or which is better?
Dim fs, f
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.OpenTextFile("c:\testfile.txt", 1, TristateFalse)
f.Close
Dim line as String
Open "c:\testfile.txt" For Input as #1
Line Input #1, line
Close #1

Overall, Open is faster. However, it can only read files up to ~2gb and cannot read Linux EOL indicators. OpenTextFile on the other hand, creates a textstream that can read bigger files and read Linux EOL indicators, but, is approximatly 4 times slower than Open.

Related

Python - How do I separate data into multiple lines

I have two strings that i want to put into a txt file but when I try and write then, it's only on the first line, I want the string to be on separate lines how do I do so?
Here is the writing part of my code btw:
saveFile = open('points.txt', 'w')
saveFile.write(str(jakesPoints))
saveFile.write(str(alexsPoints))
saveFile.close
if jakesPoints was 10 and alexsPoints was 12 then the text file would be
1012
but i want to to be
10
12
You can use a newline character (\n) to move to a new line. For your example:
with open('points.txt', 'w') as saveFile:
saveFile.write("{}\n".format(jakesPoints))
saveFile.write("{}\n".format(alexsPoints))
The other things to note:
It is helpful to open files using with - this will take care of opening and closing the file automatically (which is typically preferred over trying to remember to .close()).
The {}.format() section is used to convert your numbers to a string and add the newline character. I found https://pyformat.info/ explained the string formatters pretty good and highlight all the main advantages.
with open('points.txt', 'w') as saveFile:
saveFile.write(str(jakesPoints))
saveFile.write("\n")
saveFile.write(str(alexsPoints))
See difference betweenw and a used in open(). Also see join() .

Replace CRLF with LF in Python 3.6

I've tried searching the web, and a number of different things I've read on the web, but don't seem to get the desired result.
I'm using Windows 7 and Python 3.6.
I'm connecting to an Oracle db with cx_oracle and creating a text file with the query results. The file that is created (which I'll call my_file.txt to make it easy) has 3688 lines in it all with CRLF which needs to be converted to the unix LF.
If I run python crlf.py my_file.txt it is all converted correctly & there is no issues, but that means I need to run another command manually which I do not want to do.
So I tried adding the code below to my file.
filename = "NameOfFileToBeConverted"
fileContents = open(filename,"r").read()
f = open(filename,"w", newline="\n")
f.write(fileContents)
f.close()
This does convert the majority of the CRLF to LF but # line 3501 it has a NUL character 3500 times on the one line followed by a row of data from the database & it ends with the CRLF, every line from here on still has the CRLF.
So with that not working, I removed it and then tried
import subprocess
subprocess.Popen("crlf.py "+ filename, shell=True)
I also tried using
import os
os.system("crlf.py "+ filename)
The "+ filename" in the two examples above is just providing the filename that is created during the data extract.
I don't know what else to try from here.
Convert Line Endings in-place (with Python 3)
Windows to Linux/Unix
Here is a short script for directly converting Windows line endings (\r\n also called CRLF) to Linux/Unix line endings (\n also called LF) in-place (without creating an extra output file):
# replacement strings
WINDOWS_LINE_ENDING = b'\r\n'
UNIX_LINE_ENDING = b'\n'
# relative or absolute file path, e.g.:
file_path = r"c:\Users\Username\Desktop\file.txt"
with open(file_path, 'rb') as open_file:
content = open_file.read()
content = content.replace(WINDOWS_LINE_ENDING, UNIX_LINE_ENDING)
with open(file_path, 'wb') as open_file:
open_file.write(content)
Linux/Unix to Windows
Just swap the line endings to content.replace(UNIX_LINE_ENDING, WINDOWS_LINE_ENDING).
Code Explanation
Important: Binary Mode We need to make sure that we open the file both times in binary mode (mode='rb' and mode='wb') for the conversion to work.
When opening files in text mode (mode='r' or mode='w' without b), the platform's native line endings (\r\n on Windows and \r on old Mac OS versions) are automatically converted to Python's Unix-style line endings: \n. So the call to content.replace() couldn't find any line endings to replace.
In binary mode, no such conversion is done.
Binary Strings In Python 3, if not declared otherwise, strings are stored as Unicode (UTF-8). But we open our files in binary mode - therefore we need to add b in front of our replacement strings to tell Python to handle those strings as binary, too.
Raw Strings On Windows the path separator is a backslash \ which we would need to escape in a normal Python string with \\. By adding r in front of the string we create a so called raw string which doesn't need any escaping. So you can directly copy/paste the path from Windows Explorer.
Alternative We open the file twice to avoid the need of repositioning the file pointer. We also could have opened the file once with mode='rb+' but then we would have needed to move the pointer back to start after reading its content (open_file.seek(0)) and truncate its original content before writing the new one (open_file.truncate(0)).
Simply opening the file again in write mode does that automatically for us.
Cheers and happy programming,
winklerrr

How to keep special characters when renaming a file using AppleScript

I've got an AppleScript file to replace filenames which are supplied through a CSV file. While I've got the script to work, it has issues with encoding of the strings/filenames.
The finding of files works perfectly. Renaming it to something like Malmö results in a very weird encoded string.
The CSV originates from Microsoft Excel, and I suspect is not properly UTF-8 encoded. And now I'm stuck in how to handle the encoding properly. (or how to convert the encoding). As far as I know it has the default Excel encoding ISO 8859-1.
set theFile to (choose file with prompt "Select the CSV file")
set thePath to (choose folder with prompt "Select directory") as string
set theCSVData to paragraphs of ((read theFile))
set {oldTID, my text item delimiters} to {my text item delimiters, ";"}
repeat with thisLine in theCSVData
set {oldFileName, newFileName} to text items of thisLine
if length of oldFileName > 0 then
set oldFile to thePath & oldFileName
set newFile to newFileName
tell application "System Events"
if exists file oldFile then
set name of file oldFile to newFile
end if
end tell
end if
end repeat
So my question is, how do I fix the encoding issue, either by reading it properly or by encoding the file first (through applescript)
I really think the filenames are encoded using utf16LE, but I can be wrong. Anyhow, there is a utility you can access via Terminal, that is named iconv (man iconv), that you can experiment with, and maybe re-encode the filename with. If you receive the output from a do shell script as text, or unicode text, then you get back utf16, so it won't be corrupted after the conversion. (Before you use it as a filename).
Use:
set theCSVData to paragraphs of ((read theFile as «class utf8»))

opening a text file, converting it to string, and then outputing the string to another text file, produces unwanted \n's as well as some other stuff

I've asked this before, but the results were not fruitful, i don't know whether i should've bumped it so i just made a new one.
my code for opening the text file and converting it to stringstream:
OpenFileDialog^ failas = gcnew OpenFileDialog();
failas->Filter = "Text Files|*.txt";
if( failas->ShowDialog() != System::Windows::Forms::DialogResult::OK )
{
return;
}
MessageBox::Show( failas->FileName );
String^ str = failas->FileName;
StreamReader ^strm = gcnew StreamReader(str);
String ^ST1=strm->ReadToEnd();
strm->Close();
string st1 = marshal_as<string,String ^>(ST1);
stringstream SS(st1);
if i were to output the SS or st1
instead of outputting something like:
a
a
a
I get
a
a
a
And now the thing is, that if i open the file in notepad, it looks like intended(no spaces between lines) but if i load it anywhere else but there, it still has the spaces.
I was understand this has something to do with the way windows save text files, but I have no idea how to remove the additional \n when I use the command ReadToEnd?
any ideas?
You're reading the input file using .Net methods, converting it to an unmanaged C++ stringstream, and then presumably writing the output file using unmanaged C++ methods.
In C++, many methods will automatically handle Windows vs. Unix line endings: fprintf(outfile, "Some text\n"); will actually write the bytes "Some text\r\n" to disk, if the file was opened in text mode.
You didn't say how you're writing the output file, but I think what's happening is that you're using fopen or similar in text mode. When you read from the input file, it contained CR-LF (\r\n) character sequences, and those character sequences were copied to the String^ ST1. They were still there when you copied the characters to the stringstream.
When you wrote the "\r\n" using fwrite or similar, it converted the \n to \r\n, resulting in \r\r\n sequences. This is not a standard line-ending on any platform, so that's why different editors are displaying it differently. You can confirm this by looking at the output file in a binary editor (rename to *.bin and open in Visual Studio): I expect you'll see bytes 0d 0d 0a at the end of each line.
To fix this, there's a couple things you can do:
You could read the file using unmanaged methods. Since you apparently want to manipulate and write using unmanaged, you can stay in unmanaged-land for the whole operation. Let the unmanaged APIs convert the \r\n on disk to \n in memory, and convert back to \r\n when written back to disk.
You could remove the \r characters from the string after reading in .Net, before writing in unmanaged C++. This would be a simple call to String::Replace.
You could open the file for writing in unmanaged C++ as a binary file, rather than text. This will turn off the line-ending conversion, and output exactly the characters you have in your string. Just be sure to use \r\n line endings if you manipulate the data before writing it.

POSIX path in applescript from list not opening. Raw path works

I'm really confused with this one. All I want is the files in the list to open. Here's my codes
set FilesList to {"Users/XXXXXX/Documents/07PictureArt-Related/THINGS THAT HELP/Tutorials/artNotesfromFeng.rtf", "Users/XXXXXXXX/Documents/07PictureArt-Related/THINGS THAT HELP"}
repeat with theFiles in FilesList
delay 0.1
tell application "Finder" to open POSIX file (theFiles)
end repeat
So, how come THAT won't work, but this will???
tell application "Finder" to open POSIX file "Users/XXXXXX/Documents/07PictureArt-Related/THINGS THAT HELP/Tutorials/artNotesfromFeng.rtf"
I'm thinking it might have to do with maybe the list is making it a string, and when I plug it directly into the open command, it LOOKS like a string, but it's not really...I don't know
For now I guess I just have to brute force it, and make a new statement for each file.
Thanks
Not sure what is going on there, i agree it's confusing.
An alternate is to use the shell 'open' command instead.
repeat with filePath in FilesList
do shell script "open " & quoted form of filePath
end repeat
The shell seems more happy with POSIX paths, the trick is to send in the 'quoted form' of your POSIX paths.
--
EDIT:
Putting into a var first works too.
repeat with theFiles in FilesList
set f to POSIX file theFiles
tell application "Finder" to open f
end repeat
It seems the Finder is causing the coercion to POSIX file problem.
Either of these works:
set l to {"/bin", "/etc"}
repeat with f in l
tell application "Finder" to open (POSIX file (contents of f) as alias)
end repeat
set l to {"/bin", "/etc"}
repeat with f in l
POSIX file f
tell application "Finder" to open result
end repeat
Try running this script:
set l to {"/bin", "/etc"}
repeat with f in l
f
end repeat
The result at the end is item 2 of {"/bin", "/etc"}. contents of returns the target of the reference object.
I don't know why POSIX file f doesn't work inside a tell application "Finder" block, or why POSIX file f as alias does work.

Resources