I've been using docjure to write to excel files. Mostly I want to append rows to already existing files, usually one at a time. When I do this without agents/future, I load the file, use add-rows to add the data, and then rewrite the file like this:
(defn append [filename data]
"data is in the same format as create-workbook, i.e. [[\"n\" \"m\"] [1 2] [3 4]]"
(let [workbook (load-workbook filename))
sheet (select-sheet workbook "Sheet1")]
(add-rows! sheet data)
(save-workbook! filename workbook)))
I make a lot of calls to append, so I found this: http://blakesmith.me/2012/05/25/understanding-clojure-concurrency-part-2.html, which shows you how to use agents to write to a file using future.
First of all, I'm using FileOutputStream instead of FileWriter, which would still work, except whereas in the tutorial's example you just append strings to the end of the file using .write and then close, I need to rewrite the file every time I "append" (I think?) since there's more bytes in a .xlsx workbook than just characters.
I don't really know how to set this up since with the tutorial's logging example, write-out returns the updated instance of the BufferedWriter and I don't know what the equivalent of that would be.
My other option would be to add the data to the vector concurrently (load the file once and keep returning new vectors [[\"n\" \"m\"] [1 2] [3 4]] with the data added) but I'm planning on doing ~10000-100000 of these calls and that seems like a lot to keep track of... although to be fair reading and writing all the data that many times is probably not that great either.
If you have any suggestions on how I can do this, I'd appreciate it. I'd be willing to make calls to the Apache POI itself too, if there's a better way to append with that. Thanks.
--- UDPATE ---
I just rewrote the the logger example with the file as an agent instead of the output stream and it seems to work. I'll let you know if it ends up working with docjure/Apache POI.
(def logfile (agent (File. "blah.txt")))
(defn write-out [file msg]
(with-open [out (BufferedWriter. (FileWriter. file true))]
(.write out msg))
file)
--- UDPATE 2---
I got an analogous version written with docjure, but the unfortunately because opening the file happens within write-out and that happens during each future (I don't see a way around this if I use File as an agent, and I don't see another way to do it besides that) most of them read the empty file and write the row to that since they're all done in parallel and the end result is that most of them overwrite each other.
Ultimately I decided to just add each row vector to an overall data vector and write once. I can do that with just pmap, so its a lot neater. The one downside is if something goes wrong none of the data is written to the file at all, but the upside is that the time it takes to write is reduced since there's only one write call. Also, I would have been loading the large amount of data into memory every time which takes time. Memory usage is the same either way.
If anyone still wants to answer this, I'd still be interested, but the method in my first update does not work (each future reads in an empty file and uses that to append to). I'll post that code incase it helps anyone though--docjure version of the aforementioned tutorial:
(def file (agent (File. "blah.xlsx")))
(defn write-out [file workbook]
(with-open [out (FileOutputStream. file)]
(.write workbook out))
file)
(defn write-workbook [file data]
(let [filename (.getPath #file)
workbook (try (load-workbook filename)
(catch Exception e (create-workbook "Sheet1" [])))
sheet (select-sheet "Sheet1" workbook)]
(add-rows! sheet data)
(send file write-out workbook)))
(defn test [file]
(write-workbook file [["n" "m"]])
(dotimes [i 5]
(future (write-workbook file [[i (inc i)]]))))
Thanks
Related
The issue that I am having is that I am able to read the information from the files, but when I try to convert them from a string to an integer, I get an error. I also have issues where the min/max prints as the entire file's contents.
I have tried using if/then statements as well as using different variables for each line in the file.
file=input("Which file do you want to get the data from?")
f=open('data3.txt','r')
sent='-999'
line=f.readline().rstrip('\n')
while len(line)>0:
lines=f.read().strip('\n')
value=int(lines)
if value>value:
max=value
print(max)
else:
min=value
print(min)
total=sum(lines)
print(total)
I expect the code to find the min/max of the numbers in the file as well as the sum and average of the numbers in the file. The results from the file being processed in the code, then have to be written to a different file. My results have consisted in various errors reading that Python is unable to convert from a str to an int as well as printing the entire file's contents instead of the expected results.
does the following work?
lines = list(open('fileToRead.txt'))
intLines = [int(i) for i in lines]
maxValue = max(intLines)
minvalue = min(intLines)
sumValue = sum(intLines)
print("MaxValue : {0}".format( maxValue))
print("MinValue : {0}".format(minvalue))
print("Sum : {0}".format(sumValue))
print("Avergae : {0}".format(sumValue/len(intLines)))
and this is how my filesToRead.txt is formulated (just a simple one, in fact)
10
20
30
40
5
1
I am reading file contents into a list. Then I create a new list (it can be joined with the previous step as part of some code refactoring) which has all the list of ints.Once when I have the list of ints, its easier to calculate max and min on it.
Note that some of the variables are not named properly. Also reading the whole file in one go (like what I have done here) might be a bad idea if the file is too large. In that case, you should never ever read the whole file in one go. In this case , you need to read it line by line, parse the ints and add them to a list of ints. Once when you are done reading the file, close the file. You can then start your calculations based on the list of ints that you have now obtained.
Please let me know if this resolves your query.
Thanks
I'm trying to Create a file and append all the content being calculated into that file, but when I run the script the very last iteration is written inside the file and nothing else.
My code is on pastebin, it's too long, and I feel like you would have to see exactly how the iteration is happening.
Try to summarize it, Go through an array of model numbers, if the model number matches call the function that calculates that MAC_ADDRESS, when done calculating store all the content inside a the file.
I have tried two possible routes and both have failed, giving the same result. There is no error in the code (it runs) but it just doesn't store the content into the file properly there should be 97 different APs and it's storing only 1.
The difference between the first and second attempt,
1 attempt) I open/create file in the beginning of the script and close at the very end.
2 attempt) I open/create file and close per-iteration.
First Attempt:
https://pastebin.com/jCpLGMCK
#Beginning of code
File = open("All_Possibilities.txt", "a+")
#End of code
File.close()
Second Attempt:
https://pastebin.com/cVrXQaAT
#Per function
File = open("All_Possibilities.txt", "a+")
#per function
File.close()
If I'm not suppose to reference other websites, please let me know and I'll just paste the code in his post.
Rather than close(), please use with:
with open('All_Possibilities.txt', 'a') as file_out:
file_out.write('some text\n')
The documentation explains that you don't need + to append writes to a file.
You may want to add some debugging console print() statements, or use a debugger like pdb, to verify that the write() statement actually ran, and that the variable you were writing actually contained the text you thought it did.
You have several loops that could be a one-liner using readlines().
Please do this:
$ pip install flake8
$ flake8 *.py
That is, please run the flake8 lint utility against your source code,
and follow the advice that it offers you.
In particular, it would be much better to name your identifier file than to name it File.
The initial capital letter means something to humans reading your code -- it is
used when naming classes, rather than local variables. Good luck!
Suppose I have a file named test.txt and it currently has the number 6 inside of it. I want to use a variable such as x=4 then write to the file and add the two numbers together and save the result in the file.
var1 = 4.0
f=open(test.txt)
balancedata = f.read()
newbalance = float(balancedata) + float(var1)
f.write(newbalance)
print(newbalance)
f.close()
It's probably simpler than you're trying to make it:
variable = 4.0
with open('test.txt') as input_handle:
balance = float(input_handle.read()) + variable
with open('test.txt', 'w') as output_handle:
print(balance, file=output_handle)
Make sure 'test.txt' exists before you run this code and has a number in it, e.g. 0.0 -- you can also modify the code to deal with creating the file in the first place if it's not already there.
Files only read and write strings (or bytes for files opened in binary mode). You need to convert your float to a string before you can write it to your file.
Probably str(newbalance) is what you want, though you could customize how it appears using format if you want. For instance, you could round the number to two decimal places using format(newbalance, '.2f').
Also note that you can't write to a file opened only for reading, so you probably need to either use mode 'r+' (which allows both reading and writing) combined with a f.seek(0) call (and maybe f.truncate() if the length of the new numeric string might be shorter than the old length), or close the file and reopen it in 'w' mode (which will truncate the file for you).
I am creating several text folders programmatically using VB6, and then concatenating them all together into a single file.
I write text to the files using
Print #lngFileHandle, Text
so there should be a CR/LF even after the very last line of text in each file.
Then I append all these "subfiles" together into another text file that was opened this way:
Open strFileName For Append As #lngFileHandle
Strangely, my final resulting file looks good EXCEPT that the very last line of the last file being appended is only partially there.
The last few lines look like this in the file I'm reading FROM:
`<Name>` Referral for Service Home Delivered Meals`</Name>`
`<Name>` Referral for Service Adult Day Care/Health`</Name>`
`<Name>` Referral for Service Congregate Meals`</Name>`
but after being read in from that file and output to the final file, they look like this:
`<Name>` Referral for Service Home Delivered Meals`</Name>`
`<Name>` Referral for Service Adult Day Care/Health`</Name>`
`<Name>` Referral for Service Congr
The code I'm using to read in this particular "subfile" and output it to the final file is:
With mobjNewEntriesLog
Do While Not .IsEOF
strOutput = .ReadLine
mobjMainLog.PrintLine strOutput
Loop
End With
The .IsEOF function is as follows:
Public Function IsEOF() As Boolean
If blnOpened Then
IsEOF = EOF(lngFileHandle)
Else
IsEOF = True
End If
End Function
It would make more sense to me if I wasn't getting the last line at ALL, but getting just PART of it?--I don't get that.
Anybody see anything that would make the last line only print partially to the final file?
TIA.
Ensure you are closing your file as this may be required to flush out any data that is pending to be written.
VB6 file numbers are not file handles, so don't call them that. They are indexes into a file descriptor table in the runtime where the actual handle, mode, buffer length, buffer, ponters, etc. are stored.
The Close statement is not synchronous, but a "lazy close" that may not have flushed all data and updated the EOF pointer of the file by the time you turn around and try to read it again. This behavior is intentional as far as I can determine, perhaps for performance reasons.
A Reset statement can be used to force all open files closed, and it is synchronous. This isn't always practical, however it may be fine in your case. Easy enough to try: add a Reset before you re-open any of your files to concatenate them.
I have another puzzling problem.
I need to read .xls files with RODBC. Basically I need a matrix of all the cells in one sheet, and then use greps and strsplits etc to get the data out. As each sheet contains multiple tables in different order, and some text fields with other options inbetween, I need something that functions like readLines(), but then for excel sheets. I believe RODBC the best way to do that.
The core of my code is following function :
.read.info.default <- function(file,sheet){
fc <- odbcConnectExcel(file) # file connection
tryCatch({
x <- sqlFetch(fc,
sqtable=sheet,
as.is=TRUE,
colnames=FALSE,
rownames=FALSE
)
},
error = function(e) {stop(e)},
finally=close(fc)
)
return(x)
}
Yet, whatever I tried, it always takes the first row of the mentioned sheet as the variable names of the returned data frame. No clue how to get that solved. According to the documentation, colnames=FALSE should prevent that.
I'd like to avoid the xlsReadWrite package. Edit : and the gdata package. Client doesn't have Perl on the system and won't install it.
Edit:
I gave up and went with read.xls() from the xlsReadWrite package. Apart from the name problem, it turned out RODBC can't really read cells with special signs like slashes. A date in the format "dd/mm/yyyy" just gave NA.
Looking at the source code of sqlFetch, sqlQuery and sqlGetResults, I realized the problem is more than likely in the drivers. Somehow the first line of the sheet is seen as some column feature instead of an ordinary cell. So instead of colnames, they're equivalent to DB field names. And that's an option you can't set...
Can you use the Perl-based solution in the gdata instead? That happens to be portable too...