Start loop at specific line of text file in groovy - groovy

I am using groovy and I am trying to have a text file be altered at specific line, without looping through all of the previous lines. Is there a way to state the line of a text file that you want to wish to alter?
For instance
Text file is:
1
2
3
4
5
6
I would like to say
Line(3) = p
and have it change the text file to:
1
2
p
4
5
6
I DO NOT want to have to do a loop to iterate through the lines to change the value, aka I do not want to use a .eachline {line ->...} method.
Thank you in advance, I really appreciate it!

I dont think you can skip lines and traverse like this. You could do the skip by using the Random Access File in java, but instead of lines you should be specifying the number of bytes.

Try using readLines() on file text. It will store all your lines in a list. To change content at line n, change content at n-1 index on list and then join on list items.
Something like this will do
//We can call this the DefaultFileHandler
lineNumberToModify = 3
textToInsert = "p"
line( lineNumberToModify, textToInsert )
def line(num , text){
list = file.readLines()
list[num - 1] = text
file.setText(list.join("\n"))
}
EDIT: For extremely large files, it is better that you have a custom implementation. May be something on the lines of what Tim Yates had suggested in the comment on your question.
The above readLines() can easily process upto 100000 lines of text within less than a sec. So you can do something like this:
if(file size < 10 MB)
use DefaultFileHandler()
else
use CustomFileHandler()
//CustomFileHandler
- Split the large file into buckets of acceptable size.
- Ex: Bucket 1(1-100000 lines), Bucket 2(100000-200000 lines), etc.
- if (lineNumberToModify falls in bucket range)
insert into line in the bucket
There is no hard and fast rule to define how you implement your CustomFileHandler as it completely depends on the use case scenario. If you need to do the above operation multiple times on the same file, you can choose to do the complete bucket split first, store them in memory and use the buckets for the following operations. Or if it is a one time operation, you can avoid manipulating all the buckets first but deal with only what you need and process the others later on on-demand basis.
And even within the buckets you can define your own intelligence to speed up your job. Say if you want to insert into 99999 line of a bucket with 1-100000 lines, you can exploit groovy's methods and closures to their fullest,
file.readLines().reverse()[1] = "some text"

Related

want to optimize a "phone number generator" code with loop deduction

Description of program:
1.makes unique random phone numbers based on how many you want it to i mean: if you pass 100 it makes 100 phone numbers.
2.creates text files based on the range you pass to it, i mean: if you need 100 text files containing 100 unique phone numbers either unique comparing to each number within or the other to be made forthcoming text files.
meanwhile it creates phone numbers sorts the phone numbers like below if it makes sense:
This format to expect in the text files :
1909911304
1987237347
........... and so on.............
This is the method responsible to do so:
(Note: I use make_numbers method as construction of operation, Actually num_doc_amount should be used.)
def make_numbers(self):
"""dont use this method:this method supports num_doc_amount method"""
# sorry for this amount of loops it was inevitable to make the code work
for number_of_files in range(self.amount_numbs):
# this loop maintains the pi_digits.txt making(txt)
number_of_files += 1
if number_of_files == self.amount_files:
sys.exit()
for phone_numbers in range(self.amount_numbs):
# This loop maintains the amount of phone numbers in each pi_digits.txt
file = open(f"{self.directory}\\{number_of_files}.{self.format}", 'w')
for numbers in range(self.amount_numbs):
# This loop is parallel to the previous one and
# writes that each number is which one from the
# whole amount of numbers
file.write(f"{numbers + 1}. - {self.first_fourz}{choice(nums)}"
f"{choice(nums)}{choice(nums)}{choice(nums)}"
f"{choice(nums)}{choice(nums)}{choice(nums)}\n")
def num_doc_amount(self):
"""first make an instance and then you can use this method."""
os.mkdir(f"{self.directory}") # makes the folder
for num_of_txt_files in range(self.amount_files):
# This loop is for number of text files.
num_of_txt_files += 1
self.make_numbers()
Note That:
1.The only problem i have is with those parallel loops going with each other, i don't know if i can make the code simplified.(please let me know if it can be simplified.)
2.The code works and has no error.
if there is any way to make this code simplified please help me.Thank you.
1.The only problem i have is with those parallel loops going with each other, i don't know if i can make the code simplified.(please let me know if it can be simplified.)
Even if that's not the only problem, indeed there are unnecessarily many loops in the above code. It takes not more than two: one loop over the files, one loop over the numbers; see below.
2.The code works and has no error.
That's false, since you want all the phone numbers to be unique. As said, the code has no provision that the written phone numbers are unique. To achieve this, it's easiest to generate all unique numbers once at the start.
def num_doc_amount(self):
"""first make an instance and then you can use this method."""
os.mkdir(self.directory) # makes the folder
un = sample(range(10**7), self.amount_files*self.amount_numbs) # all the unique numbers
# This loop is for number of text files:
for num_of_txt_file in range(self.amount_files):
with open("%s/%s.%s"%(self.directory, 1+num_of_txt_file, self.format), 'w') as file:
# This loop maintains the amount of phone numbers in each .txt:
for number in range(self.amount_numbs):
# writes one number from the whole amount of numbers
file.write("%s. - %s%07d\n" %(1+number, self.first_fourz, un.pop()))

Reading values from a file and outputting each number, largest/smallest numbers, sum, and average of numbers from the file

The issue that I am having is that I am able to read the information from the files, but when I try to convert them from a string to an integer, I get an error. I also have issues where the min/max prints as the entire file's contents.
I have tried using if/then statements as well as using different variables for each line in the file.
file=input("Which file do you want to get the data from?")
f=open('data3.txt','r')
sent='-999'
line=f.readline().rstrip('\n')
while len(line)>0:
lines=f.read().strip('\n')
value=int(lines)
if value>value:
max=value
print(max)
else:
min=value
print(min)
total=sum(lines)
print(total)
I expect the code to find the min/max of the numbers in the file as well as the sum and average of the numbers in the file. The results from the file being processed in the code, then have to be written to a different file. My results have consisted in various errors reading that Python is unable to convert from a str to an int as well as printing the entire file's contents instead of the expected results.
does the following work?
lines = list(open('fileToRead.txt'))
intLines = [int(i) for i in lines]
maxValue = max(intLines)
minvalue = min(intLines)
sumValue = sum(intLines)
print("MaxValue : {0}".format( maxValue))
print("MinValue : {0}".format(minvalue))
print("Sum : {0}".format(sumValue))
print("Avergae : {0}".format(sumValue/len(intLines)))
and this is how my filesToRead.txt is formulated (just a simple one, in fact)
10
20
30
40
5
1
I am reading file contents into a list. Then I create a new list (it can be joined with the previous step as part of some code refactoring) which has all the list of ints.Once when I have the list of ints, its easier to calculate max and min on it.
Note that some of the variables are not named properly. Also reading the whole file in one go (like what I have done here) might be a bad idea if the file is too large. In that case, you should never ever read the whole file in one go. In this case , you need to read it line by line, parse the ints and add them to a list of ints. Once when you are done reading the file, close the file. You can then start your calculations based on the list of ints that you have now obtained.
Please let me know if this resolves your query.
Thanks

Python3 - How to write a number to a file using a variable and sum it with the current number in the file

Suppose I have a file named test.txt and it currently has the number 6 inside of it. I want to use a variable such as x=4 then write to the file and add the two numbers together and save the result in the file.
var1 = 4.0
f=open(test.txt)
balancedata = f.read()
newbalance = float(balancedata) + float(var1)
f.write(newbalance)
print(newbalance)
f.close()
It's probably simpler than you're trying to make it:
variable = 4.0
with open('test.txt') as input_handle:
balance = float(input_handle.read()) + variable
with open('test.txt', 'w') as output_handle:
print(balance, file=output_handle)
Make sure 'test.txt' exists before you run this code and has a number in it, e.g. 0.0 -- you can also modify the code to deal with creating the file in the first place if it's not already there.
Files only read and write strings (or bytes for files opened in binary mode). You need to convert your float to a string before you can write it to your file.
Probably str(newbalance) is what you want, though you could customize how it appears using format if you want. For instance, you could round the number to two decimal places using format(newbalance, '.2f').
Also note that you can't write to a file opened only for reading, so you probably need to either use mode 'r+' (which allows both reading and writing) combined with a f.seek(0) call (and maybe f.truncate() if the length of the new numeric string might be shorter than the old length), or close the file and reopen it in 'w' mode (which will truncate the file for you).

Read nth line in Node.js without reading entire file

I'm trying to use Node.js to get a specific line for a binary search in a 48 Million line file, but I don't want to read the entire file to memory. Is there some function that will let me read, say, line 30 million? I'm looking for something like Python's linecache module.
Update for how this is different: I would like to not read the entire file to memory. The question this is identified as a duplicate of reads the entire file to memory.
You should use readline module from Node’s standard library. I deal with 30-40 million rows files in my project and this works great.
If you want to do that in a less verbose manner and don’t mind to use third party dependency use nthline package:
const nthline = require('nthline')
, filePath = '/path/to/100-million-rows-file'
, rowNumber = 42
nthline(rowNumber, filePath)
.then(line => console.log(line))
According to the documentation, you can use fs.createReadStream(path[, options]), where:
options can include start and end values to read a range of bytes from the file instead of the entire file.
Unfortunately, you have to approximate the desired position/line, but it seems to be no seek like function in node js.
EDIT
The above solution works well with lines that have fixed length.
New line character is nothing more than a character like all the others, so looking for new lines is like looking for lines that start with the character a.
Because of that, if you have lines with variable length, the only viable approach is to load them one at a time in memory and discard those in which you are not interested.

Wolfram Mathematica import data from multiple files

I have a lot of files. Every of which contains data.
I can happy import one file to Mathematica. But there are more than 500 hundreds of files.
I do it so:
Import["~/math/third_ks/mixed_matrices/1.dat", "Table"];
aaaa = %
(*OUTPUT - some data, I can access them!*)
All that I want is just to make circle(I can do it), but I cannot change name of file - 1.dat. I want to change it.
I tried to make such solution. I generated part of possible names and I have written them to separated file.
Import["~/math/third_ks/mixed_matrices/generate_name_of_files.dat", "Table"];
aaaa = %
Output: {{"~/math/third_ks/mixed_matrices/0.dat"}, \
{"~/math/third_ks/mixed_matrices/1.dat"}, ......
All that I want to do is Table[a=Import[aaaa[[i]] ,{i,1,500}]
But the function Import accepts only String " " objects as filename/paths.
You can use FileNames to collect the names of the data files you want to import, with the usual wildcards.
And then just map the Import statement over the list of filenames.
data will then contain a list comprising the data from each file as a separate element.
data = Import[#,"Table"]& /# FileNames["~/math/third_ks/mixed_matrices/*.dat"];
It's a bit hard to work out what is going on without the file of filenames. However, I think you might be able to solve your problem by using Flatten on the list of filenames to make it a vector of String objects that can be passed to Import. Currently your list is an n*1 matrix, where each row is a List containing a String, not a vector of Strings.
Incidentally you could use Map (/#) instead of Table in this instance.
Thank you for your response.
It happened so that I got two solutions in the same time.
I think it would be not fair to forget about second way.
aaaa = "~/math/third_ks/mixed_matrices/" <> ToString[#] <> ".dat" & /# Range[0, 116];
(*This thing generates list of lines
Output:
{"~/math/third_ks/mixed_matrices/0.dat", \
"~/math/third_ks/mixed_matrices/1.dat", \
"~/math/third_ks/mixed_matrices/2.dat", .....etc, until 116
Table[Import[aaaa[[i]], "Table"], {i, 1, 117}];
(*and it just imports data from file*)
bbbb = %; (*here we have all data, voila!*)
Incidentally, it's not my solution.
It was supposed by one my friend:
https://stackoverflow.com/users/1243244/light-keeper

Resources