Chop a file in Julia

Chop a file in Julia - string

I have opened a file in Julia:
output_file = open(path_to_file, "a")
And I would like to chop the six last characters of the file.
I thought I could do it with chop, i.e., chop(output_file; tail = 6) but it seems it only works with String type and not with IOStream. How should I do?
julia> rbothpoly(0, 1, [5], 2, 30, "html")
ERROR: MethodError: no method matching chop(::IOStream; tail=6)
Closest candidates are:
chop(::AbstractString; head, tail) at strings/util.jl:164
Stacktrace:
[1]
[...] ERROR STACKTRACE [...]
[3] top-level scope at REPL[37]:1
I am new to IOStream, discovering them today.

In your case, because you're doing a single write to the end of the file and not doing any further read or other operations, you can also edit the file in-place like this:
function choppre(fname = "data/endinpre.html")
linetodelete = "</pre>\n"
linelength = length(linetodelete)
open(fname, "r+") do f
readuntil(f, linetodelete)
seek(f, position(f) - linelength)
write(f, " "^linelength)
end
end
This overwrites the text we wish to chop off with an equal length of space characters. I'm not sure if there's a way to simply delete the line (instead of overwriting it with ' ').

I have found what I wanted here, which adapts in my problem to:
(tmppath, tmpio) = mktemp()
open(output_filename, "r") do io
for line in eachline(io, keep=true) # keep so the new line isn't chomped
if line == "</pre>\n"
line = "\n"
end
write(tmpio, line)
end
end
close(tmpio)
mv(tmppath, output_filename, force=true)
chmod(output_filename, 0o777)
close(output_file)
Maybe my question could be marked as duplicate!

Related

I am trying to append a line of Text to a list

Seq = []
Head = []
for line in range (0, len(text)):
if line in '>':
Head.append(line)
else:
Seq.append(line)
I am trying to append the header of FASTA sequences and the nucleotide sequence and separate them on a list.
I don't know how to say that if line has '>', add to Head, else add to Seq

The line: line in '>' is testing whether line can be found inside the string '>'. You need to swap them around to '>' in line. This will test if the string '>' can be found inside line. If you are trying t test if the first character of line is '>', use 'line[0] == '>'.
Also when using range the start will default to zero so you could say for x in range(len(text))
Final code:
Seq = []
Head = []
for line in range (len(text)):
if '>' in line:
Head.append(line)
else:
Seq.append(line)

Having Issues Concatenating Strings into list without \n - Python3

I am currently having some issues trying to append strings into a new list. However, when I get to the end, my list looks like this:
['MDAALLLNVEGVKKTILHGGTGELPNFITGSRVIFHFRTMKCDEERTVIDDSRQVGQPMH\nIIIGNMFKLEVWEILLTSMRVHEVAEFWCDTIHTGVYPILSRSLRQMAQGKDPTEWHVHT\nCGLANMFAYHTLGYEDLDELQKEPQPLVFVIELLQVDAPSDYQRETWNLSNHEKMKAVPV\nLHGEGNRLFKLGRYEEASSKYQEAIICLRNLQTKEKPWEVQWLKLEKMINTLILNYCQCL\nLKKEEYYEVLEHTSDILRHHPGIVKAYYVRARAHAEVWNEAEAKADLQKVLELEPSMQKA\nVRRELRLLENRMAEKQEEERLRCRNMLSQGATQPPAEPPTEPPAQSSTEPPAEPPTAPSA\nELSAGPPAEPATEPPPSPGHSLQH\n']
I'd like to remove the newlines somehow. I looked at other questions on here and most suggest to use .rstrip however in adding that to my code, I get the same output. What am I missing here? Apologies if this question has been asked.
My input also looks like this(took the first 3 lines):
sp|Q9NZN9|AIPL1_HUMAN Aryl-hydrocarbon-interacting protein-like 1 OS=Homo sapiens OX=9606 GN=AIPL1 PE=1 SV=2
MDAALLLNVEGVKKTILHGGTGELPNFITGSRVIFHFRTMKCDEERTVIDDSRQVGQPMH
IIIGNMFKLEVWEILLTSMRVHEVAEFWCDTIHTGVYPILSRSLRQMAQGKDPTEWHVHT
from sys import argv
protein = argv[1] #fasta file
sequence = '' #string linker
get_line = False #False = not the sequence
Uniprot_ID = []
sequence_list =[]
with open(protein) as pn:
for line in pn:
line.rstrip("\n")
if line.startswith(">") and get_line == False:
sp, u_id, name = line.strip().split('|')
Uniprot_ID.append(u_id)
get_line = True
continue
if line.startswith(">") and get_line == True:
sequence.rstrip('\n')
sequence_list.append(sequence) #add the amino acids onto the list
sequence = '' #resets the str
if line != ">" and get_line == True: #if the first line is not a fasta ID and is it a sequence?
sequence += line
print(sequence_list)

Per documentation, rstrip removes trailing characters – the ones at the end. You probably misunderstood others' use of it to remove \ns because typically those would only appear at the end.
To replace a character with something else in an entire string, use replace instead.
These commands do not modify your string! They return a new string, so if you want to change something 'in' a current string variable, assign the result back to the original variable:
>>> line = 'ab\ncd\n'
>>> line.rstrip('\n')
'ab\ncd' # note: this is the immediate result, which is not assigned back to line
>>> line = line.replace('\n', '')
>>> line
'abcd'

When I asked this question I didn't take my time in looking at documentation & understanding my code. After looking, I realized two things:
my code isn't actually getting what I am interested in.
For the specific question I asked, I could have simply used line.split() to remove the '\n'.
sequence = '' #string linker
get_line = False #False = not the sequence
uni_seq = {}
"""this block of code takes a uniprot FASTA file and creates a
dictionary with the key as the uniprot id and the value as a sequence"""
with open (protein) as pn:
for line in pn:
if line.startswith(">"):
if get_line == False:
sp, u_id, name = line.strip().split('|')
Uniprot_ID.append(u_id)
get_line = True
else:
uni_seq[u_id] = sequence
sequence_list.append(sequence)
sp, u_id, name = line.strip().split('|')
Uniprot_ID.append(u_id)
sequence = ''
else:
if get_line == True:
sequence += line.strip() # removes the newline space
uni_seq[u_id] = sequence
sequence_list.append(sequence)

Reading in a file of one-word lines in python

Just curious if there's a cleaner way to do this. I have a list of words in a file, one word per line.
I want to read them in and pass each word to a function.
I've currently got this:
f = open(fileName,"r");
lines = f.readlines();
count = 0
for i in lines:
count += 1
print("--{}--".format(i.rstrip()))
if count > 100:
return
I there a way to read them in faster without using rstrip on each line?

with open(fileName) as f:
lines = (line for _, line in zip(range(100), f.readlines()))
for line in lines:
print('--{}--'.format(line.rstrip()))
This is how I would do it. Note the context manager (the with/as statement), and the generator comprehension giving us only the first 100 lines.

Similar to Patrick's answer:
with open(filename, "r") as f:
for i, line in enumerate(f):
if i >= 100:
break
print("--{}--".format(line[:-1]))
If you don't an .strip() and know the length line terminator, you can use [:-1].

Problems with File in Lua

I want know how to delete a specific file line ?
example : file.txt
facebook
twitter
orkut
msn
Suppose I want to delete the line 3 then the file would be :
facebook
twitter
msn
I do not want to just delete the lines, need to organize and avoid getting empty lines in the file!

Load the file contents, manipulate them in memory, then write the new contents back to the file.
In this case you can load the file contents line by line using files.lines, store the ones you want in an array and leave out the ones you don't, then turn the array back into a string with table.concat.

You could look for a specific item by matching a string:
function func(file, toDelete)
local t = {}
local tt = {}
for line in io.lines(file) do
table.insert(t, line)
end
for c, r in pairs(t) do
if string.sub(r, 4) ~= toDelete then
table.insert(tt, string.sub(r, 4))
end
end
local nfile = io.open(file, "w+")
for a, b in pairs(tt) do
nfile:write(a .. ". " .. b .. "\n")
end
end
or by looking for the number:
function func(file, num)
local t = {}
local tt = {}
for line in io.lines(file) do
table.insert(t, line)
end
for c, r in pairs(t) do
if c ~= num then
table.insert(tt, string.sub(r, 4))
end
end
local nfile = io.open(file, "w+")
for a, b in pairs(tt) do
nfile:write(a .. ". " .. b .. "\n")
end
end
Carefull: this overrides the original file!
EDIT:
This is working for the example above, without the numbers in the front, you don't have to substitute the string.

remove the item in string

How do I remove the other stuff in the string and return a list that is made of other strings ? This is what I have written. Thanks in advance!!!
def get_poem_lines(poem):
r""" (str) -> list of str
Return the non-blank, non-empty lines of poem, with whitespace removed
from the beginning and end of each line.
>>> get_poem_lines('The first line leads off,\n\n\n'
... + 'With a gap before the next.\nThen the poem ends.\n')
['The first line leads off,', 'With a gap before the next.', 'Then the poem ends.']
"""
list=[]
for line in poem:
if line == '\n' and line == '+':
poem.remove(line)
s = poem.remove(line)
for a in s:
list.append(a)
return list

split and strip might be what you need:
s = 'The first line leads off,\n\n\n With a gap before the next.\nThen the poem ends.\n'
print([line.strip() for line in s.split("\n") if line])
['The first line leads off,', 'With a gap before the next.', 'Then the poem ends.']
Not sure where the + fits in as it is, if it is involved somehow either strip or str.replace it, also avoid using list as a variable name, it shadows the python list.
lastly strings have no remove method, you can .replace but since strings are immutable you will need to reassign the poem to the the return value of replace i.e poem = poem.replace("+","")

You can read all non-empty lines like this:
list_m = [line if line not in ["\n","\r\n"] for line in file];
Without looking at your input sample, I am assuming that you simply want your white spaces to be removed. In that case,
for x in range(0, len(list_m)):
list_m[x] = list_m[x].replace("[ ](?=\n)", "");

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Chop a file in Julia - string

Related

I am trying to append a line of Text to a list

Having Issues Concatenating Strings into list without \n - Python3

Reading in a file of one-word lines in python

Problems with File in Lua

remove the item in string

Categories

Resources