Python Download Latest CSV from FTP server [duplicate]

Python Download Latest CSV from FTP server [duplicate] - python-3.x

I am using ftplib to connect to an ftp site. I want to get the most recently uploaded file and download it. I am able to connect to the ftp server and list the files, I also have put them in a list and got the datefield converted. Is there any function/module which can get the recent date and output the whole line from the list?
#!/usr/bin/env python
import ftplib
import os
import socket
import sys
HOST = 'test'
def main():
try:
f = ftplib.FTP(HOST)
except (socket.error, socket.gaierror), e:
print 'cannot reach to %s' % HOST
return
print "Connect to ftp server"
try:
f.login('anonymous','al#ge.com')
except ftplib.error_perm:
print 'cannot login anonymously'
f.quit()
return
print "logged on to the ftp server"
data = []
f.dir(data.append)
for line in data:
datestr = ' '.join(line.split()[0:2])
orig-date = time.strptime(datestr, '%d-%m-%y %H:%M%p')
f.quit()
return
if __name__ == '__main__':
main()
RESOLVED:
data = []
f.dir(data.append)
datelist = []
filelist = []
for line in data:
col = line.split()
datestr = ' '.join(line.split()[0:2])
date = time.strptime(datestr, '%m-%d-%y %H:%M%p')
datelist.append(date)
filelist.append(col[3])
combo = zip(datelist,filelist)
who = dict(combo)
for key in sorted(who.iterkeys(), reverse=True):
print "%s: %s" % (key,who[key])
filename = who[key]
print "file to download is %s" % filename
try:
f.retrbinary('RETR %s' % filename, open(filename, 'wb').write)
except ftplib.err_perm:
print "Error: cannot read file %s" % filename
os.unlink(filename)
else:
print "***Downloaded*** %s " % filename
return
f.quit()
return
One problem, is it possible to retrieve the first element from the dictionary? what I did here is that the for loop runs only once and exits thereby giving me the first sorted value which is fine, but I don't think it is a good practice to do it in this way..

For those looking for a full solution for finding the latest file in a folder:
MLSD
If your FTP server supports MLSD command, a solution is easy:
entries = list(ftp.mlsd())
entries.sort(key = lambda entry: entry[1]['modify'], reverse = True)
latest_name = entries[0][0]
print(latest_name)
LIST
If you need to rely on an obsolete LIST command, you have to parse a proprietary listing it returns.
Common *nix listing is like:
-rw-r--r-- 1 user group 4467 Mar 27 2018 file1.zip
-rw-r--r-- 1 user group 124529 Jun 18 15:31 file2.zip
With a listing like this, this code will do:
from dateutil import parser
# ...
lines = []
ftp.dir("", lines.append)
latest_time = None
latest_name = None
for line in lines:
tokens = line.split(maxsplit = 9)
time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
time = parser.parse(time_str)
if (latest_time is None) or (time > latest_time):
latest_name = tokens[8]
latest_time = time
print(latest_name)
This is a rather fragile approach.
MDTM
A more reliable, but a way less efficient, is to use MDTM command to retrieve timestamps of individual files/folders:
names = ftp.nlst()
latest_time = None
latest_name = None
for name in names:
time = ftp.voidcmd("MDTM " + name)
if (latest_time is None) or (time > latest_time):
latest_name = name
latest_time = time
print(latest_name)
For an alternative version of the code, see the answer by #Paulo.
Non-standard -t switch
Some FTP servers support a proprietary non-standard -t switch for NLST (or LIST) command.
lines = ftp.nlst("-t")
latest_name = lines[-1]
See How to get files in FTP folder sorted by modification time.
Downloading found file
No matter what approach you use, once you have the latest_name, you download it as any other file:
with open(latest_name, 'wb') as f:
ftp.retrbinary('RETR '+ latest_name, f.write)
See also
Get the latest FTP folder name in Python
How to get FTP file's modify time using Python ftplib

Why don't you use next dir option?
ftp.dir('-t',data.append)
With this option the file listing is time ordered from newest to oldest. Then just retrieve the first file in the list to download it.

With NLST, like shown in Martin Prikryl's response,
you should use sorted method:
ftp = FTP(host="127.0.0.1", user="u",passwd="p")
ftp.cwd("/data")
file_name = sorted(ftp.nlst(), key=lambda x: ftp.voidcmd(f"MDTM {x}"))[-1]

If you have all the dates in time.struct_time (strptime will give you this) in a list then all you have to do is sort the list.
Here's an example :
#!/usr/bin/python
import time
dates = [
"Jan 16 18:35 2012",
"Aug 16 21:14 2012",
"Dec 05 22:27 2012",
"Jan 22 19:42 2012",
"Jan 24 00:49 2012",
"Dec 15 22:41 2012",
"Dec 13 01:41 2012",
"Dec 24 01:23 2012",
"Jan 21 00:35 2012",
"Jan 16 18:35 2012",
]
def main():
datelist = []
for date in dates:
date = time.strptime(date, '%b %d %H:%M %Y')
datelist.append(date)
print datelist
datelist.sort()
print datelist
if __name__ == '__main__':
main()

I don't know how it's your ftp, but your example was not working for me. I changed some lines related to the date sorting part:
import sys
from ftplib import FTP
import os
import socket
import time
# Connects to the ftp
ftp = FTP(ftpHost)
ftp.login(yourUserName,yourPassword)
data = []
datelist = []
filelist = []
ftp.dir(data.append)
for line in data:
col = line.split()
datestr = ' '.join(line.split()[5:8])
date = time.strptime(datestr, '%b %d %H:%M')
datelist.append(date)
filelist.append(col[8])
combo = zip(datelist,filelist)
who = dict(combo)
for key in sorted(who.iterkeys(), reverse=True):
print "%s: %s" % (key,who[key])
filename = who[key]
print "file to download is %s" % filename
try:
ftp.retrbinary('RETR %s' % filename, open(filename, 'wb').write)
except ftplib.err_perm:
print "Error: cannot read file %s" % filename
os.unlink(filename)
else:
print "***Downloaded*** %s " % filename
ftp.quit()

Related

DNS Python Script to Modify Zone File(s) getting relativity error and inserting escape characters

Relatively new to python and programming in general but would like to automate a DNS migration I'm working on for a large # of domains.
Shamelessly stole some initial framework from AgileTesting Blog.
In it's current state the script
fails to save with errors
inserts escape characters in the new name server records
I can remark out the rdata.rname = respname in the soarr(respname) function so I know it's something specific there. Not sure how to drill down into the issue based on the error.
For the escape characters, I feel like it's something simple but my brain is mushy so just including it as a minor problem.
#!/bin/python3
import re,sys
import dns.zone
from dns.exception import DNSException
from dns.rdataclass import *
from dns.rdatatype import *
script, filename, nameservers = sys.argv
sourcefile = open(filename,"r")
def soarr(respname):
for (name, ttl, rdata) in zone.iterate_rdatas(SOA):
serial = rdata.serial
old_name = rdata.rname
new_serial = serial + 1
print ("Changing SOA serial from %d to %d" %(serial, new_serial))
print ("Changing responsible name from %s to %s" %(old_name, respname))
rdata.serial = new_serial
rdata.rname = respname
rdata.expire = 3600
print (rdata.rname)
def nsrr(nameserver):
NS_add = "#"
target = dns.name.Name((nameserver,))
print ("Adding record of type NS:", NS_add)
rdataset = zone.find_rdataset(NS_add, rdtype=NS, create=True)
rdata = dns.rdtypes.ANY.NS.NS(IN, NS, target)
rdataset.add(rdata, ttl=3600)
print (rdata)
def savefile(domain):
print ("debug",domain)
new_zone_file = "new.%s.hosts" % domain
print ("Writing modified zone to file %s" % new_zone_file)
zone.to_file(new_zone_file,domain)
for domainitem in sourcefile:
domainitem = domainitem.rstrip()
print ("Processing %s." % domainitem)
zone_file = '%s.hosts' % domainitem
zone = dns.zone.from_file(zone_file, domainitem)
# Updating the SOA record, responsible name, lowering TTL and incrementing serial of the zone file.
soarr('systems.example.com')
# Adding name servers to the zone file.
if nameservers == 'customer':
nsrr('ns01.example.com')
if nameservers == 'core':
nsrr("ns01.example2.com")
if nameservers == 'internal':
nsrr("ns01.int.example2.com")
# Save the file as a new file.
savefile(domainitem)
The intent is to cycle through a list of domains from a file, open the appropriate zone file, manipulate the zone and save the changes to a newly named file.
Error on save failure.
Traceback (most recent call last):
File "./zonefile.py", line 62, in <module>
savefile(domainitem)
File "./zonefile.py", line 36, in savefile
zone.to_file(new_zone_file,domain)
File "/usr/local/lib/python3.6/site-packages/dns/zone.py", line 531, in to_file
relativize=relativize)
File "/usr/local/lib/python3.6/site-packages/dns/node.py", line 51, in to_text
s.write(rds.to_text(name, **kw))
File "/usr/local/lib/python3.6/site-packages/dns/rdataset.py", line 218, in to_text
**kw)))
File "/usr/local/lib/python3.6/site-packages/dns/rdtypes/ANY/SOA.py", line 62, in to_text
rname = self.rname.choose_relativity(origin, relativize)
AttributeError: 'str' object has no attribute 'choose_relativity'
As mentioned, remarking out the single line let's the file save. In the saved file the NS entries show escape characters.
# 3600 IN NS ns01\.example\.com

Both of the issues you've encountered - the AttributeError and the escape characters - are because you're not creating your dns.name.Names correctly.
To create a dns.name.Name from a str, it's best to call dns.name.from_text().
Example: name = dns.name.from_text('example.com')
Specifically, in your nsrr function, the second line should be changed to
target = dns.name.from_text(nameserver)
And in your soarr function, you fix it with:
rdata.rname = dns.name.from_text(respname)
Here's a copy of the changes I made (also some minor indentation changes as well).
#!/bin/python3
import re
import sys
import dns.zone
from dns.exception import DNSException
from dns.rdataclass import *
from dns.rdatatype import *
script, filename, nameservers = sys.argv
sourcefile = open(filename,"r")
def soarr(respname):
for (name, ttl, rdata) in zone.iterate_rdatas(SOA):
serial = rdata.serial
old_name = rdata.rname
new_serial = serial + 1
print ("Changing SOA serial from %d to %d" %(serial, new_serial))
print ("Changing responsible name from %s to %s" %(old_name, respname))
rdata.serial = new_serial
rdata.rname = respname
rdata.expire = 3600
print (rdata.rname)
def nsrr(nameserver):
NS_add = "#"
target = dns.name.from_text(nameserver)
print ("Adding record of type NS:", NS_add)
rdataset = zone.find_rdataset(NS_add, rdtype=NS, create=True)
rdata = dns.rdtypes.ANY.NS.NS(IN, NS, target)
rdataset.add(rdata, ttl=3600)
print (rdata)
def savefile(domain):
print ("debug",domain)
new_zone_file = "new.%s.hosts" % domain
print ("Writing modified zone to file %s" % new_zone_file)
zone.to_file(new_zone_file,domain)
for domainitem in sourcefile:
domainitem = domainitem.rstrip()
print ("Processing %s." % domainitem)
zone_file = '%s.hosts' % domainitem
zone = dns.zone.from_file(zone_file, domainitem)
# Updating the SOA record, responsible name, lowering TTL and incrementing serial of the zone file.
soarr(dns.name.from_text('systems.example.com'))
# Adding name servers to the zone file.
if nameservers == 'customer':
nsrr('ns01.example.com')
if nameservers == 'core':
nsrr("ns01.example2.com")
if nameservers == 'internal':
nsrr("ns01.int.example2.com")
# Save the file as a new file.
savefile(domainitem)

Change order in filenames in a folder

I need to rename a bunch of files in a specific folder. They all end with date and time, like for example "hello 2019-05-22 1310.txt" and I want the date and time for each file to be first so I can sort them. With my code I get an error and it wont find my dir where all files are located. What is wrong with the code?
import os
import re
import shutil
dir_path = r'C:\Users\Admin\Desktop\Testfiles'
comp = re.compile(r'\d{4}-\d{2}-\d{2}')
for file in os.listdir(dir_path):
if '.' in file:
index = [i for i, v in enumerate(file,0) if v=='.'][-1]
name = file[:index]
ext = file[index+1:]
else:
ext=''
name = file
data = comp.findall(name)
if len(data)!=0:
date= comp.findall(name)[0]
rest_name = ' '.join(comp.split(name)).strip()
new_name = '{} {}{}'.format(date,rest_name,'.'+ext)
print('changing {} to {}'.format(name, new_name))
shutil.move(os.path.join(dir_path,name), os.path.join(dir_path, new_name))
else:
print('file {} is not change'.format(name))

python 3 tab-delimited file adds column file.write

I'm writing string entries to a tab-delimited file in Python 3. The code that I use to save the content is:
savedir = easygui.diropenbox()
savefile = input("Please type the filename (including extension): ")
file = open(os.path.join(savedir, savefile), "w", encoding="utf-8")
file.write("Number of entities not found: " + str(missing_count) + "\n")
sep = "\t"
for entry in entities:
file.write(entry[0]+"\t")
for item in entry:
file.write(sep.join(item[0]))
file.write("\t")
file.write("\n")
file.close()
The file saves properly. There are no errors or warnings sent to the terminal. When I open the file, I find an extra column has been saved to the file.
Query | Extra | Name
Abu-Jamal, Mumia | A | Mumia Abu-Jamal
Anderson, Walter | A | Walter Inglis Anderson
Anderson, Walter | A | Walter Inglis Anderson
I've added vertical bars between the tabs for clarity; they don't normally appear there. As well, I have removed a few columns at the end. The column between the vertical bars is not supposed to be there. The document that is saved to file is longer than three lines. On each line, the extra column is the first letter of the Query column. Hence, we have A's in these three examples.
entry[0] corresponds exactly to the value in the Query column.
sep.join(item[0]) corresponds exactly to columns 3+.
Any idea why I would be getting this extra column?
Edit: I'm adding the full code for this short script.
# =============================================================================
# Code to query DBpedia for named entities.
#
# =============================================================================
import requests
import xml.etree.ElementTree as et
import csv
import os
import easygui
import re
# =============================================================================
# Default return type is XML. Others: json.
# Classes are: Resource (general), Place, Person, Work, Species, Organization
# but don't include resource as one of the
# =============================================================================
def urlBuilder(query, queryClass="unknown", returns=10):
prefix = 'http://lookup.dbpedia.org/api/search/KeywordSearch?'
#Selects the appropriate QueryClass for the url
if queryClass == 'place':
qClass = 'QueryClass=place'
elif queryClass == 'person':
qClass = 'QueryClass=person'
elif queryClass == 'org':
qClass = 'QueryClass=organization'
else:
qClass = 'QueryClass='
#Sets the QueryString
qString = "QueryString=" + str(query)
#sets the number of returns
qHits = "MaxHits=" + str(returns)
#full url
dbpURL = prefix + qClass + "&" + qString + "&" + qHits
return dbpURL
#takes a xml doc as STRING and returns an array with the name and the URI
def getdbpRecord(xmlpath):
root = et.fromstring(xmlpath)
dbpRecord = []
for child in root:
temp = []
temp.append(child[0].text)
temp.append(child[1].text)
if child[2].text is None:
temp.append("Empty")
else:
temp.append(findDates(child[2].text))
dbpRecord.append(temp)
return dbpRecord
#looks for a date with pattern: 1900-01-01 OR 01 January 1900 OR 1 January 1900
def findDates(x):
pattern = re.compile('\d{4}-\d{2}-\d{2}|\d{2}\s\w{3,9}\s\d{4}|\d{1}\s\w{3,9}\s\d{4}')
returns = pattern.findall(x)
if len(returns) > 0:
return ";".join(returns)
else:
return "None"
#%%
# =============================================================================
# Build and send get requests
# =============================================================================
print("Please select the CSV file that contains your data.")
csvfilename = easygui.fileopenbox("Please select the CSV file that contains your data.")
lookups = []
name_list = csv.reader(open(csvfilename, newline=''), delimiter=",")
for name in name_list:
lookups.append(name)
#request to get the max number of returns from the user.
temp = input("Specify the maximum number of returns desired: ")
if temp.isdigit():
maxHits = temp
else:
maxHits = 10
queries = []
print("Building queries. Please wait.")
for search in lookups:
if len(search) == 2:
queries.append([search[0], urlBuilder(query=search[0], queryClass=search[1], returns=maxHits)])
else:
queries.append([search, urlBuilder(query=search, returns=maxHits)])
responses = []
print("Gathering responses. Please wait.")
for item in queries:
response = requests.get(item[1])
data = response.content.decode("utf-8")
responses.append([item[0], data])
entities = []
missing_count = 0
for item in responses:
temp = []
if len(list(et.fromstring(item[1]))) > 0:
entities.append([item[0], getdbpRecord(item[1])])
else:
missing_count += 1
print("There are " + str(missing_count) + " entities that were not found.")
print("Please select the destination folder for the results of the VIAF lookup.")
savedir = easygui.diropenbox("Please select the destination folder for the results of the VIAF lookup.")
savefile = input("Please type the filename (including extension): ")
file = open(os.path.join(savedir, savefile), "w", encoding="utf-8")
file.write("Number of entities not found: " + str(missing_count) + "\n")
sep = "\t"
for entry in entities:
file.write(entry[0]+"\t")
for item in entry:
file.write(sep.join(item[0]))
file.write("\t")
file.write("\n")
file.close()

python3 pexpect strange behaviour

I have a single threaded program that simply executes commands over ssh and simply looks for output over ssh. But after a while I start getting extrodinarily strange behaviour:
ssh_cmd = 'ssh %s#%s %s' % (user, addr, options)
ssh = pexpect.spawn(ssh_cmd, timeout=60)
lgsuc = ['(?i)(password)')]
for item in loginsuccess:
lgsuc.append(item)
retval = ssh.expect(lgsuc)
for cmd in cmdlist:
time.sleep(0.1)
#this is regex to match anything. Essentially clears the buffer so you don't get an invalid match from before
ssh.expect(['(?s).*'])
ssh.sendline(cmd)
foundind = ssh.expect([re.escape("root#")], 30) #very slow
#repr escape all the wierd stuff so madness doesn't happen with ctr chars
rettxt = repr(ssh.before.decode("us-ascii") + "root:#")
print("We Found:" + rettxt
And it will be fine for about 20 commands or so then madness occurs Assume the right echo is blablabla each time:
We found 'blablabla \r\n\r\n[edit]\r\nroot#'
We found 'blablabla \r\n\r\n[edit]\r\nroot#'
We found 'blablabla \r\n\r\n[edit]\r\nroot#'
... about 20 commands...
We found 'bl\r\nroot#' # here it just missed part of the string in the middle
We found 'alala\r\nroot#'
here is the remainder of the echo from the previous command!!! and the echo of the current command will show up some time later!! and it gets worse and worse. The thing that is strange is it is in the middle of the return byte array.
Now there are some wierd control codes coming back from this device so if I replace:
rettxt = repr(ssh.before.decode("us-ascii") + "root:#")
with
rettxt = repr(ssh.before.decode("us-ascii") + "root:#")
then
print("We Found:" + rettxt)
returns:
root#e Found lala
Anyway there is really strange stuff going on with pexpect and the buffers, and I can't figure out what it is so any help would be appreciated. I should mention I never get teh timeout, the dive always responds. Also the total number of "root:#"in the log file exceedes the total number of lines sent.
If I go through and remove all ctl codes, the output looks cleaner but the problem still persists, its as if pextect cant handle ctl coodes in its buffer correctly. Any help is appreciated
UPDATE Minimum verifiable example
Ok I have been able to recreate PART of the problem on an isolated ubuntu environment sshing into itself.
first I need to create 4 commands that can be run on a host target, so put the follwing for files in ~/ I did this in ubuntu
~/check.py
#!/usr/bin/python3
import time
import io
#abcd^H^H^H^H^MABC
#mybytes = b'\x61\x62\x63\x64\x08\x08\x08\x0D\x41\x42\x43'
#abcdACB
mybytes = b'\x61\x62\x63\x64\x41\x42\x43'
f = open('test.txt', 'wb')
#time.sleep(1)
f.write(mybytes)
print(mybytes.decode('ascii'))
f.close()
~/check2.py
#!/usr/bin/python3
import time
import io
#0123^H^H^H^H^MABC
mybytes = b'\x30\x31\x32\x33\x08\x0D\x0D\x08\x08\x08\x08\x0D\x41\x42\x43'
f = open('test2.txt', 'wb')
#time.sleep(0.1)
f.write(mybytes)
print(mybytes.decode('ascii'))
f.close()
~/check3.py
#!/usr/bin/python3
import time
import io
#789:^H^H^H^H^DABC
mybytes = b'\x37\x38\x39\x3A\x08\x08\x08\x08\x08\x08\x08\x0D\x0D\x41\x42\x43'
f = open('test3.txt', 'wb')
#time.sleep(0.1)
f.write(mybytes)
print(mybytes.decode('ascii'))
f.close()
And lastly check4.py Sorry it took a wierd combination for the problem to show back up
#!/usr/bin/python3
import time
import io
#abcd^H^H^H^HABC
mybytes = b'\x61\x62\x63\x64\x08\x08\x08\x0D\x41\x42\x43'
f = open('test.txt', 'wb')
time.sleep(4)
f.write(mybytes)
print(mybytes.decode('ascii'))
f.close()
Noticing that the last one has a bigger sleep, this is to encounter texpect timeout. Though on my actual testing this doesn't occue, I have commands that take over 6 minutes to return any text so this might be part of it. Ok and the final file to run everything. It might look ugly but I did a massive trim so I could post it here:
#! /usr/bin/python3
#import paramiko
import time
import sys
import xml.etree.ElementTree as ET
import xml
import os.path
import traceback
import re
import datetime
import pexpect
import os
import os.path
ssh = None
respFile = None
#Error Codes:
DEBUG = True
NO_ERROR=0
EXCEPTION_THROWS=1
RETURN_TEXT_NEVER_FOUND = 2
LOG_CONSOLE = 1
LOG_FILE = 2
LOG_BOTH = 3
def log(out, dummy=None):
print(str(log))
def connect(user, addr, passwd):
global ssh
fout = open('session.log', 'wb')
#options = '-q -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oPubkeyAuthentication=no'
options = ' -oUserKnownHostsFile=/dev/null '
#options = ''
#REPLACE WITH YOU LOCAL USER NAME
#user = 'user'
#REPLACE WITH YOUR LOCAL PASSWORD
#passwd = '123TesT321'
#addr = '127.0.0.1'
ssh_cmd = 'ssh %s#%s %s' % (user, addr, options)
public = None
private = None
retval = 0
try:
ssh = pexpect.spawn(ssh_cmd, timeout=60)
ssh.logfile = fout
#the most common prompts
loginsuccess = [re.escape("#"), re.escape("$")]
lgsuc = ['(?i)(password)', re.escape("connecting (yes/no)? ")]
for item in loginsuccess:
lgsuc.append(item)
retval = ssh.expect(lgsuc)
except pexpect.TIMEOUT as exc:
log("Server never connected to SSH tunnel")
return 0
print('where here ret val = ' + str(retval))
try:
if(retval > 1):
return 1
elif(retval == 1):
hostkey = ssh.before.decode("utf-8")
ssh.sendline("yes")
log("Warning! new host key was added to the database: " + hostkey.split("\n")[1])
lgsuc = ['password: ']
for item in loginsuccess:
lgsuc.append(item)
retval = ssh.expect(lgsuc)
if(retval > 0):
return 1
else:
if(public is not None):
log("Warning public key authentication failed trying password if available...")
else:
if public is not None:
log("Warning public key authentication failed trying password if available...")
if(passwd is None):
log("No password and certificate authentication failed...")
return 0
ssh.sendline(passwd)
login = ['password: ' ]
for item in loginsuccess:
login.append(item)
retval = ssh.expect(login)
except pexpect.TIMEOUT as exc:
log("Server Never responded with expected login prompt: "+lgstr)
return 0
#return 0
if retval > 0:
retval = 1
if retval == 0:
log("Failed to connect to IP:"+addr +" User:"+user+" Password:"+passwd)
return retval
def disconnect():
log("Disconnecting...")
global ssh
if ssh is not None:
ssh.close()
else:
log("Something wierd happened with the SSH client while closing the session. Shouldn't really matter", False)
def RunCommand(cmd, waitTXT, timeout = 5):
global ssh
Error = 0
if DEBUG:
print('Debugging: cmd: '+ cmd+'. timeout: '+str(timeout) +'. len of txt tags: '+ str(len(waitTXT)))
if(type(waitTXT) is str):
waitTXT = [re.excape(waitTXT)]
elif(not hasattr(waitTXT ,'__iter__')):
waitTXT = [re.escape(str(waitTXT))]
else:
cnter = 0
for TXT in waitTXT:
waitTXT[cnter] = re.escape(str(TXT))
cnter +=1
#start = time.time()
#print("type3: "+str(type(ssh)))
#time.sleep(1)
#this is regex to match anything. Essentially clears the buffer so you don't get an invalid match from before
ssh.expect(['(?s).*'])
ssh.sendline(cmd)
print("Debugging: sent: "+cmd)
#GoOn = True
rettxt = ""
try:
foundind = ssh.expect(waitTXT, timeout)
allbytes = ssh.before
newbytes = bytearray()
for onebyte in allbytes:
if onebyte > 31:
newbytes.append(onebyte)
allbytes = bytes(newbytes)
rettxt = repr(allbytes.decode("us-ascii") + waitTXT[foundind])
#rettxt = ssh.before + waitTXT[foundind]
if DEBUG:
print("Debugging: We found "+rettxt)
except pexpect.TIMEOUT as exc:
if DEBUG:
txtret = ""
for strtxt in waitTXT:
txtret += strtxt +", "
print("ERROR Debugging: we timed out waiting for text:"+txtret)
pass
return (rettxt, Error)
def CloseAndExit():
disconnect()
global respFile
if respFile is not None and '_io.TextIOWrapper' in str(type(respFile)):
if not respFile.closed:
respFile.close()
def main(argv):
try:
cmds = ['~/check.py', '~/check2.py', '~/check3.py', '~/check2.py', '~/check3.py','~/check.py', '~/check2.py', '~/check3.py', '~/check2.py', '~/check3.py', '~/check4.py', '~/check3.py','~/check.py', '~/check2.py',]
##CHANGE THESE TO MTACH YOUR SSH HOST
ret = connect('user', '127.0.0.1', 'abcd1234')
for cmd in cmds:
cmdtxt = str(cmd)
#rett = RunCommand(ssh, "ls", "root", 0, 5)
strlen = (170 - (len(cmdtxt)))/2
dashval = ''
starval = ''
tcnt = 0
while(tcnt < strlen):
dashval +='-'
starval +='*'
tcnt +=1
if DEBUG:
print(dashval+cmdtxt+dashval)
#checkval = ['ABC']
#REPLACE THE FOLLOWING LINE WITH YOUR TARGET PROMPT
checkval = ['user-virtual-machine:~$']
rett = RunCommand(cmdtxt, checkval, 2)
if DEBUG:
print(starval+cmdtxt+starval)
except Exception as e:
exc_type, exc_obj, exc_tb = sys.exc_info()
fname = os.path.split(exc_tb.tb_frame.f_code.co_filename)[1]
print(exc_type, fname, exc_tb.tb_lineno)
print(traceback.format_exc())
CloseAndExit()
#disconnect()
#respFile.close()
main(sys.argv)
Make sure that all for checks and the main python script are executble in permission via sudo chmod 774 or similar. In the main function call set your username ipaddress and password to where your target is that has the check.py and make sure they are in your ~/ directory.
Once you run this you can look at the session.log and at least on mind there is some wierd stuff going on with the buffer:
~/check4.py^M
~/check3.py
~/check3.py^M
abcd^H^H^H^MABC^M
^[]0;user#user-virtual-machine: ~^Guser#user-virtual-machine:~$ ~/check.py
~/check3.py^M
789:^H^H^H^H^H^H^H^M^MABC^M
^[]0;user#user-virtual-machine: ~^Guser#user-virtual-machine:~$ ~/check.py~/check2.py
And unfortunately its not as corrupt as my actual prbolem but I have several hundred commands I an embedded custom linux kernal that I obviously can't recreate for everyone. But anyway any help is greatly appreciated. Hopefully these examples work you you, I am just on ubuntu 16.04 lts. Also make sure to replace 'user-virtual-machine:~$' with whatever your target login prompt looks like

MySQL date range pull using Python : TypeError:unhashable type: 'bytearray'

I am trying to figure out how to pull data from a table that has a column that is called 'sent_time' and the datetime falls between two datetimes. I was finally able to figure out how to use dateutil parser to be able to input the two dates for the date range pull. My problem now is that I'm getting this error:
Traceback (most recent call last):
File "C:\Python34\timerange.py", line 75, in <module>
worksheet.write(r,0,row[0])
File "C:\Python34\lib\site-packages\xlsxwriter\worksheet.py", line 64, in cell_wrapper
return method(self, *args, **kwargs)
File "C:\Python34\lib\site-packages\xlsxwriter\worksheet.py", line 436, in write
return self.write_string(row, col, *args)
File "C:\Python34\lib\site-packages\xlsxwriter\worksheet.py", line 64, in cell_wrapper
return method(self, *args, **kwargs)
File "C:\Python34\lib\site-packages\xlsxwriter\worksheet.py", line 470, in write_string
string_index = self.str_table._get_shared_string_index(string)
File "C:\Python34\lib\site-packages\xlsxwriter\sharedstrings.py", line 128, in _get_shared_string_index
if string not in self.string_table:
TypeError: unhashable type: 'bytearray'
It's the bytearray that has got me puzzled. Could you guys tell me what I'm doing wrong and how I can fix it?
I want to give you all the information I have with all the other files and what I'm shooting for to see if you can replicate and actually get it working just to see if it's not just my system or some configuration I have..
I have a database with one table.Lets call it ‘table1’ The table is broken down with columns like this:
sent_time | delivered_time |id1_active |id2_active |id3_active |id1_inactive |id2_inactive |id3_inactive |location_active |location_inactive …..`lots more
Lets say that these are two or more customers delivering goods to and from each other. Each customer has three id#s.
I created a ‘config.ini’ file to make my life a bit easier
[mysql]
host = localhost
database = db_name
user = root
password = blahblah
I created a ‘python_mysql_dbconfig.py’
from configparser import ConfigParser
def read_db_config(filename=’config.ini’, section=’mysql’):
“”” Read database configuration file and return a dictionary object
:param filename: name of the configuration file
:param section: section of database configuration
:return: a dictionary of database parameters
“””
# create parser and read ini configuration file
parser = ConfigParser()
parser.read(filename)
# get section, default to mysql
db = {}
if parser.has_section(section):
items = parser.items(section)
for item in items:
db[item[0]] = item[1]
else:
raise Exception(‘{0} not found in the {1} file’.format(section, filename))
return db
This is the code that I'm working on right now...could you take a look?
# Establish a MySQL connection
from mysql.connector import MySQLConnection, Error
from python_mysql_dbconfig import read_db_config
db_config = read_db_config()
conn = MySQLConnection(**db_config)
cursor = conn.cursor(raw=True)
#to export to excel
import xlsxwriter
from xlsxwriter.workbook import Workbook
#to get the csv converter functions
import os
import subprocess
import glob
#to get the datetime functions
import datetime
from datetime import datetime
import dateutil.parser
#creates the path needed for output files
path = 'C:/Python34/output_files/'
#creates the workbook
output_filename = input('output filename:')
workbook = xlsxwriter.Workbook(path + output_filename + '.xlsx')
worksheet = workbook.add_worksheet()
#formatting definitions
bold = workbook.add_format({'bold': True})
date_format = workbook.add_format({'num_format': 'yyyy-mm-dd hh:mm:ss'})
timeShape = '%Y-%m-%d %H:%M:%S'
#actual query
query = (
"SELECT sent_time, delivered_time, OBJ, id1_active, id2_active, id3_active, id1_inactive, id2_inactive, id3_inactive, location_active, location_inactive FROM table1 "
"WHERE sent_time BETWEEN %s AND %s"
)
userIn = dateutil.parser.parse(input('start date:'))
userEnd = dateutil.parser.parse(input('end date:'))
# Execute sql Query
cursor.execute(query,(userIn, userEnd))
result = cursor.fetchall()
#sets up the header row
worksheet.write('A1','sent_time',bold)
worksheet.write('B1', 'delivered_time',bold)
worksheet.write('C1', 'customer_name',bold)
worksheet.write('D1', 'id1_active',bold)
worksheet.write('E1', 'id2_active',bold)
worksheet.write('F1', 'id3_active',bold)
worksheet.write('G1', 'id1_inactive',bold)
worksheet.write('H1', 'id2_inactive',bold)
worksheet.write('I1', 'id3_inactive',bold)
worksheet.write('J1', 'location_active',bold)
worksheet.write('K1', 'location_inactive',bold)
worksheet.autofilter('A1:K1') #dropdown menu created for filtering
#print into client to see that you have results
print(" sent_time ", " delivered_time ", "OBJ", "\t id1_active ", " id2_active ", " id3_active ", "\t", " id1_inactive ", " id2_inactive ", " id3_inactive ", "\tlocation_active", "\tlocation_inactive")
for row in result:
print(*row, sep='\t')
# Create a For loop to iterate through each row in the XLS file, starting at row 2 to skip the headers
for r, row in enumerate(result, start=1): #where you want to start printing results inside workbook
for c, col in enumerate(row):
worksheet.write_datetime(r,0,row[0], date_format)
worksheet.write_datetime(r,1, row[1], date_format)
worksheet.write(r,2, row[2])
worksheet.write(r,3, row[3])
worksheet.write(r,4, row[4])
worksheet.write(r,5, row[5])
worksheet.write(r,6, row[6])
worksheet.write(r,7, row[7])
worksheet.write(r,8, row[8])
worksheet.write(r,9, row[9])
worksheet.write(r,10, row[10])
#close out everything and save
cursor.close()
workbook.close()
conn.close()
#print number of rows and bye-bye message
print ("- - - - - - - - - - - - -")
rows = len(result)
print ("I just imported "+ str(rows) + " rows from MySQL!")
print ("")
print ("Good to Go!!!")
print ("")
#CONVERTS JUST CREATED FILE TO CSV
# set path to folder containing xlsx files
out_path ='C:/Python34/csv_files'
os.chdir(path)
# find the file with extension .xlsx
xlsx = glob.glob(output_filename + '.xlsx')
# create output filenames with extension .csv
csvs = [x.replace('.xlsx','.csv') for x in xlsx]
# zip into a list of tuples
in_out = zip(xlsx,csvs)
# loop through each file, calling the in2csv utility from subprocess
for xl,csv in in_out:
out = open(csv,'w')
command = 'c:/python34/scripts/in2csv %s\\%s' % (path,xl)
proc = subprocess.Popen(command,stdout=out)
proc.wait()
out.close()
print('XLSX and CSV files named ' + output_filename + ' were created')

You've disabled type conversion in cursor = conn.cursor(raw=True). Remove the raw=True so the driver stops giving you straight bytearrays for all types.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Python Download Latest CSV from FTP server [duplicate] - python-3.x

Why don't you use next dir option? ftp.dir('-t',data.append) With this option the file listing is time ordered from newest to oldest. Then just retrieve the first file in the list to download it.

With NLST, like shown in Martin Prikryl's response, you should use sorted method: ftp = FTP(host="127.0.0.1", user="u",passwd="p") ftp.cwd("/data") file_name = sorted(ftp.nlst(), key=lambda x: ftp.voidcmd(f"MDTM {x}"))[-1]

Related

DNS Python Script to Modify Zone File(s) getting relativity error and inserting escape characters

Change order in filenames in a folder

python 3 tab-delimited file adds column file.write

python3 pexpect strange behaviour

MySQL date range pull using Python : TypeError:unhashable type: 'bytearray'

Categories

Resources