I am learning how to use ReadyAPI and trying to validate a response's XML content against a Schema file. My code is pretty simple, but I am receiving "Could not find named-arg compatible constructor". I have confirmed the existence of the schema file. And that there is an appropriate response in XML format. I have peppered the code with logs but still don't have any idea where to focus.
import javax.xml.XMLConstants
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.SchemaFactory
import com.eviware.soapui.support.XmlHolder
// retrieve schema file
def xsdSchema = "/schema/v2_0/qm.xsd"
log.info xsdSchema.toString()
// get the XML Response
def groovyUtils = new com.eviware.soapui.support.GroovyUtils( context )
def response = groovyUtils.getXmlHolder( 'Get Test Case Result#ResponseAsXML' )
log.info response
// create validation objects
def factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
def schema = factory.newSchema(new StreamSource(xsdSchema))
def validator = schema.newValidator()
log.info factory.toString()
log.info schema.toString()
log.info validator.toString()
// Validate the response against the schema
assert validator.validate(new StreamSource(new StringReader(response))) == null // This is line 28
The output of my logs are as follows:
Wed Jan 25 13:15:43 EST 2023: INFO: /schema/v2_0/qm.xsd
Wed Jan 25 13:15:43 EST 2023: INFO: com.eviware.soapui.support.XmlHolder#79c0a307
Wed Jan 25 13:15:44 EST 2023: INFO: org.apache.xerces.jaxp.validation.XMLSchemaFactory#a4effe4
Wed Jan 25 13:15:44 EST 2023: INFO: org.apache.xerces.jaxp.validation.XMLSchema#1a8a805d
Wed Jan 25 13:15:44 EST 2023: INFO: org.apache.xerces.jaxp.validation.ValidatorImpl#201dce24
Full Error message:
groovy.lang.GroovyRuntimeException: Could not find named-arg compatible constructor. Expecting one of:
java.io.StringReader(com.eviware.soapui.support.XmlHolder)
java.io.StringReader()
error at line: 28
I am in the process of moving a python process to Spark. In python we are using ftplib to connect and download a file to a EC2 instance. Once file is downloaded, we are uploading to S3. We are transitioning to severless infrastructure and would like to load file in spark via AWS Glue and then use mulit-part upload to move it to S3. I have tried to just run the current code in a in a larger glue instance type but the machine still runs out of memory (20gb file).
old python code
"""
This script will get the backup file
"""
import sys
from datetime import datetime
import re
import ftplib
from retry import retry
import shutil
from tools.python.s3_functions import s3_upload
from python_scripts.get import *
def get_ftp_connector(path, user, password):
ftp = ftplib.FTP_TLS(path)
ftp.login(user, password)
ftp.prot_p()
return ftp
def get_ftp_files_list(ftp, dir):
ftp.cwd(dir)
files = ftp.nlst()
print(str("-".join(files)))
if "filecompleted.txt" not in files:
print("Failed to find filescompleted.txt file in ftp server.")
raise Exception("Failed to find filescompleted.txt file in ftp server.")
regex_str = 'Backup_File_Mask_Goes_here([\d]{8}).bak'
find_date_regex = re.compile(regex_str)
searched = [(f, find_date_regex.match(f)) for f in files if find_date_regex.match(f)]
searched = \
[(file_name, datetime.strptime(regex_result.groups()[0], '%Y%m%d')) for file_name, regex_result in searched]
searched = sorted(searched, key=lambda elem: elem[1], reverse=True)
if not searched:
print("Failed to find appropriate file in ftp server.")
raise Exception("Failed to find appropriate file in ftp server.")
return searched[0]
class FtpUploadTracker:
size_written = 0
total_size = 0
last_shown_percent = "X"
def __init__(self, total_size, bk_file):
self.total_size = total_size
self.bk_file = bk_file
self.output_file = open(self.bk_file, 'wb')
self.start_time = datetime.now()
def handle(self, block):
self.size_written += len(block)
percent_complete = str(round((self.size_written / self.total_size) * 100, 1))
self.output_file.write(block)
time_elapsed = (datetime.now() - self.start_time).total_seconds()
speed = round(self.size_written / (1000 * 1000 * time_elapsed), 2)
msg = "{percent}% complete # average speed of {speed}MB/s : total run time {minutes}m".\
format(percent=percent_complete, speed=speed, minutes=round(time_elapsed/60))
if time_elapsed > 600 and speed < 1:
print("Zombie connection, failing dl.")
raise Exception("Zombie connection, failing dl.")
if self.last_shown_percent != percent_complete:
self.last_shown_percent = percent_complete
print(msg)
def close(self):
self.output_file.close()
#retry(tries=4, delay=300)
def retrieve_db():
"""
This function will retrieve via FTP the backup
:return: None
"""
ftp = get_ftp_connector(FTP_PATH, FTP_USER, FTP_PASSWORD)
# return back the most recent entry
file_name, file_date = get_ftp_files_list(ftp, 'database')
file_epoch = (file_date - datetime(1970, 1, 1)).total_seconds()
new_file_name = "backup_{epoch}.bak".format(epoch=str(int(file_epoch)))
if os.path.exists(DATAFILEPATH):
shutil.rmtree(DATAFILEPATH)
if not os.path.exists(DATAFILEPATH):
os.makedirs(DATAFILEPATH)
temp_backup_file_location = os.path.join(DATAFILEPATH + new_file_name)
print("Found file {file_name}, and downloading it to {loc}".
format(file_name=file_name, loc=temp_backup_file_location))
ftp_handler = FtpUploadTracker(ftp.size(file_name), temp_backup_file_location)
ftp.retrbinary("RETR " + file_name, ftp_handler.handle)
ftp.quit()
ftp_handler.close()
print("Finished download. Uploading to S3.")
s3_upload(DATAFILEPATH, new_file_name, bucket, "db_backup")
os.remove(temp_backup_file_location)
def main():
try:
retrieve_db()
except Exception as e:
print("Failed to download backup after 4 tries with error {e}.".format(e=e))
return 1
return 0
if __name__ == "__main__":
rtn = main()
sys.exit(rtn)
New Spark Code (in progress): The username has a | character that made me encode the uri. When I run the code, I get a connection refused. I am able to use same connection info for python.
from pyspark import SparkContext
from pyspark import SparkFiles
import urllib
sc = SparkContext()
ftp_path = "ftp://Username:password#ftplocation.com/path_to_file"
file_path_clean = urllib.parse.urlencode(ftp_path, safe='|')
print(f"file_path_clean: {file_path_clean}")
sc.addFile(ftp_path)
filename = SparkFiles.get(file_path.split('/')[-1])
print(f"filename: {filename}")
rdd = sc.textFile("file://" + filename)
print("We got past rdd = sc.textFile(file:// + filename)")
rdd.take(10)
rdd.collect()
print(rdd)
There are three ways to approach the problem:
Use a mounted file system backed by FTP and write to it from Spark.
Use a Spark to SFTP connector such as spark-sftp.
Write the files with Spark somewhere else and copy to SFTP as a separate step. Due to the various reliability issues with SFTP and the fact that Spark leaves partial output during failed write operations, which is the path that we've taken. We write terabytes to SFTP endpoints using code that looks like the following in Scala. I hope it can be helpful for you Python work.
/** Defines some high-level operations for interacting with remote file protocols like FTP, SFTP, etc.
*/
trait RemoteFileOperations extends Closeable {
var backoff: BlockingRetry.Backoff = Backoff.linear(3000)
var retry: BlockingRetry.Retry = Retry.maxRetries(3)
var recover: Recovery = recoverable(this)
var ignore: Ignored = nonRecoverable
def listFiles(path: String = ""): Seq[FInfo]
def uploadFile(localPath: String, remoteDirectory: String): Unit
def downloadFile(localPath: String, remotePath: String): Unit
def deleteAll(path: String): Unit
def connect(): Unit = {}
def disconnect(): Unit = {}
def reconnect(): Unit = {
disconnect()
connect()
}
override def close(): Unit = disconnect()
/** Wraps a block of code and allows it to be retried when [[recoverable()]] conditions
* are met. [[BlockingRetry.retry()]] is called with the var fields
* [[backoff]], [[retry]], [[recover]], and [[ignore]], which can all be reconfigured.
*/
def retryable[A](f: => A): A = {
BlockingRetry.retry(retry, backoff, recover, ignore) {
f
}
}
def recoverable(fileOp: RemoteFileOperations): Recovery = {
case (_: SocketTimeoutException, _: Int) =>
fileOp.reconnect()
None
}
def nonRecoverable: Ignored = {
case _: UnknownHostException |
_: SSLException |
_: SocketException |
_: IllegalStateException =>
}
}
class SSHJClient(host: String, username: String, password: String) extends RemoteFileOperations {
import net.schmizz.keepalive.KeepAliveProvider
import net.schmizz.sshj.connection.ConnectionException
import net.schmizz.sshj.sftp.SFTPClient
import net.schmizz.sshj.transport.verification.PromiscuousVerifier
import net.schmizz.sshj.xfer.FileSystemFile
import net.schmizz.sshj.{DefaultConfig, SSHClient}
override def listFiles(path: String): Seq[FInfo] = {
import collection.JavaConverters._
retryable {
sftpSession(sftp => {
sftp.ls(path).asScala
.filter(f => f.getName != "." && f.getName != "..")
.map(f => FInfo(f.getPath, f.getParent, f.isDirectory, f.getAttributes.getSize, f.getAttributes.getMtime))
})
}
}
override def uploadFile(localPath: String, remoteDirectory: String): Unit = {
retryable {
sftpSession(sftp => {
sftp.getFileTransfer.setPreserveAttributes(false)
sftp.put(new FileSystemFile(localPath), remoteDirectory)
})
}
}
override def downloadFile(localPath: String, remotePath: String): Unit = {
retryable {
sftpSession(sftp => {
sftp.getFileTransfer.setPreserveAttributes(false)
sftp.get(remotePath, new FileSystemFile(localPath))
})
}
}
override def deleteAll(path: String): Unit =
throw new UnsupportedOperationException("#deleteAll is unsupported for SSHJClient")
private def sftpSession[A](f: SFTPClient => A): A = {
val defaultConfig = new DefaultConfig()
defaultConfig.setKeepAliveProvider(KeepAliveProvider.KEEP_ALIVE)
val ssh = new SSHClient(defaultConfig)
try {
// This is equivalent to StrictHostKeyChecking=no which is disabled since we don't usually know
// the SSH remote host key ahead of time.
ssh.addHostKeyVerifier(new PromiscuousVerifier())
ssh.connect(host)
ssh.authPassword(username, password)
val sftp = ssh.newSFTPClient()
try {
f(sftp)
} finally {
sftp.close()
}
} finally {
ssh.disconnect()
}
}
override def recoverable(fileOp: RemoteFileOperations): Recovery = {
super.recoverable(fileOp).orElse {
case (e: ConnectionException, _: Int) =>
println(s"Recovering session from exception: $e")
None
}
}
}
Is there a way we can retrieve artefacts in date descending order?
I currently have below script here as an example:
import org.sonatype.nexus.repository.storage.Asset
import org.sonatype.nexus.repository.storage.Query
import org.sonatype.nexus.repository.storage.StorageFacet
import groovy.json.JsonOutput
import groovy.json.JsonSlurper
def request = new JsonSlurper().parseText(args)
assert request.groupId: 'groupId parameter is required'
assert request.repoName: 'repoName parameter is required'
assert request.startDate: 'startDate parameter is required, format: yyyy-mm-dd'
log.info("Gathering Asset list for repository: ${request.repoName} as of startDate: ${request.startDate}")
def repo = repository.repositoryManager.get(request.repoName)
StorageFacet storageFacet = repo.facet(StorageFacet)
def tx = storageFacet.txSupplier().get()
tx.begin()
Iterable<Asset> assets = tx.
findAssets(Query.builder()
.where('group = ').param(request.groupId)
.and('last_updated > ').param(request.startDate)
.build(), [repo])
def urls = assets.collect { "/repository/${repo.name}/${it.name()}" }
tx.commit()
def result = JsonOutput.toJson([
assets : urls,
since : request.startDate,
repoName: request.repoName
])
return result
with:
Query.builder()
.where('group = ').param(request.groupId)
.and('last_updated > ').param(request.startDate)
.build()
def urls = assets.collect { "/repository/${repo.name}/${it.name()}" }
Is there a way we can change above script to retrieve things in date descending order?
You can simply add a suffix.
Query.builder()
.where('group = ').param(request.groupId)
.and('last_updated > ').param(request.startDate)
.suffix('order by last_updated desc')
.build()
def urls = assets.collect { "/repository/${repo.name}/${it.name()}" }
Nexus is using OrientDB behind the scene. You can find query examples here:
https://orientdb.com/docs/2.0/orientdb.wiki/Tutorial-SQL.html
I am trying to publish a test data into the cometd-demo server channel members/hello/. handshake done, can get a subscribed message on callback and can get published message on publish() callback. But i can't get that published message on subscribe() listener.
Groovy Script:
import org.cometd.bayeux.Message;
import org.cometd.bayeux.Message.Mutable
import org.cometd.bayeux.client.ClientSessionChannel;
import org.cometd.bayeux.client.ClientSessionChannel.MessageListener;
import org.cometd.client.BayeuxClient
import org.cometd.client.transport.ClientTransport
import org.cometd.client.transport.LongPollingTransport
import org.eclipse.jetty.client.HttpClient as MyHttpClient
ClientSessionChannel.MessageListener mylistener = new Mylistener();
def myurl = "http://localhost:8080/cometd/"
MyHttpClient httpClient = new MyHttpClient();
httpClient.start()
Map<String, Object> options = new HashMap<String, Object>();
ClientTransport transport = new LongPollingTransport(options, httpClient);
BayeuxClient client = new BayeuxClient(myurl, transport)
client.handshake(30000)
def channel = client.getChannel("/members/hello/")
channel.subscribe(mylistener,mylistener)
while (true)
{
sleep(5000)
channel.publish( 'hai' )
}
class Mylistener implements ClientSessionChannel.MessageListener {
public void onMessage(ClientSessionChannel channel, Message message) {
println message
}
}
While running this script I can't get the published data on listener even JVM not killed with the while loop. What am I missing?
You have specified incorrect channel path in:
def channel = client.getChannel("/members/hello/")
Channel path cannot end with / - it should be /members/hello.
Also double check if you use correct URL. I've used very simple CometD server application (https://github.com/wololock/dojo-jetty9-primer) that uses /dojo-jetty9-primer/ context path, so in my case URL to CometD server was:
def url = "http://localhost:8080/dojo-jetty9-primer/cometd/"
You can also simplify your script to something like that:
import org.cometd.bayeux.Message
import org.cometd.bayeux.client.ClientSessionChannel
import org.cometd.client.BayeuxClient
import org.cometd.client.transport.LongPollingTransport
import org.eclipse.jetty.client.HttpClient
final String url = "http://localhost:8080/dojo-jetty9-primer/cometd/"
final HttpClient httpClient = new HttpClient()
httpClient.start()
final BayeuxClient client = new BayeuxClient(url, new LongPollingTransport([:], httpClient))
client.handshake()
client.waitFor(1000, BayeuxClient.State.CONNECTED)
final ClientSessionChannel channel = client.getChannel("/members/hello")
channel.subscribe(new MyListener())
while (true) {
sleep(1000)
channel.publish("test")
}
class MyListener implements ClientSessionChannel.MessageListener {
#Override
void onMessage(ClientSessionChannel channel, Message message) {
println "[${new Date()}] Received message from channel (${channel.id}): ${message}"
}
}
Especially a part client.handshake(30000) can be simplified in your script - you don't have to wait 30 seconds here.
When you run it you will see a new message showing up in the console every 1 second:
[Mon Feb 19 10:15:02 CET 2018] Received message from channel (/members/hello): [data:test, channel:/members/hello]
[Mon Feb 19 10:15:03 CET 2018] Received message from channel (/members/hello): [data:test, channel:/members/hello]
[Mon Feb 19 10:15:04 CET 2018] Received message from channel (/members/hello): [data:test, channel:/members/hello]
[Mon Feb 19 10:15:05 CET 2018] Received message from channel (/members/hello): [data:test, channel:/members/hello]
[Mon Feb 19 10:15:06 CET 2018] Received message from channel (/members/hello): [data:test, channel:/members/hello]
Hope it helps.
I am trying to run this script from Groovy[soapUI] but i am not getting errors and not the sql command is not returning any results. am I missing anything crucial here?
import groovy.sql.Sql
import java.sql.*
import com.jcraft.jsch.JSch
import com.jcraft.jsch.Session
// ssh login
String sshHost = 'test.com'
String sshUser = 'test'
String sshPass = 'test'
int sshPort = 22
// database login
targetHost = 'localhost'
targetUser = 'test'
targetPass = 'test'
targetPort = 3306
lport = 4328
JSch jsch = new JSch();
Session session = jsch.getSession(sshUser, sshHost, sshPort);
session.setPassword(sshPass);
session.setConfig("StrictHostKeyChecking", "no");
System.out.println("Establishing Connection...");
session.connect();
int assinged_port=session.setPortForwardingL(lport, targetHost, targetPort);
Connection con = null;
String driver = "org.mariadb.jdbc.Driver";
String connectionString = "jdbc:mariadb://" + targetHost +":" + lport + "/";
con = DriverManager.getConnection(connectionString, targetUser, targetPass);
Statement st = con.createStatement();
String sql = "select * from SS_System.tblcompanies where companyid=495555"
st.executeQuery(sql);
st.close()
session.disconnect()
Also, after adding bunch of log.info statements I am getting the following response:
Sun Nov 13 21:39:30 EST 2016:INFO:com.jcraft.jsch.Session#4e6b3063
Sun Nov 13 21:39:31 EST 2016:INFO:null
Sun Nov 13 21:39:31 EST 2016:INFO:4336
Sun Nov 13 21:39:31 EST 2016:INFO:jdbc:mysql://localhost:4336/
Sun Nov 13 21:39:31 EST 2016:INFO:org.mariadb.jdbc.MariaDbConnection#14f67389
Sun Nov 13 21:39:31 EST 2016:INFO:org.mariadb.jdbc.MariaDbStatement#401b321f
Sun Nov 13 21:39:31 EST 2016:INFO:org.mariadb.jdbc.internal.queryresults.resultset.MariaSelectResultSet#74b9f5af
Perhaps you should do something with the query result
// instead of this
st.executeQuery(sql)
// do something like
java.sql.ResultSet rs = st.executeQuery(query);
and then iterate results as described here https://docs.oracle.com/javase/tutorial/jdbc/basics/processingsqlstatements.html#processing_resultset_objects
while (rs.next()) {
String value = rs.getString("COLUMN_NAME");
log.info("COLUMN_NAME:"+value)
}