How can I run Oban jobs (I am using a Cron job plugin with Oban) on localhost so that I can see it is working?
My Oban's just run a query and then create a file at a location. I want to be able to test this on localhost or which ever correct method there is for testing this.
defmodule ObanJobOne do
def job(_) do
"""
some sql query
"""
|> IO.inspect
|> //do some mapping
|> //create file
end
end
How can I run ObanJobOne so that I can see the results of the sql query with the IO.inspect and also see the file that got created.
Oban jobs are just modules. You can use iex -S mix to test them locally:
YourWorker.perform(%Oban.Job{args: %{"id" => 1}})
If you want to test the queue itself, use:
%{id: 1}
|> YourWorker.new()
|> Oban.insert()
Oban also supports ExUnit with Oban.Testing.
Related
To download artifacts from a run, you need run id. I get the run id from the UI as shown below.
Run id from the UI
But when I set the run name parameter, run id is not visible in the UI. How to find the run Id of a particular run in MLflow ?
The run id in mlflow is a random stamp id generated. I had the same problem because I wrote an mlflow decorator, which needed access to the run id after the run was finished to set tags.
The question is what do you want to do after you have the run id? Then the approach would need extra infos.
If you only want to get access to your latest run:
Use the mlflow.list_run_infos() function and insert the experiment_id, which you can get by the mlflow.get_experiment_by_name function of mlflow. I guess you know your experiment's id. Here is the list_run_infos function
def list_run_infos(
self,
experiment_id: str,
run_view_type: int = ViewType.ACTIVE_ONLY,
max_results: int = SEARCH_MAX_RESULTS_DEFAULT,
order_by: Optional[List[str]] = None,
page_token: Optional[str] = None,
)
Then you should get a list of run objects. But, please read further:
In case you have multiple run objects in your experiment (that happens with several runs or even child runs from a parent run with Gridsearch and sklearn).
Loop through each! Runobject from the output of list__run_infos() and look at the end_time property of the Runobject. The endtime property is a UNIX timestamp. So even if you have a parent run or a single run, the highest UNIX timestamp in the end_time property will always be your last run (In case you didnt used several estimators in a loop in your experiment, that would require some refactoring). And by that you identify the appropriate RunObject.
Only then! you can access the Runtime Object's property: the run_id:
Here you can see the class of the run object from mlflow, keep in mind that you also need the exp_id again.
classmlflow.entities.RunInfo
(
run_uuid,
experiment_id,
user_id,
status,
start_time,
end_time,
lifecycle_stage,
artifact_uri=None,
run_id=None
)
In case you need the specific code:
last_parent_run = set()
exp_id = mlflow.get_experiment_by_name("your_exp_name"].experiment_id
for item in mlflow.list_run_infos(exp_id):
last_parent_run.add((item.__getattribute__("end_time"), item.__getattribute__("run_id")))
And then of course look for the maximum entry in your set
If you have any further questions just ask; i already tested my decorator with this and it works fine and keeps the main code of mlflow statements clean. Although its a little bit hacky, to access the run_id after the run.
I have a Test plan with several thread groups that write summary report results in the same csv file hosted in a server, this works fine using a networkdrive (z:) and changing jmeter.properties -> resultcollector.action_if_file_exists=APPEND.
Finally I have a tearDown Thread Group that insert the csv data into a sql server (the previous used networkdrive is hosted in this server in c:\jmeter\results.csv) and then it deletes the csv.
The case is when I run the test plan full I always have this error: "Cannot bulk load because the file "c:\jmeter\results.csv" could not be opened. Operating system error code 32"
The strange thing is that if I start the tearDown Thread Group alone it works fine, it makes the bulk insert in sql server and then it deletes de csv.
I started 2 days ago with Jmeter, so I'm sure I am misunderstanding something :S
Summary Report Config
JDBC Request
BeanShell PostProcessor that deletes csv
Test plan Structure
It happens because Summary Report (as well as other listeners) keep the file(s) open until test ends so you need to trigger this "close" event somehow.
Since JMeter 3.1 you're supposed to be using JSR223 Test Elements and Groovy language for scripting therefore replace this Beanshell PostProcessor with the JSR223 PostProcessor and use the following code:
import org.apache.jmeter.reporters.ResultCollector
import org.apache.jorphan.collections.SearchByClass
def engine = engine = ctx.getEngine()
def test = engine.getClass().getDeclaredField('test')
test.setAccessible(true)
def testPlanTree = test.get(engine)
SearchByClass<ResultCollector> listenerSearch = new SearchByClass<>(ResultCollector.class)
testPlanTree.traverse(listenerSearch)
Collection<ResultCollector> listeners = listenerSearch.getSearchResults()
listeners.each { listener ->
def files = listener.files
files.each { file ->
file.value.pw.close()
}
}
new File('z:/result.csv').delete()
More information on Groovy scripting in JMeter: Apache Groovy - Why and How You Should Use It
Using boto3 of aws, I am trying to run start query and get the results using query id. but it didnt work as expected in python script. It returns the expected json output for start_query and able to fetch the queryID. But if i try to fetch the query results using queryID, it returns empty json.
<code>
import boto3
client = boto3.client('logs')
executeQuery = client.start_query(
logGroupName='LOGGROUPNAME',
startTime=STARTDATE,
endTime=ENDDATE,
queryString='fields status',
limit=10000
)
getQueryId=executeQuery.get('queryId')
getQueryResults = client.get_query_results(
queryId=getQueryId
)
</code>
it returns the reponse of get_query_results as
{'results': [], 'statistics': {'recordsMatched': 0.0, 'recordsScanned': 0.0, 'bytesScanned': 0.0}, 'status': 'Running',
But if i try using aws cli with the queryID generated from script, it returns json output as expected.
Anyone could able to tell why it didnt work from boto3 python script and worked in cli?
Thank you.
The query status is Running in your example. Its not in Complete status yet.
Running queries is not instantaneous. Have to wait a bit for query to complete, before you can get results.
You can use describe_queries to check if your query has completed or not. You can also check if logs service has dedicated waiters in boto3 for the results. They would save you from pulling describe_queries API in a loop waiting till your queries finish.
When you do this in CLI, probably there is more time before you start the query, and query results using CLI.
The other issue you might be encountering is that the syntax for the queryString in the API is different from a query you would type into the CloudWatch console.
Console query syntax example:
{ $.foo = "bar" && $.baz > 0 }
API syntax for same query:
filter foo = "bar" and baz > 0
Source: careful reading and extrapolation from the official documentation plus some trial-and-error.
My logs are in JSON format. YMMV.
Not sure if this problem is resolved. I was facing the same issue with AWS Java SDK . But when i terminate the thread performing the executeQuery query and perform the get_query_results using a new thread and the old queryId. It seems to be working fine.
Adding sleep will work here. If the query is exceeding the Sleep time then again it will show as Running status. You can write a Loop where you can check the status Completed, if the status is Running you can run Sleep again for some second and retry. You can give some retry count here.
Sample Pseudocode:
function for sleep; (let's say SleepFunc())
Loop till retry count
check if the status is Completed;
If yes break;
else call SleepFunc();
I would like to include Postgres interaction into my integration tests, i.e. not mock the database part, and I need help on figuring out the best way to do the test cleanup.
My setup is NodeJS, Postgres, Sequelize, Karma+Mocha. Currently, before running the tests a new database is created and migrated, after each test I run a raw query that truncates all the tables, and after all tests cases are finished the test database is dropped. As you probably guessed it, the execution time for running tests like this is pretty slow.
I was wondering if there is a way to speed the process up. Is there an in-memory psql database that I could use for my test cases (I've search for one for a while but couldn't find it), or something like that.
To be more precise, I'm looking for a way to clear the database after a test wrote something to it, in a way that does not require truncating all the tables after every test case.
Incorporated https://stackoverflow.com/a/12082038/2018521 into my cleanup:
afterEach(async () => {
await db.sequelize.query(`
DO
$func$
BEGIN
EXECUTE
(SELECT 'TRUNCATE TABLE ' || string_agg(oid::regclass::text, ', ') || ' RESTART IDENTITY CASCADE'
FROM pg_class
WHERE relkind = 'r' -- only tables
AND relnamespace = 'public'::regnamespace
);
END
$func$;
`);
});
Truncate now runs almost instantly.
Forgive me if I don't understand elixir really well as I am new to it...
I'm using quantum-elixir as a cron api to dynamically create cron jobs. When someone POSTS to a route I save the cron job details into my Ecto Repo and then simultaneously create a quantum job with Quantum.add_job.
In development when I close my server and restart it, i have to re-add all my cron jobs because they don't survive through a restart. So that got me thinking that if my application were to crash that would make me lose all the cron jobs. (I'm thinking about of scenarios where I host the app on Google compute engine and for whatever reason need to do a reset on the compute instance, ie upgrades on the box, etc.)
So I was wondering what the appropriate way to restart my app is while keeping these cron jobs?
Right now I have the following:
worker(Task,[MyApp.RebootTask, :reboot, []], restart: :transient)
in the start function of my application module.
Is this the right approach? What other considerations do I need to factor in?
Any guidance is greatly appreciated
I query my db and create a list with the job definition of every item
%Quantum.Job{
name: job_name,
overlap: false,
run_strategy: %Quantum.RunStrategy.Random{nodes: :cluster},
schedule: Crontab.CronExpression.Parser.parse!(schedule),
task: task,
state: :active,
timezone: "Europe/Zurich"
}
To have the jobs started at application startup, I do something like this
defmodule Alerts.Scheduler do
use Quantum.Scheduler, otp_app: :alerts
require Logger
#environmet_blacklist [:test]
def init(opts) do
case Enum.member?(#environmet_blacklist, Mix.env()) or IEx.started?() do
true ->
IO.inspect(opts)
opts
false ->
delete_all_jobs()
opts_with_jobs = get_startup_config(opts)
opts_with_jobs |> IO.inspect()
opts_with_jobs
end
end
def get_startup_config(opts) do
job_definition = Alerts.Business.Alerts.get_all_alert_jobs_config()
(opts |> List.delete(List.keyfind(opts, :jobs, 0))) ++ [jobs: job_definition]
end
In my application start
def start(_type, _args) do
[
Alerts.Repo,
AlertsWeb.Endpoint |> supervisor([]),
if(System.get_env() != :test, do: Alerts.Scheduler),
Alerts.VersionSupervisor |> supervisor([])
]
|> Supervisor.start_link(strategy: :one_for_one, name: Alerts.Supervisor)
end
It doesn't look like Quantum persists dynamically-added cronjobs, since the more typical approach is to define your cronjobs (named or otherwise) in your config.exs.
Since you're already storing the job details with Ecto, it's just a matter of reading those details and readding them when your application starts. Since you're already using Quantum, the following in config/config.exs ought to do the trick:
config :quantum, cron: [
"#reboot": &MyApp.some_function_to_read_and_readd_my_cronjobs/0
]