In Circus, the process and socket manager by mozilla, what is a singleton? - mozilla

When configuring watchers, what would be the purpose of including both of these settings under a watching:
singleton = True
numprocess = 1
The documentation states that setting singleton has the following effect:
singleton:
If set to True, this watcher will have at the most one process. Defaults to False.
I read that as negating the need to specify numprocesses however in the github repository they provide an example:
https://github.com/circus-tent/circus/blob/master/examples/example6.ini
Included here as well, where they specify both:
[circus]
check_delay = 5
endpoint = tcp://127.0.0.1:5555
pubsub_endpoint = tcp://127.0.0.1:5556
stats_endpoint = tcp://127.0.0.1:5557
httpd = True
debug = True
httpd_port = 8080
[watcher:swiss]
cmd = ../bin/python
args = -u flask_app.py
warmup_delay = 0
numprocesses = 1
singleton = True
stdout_stream.class = StdoutStream
stderr_stream.class = StdoutStream
So I would assume they do something different and in some way work together?

numprocess is the initial number of process for a given watcher. In the example you provided it is set to 1, but a user can typically add more processes as needed.
singleton would only allow a maxiumum of 1 process running for a given watcher, so it would forbid you from increment the number of processes dynamically.
The code below from circus test suite describes it well ::
#tornado.testing.gen_test
def test_singleton(self):
# yield self._stop_runners()
yield self.start_arbiter(singleton=True, loop=get_ioloop())
cli = AsyncCircusClient(endpoint=self.arbiter.endpoint)
# adding more than one process should fail
yield cli.send_message('incr', name='test')
res = yield cli.send_message('list', name='test')
self.assertEqual(len(res.get('pids')), 1)
yield self.stop_arbiter()

Related

Parameter aliasing

when implementing Origen::Parameters, I understood the importance of defining a 'default' set. But, in essence, my real default is named something different. So I implemented a hack of a parameter alias:
Origen.top_level.define_params :default do |params|
params.tconds.override = 1
params.tconds.override_lev_equ_set = 1
params.tconds.override_lev_spec_set = 1
params.tconds.override_levset = 1
params.tconds.override_seqlbl = 'my_pattern'
params.tconds.override_testf = 'tm_3'
params.tconds.override_tim_spec_set = 'bist_xxMhz'
params.tconds.override_timset = '1,1,1,1,1,1,1,1'
params.tconds.site_control = 'parallel:'
params.tconds.site_match = 2
end
Origen.top_level.define_params :cpu_mbist_hr, inherit: :default do |params|
# way of aliasing parameter names
end
Is there a proper method of parameter aliasing that is just not documented?
There is no other way to do this currently, though I would be open to a PR to enable something like:
default_params = :cpu_mbist_hr
If you don't want them to be called :default in this case though, then maybe you don't really want them to be the default anyway.
e.g. adding this immediately after you define them would effectively give you an alternative default and would do pretty much the same job as the proposed API above:
# self is required here to help Ruby know that you are calling the params= API
# and not defining a local variable called params
self.params = :cpu_mbist_hr

tensorflow workers start before the main thread could initialize variables

I am trying to use tf.train.batch to run enqueue images in multiple threads. When the number of threads is 1, the code works fine. But when I set a higher number of threads I receive an error:
Failed precondition: Attempting to use uninitialized value Variable
[[Node: Variable/read = Identity[T=DT_INT32, _class=["loc:#Variable"], _device="/job:localhost/replica:0/task:0/cpu:0"](Variable)]]
The main thread has to run for some time under one second to index the database of folders and put it into tensor.
I tried to use sess.run([some_image]) before running tf.train.bath loop. In that case workers fail in the background first with the same error, and after that I receive my images.
I tried to use time.sleep(), but it does not seem to be possible to delay the workers.
I tried adding a dependency to the batch:
g = tf.get_default_graph()
with g.control_dependencies([init_one,init_two]):
example_batch = tf.train.batch([my_image])
where init_one, and init_two are tf.initialize_all(variables) and tf.initialize_local_variables()
the most relevant issue I could find is at: https://github.com/openai/universe-starter-agent/issues/44
Is there a way I could ask the synchronize worker threads with the main thread so that they don't race first and die out ?
A similar and easy to reproduce error with variable initialization happens when set up the epoch counter to anything that is not None Are there any potential solutions ? I've added the code needed to reproduce the error below:
def index_the_database(database_path):
"""indexes av4 database and returns two tensors of filesystem path: ligand files, and protein files"""
ligand_file_list = []
receptor_file_list = []
for ligand_file in glob(os.path.join(database_path, "*_ligand.av4")):
receptor_file = "/".join(ligand_file.split("/")[:-1]) + "/" + ligand_file.split("/")[-1][:4] + '.av4'
if os.path.exists(receptor_file):
ligand_file_list.append(ligand_file)
receptor_file_list.append(receptor_file)
index_list = range(len(ligand_file_list))
return index_list,ligand_file_list, receptor_file_list
index_list,ligand_file_list,receptor_file_list = index_the_database(database_path)
ligand_files = tf.convert_to_tensor(ligand_file_list,dtype=tf.string)
receptor_files = tf.convert_to_tensor(receptor_file_list,dtype=tf.string)
filename_queue = tf.train.slice_input_producer([ligand_files,receptor_files],num_epochs=10,shuffle=True)
serialized_ligand = tf.read_file(filename_queue[0])
serialized_receptor = tf.read_file(filename_queue[1])
image_one = tf.reduce_sum(tf.exp(tf.decode_raw(serialized_receptor,tf.float32)))
image_batch = tf.train.batch([image_one],100,num_threads=100)
init_two = tf.initialize_all_variables()
init_one = tf.initialize_local_variables()
sess = tf.Session()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess,coord=coord)
sess.run([init_one])
sess.run([init_two])
while True:
print "next"
sess.run([image_batch])

Torch - Multithreading to load tensors into a queue for training purposes

I would like to use the library threads (or perhaps parallel) for loading/preprocessing data into a queue but I am not entirely sure how it works. In summary;
Load data (tensors), pre-process tensors (this takes time, hence why I am here) and put them in a queue. I would like to have as many threads as possible doing this so that the model is not waiting or not waiting for long.
For the tensor at the top of the queue, extract it and forward it through the model and remove it from the queue.
I don't really understand the example in https://github.com/torch/threads enough. A hint or example as to where I would load data into the queue and train would be great.
EDIT 14/03/2016
In this example "https://github.com/torch/threads/blob/master/test/test-low-level.lua" using a low level thread, does anyone know how I can extract data from these threads into the main thread?
Look at this multi-threaded data provider:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua
It runs this file in the thread:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L18
by calling it here:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L30-L43
And afterwards, if you want to queue a job into the thread, you provide two functions:
https://github.com/soumith/dcgan.torch/blob/master/data/data.lua#L84
The first one runs inside the thread, and the second one runs in the main thread after the first one completes.
Hopefully that makes it a bit more clear.
If Soumith's examples in the previous answer are not very easy to use, I suggest you build your own pipeline from scratch. I provide here an example of two synchronized threads : one for writing data and one for reading data:
local t = require 'threads'
t.Threads.serialization('threads.sharedserialize')
local tds = require 'tds'
local dict = tds.Hash() -- only local variables work here, and only tables or tds.Hash()
dict[1] = torch.zeros(4)
local m1 = t.Mutex()
local m2 = t.Mutex()
local m1id = m1:id()
local m2id = m2:id()
m1:lock()
local pool = t.Threads(
1,
function(threadIdx)
end
)
pool:addjob(
function()
local t = require 'threads'
local m1 = t.Mutex(m1id)
local m2 = t.Mutex(m2id)
while true do
m2:lock()
dict[1] = torch.randn(4)
m1:unlock()
print ('W ===> ')
print(dict[1])
collectgarbage()
collectgarbage()
end
return __threadid
end,
function(id)
end
)
-- Code executing on master:
local a = 1
while true do
m1:lock()
a = dict[1]
m2:unlock()
print('R --> ')
print(a)
end

How to use dcmtk/dcmprscp in Windows

How can I use dcmprscp to receive from SCU Printer a DICOM file and save it, I'm using dcmtk 3.6 & I've some trouble to use it with the default help, this's what I'm doing in CMD:
dcmprscp.exe --config dcmpstat.cfg --printer PRINT2FILE
each time I receive this messagebut (database\index.da) don't exsist in windows
W: $dcmtk: dcmprscp v3.6.0 2011-01-06 $
W: 2016-02-21 00:08:09
W: started
E: database\index.dat: No such file or directory
F: Unable to access database 'database'
I try to follow some tip, but the same result :
http://www.programmershare.com/2468333/
http://www.programmershare.com/3020601/
and this's my printer's PRINT2FILE config :
[PRINT2FILE]
hostname = localhost
type = LOCALPRINTER
description = PRINT2FILE
port = 20006
aetitle = PRINT2FILE
DisableNewVRs = true
FilmDestination = MAGAZINE\PROCESSOR\BIN_1\BIN_2
SupportsPresentationLUT = true
PresentationLUTinFilmSession = true
PresentationLUTMatchRequired = true
PresentationLUTPreferSCPRendering = false
SupportsImageSize = true
SmoothingType = 0\1\2\3\4\5\6\7\8\9\10\11\12\13\14\15
BorderDensity = BLACK\WHITE\150
EmptyImageDensity = BLACK\WHITE\150
MaxDensity = 320\310\300\290\280\270
MinDensity = 20\25\30\35\40\45\50
Annotation = 2\ANNOTATION
Configuration_1 = PERCEPTION_LUT=OEM001
Configuration_2 = PERCEPTION_LUT=KANAMORI
Configuration_3 = ANNOTATION1=FILE1
Configuration_4 = ANNOTATION1=PATID
Configuration_5 = WINDOW_WIDTH=256\WINDOW_CENTER=128
Supports12Bit = true
SupportsDecimateCrop = false
SupportsTrim = true
DisplayFormat=1,1\2,1\1,2\2,2\3,2\2,3\3,3\4,3\5,3\3,4\4,4\5,4\6,4\3,5\4,5\5,5\6,5\4,6\5,6
FilmSizeID = 8INX10IN\11INX14IN\14INX14IN\14INX17IN
MediumType = PAPER\CLEAR FILM\BLUE FILM
MagnificationType = REPLICATE\BILINEAR\CUBIC
The documentation of the "dcmprscp" tool says:
The dcmprscp utility implements the DICOM Basic Grayscale Print
Management Service Class as SCP. It also supports the optional
Presentation LUT SOP Class. The utility is intended for use within the
DICOMscope viewer.
That means, it is usually not run from the command line (as most of the other DCMTK tools) but started automatically in the background by DICOMscope.
Anyway, I think the error message is clear:
E: database\index.dat: No such file or directory
F: Unable to access database 'database'
Did you check whether there is a subdirectory "database" and whether the "index.dat" file exists in this directory? If you should ask why there is a need for a "database" then please read the next paragraph of the documentation:
The dcmprscp utility accepts print jobs from a remote Print SCU.
It does not create real hardcopies but stores print jobs in the local
DICOMscope database as a set of Stored Print objects (one per page)
and Hardcopy Grayscale images (one per film box N-SET)

Is it possible to specify an id when creating an issue on GitLab?

I intend to transfer issues from Redmine to GitLab using this script
https://github.com/sdslabs/redmine-to-gitlab/blob/master/issue-tranfer.py
It works, but I would like to keep the issues ids during the transition. By default GitLab just starts from #1 and increases. I tried adding "newissue['iid']=issue['id']" and variations to the parameters, but apparently GitLab simply does not permit assigning an id. Anyone knows if there's a way?
"issue" is the data acquired from redmine:
newissue = {}
newissue['id'] = pro['id']
newissue['title'] = issue['subject']
newissue['description'] = issue["description"]
if 'assigned_to' in issue:
auser = con.finduserbyname(issue['assigned_to']['name'])
if(auser):
newissue['assignee_id'] = auser['id']
print newissue
if ('fixed_version' in issue):
newissue['milestone_id'] = issue['fixed_version']['id']
newiss = post('/projects/' + str(pro['id']) + '/issues', newissue)
and this is the "post" function
def post( url, load = {}):
load['private_token'] = conf.token
r = requests.post(conf.base_url + url, params = load, verify = conf.sslverify)
return r.json()
The API does not allow you to specify an issue ID at creation time. The ID is intended to be sequential. The only way you could potentially accomplish this task is to interact with the database directly. If you choose this route I caution you to be extremely careful, and have backups.

Resources