Domino 10 Java XPage crashing the server after some time with PANIC: semaphore invalid or not allocated - xpages

I have a java XPages domino application that's running on my server and serves as an API for handling the Rooms & Resoruces database remotely (the main role is obtaining reservations for a set of rooms and updating it periodically).
Everything was fine when testing, but once I put the app on my production server, I got a crash after some time:
Domino version: Release 10.0.1FP3 August 09, 2019
OS Version: Windows/2016 10.0 [64-bit]
Error Message = PANIC: semaphore invalid or not allocated
SharedDPoolSize = 33554432
FaultRecovery = 0x00010012
Cleanup Script Timeout= 600
Crash Limits = 3 crashes in 5 minutes
StaticHang = Virtual Thread [ nHTTP: 0674: 0011] (Native thread [ nHTTP: 0674: 145c]) (0x674/0x11/0x170000145C)
ConfigFileSem = ( SEM:#0:0x1520000010D) n=0, wcnt=-1, Users=-1, Owner=[ : 0000]
FDSem = ( RWSEM:#53:0x410f) rdcnt=-1, refcnt=0 Writer=[ : 0000], n=53, wcnt=-1, Users=0, Owner=[ : 0000]
<## ------ Notes Data -> OS Data -> Semaphores -> SEM Info (Time 10:34:34) ------ ##>
SpinLockIterations = 1500
FirstFreeSem = 819
SemTableSize = 827
############################################################
### thread 46/89: [ nHTTP: 0674: 145c] FATAL THREAD (Panic)
### FP=0xF3AC3F61E8, PC=0x7FFFC6DD5AC4, SP=0xF3AC3F61E8
### stkbase=0xF3AC400000, total stksize=1048576, used stksize=40472
### EAX=0x00000004, EBX=0x00000000, ECX=0x00001c6c, EDX=0x00000000
### ESI=0x000927c0, EDI=0x00001c6c, CS=0x00000033, SS=0x0000002b
### DS=0x00000000, ES=0x00000000, FS=0x00000000, GS=0x00000000 Flags=0x1700000246
############################################################
[ 1] 0x7FFFC6DD5AC4 ntdll.ZwWaitForSingleObject+20 (10,0,0,F3AC3F6300)
[ 2] 0x7FFFC3464ABF KERNELBASE.WaitForSingleObjectEx+143 (10,F3AC3F69B0,7FFF00000000,1c6c)
#[ 3] 0x7FFFB326DAD0 nnotes.OSRunExternalScript+1808 (5,0,424,0)
#[ 4] 0x7FFFB3269E9C nnotes.FRTerminateWindowsResources+1532 (5,23B45D80D50,0,1)
#[ 5] 0x7FFFB326BA23 nnotes.OSFaultCleanupExt+1395 (0,7f60,0,F3AC3F7C70)
#[ 6] 0x7FFFB326B4A7 nnotes.OSFaultCleanup+23 (7f60,7FFFB3DE7E30,0,200000000)
#[ 7] 0x7FFFB32D6D76 nnotes.OSNTUnhandledExceptionFilter+390 (F3AC3F7B50,7FFFB485A818,F3AC3F7C70,FFFFEB865BDB003)
#[ 8] 0x7FFFB326E70A nnotes.Panic+1066 (5dc,125851500347E41,7FF786A7B4A0,23B1D91F9A8)
#[ 9] 0x7FFFB329FDD6 nnotes.OSLockSemInt+70 (23B1D91F9A4,145c,7FF786A84578,7FF786A84578)
#[10] 0x7FFFB32A04ED nnotes.OSLockWriteSem+77 (23B1D92AA18,7FF786A84578,23B14EA41B0,7FF786A84578)
#[11] 0x7FFFAC74DDC1 nlsxbe.ANDatabase::ANDRemoveCalendar+33 (23B1D92AA18,7FF786A84578,0,23B18FFCBA8)
#[12] 0x7FFFAC881CBB nlsxbe.ANCalendar::`scalar deleting destructor'+91 (7FF786A84578,23B1BB6FC78,0,1)
#[13] 0x7FFFAC7FFAF7 nlsxbe.Java_lotus_domino_local_NotesBase_RecycleObject2+471 (23B159C7A00,23B1BB6FC78,23B1BB6FC70,0)
#[14] 0x7FFFAC7FF91A nlsxbe.Java_lotus_domino_local_NotesBase_RecycleObject+42 (23B159C7A00,23B1BB6FC78,23B1BB6FC70,23B159C7A00)
Most of the operations rely on searching for a room by it's internet address in the $Rooms view of names.nsf, then heading to its adequate RnR database and getting all reservation documents for that specific room. Sometimes (although very rarely) I also open the users calendar and create/update reservations.
At first I thought it's caused by some memory leak or something, I went through all the code and recycled() everything I could find (and I found some places with obvious handle leaks), but it didn't help at all.
What bothers me is that the crashes happened at almost identical hours (4 days later, several minutes after 10AM).
What can be the cause of this crash? I'm not good at reading dump data, but I can see that the first call from the fatal stack calls list is an RecycleObject, followed by some calendar related things.
I have no idea where I should look in my code, why would even recycle crash the server? Does the ANCalendar suggest that I shouldn't look at code that's accessing the database directly, but rather opening users calendar?
Update
Studying the crash logs, I managed to find out the place where the crash occured. It's my appointment creation code, which uses NotesCalendar.createEntry() on the users calendar. The code is like this:
Session session = reDatabase.getParent();
Name nnOrganizer = session.createName(session.getEffectiveUserName());
String organizerEmail = "";
DirectoryNavigator nav = session.getDirectory().lookupNames("$Users", nnOrganizer.getCommon(), "InternetAddress");
if(nav.findFirstMatch() && !nav.getFirstItemValue().isEmpty()) {
organizerEmail = (String)nav.getFirstItemValue().get(0);
}
Recycler.recycle(nav);
Name nnResource = session.createName(roomName);
DbDirectory dir = session.getDbDirectory(session.getServerName());
Database mdb = dir.openMailDatabase();
NotesCalendar cal = session.getCalendar(mdb);
String dStart = DateUtil.formatICalendar(dtStart);
String dEnd = DateUtil.formatICalendar(dtEnd);
String iCalEntry = "BEGIN:VCALENDAR\n" +
// Rest of iCalendar string goes here
iCalEntry += "END:VEVENT\n" +
"END:VCALENDAR\n";
cal.setAutoSendNotices(true);
String apptUNID ="";
try {
NotesCalendarEntry entry = cal.createEntry(iCalEntry);
Document doc = entry.getAsDocument();
apptUNID = doc.getItemValueString("ApptUNID");
Recycler.recycle(doc, entry);
} catch (NotesException ex) {
System.out.println("Couldn't create appointment!: " + ex.toString());
throw ex;
} finally {
Recycler.recycle(mdb, cal, nnOrganizer, nnResource, dir, reDatabase, session);
}
return apptUNID; // return the UNID of the created entry (if any)
Considering the fatal call stack starts with an RecycleObject call, is there anything wrong in my recycling here? Can I recycle the calendar entry directly after creating it? It's still kinda confusing to me, but this code works well on my test server. Is there anything wrong about it?
This is the last code that's being executed when creating an appointment, a HTTP response with the apptUNID is made directly after calling the above function.

Related

Stuck in API XAxiDma_BdRingFromHw, why doesn't the S2MM Block descriptor's Completed bit Set?

I am working on Zynq 7z030 and i am trying to receive data on the DDR from the PL side. I am using the AXI DMA SG poll code provided as example by xilinx on SDK. (xaxidma_example_sg_poll.c)
After Configuring DMA -> Setting up the RX channel -> Starting DMA -> I enter the API CheckDmaResult.
Here I call XAxiDma_BdRingFromHw API.
while ((ProcessedBdCount = XAxiDma_BdRingFromHw(RxRingPtr,
XAXIDMA_ALL_BDS,
&BdPtr)) == 0) {
}
This API calls Xil_DCacheInvalidateRange which returns and then the Block descriptor status remains always 0. Thus resulting in forever looping of the XAxiDma_BdRingFromHw. The complete bit never sets.
This happens eventhough I see the TREADY of S2MM go high and receive data in ILA(integrated logic analyser on FPGA end/PL end)
main
....
Status1 = CheckDmaResult(&AxiDma);
.....
-> static int CheckDmaResult(XAxiDma * AxiDmaInstPtr)
....
while ((ProcessedBdCount =
XAxiDma_BdRingFromHw(RxRingPtr,
XAXIDMA_ALL_BDS,
&BdPtr)) == 0) {
}
....
-> XAxiDma_BdRingFromHw(XAxiDma_BdRing * RingPtr, int BdLimit,
XAxiDma_Bd ** BdSetPtr)
....
while (BdCount < BdLimit) {
/* Read the status */
XAXIDMA_CACHE_INVALIDATE(CurBdPtr);
BdSts = XAxiDma_BdRead(CurBdPtr, XAXIDMA_BD_STS_OFFSET);
BdCr = XAxiDma_BdRead(CurBdPtr, XAXIDMA_BD_CTRL_LEN_OFFSET);
/* If the hardware still hasn't processed this BD then we are
* done
*/
if (!(BdSts & XAXIDMA_BD_STS_COMPLETE_MASK)) {
break;
}
.....
could someone please suggest possible reasons or directions i should consider to solve this problem.. any and every suggestion would be a great help.
Thanks in advance!
The problem was with the board (ESD damage).
The DDR started receiving data as soon as the board was changed and the following were observed
further in debug config settings the following needed to be ticked on Under Target Setup
Reset entire system
Program FPGA
Under Application tab
Download Application
Stop at 'main'
by Specifying the correct corresponding .elf file in 'Application ' field

Energy Market Simulation using multiprocessing in python

I have a long code (430 lines) which is used to simulate an energy market following specific guidelines.
Four different processes: Home, Market, Weather, External.
Each process has a specific task listed below:
Home has an production and consumption float value, a trade policy as an integer and calculates energy exchanges between each home (multiple home processes are created for the simulation).
Market calculates the current energy price based on the production and consumption and external factors.
Weather determines random variables of temperature and season to be used in Market.
External is a child process of Market and provides random external factors I have created.
I have an issue in my code where I create a new thread to display the results of each day of the simulation (days pass every 2 seconds) but I feel my code doesn't launch the thread properly and I'm quite lost as to where the issue is occuring exactly and why. I have used various print(".") to show where the program goes and identify where it doesn't and I can't see why the thread doesn't launch properly.
I am on Windows and not Linux. If this could be the issues, please tell me. I will show a code snippet below of where the issue seems to be and the full code as well as a pdf explaining in more detail how the project should run in a github link (430 lines of code doesn't seem reasonable to post here).
def terminal(number_of_homes, market_queue, home_counter, clock_ready, energy_exchange_queue, console_connection, console_connection_2):
day = 1
while clock_ready.wait(1.5 * delay):
req1, req2, req3, req4 = ([] for i in range(4))
for i in range(number_of_homes):
a = market_queue.get()
req1.append(a)
req1 = sort(req1)
for i in range(number_of_homes):
a = market_queue.get()
req1.append(a)
req2 = sort(req2)
for i in range(number_of_homes):
a = market_queue.get()
req1.append(a)
req3 = sort(req3)
req1 = req1 + req2 + req3
for i in range(energy_exchange_queue.qsize()):
b = energy_exchange_queue.get()
req4.append(b)
req4 = sort(req4)
thread = threading.Thread(target = console_display, args = (number_of_homes, req1, day, req4, console_connection.recv(), console_connection_2.recv()))
thread.start()
thread.join()
day += 1
Github link: https://github.com/MaxMichel2/Energy-Market-Project

GML room_goto() Error, Expecting Number

I'm trying to make a game that chooses a room from a pool of rooms using GML, but I get the following error:
FATAL ERROR in action number 3 of Create Event for object obj_control:
room_goto argument 1 incorrect type (5) expecting a Number (YYGI32)
at gml_Object_obj_control_CreateEvent_3 (line 20) - room_goto(returnRoom)
pool = ds_list_create()
ds_list_insert(pool, 0, rm_roomOne)
ds_list_insert(pool, 1, rm_roomTwo)
ds_list_insert(pool, 2, rm_roomThree)
ds_list_insert(pool, 3, rm_roomFour)
var returnIndex;
var returnRoom;
returnIndex = irandom(ds_list_size(pool))
returnRoom = ds_list_find_value(pool, returnIndex)
if (ds_list_size(pool) == 0){
room_goto(rm_menu_screen)
}else{
room_goto(returnRoom)
}
I don't get the error message saying it's expecting a number.
This is weird indeed... I think this should actually work.. But I have no GM around to test :(
For now you can also solve this using "choose". This saves a list (and saves memory, because you're not cleaning up the list by deleting it - thus it resides in memory)
room_goto(choose(rm_roomOne, rm_roomTwo, rm_roomThree, rm_roomFour));
choose basically does exactly what you're looking for. Might not be the best way to go if you're re-using the group of items though.

Firebase... Add/Update Firebase Using node.js Script

I have arbitrary JSON that is sensibly laid out like this:
[
{
"id":100,
"name":"Buckeye, AZ",
"status":"OPEN",
"address":{
"street":"416 S Watson RD",
"city":"Buckeye"
...
}
}
]
I've written a node.js script like this for proof of concept (why I'm using node is that the JS API seems better supported than REST or Ruby for this. I could be wrong):
http = require('http')
Firebase = require('firebase')
all_sites_url = "http://supercharge.info/service/supercharge/allSites"
firebase_url = "https://tesla-supercharger.firebaseio.com/"
http.get(all_sites_url, (res) ->
body = ""
res.on "data", (chunk) ->
body += chunk
return
res.on "end", ->
response = JSON.parse(body)
all_sites = response
send_to_firebase(response)
return
return
).on "error", (e) ->
console.log "Got error: ", e
return
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
for charger in response
console.log charger
new_child = firebase_ref.push()
new_child.set {id: charger.id, data: charger}, (error) ->
if error
console.log "Data cound not be saved #{error}"
else
console.log "Data saved successfully"
The result is a unique id generated by Firebase, which has as a child a data and an id child. The data child has the expected information like name, status, etc.
What I'd prefer is to generate a key-value pair. E.g., for an id of 100:
- 100
- name
- address
street
city
etc. So my first question is how to accomplish this or if it is even sensible.
After the first time around, this data (call it the data from an external server) will be there and a mobile app will have added some fields. These are not present in the data already there. Next time I fetch data from the external server, I want to update things that have changed that the server would know about, like status. I don't want to tamper with things that only the mobile devices would know about like remote_observations.
I know I'm seeming a bit dense here, but I'm trying to put together a sensible data model that will be updatable from that server using a CRON job and incrementally updatable from a bunch of mobile devices.
Any help is much appreciated.
UPDATE: I have found that this works for getting the structure I want:
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
for charger in response
firebase_ref.child(charger.id).update charger, (error) ->
if error
console.log "Data could not be saved #{error}"
else
responses_pending += 1
console.log "Data saved successfully : #{responses_pending} pending"
firebase_ref.on 'value', ->
console.log "value received rp=#{responses_pending}"
process.exit() if (responses_pending -= 1) < 1
So the code I settled on is this:
http = require('http')
Firebase = require('firebase')
firebase_url = '/path/to/your/firebase'
# code to get JSON of the form:
{
"id":100,
"name":"Buckeye, AZ",
"status":"OPEN",
"address":{"street":"416 S Watson RD",
"city":"Buckeye",
"state":"AZ",
"zip":"85326",
"country":"USA"},
... etc.
}
# Asynchronous get of JSON hash from some server or other.
get_my_fine_JSON().on 'complete', (response) ->
send_to_firebase(response)
send_to_firebase = (response) ->
firebase_ref = new Firebase(firebase_url)
length = response.length
for charger in response
firebase_ref.child(charger.id).update charger, (error) ->
if error
console.log "Data could not be saved #{error}"
else
console.log "Data saved successfully"
process.exit() if length -= 1 is 0
Discussion:
The idea was to have a Firebase structure like this:
- 100
- address
street: "123 Main Street"
etc.
That's reason 1 why id is pulled up to be the primary key. Reason 2 is so that I can uniquely identify an object pulled off the external server as the "same" one in my Firebase and apply any updates necessary.
Epiphany 1: Update is more like upsert. If the key is there, whatever hash you supply replaces matching values. If it's not there, then Firebase happily adds it. Which is way cool because it covers both the push and patch cases.
Epiphany 2: This process will hang waiting for events if nothing tells it to stop. That's why the countdown index, length is decremented until the code has upserted (for lack of a better term) each item.
Observation 1: Doing this in node.js is super fast compared with REST using Python or Ruby. And this upsert stuff is wicked cool if I'm understanding it right.
Observation 2: There isn't a ton of wisdom out there as of this writing regarding writing node shell scripts to do this kind of stuff. Maybe it's a good idea, maybe a bad one. I don't know.
Observation 3: Because of the asynchronous nature of node and the Firebase Javascript API (both GOOD THINGs), terminating a process before the last bit is done can be tricky because your process has to hang on just long enough to complete its last request/response with Firebase. This is, as mentioned before, done in the completion handler of the update. Otherwise we wouldn't necessarily be complete when the process exited.
Caveat 1: Related to observation 2, this could be a bad idea, but I haven't been able to find resources that speak to the problem.
Caveat 2: This could be a horrid abuse or misunderstanding of the Firebase update API. I am reporting observed behavior in the limited case of my specific data. YMMV.
Caveat 3: I'm hoping the process lifetime is as I suggest it is in observation 3.
A note to the decaffeinated: The Javascript for this is so trivially different that it shouldn't be too tough to translate. Or go to js2coffee and paste the Coffeescript into the right pane to get real Javascript in the left pane that you can tune.

Perl TK Gui freezes

I have a perl tk application where in i create many objects and update the perl tk gui display with information in objects.I need to add large number of jobs(say 30k) in the tree in the gui.If i add all jobs in one go,the gui freezes.
Below is the code snippet:
sub Importjobs
{
#================= start creation of objects=============================
my JobList $self = shift;
my $exportedJobList = shift;
# third parameter whether to clear the list
$self->clear () unless shift;
my $noOfProcsToBeAdded = shift || 3000;
my $cellCollection = Tasks::CellCollection::instance ();
my $calcActionsPathHash = $cellCollection->caPathCAHash ();
my $collectionCellNames = $cellCollection->allCellNames ();
my #importedJobs = ();
# if the given job list is empty, add import job list to it
push #{$self->_importJobList()}, #$exportedJobList;
$exportedJobList = [];
# do not import new jobs if the previous jobs are still being created
foreach my $taskGenJob(#{$self->getTaskGenJobObjs()}) {
goto FINISH if TaskGenJobState::CREATE == $taskGenJob->state();
}
# now get each job and add it into the imported jobs till the noOfJobs exceeds $noOfJobsToBeAdded
while(my $jobDescription = shift #{$self->_importJobList()}) {
my $taskInstantiation = $jobDescription->{'taskInstantiation'};
my $caPath = $taskInstantiation->{'calcActionPath'};
my $errMsgPrefix = 'Error importing ' . join ('-', $task, $command, $method, $caPath);
my #calcActionList;
if(defined $caPath) {
my $calcAction = $calcActionsPathHash->{ $caPath };
unless($calcAction) {
my $errMsg = $errMsgPrefix . ": the calcAction is not defined within the current CellCollection : " . $caPath;
$logger4Perl -> error ($errMsg);
next;
}
push #calcActionList, $calcAction;
} else {
my #mList;
if(not defined $method) {
push #mList, #{$task->getMethods(cellCollection => $cellCollection, command => $command)};
$method = join(' ', #mList);
} elsif($method eq $task_desc::default_method) {
#mList = ($task_desc::default_method);
} else {
#mList = sort (grep { $_ } split(/\s+|__/, $method));
}
foreach my $m (#mList) {
push(#calcActionList, #{$cellCollection->findCalcActions($task, $command, $m)});
}
}
foreach my $calcAction(#calcActionList) {
my TaskGenJob $job = TaskGenJob->new ();
$logger4Perl->info ("Adding $caPath");
push (#importedJobs, $job);
my $noOfProcsBeingAdded = $job->calculateNoOfJobExecObjs();
$noOfProcsToBeAdded -= $noOfProcsBeingAdded;
}
last if 1 > $noOfProcsToBeAdded;
}
#================= End creation of objects=============================
#Below function updates the GUI display
$self->addJobs (\#importedJobs);
#================= Mechanism which am using so that GUI will be active after certain time limit=============================
FINISH:
if(#{$self->_importJobList()}) {
$self->parentDlg()->parentWnd()->after(60000,
sub {
$GuiTasksAppl::mainDlg->Disable();
$self->importJobList([], 'noclear', 200);
$GuiTasksAppl::mainDlg->Enable();
});
}
}
Currently the way am doing it is to add say 3000 jobs using $noOfProcsToBeAdded variable and stay idle for some time and repeat the process after some time.During this idle process,there is different process which processes the jobs in GUI.
can someone propose a better approach than this ???
Expecting ideas on threading ,shared memory.
First, if the GUI freezes (and never unfreezes) during your large 30k update then you might have found a Tk bug since that shouldn't happen. However, if its merely unresponsive for a period of time, then it make sense to mitigate the delay.
In the past, i've used either Tk::repeat() or Tk::after() to drive my UI update routine. The user interface doesn't typically need to be updated at a high rate, so every few hundred milliseconds can be a reasonable delay. The determining factor is largely determined by how responsive of an interface you need. Then during the job import step: append references to a list for the UI update routine and then periodically call $MW->update(). The update routine doesn't necessarily need to process the full list during each call but you don't want the processing to get too far behind.
I'd also recommend some visual indicator to identify that the update is still in-progress.
If ImportJobs is computationally expensive, obviously one could perform multi-process / multi-threading tricks to exploit multiple processors on the system. But that'll add a bit of complexity and testing effort.

Resources