oracle temporary ora 12505 error after linux startup - linux

I am experiencing a very strange behavior with oracle, maybe somebody can help me, let me summarize it real quick:
My OS of choice is debian linux, I am using Oracle XE 11.0.2.0. On linux startup, I run a script file which is located under /etc/init.d/. I added the following line to make oracle start on system start:
/etc/init.d/oracle-xe start
Right after this line , I run my application from the script, my application heavily relies on the oracle db, therefore once oracle starts, I am positive that my application will run ok. Unfortunately my assumption seems wrong.Here's why: I set up similar set up in 3 machines, in 2 of them I see weird behavior, after system start oracle db is not responding to connection requests, Even though oracle-xe start command completed executing.
My observation is the following, if I run my application right after oracle-xe start is executed, I receive ora-12505 errors at least for a minute: "TNS listener does not currently know of SID" . After a minute everything stabilizes, and my application starts working ok. 1 minute without a db on system startup is not acceptable for me performance-wise, therefore I am trying to solve this problem.
Surprisingly it does not happen in one of the other linux boxes I have here, I am not quite sure what is different on that box. I compared ora files, but couldn't find any difference, it seems like a wild goose chase...
I would be so grateful if anybody has experienced and solved ths problem before and shares that valuable solution with me.

I think I found the problem, looks like I am starting oracle-xe instance before I assign network interfaces an IP address, in that case it takes some time for oracle to receive connections, that requires me to set static ip on the linux boxes, which is something I don't want. Is there a solution so that I can still assign IP addresses later on?

Related

Using a fixed server ID with OOkla speedtest-cli

Some time ago, I set up a Linux task to run speedtest-cli every 30 minutes to figure out a network issue. The task used the "--server ID" argument to get the speed to the same server each time. I used it for a while then forgot about it. Today I go back to revisit this only to find out that the API seems to have changed. Now proving the --list argument does not print a list of hundreds of servers, but of only the few (~10) nearest you. In my case, the servers it reports seems to change at least daily. Requesting speedtest to any server ID not reported in the list gives a failure. Has anyone figured out a way to get a periodic speedtest to a fixed server using speedtest-cli or any other tool?
If you are still looking for a solution, here is my suggestion.
While this does not use speedtest-cli (which no longer is supported and you should look at Ookla SpeedTest command line client instead) I believe this is what you are looking for, I'm running this in a Debian VM but if you have access to a RPi and can dedicate to this task, you may want to check this out.
https://github.com/geerlingguy/internet-pi
You can modify the docker-compose to hard code the server ID of your choice. You can get this from the Ookla SpeedTest command line client.
You would need to run the command:
speedtest -L
Good Luck!

Determining Website Crash Time on Linux Server

2.5 months ago, I was running a website on a Linux server to do a user study on 3 variations of a tool. All 3 variations ran on the same website. While I was conducting my user study, the website (i.e., process hosting the website) crashed. In my sleep-deprived state, I unfortunately did not record when the crash happened. However, I now need to know a) when the crash happened, and b) for how long the website was down until I brought it back up. I only have a rough timeframe for when the crash happened and for long it was down, but I need to pinpoint this information as precisely as possible to do some time-on-task analyses with my user study data.
The server runs Linux 16.04.4 LTS (GNU/Linux 4.4.0-165-generic x86_64) and has been minimally set up to run our website. As such, it is unlikely that any utilities aside from those that came with the OS have been installed. Similarly, no additional setup has likely been done. For example, I tried looking at a history of commands used in hopes that HISTTIMEFORMAT was previously set so that I could see timestamps. This ended up not being the case; while I can now see timestamps for commands, setting HISTTIMEFORMAT is not retroactive, meaning I can't get accurate timestamps for the commands I ran 2.5 months ago. That all being said, if you have an idea that you think might work, I'm willing to try (as long as it doesn't break our server)!
It is also worth mentioning that I currently do not know if it's possible to see a remote desktop or something of the like; I've been just ssh'ing in and use the terminal to interact with the server.
I've been bouncing ideas off with friends and colleagues, and we all feel that there must be SOMETHING we could use to pinpoint when the server went down (e.g., network activity logs showing spikes around the time that the user study began as well as when the website was revived, a log of previous/no longer running processes, etc.). Unfortunately, none of us know about Linux logs or commands to really dig deep into this very specific issue.
In summary:
I need a timestamp for either when the website crashed or when it was revived. It would be nice to have both (or otherwise determine for how long the website was down for), but this is not completely necessary
I'm guessing only a "native" Linux command will be useful since nothing new/special has been installed on our server. Otherwise, any additional command/tool/utility will have to be retroactive.
It may or may not be possible to get a remote desktop working with the server (e.g., to use some tool that has a GUI you interact with to help get some information)
Myself and my colleagues have that sense of "there must be SOMETHING we could use" between various logs or system information, such at network activity, process start times, etc., but none of us know enough about Linux to do deep digging without some help
Any ideas for what I can try to help figure out at least when the website crashed (if not also for how long it was down)?
A friend of mine pointed me to the journalctl command, which apparently maintains timestamps of past commands separately from HISTTIMEFORMAT and keeps logs that for me went as far back as October 7. It contained enough information for me to determine both when I revived my Node js server as well as when my Node js server initially went down

ActivePERL on Windows running multiple instances of script simultaneously?

I've been unable to find anything which really answers the question that I have, so if anyone can shed some light on the matter, I'd be grateful.
I'm not a Unix/Linux guy, so I'm running ActivePerl on Windows NT.
The scenario is this:
webscript.cgi calls background.pl to do some dirty work while the user continues browsing the site, using system($cmd). This works fine and all, but what I am wondering is THIS:
What happens if MULTIPLE calls are made within seconds of each other as a result of users actions to run background.pl? Will multiple instances of background.pl run simultaneously? Will one instance have to complete before the next can begin? Will any subsequent instances called simply fail? Or, will my machine begin to smoke and then perhaps explode? (chuckle)
Again, this is running in a Windows environment so I'm not sure if the rules with ActivePerl are a bit different than running in a Unix environment. Thanks to anyone who might have some information about this!
The web server doesn't know anything about the process running background.pl, so it does what it always does. It runs webscript.cgi which launches background.pl.
Now, if webscript.cgi waits for background.pl to complete, you could run into a situation where the web server stops accepting requests because all of its workers are running webscript.cgi. It will resume once a script ends.
All of this is very easy to test.
Will one instance have to complete before the next can begin?
No.
Will any subsequent instances called simply fail?
No.
Or, will my machine begin to smoke and then perhaps explode? (chuckle)
A poorly configured server could indeed be brought down by trying to run too many programs at once.

Cygwin intermittently loses it's mapped drives in /cygdrive

So, I have a collection of Windows Server 2016 virtual machines that are used to run some tests in pairs. To perform these tests, I copy a selection of scripts and files from the network on to the machine, before performing the tests.
I'm basically using a selection of scripts that have existed around here since before my time and whilst i would like to use other methods, so much of our infrastructure relies on these scripts that overhauling the system would be a colossal task.
First up, i sort out the mapped drives with
net use X: \\network\location1 /user:domain\user password
net use Y: \\network\location2 /user:domain\user password
and so on
Soon after, i use rsync to copy files from a location in /cygdrive/y/somewhere to /cygdrive/c/somewhere_else
During the rsync, i will get errors that "files have vanished" (I'm currently unable to post the exact error, I will edit this later to include this). When i check what's currently in the /cygdrive directory, all i see is /cygdrive/c and everything else has disappeared.
I've tried making a symbolic link to /cygdrive/y in a different location, I've tried including persistent:yes on the net use command, I've changed the power settings on the network card to not sleep. None of these work.
I'm currently looking into the settings for the virtual machines themselves at this point, but I have some doubts as we have other virtual windows machines that do not seem to have this issue.
Has anyone has heard of anything similar and/or knows of a decent method to troubleshoot this?
Right, so I've been working on this all day and finally noticed a positive change, but since my systems are in VMware's vCloud, this may not work for some people. It's was simply a matter of having the VM turned off and upgrading the Virtual Hardware Version to the latest version. I have noticed with this though, that upon a restart, one of the first messages that comes up mentions that the computer is "disabling group policies".
I did a bit of research into this and found out that Windows 8 and 10 (no mention of any Windows Server machines) both automatically update Group Policies in the background, disconnecting and reconnecting mapped drives to recreate them.
It's possible that changing the Group Policy drive from "recreate" to "update" should fix this issue, and that the Virtual Hardware update happened to resolve this in a similar manner.

QSslSocket timeouts in Ubuntu Server, but not in Desktop

We have problem with our Qt based production server for our business application. When total SSL connections increases with time, some clients does not manage to connect at all.
QSslSocket::waitForEncrypted() starts to fail with no QSslError, regardless of that timeout where set. There are more then ~100 active connections when this problem starts to kick in.
So there are ~170 connections, twice of threads, and "lsof" mentions a little more then 1000 opened files (we had to increase file "ulimit" for that..).
It does not look like it's clients problem, since IPs that are failing and reconnecting changes with time (some "leaps in" with success, but then other don't).
As mentioned, this happens in Ubuntu Server (Zentyal 10.04 and "vanilla" 9.10), but does NOT in Ubuntu Desktop 9.10.
Everything runs inside VMWare ESX 4.1, systems there tested with same resources attached. System loads stays below 1.0. Daemon runs with root permissions.
It looks like it's something with "server"/"desktop" kernel or other configuration differences, but I couldn't tell what exactly could make SSL connection not to handshake... in "server editions"...
We are using Qt 4.5.3 compiled by ourselves.
EDIT: after all it's the same on any Linux I tried. It feels like it's some kind socket limit per process, witch is about 1016 - other_opened_files. I'll try to create new question about that.
EDIT 2: It's select and FD_SETSIZE limit problem...
Problem is with fact that Qt uses select() which is limited with FD_SETSIZE macro for maximum selected sockets/files. I had to change FD_SETSIZE value inside /usr/include/bits/typesizes.h before compiling libQtNetwork and libQtCore.

Resources