Oracle Sort Order - What may cause it to change - linux

Disclaimer: I know that it is bad to not use an 'ORDER BY' in SQL when sorted data is required.
I am currently supporting a Pro*C program which is having a wierd-problem.
One of the possible causes of the wierd-problem may be that the original developers (from a long time ago) have not used ORDER BY in their SQL even though the program logic depends on it!
The program has been working fine all these years and started showing problems only recently.
We are trying to pin the wierd-problem to the ORDER BY mistake (there are other cause candidates like a recent port from Solaris to Linux which took place).
What shadowy things on the database end should we look at that may have changed the old sort order? Things like data files etc?
Anybody have any experience with Pro*C on Solaris magically sorting the result-set?
Thanks!

Since you know that the program cares about the order in which results are returned and you know that the query that is submitted is missing an ORDER BY clause, is there a reason that you don't just fix the problem rather than looking to try to figure out whether the actual order of results may have changed? If you fix the known ORDER BY problem and the "weird problem" you have disappears, that would provide some pretty good evidence that the "weird problem" is, in fact, caused by the missing ORDER BY.
Unfortunately, there are lots of things that might have caused the order of results to change many of which may be impossible to track down. The most obvious cause would be a change in the execution plan. That, in turn, may have been caused either because statistics changed or because statistics didn't change enough or because of a patch or because of an initialization parameter change or because of a client configuration change among other things. If you are licensed to use the AWR (Automatic Workload Repository), you might be able to find evidence that the plan has changed by looking to see if there are multiple PLAN_HASH_VALUE values for the SQL_ID in DBA_HIST_SQLSTAT over different days. If there are, you'd still have to try to figure out whether the different plans actually caused the results to be returned in a different order. Beyond a query plan change, though, there are dozens of other possible causes. The physical order of data on disk may have changed because someone reorganized the table or because someone moved data files around on the disk or because the SAN automatically rebalanced something by moving data around. Some data may have been cached (or may not have been cached) in general in the past that is now cached. An Oracle patch may have been applied.

I suggest that change your physical table with view and make your required order in that view.
example
TABLE_NOT_SORTED --> rename to --> PHYS_TABLE_NOT_SORTED
CREATE VIEW TABLE_NOT_SORTED
AS
SELECT * FROM PHYS_TABLE_NOT_SORTED
ORDER BY DESIRED_COLUMNS
For response to comment:
According to this question and Ask Tom's Answer, it seems that since Oracle does not guarantee a default sorting if you do not use "ORDER BY", they are free to change it. They are absolutely right of course. If you need sorting, use Order By.
Other than that we can not say anything about your code or default ordering.

Related

No depot VRP - roadside assistance

I am researching a problem that is pretty unique.
Imagine a roadside assistance company that wants to dynamically route its vehicles. Hence for each packet of new incidents wants to create routes that will satisfy them, according to some constraints (time constraints, road accessibility, vehicle - incident matching).
The company has an heterogeneous fleet of vehicle (motorbikes for easy cases, up to tow trucks for the hard cases) and each incident states it's uniqueness (we know if it wants just fuel, or needs towing).
There is no depot, only the vehicles roaming on the streets.
The objective is to dynamically create routes on the way, having in mind the minimization of time and the total traveled distance.
Have you ever met such a problem? Do you have any idea in which VRP variant it belongs?
I have seen two previous questions but unfortunately they don't fit with my problem.
The respected optaplanner - VRP but with no depot and Does optaplanner out of box support VRP with multiple trips and no depot, which are both open VRPs.
Unfortunately I don't have code right now, as I am still modelling the way I will approach this problem.
I am really sorry for creating a suggestion question and not a real one.
Thank you so much in advance.
It's a rich dynamic/realtime vehicle routing problem. You won't find an exact name for your problem, as when VRPs get too complex they don't fit inside any of the standard categories.
It's clearly a dynamic/realtime problem (the terms are used interchangeably) as you would typically only find out about roadside breakdowns at short notice.
Sometimes you're servicing a broken down car, which would be a single stop (so a vehicle routing problem). Sometimes you're towing a car, which would be a pick-up delivery problem. So you have a mix of both together.
You would want to get to the broken down vehicles ASAP and some would need fixing sooner than others (think a car broken down in a dangerous position on a motorway). You would therefore need soft time windows so you can penalise lateness instead of the standard hard time windows supported in most VRP formulations.
Also for you to be able to scale to larger problems, you need an incremental optimiser that can restart from the previous (possibly now infeasible) solution when new jobs are added, vehicle positions are changed etc. This isn't supported out of the box in the open source solvers I know of.
We developed a commercial engine which does the above. We started off using the jsprit library, which supports mixing single stop and pickup delivery problems together. We later had to replace jsprit due to the amount of code we had to override to get it running happily for realtime problems, however jsprit may still prove a useful starting point for you. We discuss some of the early technical obstacles we had to overcome in getting jsprit to handle realtime problems in this white paper.

How to resolve Order and Warehouse bounded contexts dependency?

I am working on DDD project and I am currently focused on two bouned contexts, Orders and Warehouse.
What confuses me is the following situation:
Order keep track of all the placed orders, and warehouse keeps track about all the available inventory. If user places one order for certain product item, that would mean one less item of that product in a warehouse. I am oversimplifying this process, so please bear with me.
Since two domain models are placed inside of a different BC, i am currently relying on eventual consistency ie. after one item has been sold, it would eventually be removed from the warehouse.
That situation unfortunately leads to the problem scenario where another user could simultaneously make another order of the same item, and it would appear as available until eventual consistency kicks is. That is something it is unacceptable by the domain expert.
So IMO I am stuck with two options
merge order and warehouse (at least the part regarding product
inventory, units available in warehouse) into one BC
have Order BC (or microservice if you wish) to be dependent of Warehouse BC (microservice) in order to pull a live product units
available
Which option does actually follows DDD patern? Is there another way out?
You could use a reservation system with a timeout.
Using a messaging analogy: With a broker-style queuing mechanism (such as RabbitMQ) you get a message from the queue and you have control over it until you either acknowledge that it can be removed from the queue or you release it back to the queue.
You could do the same thing in your ordering process. You reserve any items on your order. SO when you add them they have a status of, say, reserving and upon sending some message to reserve the items. If the response comes back you can decide how to proceed. Perhaps you could add any items that cannot be reserved onto a back order or try again later.
There are going to be different ways to approach this. Depending on your business case it may be acceptable to only check availability when someone really accepts the order.
If you domain expert reckons it is totally unacceptable that having this resolved at the end of the process then you could move it to the start. The issue is of course that user A could reserve and never buy thereby losing user B as a customer; whereas only applying the real "taking" of the item at the end of the process is closer to ensuring a purchase. But that is a business decision.
This issue is a really great example of where reality actually is eventually consistent. Is it really the best thing to decline an order if there is no inventory currently in the warehouse - even if there was a replenishment due in the next 20 minutes?
What if the item was actually on the shelf, but the operator hadn't yet keyed it into the system?
Sometimes designers and domain experts assume that people want 100% consistency, when really, users might be willing to accept a delay in confirmation of their order, if it increased the chance that their order would be accepted rather than rejected.
In the case above, why make it the user's job to retry their order N minutes later? In an eventually consistent system, you can accommodate such timing flexibility by including a timeout to retry the attempt to fulfill the order for a period of time before confirming to the client that it really wasn't possible.
There are technical solutions that will give you 100% consistency, but I think really this is not a technical challenge but a cultural/mindset one, changing people's minds about what is possible & acceptable to achieve an what is actually a better outcome.
IMO you can build a PlaceOrderSaga which will ask for inventory availability before placing the order.

View Index is always being rebuilt

As of late, I have encountered a problem with my view index being rebuilt all the time and users are having massive issues with this particular view.
I figured it was due to #Date in my selection formula aswell as one of my column formulas. This way the selection formula would be different every second that passes.
So I figured, since I dont need hours/minutes/seconds in my formulas, I would use #Today. This worked out well for 2-3 days and after that the same problem occured again.
So since the problem is back again, I'm not quite sure if that even causes the problem. When this particular view is open, I have issues in every tab that's open in notes, not only this specific database.
Is this a common/known issue? What can I do to avoid this problem?
Yes, it's a common issue that has been well known since the very early days of Notes more than 20 years ago.
#Date is not a problem on its own. #Now and #Today are both problems.
Using #TextToTime("Today") was a popular workaround that was discovered early on. This hid the problem from the indexer so the server failed to realize that the view was out of date. It doesn't solve the underlying problem, though, which is that the view is trying to do something that views simply aren't designed to do. Views are intended to be static, requiring update only when documents change. Introducing time into a selection or column formula makes them dynamic, which kills that presumption and is a major source of performance problems. Using this workaround requires that the view be fully rebuilt every night. You can do that by setting the view index options to "Manual", and setting up a program document to run an updall command with the -T option for the specific database and view once per night. Note that if your users are spread out across timezones, you'll have to pick one specific time as your standard, and if you have servers spread out across timezones you're going to have a lot of fun figuring out how to make them all show the same documents in the view at all times - but that's common to pretty much all approaches to the problem.
See this IBM Technote for a description of several other options that people have used over the years, with their pros and cons. Also see this article by Andre Guirard, which covers date/time issues in great detail.
I would add that the agent-and-folder solution that they describe in the Technote was generally my preferred approach, but it does have an additional disadvantage that they don't mention: it can eventually lead to an obscure situation where the server throws an error "Folder is larger than supported". This error actually has nothing to do with the size of the folder in documents; it refers to fragmentation of internal structures that occurs as large numbers of documents are moved in and out of the folder over time. It could only be fixed by deleting and re-creating the folder, which you can do in your agent code. I believe this problem may be fixed in more recent versions of Domino, but it caused me a lot of grief back in the Notes 6 and 7 timeframes.

fast parallel picking/checking of some ID (think non-game application of loot lag)

So I am verifying new operational management systems, and one of these OS's sends pick lists to a scale-able number of handheld devices. It sends these using messages, and their pick lists may contain overlapping jobs. So in my virtual world, I need to make sure that two simulated humans don't pick the same job - whenever someone picks a job, all the job lists get refreshed, so that the picked job doesn't appear on anyone else's handheld anymore, but for me the message is still in the queue being handled, so I have to make sure to discard that option.
Basically I have this giant list with a mutex, and the more "people" hitting it faster, the slower I can handle messages, to the point where I'm no longer at real-time, which is bad, because I can't actually validate the system because I can't keep up with the messages. (two guys on the same isle will recognize that one is going to pick one object and the next guy should pick the 2nd item, but I need to check every single job i'm about to pick and see if it has been claimed by someone else already)
I've considered localized binning of the lists, but it actually doesn't solve the problem in the stupid case that breaks it anyway, tons of people working on the same row. Now granted this would probably be confusing for the real people as well, as in real life they need to do the same resolution, but I'm curious what the currently accepted "best" solution to this problem is.
PS - I already am implementing this in c++ and it's fast, fast enough that in any practical test I don't "need" this question answered, it's more because I'm curious that I'm asking.
Thanks in advance!
I see a problem in the design "giant list with (one) mutex". You simply can't provide the whole list in synchronized fashion, if the list size and/or access rate is unlimited. Basic math works against you. So what i would do is a mutexed flag on each job. You can't prevent a job from being displayed on someone's screen, but you can assure that he gets a graceful "no more available" error and THEN the updated list. If you ever wanted to reserve a seat on highly popular gig, you may have witnessed the solution.

What is the best way to search multiple sources simultaneously?

I'm writing a phonebook search, that will query multiple remote sources but I'm wondering how it's best to approach this task.
The easiest way to do this is to take the query, start a thread per remote source query (limiting max results to say 10), waiting for the results from all threads and aggregating the list into a total of 10 entries and returning them.
BUT...which of the remote source is more important if all sources return at least 10 results, so then I would have to do a search on the search results. While this would yield accurate information it seems inefficient and unlikely to scale up well.
Is there a solution commercial or open source that I could use and extend, or is there a clever algorithm I can use that I've missed?
Thanks
John, I believe what you want is federated search. I suggest you check out Solr as a framework for this. I agree with Nick that you will have to evaluate the relative quality of the different sources yourself, and build a merge function. Solr has some infrastructure for this, as this email thread shows.
To be honest I haven't seen a ready solution, but this is why we programmers exist: to create a solution if one is not readily availble :-)
The way I would do it is similar to what you describe: using threads - if this is a web application then ajax is your friend for speed and usability, for a desktop app gui representation is not even an issue.
It sounds like you can't determine or guess upfront which source is the best in terms of reliability, speed & number of results. So you need to setup you program so that it determines best results on the fly. Let's say you have 10 data sources, and therfore 10 threads. When you fire up your threads - wait for the first one to return with results > 0. This is going to be you "master" result. As other threads return you can compare them to your "master" result and add new results. There is really no way to avoid this if you want to provide unique results. You can start displaying results as soon as you have your first thread. You don't have to update your screen right away with all the new results as they come in but if takes some time user may become agitated. You can just have some sort of indicator that shows that more results are available, if you have more than 10 for instance.
If you only have a few sources, like 10, and you limit the number of results per source you are waiting for, to like 10, it really shouldn't take that much time to sort through them in any programming language. Also make sure you can recover if your remote sources are not available. If let's say, you are waiting for all 10 sources to come back to display data - you may be in for a long wait, if one of the sources is down.
The other approach is to f00l user. Sort of like airfare search sites do - where they make you want a few seconds while they collect and sort results. I really like Kayak.com's implementation - as it make me feel like it's doing something unlike some other sites.
Hope that helps.

Resources