Complexities of the following -insertion sort, selection sort ,merge sort , radix sort and explain which one is best sorting algorithm and why? - array-algorithms

Complexities of the following -insertion sort, selection sort ,merge sort , radix sort and explain which one is best sorting algorithm and why?

I don't believe in the 'best' sorting algorithm. It depends on what you want to do. For instance, bubble sort is really easy to implement and would be the best if you just want a quick and dirty way of sorting a short array. On the other hand, for larger arrays, the time complexity will really come into play and you will notice considerable runtime difference. If you really value memory, then you probably want to evaluate space complexities of these.
So the sort answer is: IMHO, there's no best sorting algorithm. I'll leave the following table for you to evaluate for yourself what you want to use.
Sorting AlgorithmAvg Time ComplextitySpace Complexity
Quicksort O(nlog(n)) O(log(n))
Mergesort O(nlog(n)) O(n)
Insertionsort O(n^2) O(1)
Selectionsort O(n^2) O(1)
Radixsort O(nk) O(n+k)

Related

When is good to use KMP algorithm?

I understand that KMP algorithm depends on the helper array that there are prefixes that are similar to suffixes.
It won't efficient when the above condition is not fulfilled as in the helper array contains all zeroes.
Would the runtime be O(m + n) ?
If I am right, what is a better substring algorithm in this case?
To understand when KMP is a good algorithm to use, it's often helpful to ask the question "what's the alternative?"
KMP has the nice advantage that it is guaranteed worst-case efficient. The preprocessing time is always O(n), and the searching time is always O(m). There are no worst-case inputs, no probability of getting unlucky, etc. In cases where you are searching for very long strings (large n) inside of really huge strings (large m), this may be highly desirable compared to other algorithms like the naive one (which can take time Θ(mn) in bad cases), Rabin-Karp (pathological inputs can take time Θ(mn)), or Boyer-Moore (worst-case can be Θ(mn)). You're right that KMP might not be all that necessary in the case where there aren't many overlapping parts of the string, but the fact that you never need to worry about whether there's a bad case is definitely a nice thing to have!
KMP also has the nice property that the processing can be done a single time. If you know you're going to search for the same substring lots and lots of times, you can do the O(n) preprocessing work once and then have the ability to search in any length-m string you'd like in time O(m).

how to sort strings in dictionary order?

I have a file of strings and I have to sort them in dictionary order in O(nlog(n)) or less time. I know the sorting algorithms and applied them to sort numbers. but I have no Idea how to sort strings by using Quick sort or any other sorting algorithm.
Please provide the algorithms not built in methods.
For strings, common suggestion may be radix sort
It strictly depends from alphabet that was used for strings forming and is O(kN) time complexity, where n is number of keys and k is average key length. Note, that it may be confusing to compare this with O(n log n) (where n means number of input elements )
So - the lesser is k, the better is radix sort approach. That means, for lesser radix it will be more effective. I would just quote extended explanation (no need to rephrase it):
The topic of the efficiency of radix sort compared to other sorting
algorithms is somewhat tricky and subject to quite a lot of
misunderstandings. Whether radix sort is equally efficient, less
efficient or more efficient than the best comparison-based algorithms
depends on the details of the assumptions made. Radix sort efficiency
is O(d·n) for n keys which have d or fewer digits. Sometimes d is
presented as a constant, which would make radix sort better (for
sufficiently large n) than the best comparison-based sorting
algorithms, which are all O(n·log(n)) number of comparisons needed.
However, in general d cannot be considered a constant. In particular,
under the common (but sometimes implicit) assumption that all keys are
distinct, then d must be at least of the order of log(n), which gives
at best (with densely packed keys) a time complexity O(n·log(n)). That
would seem to make radix sort at most equally efficient as the best
comparison-based sorts (and worse if keys are much longer than
log(n)).
Also, this algorithm will use some additional space (with worth space complexity O(k+n)) - thus, you should be aware of that (not like comparative algorithms which won't use additional space)
You look at the numerical values (ascii values) of the letters and use them to do the numerical order.
So you look at a word from left to right. If the letters match, you look at the second etc.

Does the most efficient solution to some problems require mutable data?

I've been dabbling in Haskell - so still very much a beginner.
I'm been thinking about the counting the frequency of items in a list. In languages with mutable data structures, this is typically solved using a hash table - a dict in Python or a HashMap in Java for example. The complexity of such a solution is O(n) - assuming the hash table can fit entirely in memory.
In Haskell, there seem to be two (mainstream) choices - to sort the data then group and count it or use a Data.Map. If a sort is used, it dominates the run-time of the solution, so the complexity is O(n log n). Likewise, Data.Map uses a balanced tree, so inserting n elements into it will also have complexity O(n log n).
If my analysis is correct, then I assume that this particular problem is most efficiently solved by resorting to a mutable data structure. Are there other types of problems where this is also true? How in general do people using Haskell approach something like this?
The question whether we can implement any algorithm with optimal complexity in a pure language is currently unknown. Nicholas Pippenger has proven that there is a problem that must necessarily have a log(n) penalty in a pure strict language compared to the optimal algorithm. However, there is a followup paper which shows that this problem have an optimal solution in a lazy language. So at the end of the day we really don't know. Though it seems that most people think that there is an inherent log(n) penalty for some problems, even for lazy languages.

QuickSelect vs. linear search

I am wondering why QuickSelect is supposed to be such a good performing algorithm for finding
an arbitrary element out of an n-sized, unsorted set. I mean, when you go through all elements one by one, until you find the desired one it took O(n) comparisions - That's as much the quickselect's best case and much easier.
Am I missing something essential about this? Is there a case the QiuckSelect is performing better, than linear search?
QuickSelect in Average is better in finding the k-th smallest (largest) number(item) in not sorted array

Dynamic Programming: top down versus bottom up comparison

Can you point me to some dynamic programming problem statements where bottom up is more beneficial than top down? (i.e. simple DP works more naturally but memoization would be harder to implement?)
I find recursion with memoization much easier, and want to solve problems where bottom up is a better/perhaps only feasible approach.
I understand that theoretically both are equivalent, so even something like ease of implementation would count as a benefit.
You will apply bottom up with memoization OR top down recursion with memoization depending on the problem at hand .
For example, if you have to find the minimum weight independent path of a path graph, you will use the bottom up approach as you have to solve all the subproblems that are possible.
But if you have to solve the knapsack problem , you may want to use recursive top down with memoization as you have to solve a limited number of subproblems. Approaching the knapsack problem bottom up will cause the algo to solve a lot of redundant problems that are not used in the original subproblem.
Two things to consider when deciding which algorithm to use
Time Complexity. Both approaches have the same time complexity in general, but because for loop is cheaper than recursive function calls, bottom-up can be faster if measured in machine time.
Space Complexity. (without considering extra call stack allocations during top-down) Usually both approaches need to build a table for all sub-solutions, but bottom-up is following a topological order, its cost of auxiliary space can be sometimes reduced to the size of problem's immediate dependencies. For example: fibonacci(n) = fibonacci(n-1) + fibonacci(n-2), we only need to store the past two calculations
That being said, bottom-up is not always the best choice, I will try to illustrate with examples:
(mentioned by #Nikunj Banka) top-down only solves sub-problems used by your solution whereas bottom-up might waste time on redundant sub-problems. A silly example would be 0-1 knapsack with 1 item...run time difference is O(1) vs O(weight)
you might need to perform extra work to get topological order for bottm-up. In Longest Increasing Path in Matrix, if we want to do sub-problems after their dependencies, we would have to sort all entries of the matrix in descending order, that's extra nmlog(nm) pre-processing time before DP

Resources