Time to find the largest element in a Linked list - search

I have an ordered linked list. I want to know the time to find the max element in both cases:
If I maintain a tail pointer to the end of the linked list, and
If I do not do so.

For ordered linked list:
O(1) if the list is ordered from max to min as the first one is already the max one.
O(n) if the list is ordered from min to max, it will become O(1) if you further have the tail pointer. The reason is that the max element is the last one in the linked list. Thus you have to traverse to the tail (O(n)). On the other hand, this will be O(1) when you have the tail pointer, you just return what the tail pointer pointers to.
For un-ordered linked list:
Always O(n) no matter you have the tail pointer or not. The reason is that, for an un-ordered lined list, you have to traverse (O(n)) each one to determine which one is the max. And it cannot help even with a tail pointer.

Related

Can I write a hash function say `hashit(a, b, c)` taking no more than O(N) space

Given integers a,b,c such that
-N<=a<=N,
0<=b<=N,
0<=c<=10
Can I write a hash function say hashit(a, b, c) taking no more than O(N) adrdress space.
My naive thought was to write it as,
a+2N*b+10*2N*N*c
thats like O(20N*N) space, so it wont suffice my need.
let me elaborate my usecase, I want tuple (a,b,c) as key of a hashmap . Basically a,b,c are arguments to my function which I want to memorise. in python #lru_cache perfectly does it without any issue for N=1e6 but when I try to write hash function myself I get memory overflow. So how do python do it ?
I am working wih N of the order of 10^6
This code work
#lru_cache(maxsize=None)
def myfn(a,b,c):
//some logic
return 100
But if i write the hash function myself like this, it doesn't . So how do python do it.
def hashit(a,b,c):
return a+2*N*b+2*N*N*c
def myfn(a,b,c):
if hashit(a,b,c) in myhashtable:
return myhashtable[hashit(a,b,c)]
//some logic
myhashtable[hashit(a,b,c)] = 100;
return myhashtable[hashit(a,b,c)]
To directly answer your question of whether it is possible to find an injective hash function from a set of size Θ(N^2) to a set of size O(N): it isn't. The very existence of an injective function from a finite set A to a set B implies that |B| >= |A|. This is similar to trying to give a unique number out of {1, 2, 3} to each member of a group of 20 people.
However, do note that hash functions do oftentimes have collisions; the hash tables that employ them simply have a method for resolving those collisions. As one simple example for clarification, you could for instance hold an array such that every possible output of your hash function is mapped to an index of this array, and at each index you have a list of elements (so an array of lists where the array is of size O(N)), and then in the case of a collision simply go over all elements in the matching list and compare them (not their hashes) until you find what you're looking for. This is known as chain hashing or chaining. Some rare manipulations (re-hashing) on the hash table based on how populated it is (measured through its load factor) could ensure an amortized time complexity of O(1) for element access, but this could over time increase your space complexity if you actually try to hold ω(N) values, though do note that this is unavoidable: you can't use less space than Θ(X) to hold Θ(X) values without any extra information (for instance: if you hold 1000000 unordered elements where each is a natural number between 1 and 10, then you could simply hold ten counters; but in your case you describe a whole possible set of elements of size 11*(N+1)*(2N+1), so Θ(N^2)).
This method would, however, ensure a space complexity of O(N+K) (equivalent to O(max{N,K})) where K is the amount of elements you're holding; so long as you aren't trying to simultaneously hold Θ(N^2) (or however many you deem to be too many) elements, it would probably suffice for your needs.

What is the time/space complexity of zip(*a[::-1])?

If a is a List[List[int]] representing a n✕n matrix, what is the time- and space-complexity of a = zip(*a[::-1])?
My thoughts
Time-complexity has to be at least O(n^2) because we touch n^2 elements. I guess that every element is touched exactly twice (reversing/flipping with a[::-1] and transposing with zip(*reversed))
list(zip(*a[::-1])) returns a copy, so the space-complexity would be at least n^2. But zip returns an iterator, hence I'm not sure.
I'm especially uncertain about the space complexity, because I read that zip(*inner) unpacks the inner iterables (source). My guess is that it stores additionally to the input O(n) for keeping pointers for the n inner "unpacked" iterators. But I'm very uncertain about that.

How is heapq.heappushpop more efficient than heappop and heappush in python

In the docs for heapq, its written that
heapq.heappushpop(heap, item)
Push item on the heap, then pop and return the smallest item from the heap. The combined action runs more efficiently than heappush() followed by a separate call to heappop().
Why is it more efficient?
Also is it considerably more efficient ?
heappop is pop out the first element, then move the last element to fill the in the first place, then do a sinking operation, which moving the the element down through consecutive exchange. thus restore the head
it is O(logn)
then you headpush, place the element in the last place, and bubble-up
like heappop but reverse
another O(logn)
while heappushpop, pop out the first element, instead of moving the last element to the top, it place the new element in the top, then do a sinking motion. which is almost the same operation with heappop.
just one O(logn)
as above even though they are both O(logn), it is easier to see heappushpop is faster than heappop then heappush.
heappushpop pushes an element and then pops the smallest elem. If the elem you're pushing is smaller than the heap's minimum, then there's no need to do any operations., because we know that the element we're trying to push (which is smaller than the heap min), will be popped if we do it in two operations.
This is efficient, isn't it?

fast, semi-accurate sort in linux

I'm going through a huge list of files in Linux, the output of a "find" (directory walk). I want to sort the list by filename, but I'd like to begin processing the files as soon as possible.
I don't need the sort to be 100% correct.
How can I do a "partial sort", that might be off some of the time but will output quickly?
This is StackOverflow, not SuperUser, so an algorithm answer should be enough for you.
Try implementing HeapSort. But instead of sorting the full list of names, do the following.
Pick a constant M. The smaller it is, the more "off" it will be and the "faster" the algorithm will start printing the results. In the limiting case where M is equal to the number of all names, it will be an exact sorting algorithm.
Load the first M elements, heapify() them.
Take the lowest element from the heap, print it. Put next unsorted name into its place, then do siftDown().
Repeat until you run out of unsorted names. Do a standard HeapSort on the elements left in the heap.
This algorithm will be linear in number of names and will start printing the names as soon as the first M of them will be read. Step 2 is O(M) == O(1). Step 3 is O(log M) == O(1), it is repeated O(N) times, hence total is O(N).
This algorithm will try to keep the large elements in the heap as long as possible while pushing the lowest elements from the heap as quickly as possible. Hence the output will look as if it was almost sorted.
IIRC, a variant of this algorithm is actually what GNU sort does before switching to on-disk MergeSort to keep sorted runs of data as long as possible and minimize number of on-disk merges.

finding element in very big list in less than O(n)

I want to check if an element exists in a list (a very big one in 10,000,000 order) in a O(1) instead of O(n). Lists with elem x ys take O(n)
So i want to use another data type/constructor but it has to be in Prelude(not Array); any suggestions? And if i have to build me data type what it would be like?
Also to sort a big list of numbers in the same order (10,000,000)and indexing an element in the shortest time possible.
The only way to search for an item in a data set in O(1) time is if you already know where it is, but then you don't need to search for it. For unsorted data, search is O(n) time. For sorted data, search is O(log n) time.
You should use either Bloom filter or Hashtable. Neither of them is in Prelude; moreover, both rely on Array to be available.
The only left option is some kind of tree; I would suggest heap. It’s not hard to implement and it also gives you sorting for free.
UPDATE: oops! I have forgotten that heap doesn’t provide lookup. BST is your choice, then.

Resources