Elements in beginning or end of an array in Binary Search - search

I have been trying to understand the binary search algorithm which is quite simple but I have one question which is bothering me. I might have understand it wrong.
1) When we start searching for an element in an array, we checked three scenarios
i) If the element is in the middle.
ii) If it is greater then middle element.
iii) If it less then the middle element.
My question is I understand this algorithm tries to find the element by diving the array and using the above check pints but what if the element we are searching for is in the beginning or end of array. for example if
a = {12, 14, 15, 16, 17, 18 , 19, 20} ;
and we are looking for number 12 then why would it has to do all the divide and checking when it can find it in the first element of array. Why don't we also check the starting and end element of every binary search iteration instead of just only three scenarios stated above?
Thanks.

Related

DIfference between sorted and unsorted array time complexity

So, I'm very new to programming and computer science, and have been trying to understand the concept of time complexity while solving my code. However, when it comes to sorting arrays, I always get confused about a few things:
In my understanding solving a problem with a sorted array should come in the best case complexity, and unsorted array will have a worst case.
What I always get confused about is, how do we actually take advantage of an array being sorted in a problem that involves searching? Meaning, how will this reduce my time complexity, because i thought i will have to run the loop the same number of times.
For example, if i have an array and want to find two indices whose value add up to a specific target, will it make a difference in time complexity if the array is sorted or unsorted?
Thanks in advance for helping me out.
Let's take a look at your sample problem: find two numbers whose sum equals a given number.
Let's say you have an unsorted array: [2, 8, 1, 3, 6, 7, 5, 4], and the target is 11.
So you look at the first item, 2, and you know that you have to find the number 9 in the array, if it exists. With the unsorted array, you have to do a linear search to determine if 9 exists in the array.
But if you have a sorted array, [1, 2, 3, 4, 5, 6, 7, 8], you have an advantage. When you see the value 2, you know you need to find 9 in the array. But because the list is sorted, you can use binary search. Instead of having to look at every item in the list, you only have to look at 3 of them:
Binary search will look at the 4th item, then the 6th, then the 8th, and finally determine that 9 isn't in the array.
In short, searching in an unsorted array takes O(n) time: you potentially have to look at every item to find out if what you're looking for is there. A sorted array lets you speed up the search. Instead of having to examine every item, you only have to examine at most log2(n) items. That makes a huge difference when numbers get large. For example, if your list contains a million items, binary search only has to examine 20 of them. Sequential search has to look at all million.
The advantage of a sorted array for the "target sum" problem is even better: you don't search the array at all. Instead, you start with pointers at the two ends. If the sum is equal to the target, emit that and move both pointers in. If less than the target, increment the lower pointer. Otherwise, decrement the upper pointer. This will find all solutions in O(n) time -- after taking O(n log n) for the sort.
For the case given in the comments, [40, 60, 1, 200, 9, 83, 17] the process looks like this:
Sort array:
[1, 9, 17, 40, 60, 83, 200]
Start your pointers at the ends, 1 + 200
The sum is 201, too large, so decrement the right pointer.
Now looking at 1 + 83. This is too small; increment the left pointer.
Now looking at 9 + 83. This is too small; increment the left pointer.
Now looking at 17 + 83. This is the target; print (17, 83) as a solution
and move *both* pointers.
Now looking at 40 + 60. This is the target; print (40, 60) as a solution
and move *both* pointers.
The pointers have now met (and passed), so you're done.
This is a lovely example. In general, sorting the array gives you a variety of options for finding things in the array much faster than checking each element in turn. A simple binary search is O(log n), and there is a variety of ways to tune this for a particular application. At worst, a binary (log base 2) search will work nicely.
However, sorting an arbitrary list costs O(n log n) as overhead; you need to figure this one-time payment into your application's needs. For instance, if your array is some sort of data base sorted by a key value (such as a name or ID number), and you have to perform millions of searches from user requests, then it's almost certainly better to sort the data base somehow before you do any searches.
If you want a thorough introduction, research "sorting and searching". One excellent reference is a book by that title by Donald Knuth; it's Vol 2 of "The Art of Computer Programming".

Firebase limit_to_last(1) doesn't work as expected

I have data in firebase that is recording temperature like the following:
I will eventually have a number of weeks, the key starts at 1, next week the key will be 2 then 3 etc etc
I wanted to write a query that gives me the data from the last week (The week with the highest numbered key)
I have this line of code in a python script:
rtn = root.child('bedroom').child('weeks').order_by_key().limit_to_last(1).get()
print(rtn)
This is what is printed out:
[None, {'date_time': '2018-06-08 19:38:41.634010', 'temperature': '21'}]
Why is None at the start of the array? Do I assume it is always here? I want to use the data in the second location of the array. But if it isn't always in the second location of the array my code will then break. I thought that query would return an array of size 1.
I think when I was testing I did see the array only being 1 element in size with the json structure as the first element but I cannot confirm this.
When you use numeric keys (like you do), Firebase may interpret your data as an array. To learn more about this, read Best Practices: Arrays in Firebase.
If this array coercion is a problem for your app, I recommend prefixing the number with a fixed string, e.g. "week_1" or even better "week_01 (since that may you can filter on ranges of weeks).

Python: using list comprehension to count first element in list of numbers

I'm trying to teach myself list comprehension in Python, but I find it quite tricky compared to regular loops and it is hard to find good beginner examples of list comprehension.
Using this basic example below, it supplies a list of numbers and asks for sentences generated such as "2 numbers start with 1."
my_list = [232, 379, 985, 384, 129, 197]
2 numbers start with 1
1 number starts with 2
2 numbers start with 3
1 number starts with 9
If I was going to do this in a loop, I might bring back the first digit in each like this and then count them and put them in print statements (this just shows how I might start out in a loop):
for x in range(len(my_list)):
strList = (str(my_list[x]))
if strList[0]:
print(strList[0])
I'm so confused about how to bring back element [0] in list comprehension.
I know there is a sum available in list comprehension, so I'm trying to start like this below to create a count (this isn't right though) and I don't know how to retrieve the first elements back out of this so I can piece together sentences like "2 numbers start with 1":
count = [sum(x) for x in my_list if my_list[0]]
print(count,' numbers start with', start_digit)
Thanks for any help with understanding list comprehension. It looks much better than loops in terms of being more concise so I want to learn it.
Perhaps the reason why you're getting confused here is that this particular problem doesn't seem like something that list comprehension would solve.
If you only need to get the first digits of the items, then list comprehension can do the trick:
start_digits = [str(x)[0] for x in my_list]
Getting the occurrences of each item is a completely different story. You can it implement in a variety of ways, and if you're not against importing modules, you can use collections.Counter to get the occurrence counts.
from collections import Counter
Counter(start_digits)

Looking for a way to distinguish identical string entries for index use

I am making a function in python 3.5.2 to read chemical structures (e.g. CaBr2) and then gives a list with the names of the elements and their coefficients.
The general rundown of how I am doing it is i have a for loop, it skips the first letter. Then it will append the previous element when it reaches one of: capital letter/number/the end. I did this with index of my iteration, and then get the entry with index(iteration)-1 or -2 depending on the specifics. For the given example it would skip C, read a but do nothing, reach B and append to my name list the translation of Ca, and append 1 to my coefficient list.
This works perfectly for structures with unique entries, but with something like CaCl2, the index of the iteration at the second C is not 2, but zero as index doesn't differentiate between the two. How would I be able to have variables in my function equal to the value at previous index(es) without running in to this problem? Keeping in mind inputs can be of any length, capitalization cannot change, and there could be any number of repeated values

Algorithm for approximate search in sorted integer list

Consider an array of integers (assumed to be sorted); I would like to find the array index of the integer that is closest to a given integer in the fastest possible way. And the case where there are multiple possibilities, the algorithm should identify all.
Example: consider T=(3, 5, 24, 65, 67, 87, 129, 147, 166), and if the given integer is 144, then the code should identify 147 as the closest integer, and give the array index 7 corresponding to that entry. For the case of 66, the algorithm should identify 65 and 67.
Are there O(1) or at least O(log N) algorithms to do this? The direct search algorithm (binary-search, tree-search, hashing etc.) implementation won't work since those would require perfect matching. Is there any way these can be modified to handle approximate search?
I am developing a C code.
Thanks
Do binary search until you get down to a single element.
If there is a match, walk along your neighbors to find other matches.
If there is no match, look at your immediate neighbors to find the closest match.
Properly implemented binary-search should do the trick -- as long as you identify the moment where your search range decreased to two items only. Then you just pick the closest one. Complexity: O(log n).
I know this is really old - but for other people looking for an answer:
If implementing a regular binary search algorithm with a target value, will of course return -1 if the target value was not found.
BUT - in this case, the value of Low/Left will be in the index which the target number was supposed to be positioned in the sorted list.
So in this example, the value of Low at the end of the search will be 7.
Which means if 144 was actually inside the array, 147 will be to it's right, and 129 will be to it's left. All there's left to do is to check which difference is smaller between the target to 147 and 129, and return it.

Resources