Add numbers to the beginning of lists - python-3.x

I have a list of list, say X, that looks like this
X_train = [[4,3,1,5], [3,1,6,2], [5,0,49,4], ... , [3,57,3,3]]
I wrote this piece of code
for x in range(0,len(X_train)):
X_train[x].insert(0, x+1)
For each list in X this code inserts the index value of the list + 1 to the beginning of the list. That is, running
for x in range(0,len(X_train)):
X_train[x].insert(0, x+1)
print(X)
will produce the following output
[[1,4,1,5],[2,3,1,6,2],[3,5,0,49,4],...,[n,3,57,3,3]]
where n is the number of lists in X.
Question: Is there a faster way to do this? I would like to be able to do this for very large lists, e.g. list with millions of sublists (if that's possible).

This is faster in my testing:
X = [[n, *l] for n, l in enumerate(X, 1)]

To my knowledge, the standard insert method in Python has a time complexity of O(n). Given your current implementation, your algo would have a time complexity of O(m x n) where m is the number of sublists and n is the number of elements in the sublists (I assume here that the number of sublist elements is always the same).
You could use blist instead of the standard lists which has a time complexity of O(log n) for insertions. This means the total time reduces to O(m x log n). It's not that much of an improvement, though.

Related

What's the Big-O-notation for this algorithm for printing the prime numbers?

I am trying to figure out the time complexity of the below problem.
import math
def prime(n):
for i in range(2,n+1):
for j in range(2, int(math.sqrt(i))+1):
if i%j == 0:
break
else:
print(i)
prime(36)
This problem prints the prime numbers until 36.
My understanding of the above program:
for every n the inner loop runs for sqrt(n) times so on until n.
so the Big-o-Notation is O(n sqrt(n)).
Does my understanding is right? Please correct me if I am wrong...
Time complexity measures the increase in number or steps (basic operations) as the input scales up:
O(1) : constant (hash look-up)
O(log n) : logarithmic in base 2 (binary search)
O(n) : linear (search for an element in unsorted list)
O(n^2) : quadratic (bubble sort)
To determine the exact complexity of an algorithm requires a lot of math and algorithms knowledge. You can find a detailed description of them here: time complexity
Also keep in mind that these values are considered for very large values of n, so as a rule of thumb, whenever you see nested for loops, think O(n^2).
You can add a steps counter inside your inner for loop and record its value for different values of n, then print the relation in a graph. Then you can compare your graph with the graphs of n, log n, n * sqrt(n) and n^2 to determine exactly where your algorithm is placed.

Big-O analysis of permutation algorithm

result = False
def permute(a,l,r,b):
global result
if l==r:
if a==b:
result = True
else:
for i in range(l, r+1):
a[l], a[i] = a[i], a[l]
permute(a, l+1, r, b)
a[l], a[i] = a[i], a[l]
string1 = list("abc")
string2 = list("ggg")
permute(string1, 0, len(string1)-1, string2)
So basically I think that finding each permutation takes n^2 steps (times some constant) and to find all permutations should take n! steps. So does this make it O(n^2 * n!) ? and if so does the n! take over, making it just O(n!)?
Thanks
edit: this algorithm might seem weird for just finding permutations, and that is because i'm also using it to test for anagrams between the two strings. I just haven't renamed the method yet sorry
Finding each permutation doesn't take O(N^2). Creating each permutation happens in O(1) time. While it is tempting to say that this O(N) because you assign a new element to each index N times per permutation, each permutation shares assignments with other permutations.
When we do:
a[l], a[i] = a[i], a[l]
permute(a, l+1, r, b)
All subsequent recursive calls of permute down the line have this assignment already in place.
In reality, assignments only happen each time permute is called, which is times. We can then determine the time complexity to build each permutation using some limit calculus. We take the number of assignments over the total number of permutations as N approaches infinity.
We have:
Expanding the sigma:
The limit of the sum is the sum of the limits:
At this point we evaluate our limits and all of the terms except the first collapse to zero. Since our result is a constant, we get that our complexity per permutation is O(1).
However, we're forgetting about this part:
if l==r:
if a==b:
result = True
The comparison of a == b (between two lists) occurs in O(N). Building each permutation takes O(1), but our comparison at the end, which occurs for each permutation, actually takes O(N). This gives us a time complexity of O(N) per permutation.
This gives you N! permutations times O(N) for each permutation giving you a total time complexity of O(N!) * O(N) = O(N * N!).
Your final time complexity doesn't reduce to O(N!), since O(N * N!) is still an order of magnitude greater than O(N!), and only constant terms get dropped (same reason why O(NlogN) != O(N)).

I keep timing out calculating the mean of a subset

I have a working block of code, but the online judge on HackerEarth keeps returning a timing error. I'm new to coding and so i don't know the tricks to speed up my code. any help would be much appreciated!
N, Q = map(int, input().split())
#N is the length of the array, Q is the number of queries
in_list =input().split()
#input is a list of integers separated by a space
array = list(map(int, in_list))
from numpy import mean
means=[]
for i in range(Q):
L, R = map(int, input().split())
m= int(mean(array[L-1:R]))
means.append(m)
for i in means:
print(i)
Any suggestions would be amazing!
You probably need to avoid doing O(N) operations in the loop. Currently both the slicing and the mean call (which needs to sum up the items in the slice) are both that slow. So you need a better algorithm.
I'll suggest that you do some preprocessing on the list of numbers so that you can figure out the sum of the values that would be in the slice (without actually doing a slice and adding them up). By using O(N) space, you can do the calculation of each sum in O(1) time (making the whole process take O(N + Q) time rather than O(N * Q)).
Here's a quick solution I put together, using itertools.accumulate to find a cumulative sum of the list items. I don't actually save the items themselves, as the cumulative sum is enough.
from itertools import accumulate
N, Q = map(int, input().split())
sums = list(accumulate(map(int, input().split())))
for _ in range(Q):
L, R = map(int, input().split())
print((sums[R] - (sums[L-1] if L > 0 else 0)) / (R-L+1))

Generating triangular number using iteration in haskell

I am trying to write a function in Haskell to generate triangular number, I am not allowed to use recursion, I am supposed to use iteration
here is my code ...
triSeries 0 = [0]
triSeries n = take n $iterate (\x->(0+x)) 1
I know that my function after iterate is wrong .
But It has been hours looking for a function, any hint please?
Start by writing out some triangular numbers
T(1) = 1
T(2) = 1 + 2
T(3) = 1 + 2 + 3
An iterative process to generate T(n) is to start from [1..n], take the first element of the list, and add it to a running total. In a language with mutable state, you might write:
def tri(n):
sum = 0
for x in [1..n]:
sum += x
return sum
In Haskell, you can iteratively consume a list of numbers and accumulate state via a fold function (foldl, foldr, or some variant). Hopefully that's enough to get started with.
Maybe wikipedia could be a hint, where something like
triangular :: Int -> Int
triangular x = x * (x + 1) `div` 2
could be got from.
triSeries could be something like
triSeries :: Int -> [Int]
triSeries x = map triangular [1..x]
and works like that
> triSeries 10
[1,3,6,10,15,21,28,36,45,55]
Talking about iterate. Maybe there is some way to use it here, but as John said, foldl would be sufficient. Take a look at this page, what are you looking is in the very beginning.
It is not clear what is meant by "recursion is not allowed, use iteration". All functions that appear to be "iterative" are recursive inside.
iterate in all your uses can only modify the input with a constant, and iterate (+1) 1 is the same as [1..]. Consider using a Data.List function that can combine a number from infinite range [1..] and the previously computed sum to produce a infinite list of such sums:
T_i=i+T_{i-1}
This is definitely cheaper than x*(x+1) div 2
Consider using a Data.List function that can produce an infinite list of finite lists of sums from a infinite list of sums. This is going to be cheaper than computing a list of 10, then a list of 11 repeating the same computation done for the list of 10, etc.

Haskell function taking a long time to process

I am doing question 12 of project euler where I must find the first triangle number with 501 divisors. So I whipped up this with Haskell:
divS n = [ x | x <- [1..(n)], n `rem` x == 0 ]
tri n = (n* (n+1)) `div` 2
divL n = length (divS (tri n))
answer = [ x | x <- [100..] , 501 == (divL x)]
The first function finds the divisors of a number.
The second function calculates the nth triangle number
The 3rd function finds the length of the list that are the divisors of the triangle number
The 4th function should return the value of the triangle number which has 501 divisors.
But so far this run for a while without returning a result. Is the answer very large or do I need some serious optimisation to make this work in a realistic amount of time?
You need to use properties of divisor function: http://en.wikipedia.org/wiki/Divisor_function
Notice that n and n + 1 are always coprime, so that you can get d(n * (n + 1) / 2) by multiplying previously computed values.
It is probably faster to prime-factorise the number and then use the factorisation to find the divisors, than using trial division with all numbers <= sqrt(n).
The Sieve of Eratosthenes is a classical way of finding primes, which may be modified slightly to find the number of divisors of each natural number. Instead of just marking each non-prime as "not prime", you could make a list of all the primes dividing each number.
You can then use those primes to compute the complete set of divisors, or just the number of them, since that is all you need.
Another variation would be to mark not just multiples of primes, but multiples of all natural numbers. Then you could simply use a counter to keep track of the number of divisors for each number.
You also might want to check out The Genuine Sieve of Eratosthenes, which explains why
trial division is way slower than the real sieve.
Last off, you should look carefully at the different kinds of arrays in Haskell. I think it is probably easier to use the ST monad to implement the sieve, but it might be possible to achieve the correct complexity using accumArray, if you can make sure that your update function is strict. I have never managed to get this to work though, so you are on your own here.
If you were using C instead of Haskell, your function would still take much time.
To make it faster you will need to improve the algorithm, using suggestions from the above answers. I suggest to change the title and question description accordingly. Following that I'll delete this comment.
If you wish, I can spoil the problem by sharing my solution.
For now I'll give you my top-level code:
main =
print .
head . filter ((> 500) . length . divisors) .
map (figureNum 3) $ [1..]
The algorithmic improvement lies in the divisors function. You can further improve it using rawicki's suggestion, but already this takes less than 100ms.
Some optimization tips:
check for divisors between 1 and sqrt(n). I promise you won't find any above that limit (except for the number itself).
don't build a list of divisors and count the list, but count them directly.

Resources