output of "s = "0123456789" print(s[2:-1:-1])" slice operator - python-3.x

s = "0123456789"
print(s[2:-1:-1])
according to me, output of the above question should be "210" BUT it gives nothing
please explain to me how?

Syntax:
sequence [start:stop[:step]]
start:
Optional. Starting index of the slice. Defaults to 0.
stop:
Optional. The last index of the slice or the number of items to get. Defaults to len(sequence).
step:
Optional. Extended slice syntax. Step value of the slice. Defaults to 1.
+---+---+---+---+
|-4 |-3 |-2 |-1 | <= negative indexes
+---+---+---+---+
| A | B | C | D | <= sequence elements
+---+---+---+---+
| 0 | 1 | 2 | 3 | <= positive indexes
+---+---+---+---+
|<- 2:-1:-1 ->| <= extent of the slice: "ABCD"[2:-1:-1] (won't work)
Explanation:
Here in my example "ABCD"[2:-1:-1] If we interpret it, then it says:
start from index 2. (include that item)
Go till index -1 (exclude that item) which is the last item as you can see the table above.
With steps of -1 which basically means in reverse direction. Here you are contradicting your sequence. So it returns nothing.
So the solution would be "ABCD"[2::-1] as someone correctly answered in the comment. This says start from index 2 go till end either beginnig or end based on the steps which is -1 here so beginning.
So same answer to your question print(s[2::-1]) will print 210

Related

SUMPRODUCT function with IF logic

I am attempting to multiple two arrays, but only if array 2 meets a greater than or equal to criteria. That criteria is >=243
| Array 1 | Array 2 |
| 5 | 200 |
| 5 | 240 |
| 5 | 280 |
| 5 | 320 |
I have attempted to use the following formula:
=SUMPRODUCT(--(Program!F4:F8>=(VLOOKUP(Results!$C$10,Start!$B$3:$H$8,4,0)*Results!E22)),Program!E4:E8)
Which is simplified to:
=SUMPRODUCT(--(Program!F4:F8>=243)),Program!E4:E8)
This returns the number 10, which I assume is because it returns the true values as 1, and then are multiplied by 5 and summed.
How may I fix this to return 3000?
This is the same as Variatus's post except it avoids the, in my opinion, odd choice to multiply the parameters inside the sumproduct function, which handles the multiplication. I've also explicitly converted the first logical array to number.
=SUMPRODUCT(N(F2:F5>=243),E2:E5,F2:F5)
You're almost there. Just add the first array to your formula another time,
=SUMPRODUCT((F2:F5>=243)*(E2:E5)*(F2:F5))
(F2:F5>=243) creates an array of 1 or 0, and the result of that must be multiplied with (F2:F5).

Extract a substring new column based on a substring based on conditions ideally with Pandas

I got a data set (Excel) with hundreds of entries. In one string column there is most of the information. The information is divided by '_' and typed in by humans. Therefore, it is not possible to work with index positions.
To create a usable data basis it's mandatory to extract information from this column in another column.
The search pattern = '*v*' is alone not enough. But combined with the condition that the first item has to be a digit it works.
I tried to get it to work with iterrows, iteritems, str.strip, str.extract and many more. But the best solution I received with a for-loop.
pattern = '_*v*_'
test = []
for i in df['col']:
'#Split the string in substrings
i = i.split('_')
for c in i:
if c.find('x') == 1:
if c[0].isdigit():
# print(c)
test.append(c)
else:
'#To be able to fix a few rows manually
test.append(0)
[4]: test =[22v3, 33v55, 4v2]
#Input
+-----------+-----------+
| col | targetcol |
+-----------+-----------+
| as_22v3 | |
| 33v55_bdd | |
| Ave_4v2 | |
+-----------+-----------+
#Output
+-----------+-----------+--+
| col | targetcol | |
+-----------+-----------+--+
| as_22v3 | 22v3 | |
| 33v55_bdd | 33v55 | |
| Ave_4v2 | 4v2 | |
+-----------+-----------+--+
My code does work, but only for the first few rows. It stops after 36 values and I can't figure out why. There is no error message besides of course that it is not possible to assign the list to a DataFrame series since it has not the same size.
pandas.Series.str.extract should help:
>>> df['col'].str.extract(r'(\d+v+\d+)')
0
0 22v3
1 33v55
2 4v2
df = pd.DataFrame({
'col': ['as_22v3', '33v55_bdd', 'Ave_4v2']
})
df['targetcol'] = df['col'].str.extract(r'(\d+v+\d+)')
EDIT
df = pd.DataFrame({
'col': ['as_22v3', '33v55_bdd', 'Ave_4v2', '_22 v3', 'space 2,2v3', '2.v3',
'2.111v999', 'asd.123v77', '1 v7', '123 v 8135']
})
pattern = r'(\d+(\,[0-9]+)?(\s+)?v\d+)'
df['result'] = df['col'].str.extract(pattern)[0]
col result
0 as_22v3 22v3
1 33v55_bdd 33v55
2 Ave_4v2 4v2
3 _22 v3 22 v3
4 space 2,2v3 2,2v3
5 2.v3 NaN
6 2.111v999 111v999
7 asd.123v77 123v77
8 1 v7 1 v7
9 123 v 8135 NaN
You say it stops after 36 values? You say it is Excel file you are processing? One thing you could try is to save data set to .csv file and try to read this file in with pd.read_csv function. There are sometimes some extra characters in Excel file that are not easily visible.

Add "invisible" decimal places to end of number?

I am printing a "Table" to the console. I will be using this same table structure for several different variables. However as you can see from Output below, the lines don't all align.
One way to resolve it would be to increase the number of decimal places (e.g. 6.730000 for Standard Deviation) which would push the line into place.
However, I do not want this many decimal places.
Is it possible to add extra 0s to the end of a number, and make these invisible?
I am planning on using this table structure for several variables, and the length of Mean, Stddev, and Median will likely never be more than 6 characters.
EDIT - I would really like to ensure that each value which appears in the table will be 6 characters long, and if it is not 6 characters long, add additional "invisible" zeros.
Input
# Create and structure Table to store descriptive statistics for each variable.
subtitle = "| Mean | Stddev | Median |"
structure = '| {:0.2f} | {:0.2f} | {:0.2f} |'
lines = '=' * len(subtitle)
# Print table.
print(lines)
print(subtitle)
print(lines)
print(structure.format(mean, std, median))
print(lines)
Output:
======================================
| Mean | Stddev | Median |
======================================
| 181.26 | 6.73 | 180.34 |
======================================
Didn't really figure this out - but found a workaround.
I just did the following:
"| {:^6} | {:^6} | {:^6} | {:^6} | {:^6} |"
This keeps the width between | consistent.

How to justify columns using format function?

I have a working function that takes a list made up of lists and outputs it as a table. I am just missing certain spacing and new lines. I'm pretty new to formatting strings (and python in general.) How do I use the format function to fix my output?
For examples:
>>> show_table([['A','BB'],['C','DD']])
'| A | BB |\n| C | DD |\n'
>>> print(show_table([['A','BB'],['C','DD']]))
| A | BB |
| C | DD |
>>> show_table([['A','BBB','C'],['1','22','3333']])
'| A | BBB | C |\n| 1 | 22 | 3333 |\n'
>>> print(show_table([['A','BBB','C'],['1','22','3333']]))
| A | BBB | C |
| 1 | 22 | 3333 |
What I am actually outputting though:
>>>show_table([['A','BB'],['C','DD']])
'| A | BB | C | DD |\n'
>>>show_table([['A','BBB','C'],['1','22','3333']])
'| A | BBB | C | 1 | 22 | 3333 |\n'
>>>show_table([['A','BBB','C'],['1','22','3333']])
| A | BBB | C | 1 | 22 | 3333 |
I will definitely need to use the format function but I'm not sure how?
This is my current code (my indenting is actually correct but I'm horrible with stackoverflow format):
def show_table(table):
if table is None:
table=[]
new_table = ""
for row in table:
for val in row:
new_table += ("| " + val + " ")
new_table += "|\n"
return new_table
You do actually have an indentation error in your function: the line
new_table += "|\n"
should be indented further so that it happens at the end of each row, not at the end of the table.
Side note: you'll catch this kind of thing more easily if you stick to 4 spaces per indent. This and other conventions are there to help you, and it's a very good idea to learn the discipline of keeping to them early in your progress with Python. PEP 8 is a great resource to familarise yourself with.
The spacing on your "what I need" examples is also rather messed up, which is unfortunate since spacing is the subject of your question, but I gather from this question that you want each column to be properly aligned, e.g.
>>> print(show_table([['10','2','300'],['4000','50','60'],['7','800','90000']]))
| 10 | 2 | 300 |
| 4000 | 50 | 60 |
| 7 | 800 | 90000 |
In order to do that, you'll need to know in advance what the maximum width of each item in a column is. That's actually a little tricky, because your table is organised into rows rather than columns, but the zip() function can help. Here's an example of what zip() does:
>>> table = [['10', '2', '300'], ['4000', '50', '60'], ['7', '800', '90000']]
>>> from pprint import pprint
>>> pprint(table, width=30)
[['10', '2', '300'],
['4000', '50', '60'],
['7', '800', '90000']]
>>> flipped = zip(*table)
>>> pprint(flipped, width=30)
[('10', '4000', '7'),
('2', '50', '800'),
('300', '60', '90000')]
As you can see, zip() turns rows into columns and vice versa. (don't worry too much about the * before table right now; it's a bit advanced to explain for the moment. Just remember that you need it).
You get the length of a string with len():
>>> len('800')
3
You get the maximum of the items in a list with max():
>>> max([2, 4, 1])
4
You can put all these together in a list comprehension, which is like a compact for loop that builds a list:
>>> widths = [max([len(x) for x in col]) for col in zip(*table)]
>>> widths
[4, 3, 5]
If you look carefully, you'll see that there are actually two list comprehensions in that line:
[len(x) for x in col]
makes a list with the lengths of each item x in a list col, and
[max(something) for col in zip(*table)]
makes a list with the maximum value of something for each column in the flipped (with zip) table … where something is the other list comprehension.
That's all kinda complicated the first time you see it, so spend a little while making sure you understand what's going on.
Now that you have your maximum widths for each column, you can use them to format your output. In order to do so, though, you need to keep track of which column you're in, and to do that, you need enumerate(). Here's an example of enumerate() in action:
>>> for i, x in enumerate(['a', 'b', 'c']):
... print("i is", i, "and x is", x)
...
i is 0 and x is a
i is 1 and x is b
i is 2 and x is c
As you can see, iterating over the result of enumerate() gives you two values: the position in the list, and the item itself.
Still with me? Fun, isn't it? Pressing on ...
The only thing left is the actual formatting. Python's str.format() method is pretty powerful, and too complex to explain thoroughly in this answer. One of the things you can use it for is to pad things out to a given width:
>>> "{val:5s}".format(val='x')
'x '
In the example above, {val:5s} says "insert the value of val here as a string, padding it out to 5 spaces". You can also specify the width as a variable, like this:
>>> "{val:{width}s}".format(val='x', width=3)
'x '
These are all the pieces you need … and here's a function that uses all those pieces:
def show_table(table):
if table is None:
table = []
new_table = ""
widths = [max([len(x) for x in c]) for c in zip(*table)]
for row in table:
for i, val in enumerate(row):
new_table += "| {val:{width}s} ".format(val=val, width=widths[i])
new_table += "|\n"
return new_table
… and here it is in action:
>>> table = [['10','2','300'],['4000','50','60'],['7','800','90000']]
>>> print(show_table(table))
| 10 | 2 | 300 |
| 4000 | 50 | 60 |
| 7 | 800 | 90000 |
I've covered a fair bit of ground in this answer. Hopefully if you study the final version of show_table() given here in detail (as well as the docs linked to throughout the answer), you'll be able to see how all the pieces described earlier on fit together.

How to use slicing in Python

I am finding slicing in Python a bit very difficult. Lets say if I want the first five and last five characters of a phrase to display how do i go about it. For example:
words = input("Enter a word ")
slice = words[:2]
print(slice)
You can use negative indexing for slice from end :
>>> s="teststring"
>>>
>>> s[-5:]
'tring'
>>> s[:5]
'tests'
Actually a slice notation observes the following law :
[start:end:step]
One way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n, for example:
+---+---+---+---+---+---+
| P | y | t | h | o | n |
+---+---+---+---+---+---+
0 1 2 3 4 5 6
-6 -5 -4 -3 -2 -1
Read more about slicing https://docs.python.org/2/tutorial/introduction.html#strings
And https://docs.python.org/2.3/whatsnew/section-slices.html

Resources