Stripping ints from a string in pandas column

Stripping ints from a string in pandas column - python-3.x

I have a column like this:
Age
15-20 years old
20-25 years old
I want this as output:
Age_Min Age_Max
15 20
20 25
I am trying to use str.strip() but no success so far.
I tried d[['Age_Min','Age_Max']]=d['Age'].str.split('-',expand=True)
and the result is almost there. Is there a way to get only the integers and remove the string?
Any tips?

Use Series.str.split with expand=True:
In [858]: out = df['Age'].str.split('-', expand=True).rename(columns={0:'Age_Min', 1: 'Age_Max'})
In [860]: out['Age_Max'] = out['Age_Max'].str.split().str[0]
In [861]: out
Out[861]:
Age_Min Age_Max
0 15 20
1 20 25
OR using regex:
In [870]: out = df['Age'].str.extract("(\d*\-?\d+)")[0].str.split('-', expand=True).rename(columns={0:'Age_Min', 1: 'Age_Max'})
In [871]: out
Out[871]:
Age_Min Age_Max
0 15 20
1 20 25

Related

What do "!" and "." mean in BASIC?

Trying to translate BASIC code written in the 1990's to Python. I keep coming across two symbols, ! (exclamation mark) and . (period). I can't find any documentation online on what they do.
I have the code running but some of the outputs are not as expected - I am wondering if these might be the issue as I previously thought that the period may just be a typo for a multiplication.
Examples:
|
v
QWLOST = (((TW-TDAO)/(TWRT-TDAOR))^1.25)*((VISR/VIS)^0.25).(PW+PE)*DT
TFAVE = (TTO+TBO)/2!
^
|

In case anyone else in the future needs to know this.
! - defines a single
. - Was just a typo for * (multiplication)

I tried a few things in bwBasic (in Linux, in case that's relevant!).
bwBASIC: list
10: for i = 1 to 20
20: print i, ., . - i
30: next i
40: print ".="; .
This gave me:
bwBASIC: run
1 20 19
2 20 18
3 20 17
4 20 16
5 20 15
6 20 14
7 20 13
8 20 12
9 20 11
10 20 10
11 20 9
12 20 8
13 20 7
14 20 6
15 20 5
16 20 4
17 20 3
18 20 2
19 20 1
20 20 0
.= 20
Which would suggest that . (in bwBasic in any case) is the max number in a for loop.

How to generate 3 natural number that sum to 60 using awk

I am trying to write awk script that generate 3 natural numbers that sum to 60. I am trying with rand function but I`ve got problem with sum to 60

Here is one way:
awk -v n=60 'BEGIN{srand();a=int(rand()*n);b=int(rand()*(n-a));c=n-a-b;
print a,b,c}'
Idea is:
generate random number a :0=<a<60
generate random number b :0=<b<60-a
c=60-a-b
here, I set a variable n=60, to make it easy if you have other sum.
If we run this one-liner 10 times, we get output:
kent$ awk 'BEGIN{srand();for(i=1;i<=10;i++){a=int(rand()*60);b=int(rand()*(60-a));c=60-a-b;print a,b,c}}'
46 7 7
56 1 3
26 15 19
14 12 34
44 6 10
1 36 23
32 1 27
41 0 19
55 1 4
54 1 5

Excel formula to get the count of certain value based on odd/even line

I have this data in Excel.
A B C
--------------------------------------
Line Number Value #1 Value #2
1 21 35
2 21 27
3 21 18
4 10 47
5 50 5
6 37 68
7 10 21
8 75 21
I tried to calculate the total "21" based on odd line number. In this situation, the answer should be 3. However, neither" IF(MOD(A1:A8,2)=1,COUNTIF(B1:C8,21)) " nor " {IF(MOD(A1:A8,2)=1,COUNTIF(B1:C8,21))} "worked and Google didn't yield anything helpful. Could anyone help me? Thanks!!

This works for odd lines:
=SUM(COUNTIF(A:B,21)-SUMPRODUCT((A:B=21)*(MOD(ROW(A:B),2)=0)))
there may be a better way of writing this formula.
Use this to count even lines:
=SUMPRODUCT((A:B=21)*(MOD(ROW(A:B),2)=0))

File Reading problems in Python

While Reading the files in python using
f = open ("filename.txt")
and accessing the data with
f.read(1)
and finally finding the position of stream usibg
f.tell()
for every step; We get a continous numbering starting from 0 to the current position.
The problem i am facing is that i am actually getting a random number as f.tell() for some positions and then continung the numbers.
For examle, the f.tell() outputs look something ike the following
0
1
2
3
133454568679978
6
7
8...
Any idea why this is happening?
My Code :
f=open("temp_mcompress.cpp")
current = ' '
while current != '' :
print(f.tell())
current = f.read(1)
f.close()
Temp_mcompress.cpp file :
#include <iostream>
int main(int a)
{
}
OUtput :
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
18446744073709551636
18446744073709551638
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
18446744073709551655
40
41
43
44

It seems I might have found the problem which may still be applicable to python 3.x:
source: http://docs.python.org/2.4/lib/bltin-file-objects.html
tell()
Return the file's current position, like stdio's ftell().
Note: On Windows, tell() can return illegal values (after an fgets())
when reading files with Unix-style line-endings. Use binary mode
('rb') to circumvent this problem.

How can I align columns where the biggest number or greatest string is the align indicator?

How can I right align (and left align?) a block of numbers or text in vim like this:
from:
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
to this:
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
That means the biggest number or greatest string in every column doesn't move.
In the first column it is 45+34, in the second column 209+120, in the third column 300 and in the last column 12.

Have a look at the align plugin, it can do this and much more. Great tool in your utility belt!
Found here
After some serious vimhelp/reading I found the correct AlignCtrl mapping...
Visually select the table, e.g. by using ggVG, then do a \Tsp i.e. <leader>Tsp
Then I get this:
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
From vimhelp:
\Tsp : use Align to make a table separated by blanks |alignmap-Tsp|
(right justified)

You can look into the Tabularize plugin. So if you have something like
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
just select those lines in the visual mode and type :Tab/ and it will format it as
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
Also, it looks like you don't have an equal number of spaces separating the numbers at the moment. So before you use the plugin, replace all the multiple spaces with a single space with the following regex:
%s![^ ]\zs \+! !g

With the Align plugin you can select the rows you want to align and hit :
<Leader>Tsp
From Align.txt
\Tsp : use Align to make a table separated by blanks |alignmap-Tsp|
(right justified)
(The help mention \ because it is the default leader but in case you have changed it to something else you must adapt accordingly)
Just trying on my install, I got the following result :
45 209 25 1
2 4 2 3
34 5 300 5
34 120 34 12
In my opinion Align plugin is great but the "align maps" and various commands are not really easy to remember.

With the Align and AlignMaps plugins: select using V, then \anum (AlignMaps comes with Align). One advantage of \anum is that it also handles decimal points (commas) and scientific notation.

I think the best thing to do is to first eat all multiple spaces with
:{range}s/ \+/ /g
And then call Tabularize
:Tab / /r1
Or change that r to l.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Stripping ints from a string in pandas column - python-3.x

Related

What do "!" and "." mean in BASIC?

How to generate 3 natural number that sum to 60 using awk

Excel formula to get the count of certain value based on odd/even line

File Reading problems in Python

How can I align columns where the biggest number or greatest string is the align indicator?

Categories

Resources