Python substitute column text - python-3.x

On execute the following line of code, I am getting TypeError: repl must be a string or callable
ps_df['Name'].str.replace(ps_df['Substitute'].str,'\n'+ps_df['Substitute'].str)
When I changed it to the following,
ps_df['Name'].str.replace(ps_df['Substitute'].str,'\n'+ps_df['Substitute'].str)
I get this error, TypeError: can only concatenate str (not "StringMethods") to str

According to pandas documentation for pandas.Series.str.replace, the first argument should be a string or compiled regex.
But you are trying to feed in a series or a list of strings. Also, you have used the .str accessor of the series which is not properly used. Hence, the error.
You should be using apply to replace strings row-wise.
ps_df.apply(lambda x: x.Name.replace(x.Substitute, '\n' + x.Substitute), axis=1)

Related

NoneType object has no attribute groups? [duplicate]

I have a string which I want to extract a subset of. This is part of a larger Python script.
This is the string:
import re
htmlString = '</dd><dt> Fine, thank you. </dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)'
Which I want to pull-out "Molt bé, gràcies. mohl behh, GRAH-syuhs". And for that I use regular expression using re.search:
SearchStr = '(\<\/dd\>\<dt\>)+ ([\w+\,\.\s]+)([\&\#\d\;]+)(\<\/dt\>\<dd\>)+ ([\w\,\s\w\s\w\?\!\.]+) (\(\<i\>)([\w\s\,\-]+)(\<\/i\>\))'
Result = re.search(SearchStr, htmlString)
print Result.groups()
AttributeError: 'NoneType' object has no attribute 'groups'
Since Result.groups() doesn't work, neither do the extractions I want to make (i.e. Result.group(5) and Result.group(7)).
But I don't understand why I get this error? The regular expression works in TextWrangler, why not in Python? Im a beginner in Python.
You are getting AttributeError because you're calling groups on None, which hasn't any methods.
regex.search returning None means the regex couldn't find anything matching the pattern from supplied string.
when using regex, it is nice to check whether a match has been made:
Result = re.search(SearchStr, htmlString)
if Result:
print Result.groups()
import re
htmlString = '</dd><dt> Fine, thank you. </dt><dd> Molt bé, gràcies. (<i>mohl behh, GRAH-syuhs</i>)'
SearchStr = '(\<\/dd\>\<dt\>)+ ([\w+\,\.\s]+)([\&\#\d\;]+)(\<\/dt\>\<dd\>)+ ([\w\,\s\w\s\w\?\!\.]+) (\(\<i\>)([\w\s\,\-]+)(\<\/i\>\))'
Result = re.search(SearchStr.decode('utf-8'), htmlString.decode('utf-8'), re.I | re.U)
print Result.groups()
Works that way. The expression contains non-latin characters, so it usually fails. You've got to decode into Unicode and use re.U (Unicode) flag.
I'm a beginner too and I faced that issue a couple of times myself.

How to convert string represntaion of list containing namedtuple into normal list

I am trying to convert string representation of list containing namedtuples into normal list.
Tried eval_literal, json.loads and split.
Error with ast.literal_eval
ValueError: malformed node or string: <_ast.Call object at 0x7f834bef5860>
After adding double quotes to string
SyntaxError: invalid syntax
Error with json.loads()
json.decoder.JSONDecodeError: Expecting value: line 1 column 2 (char 1)
Problem with split is it is converting data types of contents of the namedtuple.
Please suggest me possible solution.
input :
'[milestone(id=1, amount=1000, curency='inr'), milestone(id=1, amount=1000, curency='inr')]'
type<str>
expected output:
[milestone(id=1, amount=1000, curency='inr'), milestone(id=1, amount=1000, curency='inr')]
type<list>
This looks like a job for eval() (StackOverflow thread about eval):
from collections import namedtuple
str = "[milestone(id=1, amount=1000, currency='inr'), milestone(id=1, amount=1000, currency='inr')]"
milestone = namedtuple('milestone', 'id amount currency')
list = eval(str)
print(list)
Try it online!
By the way, two r in currency ;)

Python instructional code to print long strings with line breaks

What's wrong with this code? Except for the print statement, it is the direct answer code from a Udacity learning python lesson. It suggests br as an html response, but to me that didn't make sense in python. The python run results print the letters <BR> between every letter of the string.
def breakify(strings):
return "<br>".join(strings)
print(breakify("Haiku frogs in snow" "A limerick came from Nantucket" "Tetrametric drum-beats thrumming,"))
Output:
H<br>a<br>i<br>k<br>u<br> <br>f<br>r<br>o<br>g<br>s<br> <br>i<br>n<br> <br>s<br>n<br>o<br>w<br>A<br> <br>l<br>i<br>m<br>e<br>r<br>i<br>c<br>k<br> <br>c<br>a<br>m<br>e<br> <br>f<br>r<br>o<br>m<br> <br>N<br>a<br>n<br>t<br>u<br>c<br>k<br>e<br>t<br>T<br>e<br>t<br>r<br>a<br>m<br>e<br>t<br>r<br>i<br>c<br> <br>d<br>r<br>u<br>m<br>-<br>b<br>e<br>a<br>t<br>s<br> <br>t<br>h<br>r<br>u<br>m<br>m<br>i<br>n<br>g<br>,
The strings are being concatenated due to string literal concatenation.
Simply put them in a list (or tuple) and separate them with commas.
Example with shorter strings for readability:
print(breakify(["Haiku", "limerick", "drum"]))
Output:
Haiku<br>limerick<br>drum
You got the output you did because str.join takes any iterable, and a string is an iterable. For example:
>>> '.'.join('hello')
'h.e.l.l.o'

Simple Moving Average on a .cat file with Python 3.6

I'm trying to write a program that performs a moving average on a .cat file with ~500 float values, then saves the result to another file. The code works fine if I give in input an array like x=[1,2,3...] but when I try with the file I get the error message:
TypeError: unsupported operand type(s) for *: 'float' and '_io.TextIOWrapper'
May someone please help me?
import numpy as np
def movingaverage (values, window):
weights = np.repeat(1.0,window)/window
sma = np.convolve(values,weights,'valid')
return sma
with open('Relative_flux.cat','r') as f:
data=movingaverage(f,3)
print(data)
f is a file handle, not the contents of the files. The contents must first be read, then formatted into an array of floats, before being handed to your function, which expects an array of floats.
Assuming the file is formatted in the way you mention in your comment:
data=movingaverage([float(x) for x in f.read().split()], 3)
read() reads the whole content of the file and returns it as a string.
split() splits the string at all whitespaces
[float(x) for x in [...]) applies the conversion to float to every string, returning an array of floats.
This code will throw an exception if any of the entries in the file cannot be converted to float, or if the format is not consistently floating point numbers separated by whitespaces.
Your object f is an open file rather than an array of floating point values. You need to read lines from the file and load the floating point values into an array, which depends on the specific file format you're using.

Why is str.translate() returning an error and how can I fix it?

import os
def rename_files():
file_list = os.listdir(r"D:\360Downloads\test")
saved_path = os.getcwd()
os.chdir(r"D:\360Downloads\test")
for file_name in file_list:
os.rename(file_name, file_name.translate(None,"0123456789"))
rename_files()
the error message is TypeError: translate() takes exactly one argument (2 given). How can I format this so that translate() does not return an error?
Hope this helps!
os.rename(file_name,file_name.translate(str.maketrans('','','0123456789')))
or
os.rename(file_name,file_name.translate({ ord(i) : None for i in '0123456789' }))
Explanation:
I think you're using Python 3.x and syntax for Python 2.x. In Python 3.x translate() syntax is
str.translate(table)
which takes only one argument, not like Python 2.x in which translate() syntax is
str.translate(table[, deletechars])
which can takes more than one arguments.
We can make translation table easily using maketrans function.
In this case, In first two parameters, we're replacing nothing to nothing and in third parameter we're specifying which characters to be removed.
We can also make translation table manually using dictionary in which key contains ASCII of before and value contains ASCII of after character.If we want to remove some character it value must be None.
I.e. if we want to replace 'A' with 'a' and remove '1' in string then our dictionary looks like this
{65: 97, 49: None}

Resources