Questions regarding Python replace specific texts - python-3.x

I'm writing a script to scrape from another website with Python, and I am facing this question that I have yet to figure out a method to resolve it.
So say I have set to replace this particular string with something else.
word_replace_1 = 'dv'
namelist = soup.title.string.replace(word_replace_1,'11dv')
The script works fine, when the titles are dv234,dv123 etc.
The output will be 11dv234, 11dv123.
However if the titles are, dv234, mixed with dvab123, even though I did not set dvab to be replaced with anything, the script is going to replace it to 11dvab123. What should I do here?
Also, if the title is a combination of alphabits,numbers and Korean characters, say DAV123ㄱㄴㄷ,
how exactly should I make it to only spitting out DAV123, and adding - in between alphabits and numbers?
Python - making a function that would add "-" between letters
This gives me the idea to add - in between all characters, but is there a method to add - between character and number?
the only way atm I can think of is creating a table of replacing them, for example something like this
word_replace_3 = 'a1'
word_replace_4 = 'a2'
.......
and then print them out as
namelist3 = soup.title.string.replace(word_replace_3,'a-1').replace(word_replace_4,'a-2')
This is just slow and not efficient. What would be the best method to resolve this?
Thanks.

Related

Python-docx: Editing a pre-existing Run

import docx
doc = docx.Document('CLT.docx')
test = doc.paragraphs[12].runs[2].text
print(test)
doc.save(input('Name of docx file? Make sure to add file extension '))
I've trying to figure out some way to add/edit text to a pre-existing run using python-docx. I've tried test.clear() just to see if I can remove it, but that doesn't seem to work. Additionally, I tried test.add_run('test') and that didn't work either. I know how to add a new run but it will only add it at the end of the paragraph which doesn't help me much. Currently, 'print' will output the text i'd like to alter within the document, "TERMOFINTERNSHIP". Is there something i'm missing?
The text of a run can be edited in its entirety. So to replace "ac" with "abc" you just do something like this:
>>> run.text
"ac"
>>> run.text = "abc"
>>> run.text
"abc"
You cannot simply insert characters at some location; you need to extract the text, edit that str value using Python str methods, and replace it entirely. In a way of thinking, the "editing" is done outside python-docx and you're simply using python-docx for the "before" and "after" versions.
But note that while this is quite true, it's not likely to benefit you much in the general case because runs break at seemingly random locations in a line. So there is no guarantee your search string will occur within a single run. You will need an algorithm that locates all the runs containing any part of the search string, and then allocate your edits accordingly across those runs.
An empty run is valid, so run.text == "" may be a help when there are extra bits in the middle somewhere. Also note that runs can be formatted differently, so if part of your search string is bold and part not, for example, your results may be different than you might want.

How to compare Strings and put it into another program?

i´ve got small problem and before I spend even more time in trying to solve it i´d like to know if what I want to do is even possible ( and maybe input on how to do it^^).
My problem:
I want to take some text and then split it into different strings at every whitespace (for example "Hello my name is whatever" into "Hello" "my" "name" "is" "whatever").
Then I want to set every string with it´s own variable so that I get something alike to a= "Hello" b= "my" and so on. Then I want to compare the strings with other strings (the idea is to get addresses from applications without having to search through them so I thought I could copy a telephone book to define names and so on) and set matching input to variables like Firstname , LastName and street.
Then, and here comes the "I´d like to know if it´s possible" part I want it to put it into our database, this means I want it to copy the string into a text field and then to go to the next field via tab. I´ve done something like this before with AutoIT but i´ve got no idea how to tell AutoIT whats inside the strings so I guess it must be done through the programm itself.
I´ve got a little bit of experience with c++, python and BATCH files so it would be nice if anyone could tell me if this can even be done using those languages (and I fear C++ can do it and I´m just to stupid to do so).
Thanks in advance.
Splitting a string is very simple, there is usually a built in method called .split() which will help you, the method varies from language to language.
When you've done a split, it will be assigned to an array, you can then use an index to get the variables, for example you'd have:
var str = "Hello, my name is Bob";
var split = str.split(" ");
print split[0]; // is "Hello,"
print split[1]; // is "my" etc
You can also use JSON to return data so you could have an output like
print split["LastName"];
What you're asking for is defiantly possible.
Some links that could be useful:
Split a string in C++?
https://code.google.com/p/cpp-json/

Finding a character inside a string in Excel

I want to remove all the characters from a string expect whatever character is between a certain set of characters. So for example I have the input of Grade:2/2014-2015 and I want the output of just the grade, 2.
I'm thinking that I need to use the FIND function to grab whatever is between the : and the / , this also needs to work with double characters such 10 however I believe that it would work so long as the defining values with the FIND function are correct.
Unfortunately I am totally lost on this when using the FIND function however if there is another function that would work better I could probably figure it out myself if I knew what function.
It's not particularly elegant but =MID(A1,FIND(":",A1)+1,FIND("/",A1) - FIND(":",A1) - 1) would work.
MID takes start and length,FIND returns the index of a given character.
Edit:
As pointed out, "Grade:" is fixed length so the following would work just as well:
=MID(A1,7,FIND("/",A1) - 7)
You could use LEFT() to remove "Grade:"
And then use and then use LEFTB() to remove the year.
Look at this link here. This is the way I would go about it.
=SUBSTITUTE(SUBSTITUTE(C4, "Grade:", ""), "/2014-2015", "")
where C4 is the name of your cell.

Same for loop, giving out two different results using .write()

this is my first time asking a question so let me know if I am doing something wrong (post wise)
I am trying to create a function that writes into a .txt but i seem to get two very different results between calling it from within a module, and writing the same loop in the shell directly. The code is as follows:
def function(para1, para2): #para1 is a string that i am searching for within para2. para2 is a list of strings
with open("str" + para1 +".txt", 'a'. encoding = 'utf-8') as file:
#opens a file with certain naming convention
n = 0
for word in para2:
if word == para1:
file.write(para2[n-1]+'\n')
print(para2[n-1]) #intentionally included as part of debugging
n+=1
function("targetstr". targettext)
#target str is the phrase I am looking for, targettext is the tokenized text I am
#looking through. this is in the form of a list of strings, that is the output of
#another function, and has already been 'declared' as a variable
when I define this function in the shell, I get the correct words appearing. However, when i call this same function through a module(in the shell), nothing appears in the shell, and the text file shows a bunch of numbers (eg: 's93161), and no new lines.
I have even gone to the extent of including a print statement right after declaration of the function in the module, and commented everything but the print statement, and yet nothing appears in the shell when I call it. However, the numbers still appear in the text file.
I am guessing that there is a problem with how I have defined the parameters or how i cam inputting the parameters when I call the function.
As a reference, here is the desired output:
‘She
Ashley
there
Kitty
Coates
‘Let
let
that
PS: Sorry if this is not very clear as I have very limited knowledge on speaking python
I have found the solution to issue. Turns out that I need to close the shell and restart everything before the compiler recognizes the changes made to the function in the module. Thanks to those who took a look at the issue, and those who tried to help.

Replacing or substituting in a python string does not work

I could almost solve all of my python problems thanks to this great site, however, now I'm on a point where I need some more and specific help.
I have a string fetched from a database which looks like this:
u'\t\t\tcase <<<compute_type>>>:\n\t\t\t\t{\n\t\t\t\t\tif (curr_i <= 1) Messag...
the string is basically plain c code with unix line endings and supposed to be treated in a way that the values of some specific variables are replaced by something else gathered from a Qt UI.
I tried the following to do the replacing:
tmplt.replace(u"<<<compute_type>>>", str(led_coeffs.compute_type))
where 'led_coeffs' is a namedtuple and its value is an integer. I also tried this:
tmplt = Template(u'\t\t\tcase ${compute_type}:\n\t\t\t\t{\n\t\t\t\t\tif (curr_i <= 1) Messag...)
tmplt.substitute(compute_type = str(led_coeffs.compute_type))
however, both approaches do not work and I have no idea why. Finally I was hoping to get some input here. Maybe the whole approach is not right and any hint on how to achieve the replacing in a good manner is highly appreciated.
Thanks,
Ben
str.replace (and other string methods) don't work in-place (string in Python are immutable) - it returns a new string - you will need to assign the result back to the original name for the changes to take effect:
tmplt = tmplt.replace(u"<<<compute_type>>>", str(led_coeffs.compute_type))
You could also invent your own kind of templating:
import re
print re.sub('<<<(.*?)>>>', lambda L, nt=led_coeffs: str(getattr(nt, L.group(1))), your_string)
to automatically lookup attributes on your namedtuple...

Resources