Trying to replace image extensions like "<filename>.<extension>" to "<filename>_resized.<extension>" - python-3.x

I'm trying to use this code in Python using regular expression to get all the image files (of types jpg, png and bmp) in my current folder and add a word "resized" inbetween the filename and the extension
Input
Batman - The Grey Ghost.png
Mom and Dad - Young.jpg
Expected Output
Batman - The Grey Ghost_resized.png
Mom and Dad - Young_resized.jpg
Query
But my output is not as expected. Somehow the 2nd letter of the extension is getting replaced. I have tried tutorials online, but didn't see one which answers my query. Any help would be appreciated.
Code:
import glob
import re
files=glob.glob('*.[jp][pn]g')+glob.glob('*.bmp')
for x in files:
new_file = re.sub(r'([a-z|0-9]).([jpb|pnm|ggp])$',r'\1_resized.\2',x)
print(new_file,' : ',x)
Code Output
Ma image scan - Copy.j\_resized.g : Ma image scan - Copy.jpg
Ma image scan.j\_resized.g : Ma image scan.jpg
Mom and Dad - Young.j\_resized.g : Mom and Dad - Young.jpg
PPF - SBI - 4.j\_resized.g : PPF - SBI - 4.jpg
when-youre-a-noob-programmer-and-you-think-your-loop-64102565.p\_resized.g : when-youre-a-noob-programmer-and-you-think-your-loop-64102565.png
Sample.b\_resized.p : Sample.bmp

Try this:
r'([a-zA-Z0-9_ -]+)\.(bmp|jpg|png)$'
Input:
Batman - The Grey Ghost.png
Output:
Batman - The Grey Ghost_resized.png
See live demo.

Related

Regex to find text & value in large text

As I SSH into CM, run commands and start reading the CLI output, I get the following
back:
# * A lot more output above but been removed *
terminal_output = """
[24;1H [79b[1GCommand: disp sys cust<<[23;0H[0;7m [79b[1G[0m[24;0H [79b[1G[1;0H[0;7m [79b[1G[0m[2;0H [79b[1G[3;1H[0J7[1;1H[0;7mdisplay system-parameters customer-options [0m8[1;65H[0;7mPage 1 of 12[0m[2;33HOPTIONAL FEATURES[4;8HG3 Version: [4;20HV20 [4;50HSoftware Package: [4;68HEnterprise [5;10HLocation: [5;20H2[6;10HPlatform: [6;20H28 [5;51HSystem ID (SID): [5;68H9990093751 [6;51HModule ID (MID): [6;68H1 [8;60HUSED[9;29HPlatform Maximum Ports: [9;53H 81000[9;60H 436[10;35HMaximum Stations: [10;53H 135[10;60H 110[11;27HMaximum XMOBILE Stations: [11;53H 41000[11;60H 0[12;17HMaximum Off-PBX Telephones - EC500: [12;53H 135[12;60H 2[13;17HMaximum Off-PBX Telephones - OPS: [13;53H 135[13;60H 40[14;17HMaximum Off-PBX Telephones - PBFMC: [14;53H 135[14;60H 0[15;17HMaximum Off-PBX Telephones - PVFMC: [15;53H 135[15;60H 0[16;17HMaximum Off-PBX Telephones - SCCAN: [16;53H 0[16;60H 0[17;22HMaximum Survivable Processors: [17;53H 313[17;62H 1[22;9H(NOTE: You must logoff & login to effect the permission changes.)[2;50H[0m
"""
It's a lot of ANSI escape codes (I think?) which sort of makes the output not too readable but anyways, what I'm trying to get back is the following from the text above:
Maximum Stations: 135 110
I know from my understanding that a Regex would be required for this.
The Regexes that I tried using but did not work:
r'Maximum Stations:\s*(\d+)(\d+)'
r'Maximum Stations: \d+'
If anyone knows how to filter out these ANSI character codes so they don't appear in the final output that'd be great too.
Thank you.
you can try the following
"(Maximum Stations:)\s\[\d*;\d*H\s*(\d*)\[\d*;\d*H\s*(\d*)"gm
it produces three groups the first with the maximum stations text then two more each with the number you wanted to capture. You would have to combine the groups to get your final output.
I don't know if this will be generic enough for your application though.

plupload, preserve_headers = false, and autorotation issue

I have a jpeg image where the EXIF Orientation flag = 6, or "Rotate 90 CW". Here's the pertinent data from exiftool:
---- ExifTool ----
ExifTool Version Number : 12.44
---- File ----
File Name : orig.jpg
Image Width : 4032
Image Height : 3024
---- EXIF ----
Orientation : Rotate 90 CW
Exif Image Width : 4032
Exif Image Height : 3024
---- Composite ----
Image Size : 4032x3024
Here's how IrfanView presents the image, with auto-rotate turned off:
Using the plupload "Getting Started" script from here, with preserve_headers = false, I get an image without EXIF headers - as expected - but rotated 180 degrees, which is unexpected. Again, the view with IrfanView:
Here's the "resize" snippet from the code:
resize: {
width: 5000,
height: 5000,
preserve_headers: false
}
Is there something I'm doing wrong? I would have expected a 90 CW rotation on upload with the EXIF stripped.
Dan
Edit: I'm using plupload V2.3.9
BUMP
I'm getting the exact same result with plupload using these exif samples on github. I chose landscape_6, because it's Orientation is the same as my example ("Rotate 90 CW", or Orientation tag value 6). Here's the before and after upload views using IrfanView with no autorotate, preserve_headers = false:
Aren't these canonical examples for demonstrating exif properties? Unless I'm missing some fundamental point, plupload is busted. I'd much rather it be the former, and someone can tell me the error of my ways.

Need help in aligning the content in python for self automation

I am trying to create an anime series search using the tool anilistpython, but I am not able to ignore the newline character in the plot tag and need help in align the output in a proper view format.
Tried code :
from AnilistPython import Anilist
import pandas as pd
import re
# db access online
anilist = Anilist()
# User input
ani_search = anilist.get_anime(input('Enter the Anime Name\t:\t'), manual_select=True)
df = ani_search
# for Genres split
cate = []
for gen in df['genres']:
cate.append(gen)
cate1 = (' , '.join(cate))
# for Checking Episode
if df['airing_status'] == 'RELEASING':
print('Ongoing')
x ='Ongoing'
y = df['next_airing_ep']
print(y['episode'])
y1 = y['episode']
elif df['airing_status'] == 'FINISHED':
print('Ended')
x = 'Ended'
y = df['airing_episodes']
print(y)
y1 = y
else:
print('None')
# print other details
print(f"\nTitle_Name\t:\t{df['name_english']}\nRomji_Title\t:\t{df['name_romaji']}\nPlot\t:\t{re.split('<br>', df['desc'])}\nAiring_Format\t:\t{df['airing_format']}\nStatus\t:\t{x}\nEpisodes_Count\t:\t{y1}\nGenres\t:\t{cate1}\nRating\t:\t{df['average_score']}/100\n")
The output it generated :
Enter the Anime Name : Bleach
1. BLEACH
2. BEACH
3. Akkanbee da
Please select the anime that you are searching for in number: 1
Title_Name : Bleach
Romji_Title : BLEACH
Plot : ["Ichigo Kurosaki is a rather normal high school student apart from the fact he has the ability to see ghosts. This ability never impacted his life in a major way until the day he encounters the Shinigami Kuchiki Rukia, who saves him and his family's lives from a Hollow, a corrupt spirit that devours human souls. \n", '', '\nWounded during the fight against the Hollow, Rukia chooses the only option available to defeat the monster and passes her Shinigami powers to Ichigo. Now forced to act as a substitute until Rukia recovers, Ichigo hunts down the Hollows that plague his town. \n\n\n']
Airing_Format : TV
Status : Ended
Episodes_Count : 366
Genres : Action , Adventure , Supernatural
Rating : 76/100
I am looking for the format to look like this:
Title_Name : Bleach
Romji_Title : BLEACH
Plot : Ichigo Kurosaki is a rather normal high school student apart from the fact he has the ability to see ghosts. This ability never impacted his life in a major way until the day he encounters the Shinigami Kuchiki Rukia, who saves him and his family's lives from a Hollow, a corrupt spirit that devours human souls. Wounded during the fight against the Hollow, Rukia chooses the only option available to defeat the monster and passes her Shinigami powers to Ichigo. Now forced to act as a substitute until Rukia recovers, Ichigo hunts down the Hollows that plague his town.
Airing_Format : TV
Status : Ended
Episodes_Count : 366
Genres : Action , Adventure , Supernatural
Rating : 76/100

Python3 Renaming Files By tkinter Listbox

I want to rename all files in a directory by tkinter listbox.
Got stuck at this point:
files_list = os.listdir(root.foldername)
print(files_list)
gives me
['1.mp4', '10.mp4', '2.mp4', '3.mp4', '4.mp4', '5.mp4', '6.mp4', '7.mp4', '8.mp4', '9.mp4']
values = [listbox.get(idx) for idx in listbox.curselection()]<br>
And
inlist = (', '.join(values))<br>
print(inlist)
gives me
Lost - 1x01 - Pilot(1), Lost - 1x02 - Pilot(2), Lost - 1x03 - Tabula Rasa, Lost - 1x04 - Walkabout, Lost - 1x05 - White Rabbit, Lost - 1x06 - House Of The Rising Sun, Lost - 1x07 - The Moth, Lost - 1x08 - Confidence Man, Lost - 1x09 - Solitary, Lost - 1x10 - Raised By Another
Now I'm looking for a solution to use os.rename in order to rename the files 1.mp4 till 10.mp4.
Additionally Python for whatever reason does not come with a built-in way to have natural sorting, so it sorts 1.mp4 followed by 10.mp4.
Thank you very much in advance.
For natural sorting take a look at Sorting alphanumeric strings in Python.
Then loop through all files and rename them, eg.
for i in range(len(files_list)):
old_file_name = files_list[i]
new_file_name = values[i] + '.mp4'
os.rename(old_file_name, new_file_name)
For assistance in dealing with pathnames see os.path.

Scraping youtube playlist

I've been trying to write a python script which will fetch me the name of the songs contained in the playlist whose link will be provided. for eg.https://www.youtube.com/watch?v=foE1mO2yM04&list=RDGMEMYH9CUrFO7CfLJpaD7UR85wVMfoE1mO2yM04 from the terminal.
I've found out that names could be extracted by using "li" tag or "h4" tag.
I wrote the following code,
import sys
link = sys.argv[1]
from bs4 import BeautifulSoup
import requests
req = requests.get(link)
try:
req.raise_for_status()
except Exception as exc:
print('There was a problem:',exc)
soup = BeautifulSoup(req.text,"html.parser")
Then I tried using li-tag as:
i=soup.findAll('li')
print(type(i))
for o in i:
print(o.get('data-video-title'))
But it printed "None" those number of time. I belive it is not able to reach those li tags which contains data-video-title attribute.
Then I tried using div and h4 tags as,
for i in soup.findAll('div', attrs={'class':'playlist-video-description'}):
o = i.find('h4')
print(o.text)
But nothing happens again..
import requests
from bs4 import BeautifulSoup
url = 'https://www.youtube.com/watch?v=foE1mO2yM04&list=RDGMEMYH9CUrFO7CfLJpaD7UR85wVMfoE1mO2yM04'
data = requests.get(url)
data = data.text
soup = BeautifulSoup(data)
h4 = soup.find_all("h4")
for h in h4:
print(h.text)
output:
Mike Posner - I Took A Pill In Ibiza (Seeb Remix) (Explicit)
Alan Walker - Faded
Calvin Harris - This Is What You Came For (Official Video) ft. Rihanna
Coldplay - Hymn For The Weekend (Official video)
Jonas Blue - Fast Car ft. Dakota
Calvin Harris & Disciples - How Deep Is Your Love
Galantis - No Money (Official Video)
Kungs vs Cookin’ on 3 Burners - This Girl
Clean Bandit - Rockabye ft. Sean Paul & Anne-Marie [Official Video]
Major Lazer - Light It Up (feat. Nyla & Fuse ODG) [Remix] (Official Lyric Video)
Robin Schulz - Sugar (feat. Francesco Yates) (OFFICIAL MUSIC VIDEO)
DJ Snake - Middle ft. Bipolar Sunshine
Jonas Blue - Perfect Strangers ft. JP Cooper
David Guetta ft. Zara Larsson - This One's For You (Music Video) (UEFA EURO 2016™ Official Song)
DJ Snake - Let Me Love You ft. Justin Bieber
Duke Dumont - Ocean Drive
Galantis - Runaway (U & I) (Official Video)
Sigala - Sweet Lovin' (Official Video) ft. Bryn Christopher
Martin Garrix - Animals (Official Video)
David Guetta & Showtek - Bad ft.Vassy (Lyrics Video)
DVBBS & Borgeous - TSUNAMI (Original Mix)
AronChupa - I'm an Albatraoz | OFFICIAL VIDEO
Lilly Wood & The Prick and Robin Schulz - Prayer In C (Robin Schulz Remix) (Official)
Kygo - Firestone ft. Conrad Sewell
DEAF KEV - Invincible [NCS Release]
Eiffel 65 - Blue (KNY Factory Remix)
Ok guys, I have figured out what was happening. My code was perfect and it works fine, the problem was that I was passing the link as an argument from the terminal and co-incidentally, the link contained some symbols which were interpreted in some other fashion for eg. ('&').
Now I am passing the link as a string in the terminal and everything works fine. So dumb yet time-consuming mistake.

Resources