How to switch the base of a path using pathlib? - python-3.x

I am trying to get a part of a path by removing the base, currently this is what I'm doing:
original = '/tmp/asd/asdqwe/file'
base = '/tmp/asd/'
wanted_part = original.strip(base)
Unfortunately, instead of getting 'asdqwe/file' I'm getting 'qwefile', for some reason strip works weird and I don't get it.
The best solution for my problem would be using pathlib.Path because my function gets its proprieties as paths, and the return value converting the trimmed string into Path after adding a new base path.
But if no pathlib solution is available a string one would also be great, currently I'm dealing with a weird bug...

You are misinterpreting how str.strip works. The method will remove all characters specified in the argument from the "edges" of the target string, regardless of the order in which they are specified:
original = '/tmp/asd/asdqwe/file'
base = '/tmp/asd/'
wanted_part = original.strip(base)
print(wanted_part)
# qwe/file
What you would like to do is probably a slicing:
wanted_part = original[len(base):]
print(wanted_part)
# asdqwe/file
Or, using pathlib:
from pathlib import Path
original = Path('/tmp/asd/asdqwe/file')
base = Path('/tmp/asd/')
wanted_part = original.relative_to(base)
print(wanted_part)
# asdqwe/file

strip will remove a sequnce of chars, not a string prefix or suffix, so it will keep removing anychars in the sequence you passed. Instaed you can test if the original starts with your base and if it does then just take the remaining chars of the string which are the chars after the length of the base.
original = '/tmp/asd/asdqwe/file'
base = '/tmp/asd/'
if original.startswith(base):
wanted_part = original[len(base):]
print(wanted_part)
OUTPUT
asdqwe/file

Related

Get number from string in Python

I have a string, I have to get digits only from that string.
url = "www.mylocalurl.com/edit/1987"
Now from that string, I need to get 1987 only.
I have been trying this approach,
id = [int(i) for i in url.split() if i.isdigit()]
But I am getting [] list only.
You can use regex and get the digit alone in the list.
import re
url = "www.mylocalurl.com/edit/1987"
digit = re.findall(r'\d+', url)
output:
['1987']
Replace all non-digits with blank (effectively "deleting" them):
import re
num = re.sub('\D', '', url)
See live demo.
You aren't getting anything because by default the .split() method splits a sentence up where there are spaces. Since you are trying to split a hyperlink that has no spaces, it is not splitting anything up. What you can do is called a capture using regex. For example:
import re
url = "www.mylocalurl.com/edit/1987"
regex = r'(\d+)'
numbers = re.search(regex, url)
captured = numbers.groups()[0]
If you do not what what regular expressions are, the code is basically saying. Using the regex string defined as r'(\d+)' which basically means capture any digits, search through the url. Then in the captured we have the first captured group which is 1987.
If you don't want to use this, then you can use your .split() method but this time provide a split using / as the separator. For example `url.split('/').

Problem with multivariables in string formatting

I have several files in a folder named t_000.png, t_001.png, t_002.png and so on.
I have made a for-loop to import them using string formatting. But when I use the for-loop I got the error
No such file or directory: '/file/t_0.png'
This is the code that I have used I think I should use multiple %s but I do not understand how.
for i in range(file.shape[0]):
im = Image.open(dir + 't_%s.png' % str(i))
file[i] = im
You need to pad the string with leading zeroes. With the type of formatting you're currently using, this should work:
im = Image.open(dir + 't_%03d.png' % i)
where the format string %03s means "this should have length 3 characters and empty space should be padded by leading zeroes".
You can also use python's other (more recent) string formatting syntax, which is somewhat more succinct:
im = Image.open(f"{dir}t_{i:03d}")
You are not padding the number with zeros, thus you get t_0.png instead of t_000.png.
The recommended way of doing this in Python 3 is via the str.format function:
for i in range(file.shape[0]):
im = Image.open(dir + 't_{:03d}.png'.format(i))
file[i] = im
You can see more examples in the documentation.
Formatted string literals are also an option if you are using Python 3.6 or a more recent version, see Green Cloak Guy's answer for that.
Try this:
import os
for i in range(file.shape[0]):
im = Image.open(os.path.join(dir, f't_{i:03d}.png'))
file[i] = im
(change: f't_{i:03d}.png' to 't_{:03d}.png'.format(i) or 't_%03d.png' % i for versions of Python prior to 3.6).
The trick was to specify a certain number of leading zeros, take a look at the official docs for more info.
Also, you should replace 'dir + file' with the more robust os.path.join(dir, file), which would work regardless of dir ending with a directory separator (i.e. '/' for your platform) or not.
Note also that both dir and file are reserved names in Python and you may want to rename your variables.
Also check that if file is a NumPy array, file[i] = im may not be working.

I need to pull a specific string from a URL path

I am pulling the following URL from a JSON on the internet.
Example of the string I am working with:
http://icons.wxug.com/i/c/k/nt_cloudy.gif
I need to get just the "nt_cloudy" from the above in order to write the img (already stored) to an epaper display for a weather app. I have tried re.split() but only ever get the full string back, no matter what I split on.
Everything else works, if I manually enter the filename, I can display the image, however the weather conditions change, so I need to pull the name from the JSON. Again, it is only locating the specific string within the full string I am stuck on.
imgurl = weatherinfo['current_observation']['icon_url'] # http://icons.wxug.com/i/c/k/nt_cloudy.gif
img_condition = re.split('\/ |\// |.', imgurl)
image_1 = "/home/pi/epaper/python2/icons/" + img_condition + ".bmp"
Please check this,
import re
imgurl = weatherinfo['current_observation']['icon_url'] # http://icons.wxug.com/i/c/k/nt_cloudy.gif
img_condition = re.split('\/', imgurl)[-1]
image_1 = "/home/pi/epaper/python2/icons/" + img_condition[:-4] + ".bmp"
If you are confident that the path will always end with the image filename, and won't have a query string after it (e.g., nt_cloudy.gif?foo=bar&x=y&...) you can just use the filesystem path functions from Python's os.path standard module.
https://docs.python.org/3/library/os.path.html
#!/usr/bin/env python
import os
URL = 'http://icons.wxug.com/i/c/k/nt_cloudy.gif'
FILENAME = os.path.basename(URL)
If you are trying to decode a URL that might include a query string, you may prefer to use the urllib.parse module.
https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse
I won't go into detail about why your regular expression isn't working the way you expect, because honestly, hand-crafting regular expressions is overkill for this use-case.
You could use below regular expresion:
let regex = /(\w+)\.gif/g.exec("http://icons.wxug.com/i/c/k/nt_cloudy.gif")
if(regex != null && regex.length == 2)
console.log(regex[1]);
Find the reference here.

Matlab sprintf incorrect result using random strings from list

I want create a string variable using ´sprintf´ and a random name from a list (in order to save an image with such a name). A draft of the code is the following:
Names = [{'C'} {'CL'} {'SCL'} {'A'}];
nameroulette = ceil(rand(1)*4)
filename = sprintf('DG_%d.png', Names{1,nameroulette});
But when I check filename, what I get is the text I typed followed not by one of the strings, but by a number that I have no idea where it comes from. For example, if my nameroulette = 1 then filename is DG_67.png, and if nameroulette = 4, filename = 'DG_65.png' . Where does this number come from and how can I fix this problem?
You just need to change
filename = sprintf('DG_%d.png', Names{1,nameroulette});
to
filename = sprintf('DG_%s.png', Names{1,nameroulette});
By the way you may want to have a look at randi command for drawing random integers.

matlab iterative filenames for saving

this question about matlab:
i'm running a loop and each iteration a new set of data is produced, and I want it to be saved in a new file each time. I also overwrite old files by changing the name. Looks like this:
name_each_iter = strrep(some_source,'.string.mat','string_new.(j).mat')
and what I#m struggling here is the iteration so that I obtain files:
...string_new.1.mat
...string_new.2.mat
etc.
I was trying with various combination of () [] {} as well as 'string_new.'j'.mat' (which gave syntax error)
How can it be done?
Strings are just vectors of characters. So if you want to iteratively create filenames here's an example of how you would do it:
for j = 1:10,
filename = ['string_new.' num2str(j) '.mat'];
disp(filename)
end
The above code will create the following output:
string_new.1.mat
string_new.2.mat
string_new.3.mat
string_new.4.mat
string_new.5.mat
string_new.6.mat
string_new.7.mat
string_new.8.mat
string_new.9.mat
string_new.10.mat
You could also generate all file names in advance using NUM2STR:
>> filenames = cellstr(num2str((1:10)','string_new.%02d.mat'))
filenames =
'string_new.01.mat'
'string_new.02.mat'
'string_new.03.mat'
'string_new.04.mat'
'string_new.05.mat'
'string_new.06.mat'
'string_new.07.mat'
'string_new.08.mat'
'string_new.09.mat'
'string_new.10.mat'
Now access the cell array contents as filenames{i} in each iteration
sprintf is very useful for this:
for ii=5:12
filename = sprintf('data_%02d.mat',ii)
end
this assigns the following strings to filename:
data_05.mat
data_06.mat
data_07.mat
data_08.mat
data_09.mat
data_10.mat
data_11.mat
data_12.mat
notice the zero padding. sprintf in general is useful if you want parameterized formatted strings.
For creating a name based of an already existing file, you can use regexp to detect the '_new.(number).mat' and change the string depending on what regexp finds:
original_filename = 'data.string.mat';
im = regexp(original_filename,'_new.\d+.mat')
if isempty(im) % original file, no _new.(j) detected
newname = [original_filename(1:end-4) '_new.1.mat'];
else
num = str2double(original_filename(im(end)+5:end-4));
newname = sprintf('%s_new.%d.mat',original_filename(1:im(end)-1),num+1);
end
This does exactly that, and produces:
data.string_new.1.mat
data.string_new.2.mat
data.string_new.3.mat
...
data.string_new.9.mat
data.string_new.10.mat
data.string_new.11.mat
when iterating the above function, starting with 'data.string.mat'

Resources