Renaming files in a directory based on an instructions file

Renaming files in a directory based on an instructions file - linux

I have a directory that regroups a .sql file and multiple data files.
/home/barmar/test.dir
├── database.sql
├── table1.unl
├── table2.unl
├── table3.unl
├── table4.unl
├── table5.unl
└── table6.unl
The .sql file contains an unload instructions for every .unl file the issue I have is that the names of the .unl files are not the same as the instructions on .sql.
Usually the name should be TABLE_TABID.unl Im looking for a way to retreive the names from the .sql file and rename the .unl files correctly.
The .sql file contains multiple instructions here's an example of the lines that contain the correct names.
{ unload file name = table1_89747.unl number of rows = 8376}
As you can see the only thing in common is the table name (table1) in the example
The expected result should be something like that:
/home/barmar/test.dir
├── database.sql
├── table1_89747.unl
├── table2_89765.unl
├── table3_89745.unl
├── table4_00047.unl
├── table5_00787.unl
└── table6_42538.unl

This sed line will generate commands to rename files like table1.unl to names like table1_89747.unl:
sed -n 's/.*name = \([^_]*\)\(_[^.]*\.unl\).*/mv '\''\1.unl'\'' '\''\1\2'\''/p' <database.sql
Assumptions: spaces exist around the = sign, and the filename is of the form FOO_BAR.unl, i.e. the underscore character and the extension are always present.
Sample output:
$ echo '{ unload file name = table1_89747.unl number of rows = 8376}' | sed -n 's/.*name = \([^_]*\)\(_[^.]*\.unl\).*/mv '\''\1.unl'\'' '\''\1\2'\''/p'
mv 'table1.unl' 'table1_89747.unl'
To generate and execute the commands:
eval $(sed -n 's/.*name = \([^_]*\)\(_[^.]*\.unl\).*/mv '\''\1.unl'\'' '\''\1\2'\'';/p' <database.sql | tr -d '\n')
Goes without saying, before running this make sure your database.sql doesn't have malicious strings that could lead to renaming files outside the current directory.

Related

How to run FZF in vim in a directory where the directory path comes from a function?

I'd like to run fzf file finder (inside vim) on a custom directory, but the directory name varies at runtime.
For e.g., say, vim is started at the root of project directory. The subdirectories look like:
$ tree -ad
.
├── docs
├── .notes
│   ├── issue_a
│   └── new_feature
├── README
├── src
└── tests
Every time I create a branch, I also create a directory in .notes/BRANCH_NAME where I store notes, tests results, etc. The .notes directory itself is ignored by git.
I'd like to run FZF on the .notes/BRANCH_NAME directory. The branch name will come from a function (say using https://github.com/itchyny/vim-gitbranch).
I am able to run fzf on the .notes directory by :FZF .notes, but I don't know how to run it on the branch directory within .notes.
Thanks!
Edit: Added what I'd tried:
I tried saving the output of gitbranch#name() to a variable and then use it to call fzf#run(), but didn't quite work:
function! NotesDir()
let branch=gitbranch#name()
let ndir=".notes/" . branch
call fzf#run({'source': ndir})
endfunction
command! NotesDir call NotesDir()
When I run :NotesDir while on branch issue_a, I see fzf window with an error:
> < [Command failed: .notes/issue_a]
.notes/issue_a indicates that ndir variable has the correct notes directory path, but I couldn't figure out how to pass it to fzf.

Looking at the documentation for fzf#run(), it looks like source could be either a string, interpreted as a shell command to execute, or a list, used as-is to populate the plugin's window.
So your mistake was passing a path as source, which was interpreted as an external command, instead of either an external command or a list of paths.
It also says that you should at least provide a sink but some examples don't have it so YMMV.
If I am reading that section correctly, the following approaches should work:
" with a string as external command
call fzf#run({'source': 'find ' .. ndir, 'sink': 'e'})
" with a list
let list = globpath('.', ndir, O, 1)
\ ->map({ _, val -> val->fnamemodify(':.')})
call fzf#run({'source': list, 'sink': 'e'})
NOTE: I don't use that plugin so this is not tested.

use ansible-vault to encrypt multiple files at once

I am using the following structure to separate my host_vars into plaintext and encrypted
ansible
├── ansible.cfg
├── host_vars
│ ├── host1
│ │ ├── vars
│ │ └── vault
│ └── host2
│ ├── vars
│ └── vault
├── inventory
├── site.yaml
└── vars
└── ansible_vars.yaml
Is there a way, using ansible-vault to encrypt both files named vault or do I have to do them one by one?
Just asking since there are more to come, e.g. in future directories of group_vars etc.
I know this works
ansible-vault encrypt host_vars/host1/vault host_vars/host2/vault
just asking whether there is a more elegant / quick solution

There are a lot of possibilities gives by shell expansions.
Here are two that would be interesting in your case:
The asterisk * expansion, that is used as a wildcard.
Which means that host_vars/*/vault would match both host_vars/host1/vault and host_vars/host2/vault but any other in the future, too.
Mind that, if, in the future, you have a more complex folder hierarchy host_vars/*/vault will only match one folder level (e.g. it won't match host_vars/level1/host1/vault), but multiple folder levels can be achieved with a double asterisk (actually named globstar): host_vars/**/vault, will match
host_vars/host1/vault as well as host_vars/level1/host1/vault
The brace expansion, on the other hands offer a more granular set of possibilities, for examples, if I have hosts names after the distributions like RedHat[1..5], Ubuntu[1..5] and Debian[1..5], I could target only the Debian and RedHat ones via host_vars/{Ubuntu*,RedHat*}/vault.
Or only target the three first of them both with host_vars/{Ubuntu{1..3},RedHat{1..3}}/vault, or the three first of them all via host_vars/*{1..3}/vault
As a more practical example, if you where to handle SE via Ansible and would like to encrypt the the files for *.stackexchange.com and stackoverflow.com but not superuser.com or any other Q&A having a specific domain name, given that the hosts are named as their DNS name, you could do
ansible-vault host_vars/{stackoverflow.com,*.stackexchange.com}/vault

I will just throw in my quick super simple shell script which worked for my simple use case.
For sure it can be improved but I think it's a good starting point.
You could also utilize secret file via --vault-password-file parameter.
#!/bin/sh
echo "Choose the option:"
echo "(d) Decrypt all Ansible vault files"
echo "(e) Encrypt all Ansible vault files"
read option
function decrypt {
ansible-vault decrypt --ask-vault-pass \
ansible/*/environments/development/group_vars/*/vault.yaml \
ansible/*/environments/development/host_vars/*/vault.yaml
}
function encrypt {
ansible-vault encrypt --ask-vault-pass \
ansible/*/environments/development/group_vars/*/vault.yaml \
ansible/*/environments/development/host_vars/*/vault.yaml
}
case $option in
d)
decrypt
;;
e)
encrypt
break
;;
*)
echo "Wrong option"
;;
esac

Loop over a directory, but starting from the end?

I have a directory with many sub-directories, each of which follow the same naming convention; the day's date. Today a folder was made: 2021-04-22
I occasionally need to go through these directories and read a file from one, but once I've read it I don't need to again.
li = []
for root, dirs, files in os.walk(path):
for f in files:
li.append(f)
The list shows me the order the files are read, which is an alphabetic(numeric?) order. I know the newest files are going to be towards the bottom because of the naming convention.
How can I start my for loop from the 'end' rather than the 'beginning'?
If this is possible, I'd then exit the loop when my criteria are met, else, what would be the point of starting at the end?
EDIT: My original naming convention was mistyped. It is YYYY-MM-DD thank you #null

To reverse any iterable or iterator in python warp it in reversed().
In your code:
li = []
for root, dirs, files in os.walk(path):
for f in reversed(files):
li.append(f)

Suppose you have this tree of directories:
.
├── 1
│   ├── a
│   │   ├── 03-01-2021
│   │   └── 04-22-2021
│   ├── b
│   │   └── 04-21-2021
│   └── c
├── 2
│   ├── a
│   │   └── 05-01-2020
│   ├── b
│   └── c
│   └── 01-01-1966
└── 3
├── a
│   ├── 12-15-2001
│   └── 12-15-2001_blah
├── b
└── c
You can use pathlib with a recursive glob to get your directories. Then use a regex to reverse the date pattern to ISO 8601 format of YYYY-MM-DD and sort in reverse fashion:
import re
from pathlib import Path
p=Path('/tmp/test/')
my_glob='**/[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]*'
my_regex=r'.*/(\d{2})-(\d{2})-(\d{4}).*'
for pa in sorted(
[pa for pa in p.glob(my_glob) if pa.is_dir()],
key=lambda pa: re.sub(my_regex,r'\3-\2-\1', str(pa)), reverse=True):
print(pa)
Prints:
/tmp/test/1/a/04-22-2021
/tmp/test/1/b/04-21-2021
/tmp/test/1/a/03-01-2021
/tmp/test/2/a/05-01-2020
/tmp/test/3/a/12-15-2001_blah
/tmp/test/3/a/12-15-2001
/tmp/test/2/c/01-01-1966
The glob of '**/*' makes the search recursive and adding:
**/[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]*
will only return files and directories that match that naming pattern. By adding the test if pa.is_dir() we are only looking at directories -- not files.
The regex:
my_regex=r'.*/(\d{2})-(\d{2})-(\d{4})/'
re.sub(my_regex,r'\3-\2-\1', str(pa))
Removes everything other than the date and reverses it to ISO 8601 for the key passed to sorted.
You asked the default order files are returned. Usually files are breadth first oldest to newest. That said, it is OS and implementation dependent.
You updated the question that your files DO have the YYYY-MM-DD naming convention. If so, just change or remove the regex. Same basic method handles both.

Since files is a list, you can use extended list slicing to reverse the list:
li = []
for root, dirs, files in os.walk(path):
for f in files[::-1]:
li.append(f)

How to unzip archive directly into target folder without creating a subfolder with the archive name (7zip, command line)?

I'm using the 7zip command line interface to extract archives, like so:
7za.exe x -y {path_to_zipfile} -o{path_to_target_folder}
If my zipfile is named my_archive.7z, then I get the following filestructure in the target folder:
🗁 target_folder
└─ 🗁 my_archive
├─ 🗋 foo.png
├─ 🗁 bar
│ ├─ 🗋 baz.txt
│ └─ 🗋 qux.txt
...
However, I don't want the subfolder 🗁 my_archive. I'm looking for flags to apply on the 7zip command such that everything extracts directly in the target folder, without creating the 🗁 my_archive subfolder.
NOTES
I can't replace x with e because the filestructure shouldn't be lost (the e flag pushes all files to the toplevel).
I'm working on a Windows 10 computer, but the solution must also work on Linux.
I'm using the following version: 7-Zip (a) 19.00 (x64)
Some background info: I'm calling 7zip from a Python program, like so:
# Variables:
# 'sevenzip_abspath': absolute path to 7za executable
# 'zipfile_abspath': absolute path to zipped file (`.7z` format)
# 'targetdir_abspath': absolute path to target directory
commandlist = [
sevenzip_abspath,
'x',
'-y',
zipfile_abspath,
f'-o{targetdir_abspath}',
]
output = subprocess.Popen(
commandlist,
stdout=subprocess.PIPE,
shell=False,
).communicate()[0]
if output is not None:
print(output.decode('utf-8'))
I know I could do all kinds of things in Python after the unzipping has finished (move/rename directories, etc etc), but that's for plan B. First I want to check if there is an elegant solution.
I'd like to stick to 7zip for reasons that would lead us too far here.

You can rename the top level folder to match the target folder before extracting the archive.
7za rn {path_to_zipfile} my_archive target_folder
This will permanently change the archive. If you don't want that, take a copy first.

Recursive code to traverse through directories in python and filter files

I would like to recursively search through "project" directories for "Feedback Report" folder and if that folder has no more sub directories I would like to process the files in a particular manner.
After we have reached the target directory, I want to find the latest feedback report.xlsx in that directory(which will contain many previous versions of it)
the data is really huge and inconsistent in its directory structure. I believe the following algorithm should bring me close to my desired behavior but still not sure. I have tried multiple scrappy code scripts to convert into json path hierarchy and then parse from it but the inconsistency makes the code really huge and not readable
The path of the file is important.
My algorithm that I would like to implement is:
dictionary_of_files_paths = {}
def recursive_traverse(path):
//not sure if this is a right base case
if(path.isdir):
if re.match(dir_name, *eedback*port*) and dir has no sub directory:
process(path,files)
return
for contents in os.listdir(path):
recursive_traverse(os.path.join(path, contents))
return
def process(path,files):
files.filter(filter files only with xlsx)
files.filter(filter files only that have *eedback*port* in it)
files.filter(os.path.getmtime > 2016)
files.sort(key=lambda x:os.path.getmtime(x))
reversed(files)
dictionary_of_files_paths[path] = files[0]
recursive_traverse("T:\\Something\\Something\\Projects")
I need guidance before I actually implement and need to validate if this is correct.
There is another snippet that I got for path hierarchy from stackoverflow which is
try:
for contents in os.listdir(path):
recursive_traverse(os.path.join(path, contents))
except OSError as e:
if e.errno != errno.ENOTDIR:
raise
//file

Use pathlib and glob.
Test directory structure:
.
├── Untitled.ipynb
├── bar
│   └── foo
│   └── file2.txt
└── foo
├── bar
│   └── file3.txt
├── foo
│   └── file1.txt
└── test4.txt
Code:
from pathlib import Path
here = Path('.')
for subpath in here.glob('**/foo/'):
if any(child.is_dir() for child in subpath.iterdir()):
continue # Skip the current path if it has child directories
for file in subpath.iterdir():
print(file.name)
# process your files here according to whatever logic you need
Output:
file1.txt
file2.txt

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Renaming files in a directory based on an instructions file - linux

Related

How to run FZF in vim in a directory where the directory path comes from a function?

use ansible-vault to encrypt multiple files at once

Loop over a directory, but starting from the end?

How to unzip archive directly into target folder without creating a subfolder with the archive name (7zip, command line)?

Recursive code to traverse through directories in python and filter files

Categories

Resources