Create a text file containing list of (relative paths to) files in a directory? - linux

Suppose I am standing on a directory. Inside there's another directory called inside_dir containing a huge number of files. I want to create a file containing a list of all the files inside inside_dir, listed as the relative path to the files. That is, if there is a file called file1 inside inside_dir, the corresponding line in the list file should be inside_dir/file1.
Since the number of files is huge, just doing ls inside_dir/* > list.txt won't work because it will complain about having too many arguments.

find inside_dir -type f > list.txt

Related

Using regex and cp: cannot stat

I am trying to copy files over from an old file structure where data are stored in folders with improper names to a new (better) structure, but as there are 671 participants I need to copy, I want to use regex in order to streamline it (each participant has files saved in the same format). However, I keep getting a cp: cannot stat error message saying that no file/directory exists. I had assumed this meant that I had missed a / or put "" in the wrong location but I cannot see anything in the code that would suggest it.
My code is as follows (which I add a lot of comments so other collaborators can understand):
#!/bin/bash
# This code below copies the initial .nii file.
# These data are copied into my Trial Participant folders.
# Create a variable called parent_folder1 that describes initial mask directory e.g. folders for each participant which contains the files.
parent_folder1="/this/path/here/contains/Trial_Participants"
# The original folders are named according to ClinicalID_scandate_randomdigits e.g. folder 1234567890_20000101_987654.
# The destination folders are named according to TrialIDNumber e.g. LBC100001.
# The .nii files are saved under TrialIDNumber_1_ICV.nii.gz e.g. LBC1000001_1_ICV.nii.gz.
# These files need copied over from their directories into the Trial Participant folders, using the for loop function.
# The * symbol is used as a wildcard.
for i in $(ls -1d "${parent_folder1}"/*_20*); do
lbc=$(ls ${i}/finalMasks/*ICV* | sed 's/^.*\///'); lbc=${lbc:0:9}
cp "${parent_folder1}/${i}"/finalMasks/*_1_ICV.nii.gz /this/path/is/the/destination/path/${lbc}/
done
# This code uses regular expression to find the initial ICV file.
# ls asks for a list, -1 makes each new folder on a new line, d is for directory.
# *_20* refers to the name of the folders. The * covers the ClinicalID, _20* refers to the scan date and random digits.
# I have no idea what the | sed 's/^.*\///' does, but I think it strips the path.
# lbc=${lbc:0:9} is used to keep the ID numbers.
# cp copies the files that are named under TrialIDNumber(replaced by *)_1_ICV.nii.gz to the destination under the respective folder.
So after a bit of fooling around, I changed the code a lot (took out sed as it confuses me), and came up with this that worked. Thanks to those who commented!
# Create a variable called parent_folder1 that describes initial mask directory.
parent_folder1="/original/path/here"
# Iterate over directories in parent_folder1
for i in $(ls -1d "${parent_folder1}"/*_20*); do
# Extract the base name of the file in the finalMasks directory
lbc=$(basename $(ls "${i}"/finalMasks/*ICV*))
# Extract the LBC number from the file name
lbc=${lbc:0:9}
# Copy the file to the specific folder
cp "${i}"/finalMasks/${lbc}_1_ICV.nii.gz /destination/path/here/${lbc}/
done

How to link a selected set of files from one directory to another in Linux?

Say for example, in the source directory I have the following files:
abc.r
xyz.sh
pqr.fam
lmn.bim
uvw.r
ttt.sh
Now I need to link only the items 1,2 and 5 only (listed above). Most importantly I need to link all the 3 files together (i.e. link all the 3 files at the same time).
I know how to link 1 file at a time (ln -s sourceDirectory/fileName targetDirectory/), but not multiple files at once.
I found ways to do this when the file name prefixes has some pattern (for example, link all the files where the names start with letter "f"), but in my case, I do not have any such pattern. My file names are different.
Try this:
#!/bin/bash
for file in a.txt b.txt c.txt
do
ln -s /sourcedir/"${file}" /targetdir/
done
Since you only have a list, you have to iterate through the list.

How to rename multiple files while keeping extension based on provided txt file?

I have a folder with many files that look like:
A1_R1.fastq
A2_R1.fastq
A3_R1.fastq
I would like to rename the files based on a text file keeping the _R1.fastq but changing the A# to a specific samples name (example):
A1_R1.fastq KUG_R1.fastq
A2_R1.fastq AUG_R1.fastq
A3_R1.fastq TRY_R1.fastq
I'd also like an output directory which contains all my newly names .fastq files.
I tried this to no avail (only a few were renamed):
ls *.fastq| paste -d' ' - $PATH/txt | xargs -n2 mv
Thank you.

Replacing files in one folder and all its subdirectories with modified versions in another folder

I have two folders, one called 'modified' and one called 'original'.
'modified' has no subdirectories and contains 4000 wav files each with unique names.
The 4000 files are copies of files from 'original' except this folder has many subdirectories inside which the original wav files are located.
I want to, for all the wav files in 'modified', replace their name-counterpart in 'original' wherever they may be.
For example, if one file is called 'sound1.wav' in modified, then I want to find 'sound1.wav' in some subdirectory of 'original' and replace the original there with the modified version.
I run Windows 8 so command prompt or cygwin would be best to work in.
As requested, I've written the python code that does the above. I use the 'os' and 'shutil' modules to first navigate over directories and second to overwrite files.
'C:/../modified' refers to the directory containing the files we have modified and want to use to overwrite the originals.
'C:/../originals' refers to the directory containing many sub-directories with files with the same names as in 'modified'.
The code works by listing every file in the modified directory, and for each file, we state the path for the file. Then, we look through all the sub-directories of the original directory, and where the modified and original files share the same name, we replace the original with the modified using shutil.copyfile().
Because we are working with Windows, it was necessary to change the direction of the slashes to '/'.
This is performed for every file in the modified directory.
If anyone ever has the same problem I hope this comes in handy!
import os
import shutil
for wav in os.listdir('C:/../modified'):
modified_file = 'C:/../modified/' + wav
for root, dirs, files in os.walk('C:/../original'):
for name in files:
if name == wav:
original_file = root + '/' + name
original_file = replace_file.replace('\\','/')
shutil.copyfile(modified_file, original_file)
print wav + ' overwritten'
print 'Complete'

How to move and number files?

I working with linux, bash.
I have one directory with 100 folders in it, each one named different.
In each of these 100 folders, there is a file called first.bars (so I have 100 files named first.bars). Although all named first.bars, the files are actually slightly different.
I want to get all these files moved to one new folder and rename/number these files so that I know which file comes from which folder. So the first first.bars file must be renamed to 001.bars, the second to 002.bars.. etc.
I have tried the following:
ls -d * >> /home/directorywiththe100folders/list.txt
cat list.txt | while read line;
do cd $line;
mv first.bars /home/newfolder
This does not work because I can't have 100 files, named the same, in one folder. So I only need to know how to rename them. The renaming must be connected to the cat list.txt, because the first line is the folder containing the first file wich is moved and renamed. That file will be called 001.bars.
Try doing this :
$ rename 's/^.*?\./sprintf("%03d.", $c++)/e' *.bar
If you want more information about this command, see this recent response I gave earlier : How do I rename multiple files beginning with a Unix timestamp - imapsync issue
If the rename command is not available,
for d in /home/directorywiththe100folders/*/; do
newfile=$(printf "/home/newfolder/%d.bars" $(( c++ )) )
mv "$d/first.bars" "$newfile"
done

Resources