How to concatenate files split by round robin? - linux

I split a file using split -n r/12 file, now how do I concatenate these 12 files? I've tried cat <files> and paste <files>, but after using diff, whole file was different from the original.
How do I concatenate these 12 files so that cmp/diff will show no differences? Any special arguments for paste/cat to use?

Is round robin splitting an absolute requirement? If not you might just split into sections:
$split --number=12 file
This creates 12 files:
$ ls x*
xaa xab xac xad xae xaf xag xah xai xaj xak xal
Now you can concat without any difference:
$cat x* > file.new
$diff file file.new
But if there is no way around the round robin requirement I would create a bash script - not pretty. Just providing a pseudocode
Something like:
Create working directory
Copy all x* files into working directory
Change to working directory
Touch new concatenated file
While all x* files are not empty
Iterate over files in alpha order
Remove the first line in file
Append the line to the new concatenated file

Related

How to link a selected set of files from one directory to another in Linux?

Say for example, in the source directory I have the following files:
abc.r
xyz.sh
pqr.fam
lmn.bim
uvw.r
ttt.sh
Now I need to link only the items 1,2 and 5 only (listed above). Most importantly I need to link all the 3 files together (i.e. link all the 3 files at the same time).
I know how to link 1 file at a time (ln -s sourceDirectory/fileName targetDirectory/), but not multiple files at once.
I found ways to do this when the file name prefixes has some pattern (for example, link all the files where the names start with letter "f"), but in my case, I do not have any such pattern. My file names are different.
Try this:
#!/bin/bash
for file in a.txt b.txt c.txt
do
ln -s /sourcedir/"${file}" /targetdir/
done
Since you only have a list, you have to iterate through the list.

Read only nth first lines [sublime text]

I've got some files so big to directly open them in Sublime Text. Is there any way to open only the nth first lines? Something like head in bash? Thanks
If you're on Linux or Mac, or have Cygwin, Git Bash, or similar installed on a Windows machine, check out the split utility, which is part of the coreutils package. It does exactly what it says: it splits input into separate files. It is configurable via command-line options, like every Unix utility. For example, if you wanted to split your input file into separate 10,000-line files starting with notsobigfile and using numeric suffixes ending with .txt, you would run
split -d -l 10000 --additional-suffix=".txt" reallybigfile.txt notsobigfile
and it would output files named notsobigfile01.txt, notsobigfile02.txt, etc. If this would generate more than 100 files (00 through 99), just add -a x where x is the number of digits (the default is 2).
For all the possible options, just read the man page:
man split
If you only want to output the first part of the file, check out the options for the -n/--number flag.
To figure out how many lines your input file has, run the word counting utility using the lines option:
wc -l reallybigfile.txt

How to copy multiple files with varying version numbers from one directory to another using bash?

I have a folder /home/user/Document/filepath where I have three files namely file1-1.1.0.txt, file2-1.1.1.txt, file3-1.1.2.txt
and another folder named /home/user/Document/backuppath where I have to move files from /home/user/Document/folderpath which has file1-1.0.0.txt, file2-1.0.1.txt and file3-1.0.2.txt
task is to copy the specific files from folder path to backup path.
To summarize:
the below is the files.txt where I listed the files which has to be copied:
file1-*.txt
file2-*.txt
The below is the move.sh script that execute the movements
for file in `cat files.txt`; do cp "/home/user/Document/folderpath/$file" "/home/user/Documents/backuppath/" ; done
for the above script I am getting the error like
cp: cannot stat '/home/user/Document/folderpath/file1-*.txt': No such file or directory found
cp: cannot stat '/home/user/Document/folderpath/file2-*.txt': No such file or directory found
what I would like to accomplish is that I would like to use the script to copy specific files using * in the place of version numbers., since the version number may vary in the future.
You have wildcard characters in your files.txt. In your cp command, you are using quotes. These quotes prevent the wildcards to be expanded, as you can clearly see from the error message.
One obvious possibility is to not use quotes:
cp /home/user/Document/folderpath/$file /home/user/Documents/backuppath/
Or not use a loop at all:
cp $(<files.txt) /home/user/Documents/backuppath/
However, this would of course break if one line in your files.txt is a filename pattern which contains white spaces. Therefore, I would recommend a second loop over the expanded pattern:
while read file # Puts the next line into 'file'
do
for f in $file # This expands the pattern in 'file'
do
cp "/home/user/Document/folderpath/$f" /home/user/Documents/backuppath
done
done < files.txt

Splitting large tar file into multiple tar files

I have a tar file which is 3.1 TB(TeraByte)
File name - Testfile.tar
I would like to split this tar file into 2 parts - Testfil1.tar and Testfile2.tar
I tried the following so far
split -b 1T Testfile.tar "Testfile.tar"
What i get is Testfile.taraa(what is "aa")
And i just stopped my command. I also noticed that the output Testfile.taraa doesn't seem to be a tar file when I do ls in the directory. It seems like it is a text file. May be once the full split is completed it will look like a tar file?
The behavior from split is correct, from man page online: http://man7.org/linux/man-pages/man1/split.1.html
Output pieces of FILE to PREFIXaa, PREFIXab, ...
Don't stop the command let it run and then you can use cat to concatenate (join) them all back again.
Examples can be seen here: https://unix.stackexchange.com/questions/24630/whats-the-best-way-to-join-files-again-after-splitting-them
split -b 100m myImage.iso
# later
cat x* > myImage.iso
UPDATE
Just as clarification since I believe you have not understood the approach. You split a big file like this to transport it for example, files are not usable this way. To use it again you need to concatenate (join) pieces back. If you want usable parts, then you need to decompress the file, split it in parts and compress them. With split you basically split the binary file. I don't think you can use those parts.
You are doing the compression first and the partition later.
If you want each part to be a tar file, you should use 'split' first with de original file, and then 'tar' with each part.

Create a text file containing list of (relative paths to) files in a directory?

Suppose I am standing on a directory. Inside there's another directory called inside_dir containing a huge number of files. I want to create a file containing a list of all the files inside inside_dir, listed as the relative path to the files. That is, if there is a file called file1 inside inside_dir, the corresponding line in the list file should be inside_dir/file1.
Since the number of files is huge, just doing ls inside_dir/* > list.txt won't work because it will complain about having too many arguments.
find inside_dir -type f > list.txt

Resources