Extracting inode from ls command - linux

I am trying to extract inode from ls command:
"ls -i"
out put:
1234 File name with space
2345 File name
In order to extract, I tried to use cut command as follows:
ls -i | cut -d" " -f1
above command didn't work cuz the files name have different number of spaces in them. From man page I found that by specifying "-Q" to ls command, it would double quote the file name:
ls -iQ
1234 "File name with space"
2345 "File name"
I can't find a way on to utilize this option. Any help would be greatly appreciated.
Thanks
Update 1
It looks like the file name space wasn't the cause of the problem. It's the inode number size. For example:
ls -iQ
2321352 "My Cheat Tables"
507896 "My Data Sources"
Note the leading space before the number " 507896". Therefore, for the first file, the inode is in field 1(-f1) and for the second file inode number is on the second field (-f2).
Updat 2 (My solution)
I found the solution.. xD the command as follows:
ls -iQ | cut -d ' "' -f1
above would print inode only. However, I would like to know if there is proper or better way of doing this. Please do answer on this post if you know the right way. I am new to unix, and I would love to learn it in the proper way. Thanks.. :)

if you have an unknown number of trailing (or delimiting) white spaces, you can use awk to get the desired column:
ls -iQ1 | awk '{print $1}'
other than cut, awk ignores any number of consecutive white spaces so you don't need to guess the offset.
an even better solution is to ask explicitly for the information you need rather than parse ls output:
find . -type f -printf '%i\n'

You can also use the stat(1) command.
stat -c %i "File name with spaces"

Related

Collect only numbers in a file's extension

I need some help, I'm creating a script with the purpose of going through a text file line by line and validating it with the images of a folder.
My doubt is: when I search for images, I only want the number not the extension.
find /mnt/62-PASTA/01.GERAL/ -mindepth 2 | head -19 | cut -d/ -f6
I get:
111066.jpg
88008538.jpg
11241.jpg
88008563.jpg
116071.PNG
But I want
111066
88008538
11241
88008563
116071
Any help?
A really simple way given the examples shown would be to use cut again to split on .:
find /mnt/62-PASTA/01.GERAL/ -mindepth 2 | head -19 | cut -d/ -f6 | cut -d'.' -f1
What we can do here is use another cut command.
cut -d .
Now this will give me strings separated by . as delimiter. Then we can grab all but last part as below.
cut -d . -f 1
I think this should work.Check below link for additional details.
Suggesting sed pipe instead of cut
find /mnt/62-PASTA/01.GERAL/ -mindepth 2 | sed 's|\.[[:alpha:]]*$||'
With pure shell solution, try following solution. Simple explanation would be, using a for loop to loop through .jpg, .PNG format files in your current directory. Then in main loop, using bash's capability to perform substitution and substitute everything apart from digits with NULL in file's name which will give only digits from file's names.
Running code from directory/mnt/62-PASTA/01.GERAL/:
for file in *.jpg *.PNG;
do
echo "${file//[^0-9]/}"
done
OR with full path(/mnt/62-PASTA/01.GERAL/) to run from any other path try following code:
for file in /mnt/62-PASTA/01.GERAL/*.jpg /mnt/62-PASTA/01.GERAL/*.PNG;
do
file1="${file##*/}" ##Removing values till / to get only filename.
echo "${file1//[^0-9]/}" ##Removing everything apart from digits in filename.
done
Output will be as follows:
111066
11241
88008538
88008563
116071

Automate and looping through batch script

I'm new to batch. I want iterate through a list and use the output content to replace a string in another file.
ls -l somefile | grep .txt | awk 'print $4}' | while read file
do
toreplace="/Team/$file"
sed 's/dataFile/"$toreplace"/$file/ file2 > /tmp/test.txt
done
When I run the code I get the error
sed: 1: "s/dataFile/"$torepla ...": bad flag in substitute command: '$'
Example of somefile with which has list of files paths
foo/name/xxx/2020-01-01.txt
foo/name/xxx/2020-01-02.txt
foo/name/xxx/2020-01-03.txt
However, my desired output is to use the list of file paths in somefile directory to replace a string in another file2 content. Something like this:
This is the directory of locations where data from /Team/foo/name/xxx/2020-01-01.txt ............
I'm not sure if I understand your desired outcome, but hopefully this will help you to figure out your problem:
You have three files in a directory:
TEAM/foo/name/xxx/2020-01-02.txt
TEAM/foo/name/xxx/2020-01-03.txt
TEAM/foo/name/xxx/2020-01-01.txt
And you have another file called to_be_changed.txt which contains the text This is the directory of locations where data from TO_BE_REPLACED ............ and you want to grab the filenames of your three files and insert them into your to_be_changed.txt file, you can do it with:
while read file
do
filename="$file"
sed "s/TO_BE_REPLACED/${filename##*/}/g" to_be_changed.txt >> changed.txt
done < <(find ./TEAM/ -name "*.txt")
And you will then have made a file called changed.txt which contains:
This is the directory of locations where data from 2020-01-02.txt ............
This is the directory of locations where data from 2020-01-03.txt ............
This is the directory of locations where data from 2020-01-01.txt ............
Is this what you're trying to achieve? If you need further clarification I'm happy to edit this answer to provide more details/explanation.
ls -l somefile | grep .txt | awk 'print $4}' | while read file
No. No, no, nono.
ls -l somefile is only going to show somefile unless it's a directory.
(Don't name a directory "somefile".)
If you mean somefile.txt, please clarify in your post.
grep .txt is going to look through the lines presented for the three characters txt preceded by any character (the dot is a regex wildcard). Since you asked for a long listing of somefile it shouldn't find any, so nothing should be passed along.
awk 'print $4}' is a typo which won't compile. awk will crash.
Keep it simple. What I suspect you meant was
for file in *.txt
Then in
toreplace="/Team/$file"
sed 's/dataFile/"$toreplace"/$file/ file2 > /tmp/test.txt
it's unlear what you expect $file to be - awk's $4 from an ls -l seems unlikely.
Assuming it's the filenames from the for above, then try
sed "s,dataFile,/Team/$file," file2 > /tmp/test.txt
Does that help? Correct me as needed. Sorry if I seem harsh.
Welcome to SO. ;)

Stream File Contents Until Substring Encountered

I was using:
bash $ head -n 2 *.xml | grep (..stuff..)
to stream first 2 lines of all xml files to grep command. However, I realized that this was not reliable for the structure of these files.
What I need instead is to stream start of each xml file until a particular substring (which all these files have) is encountered.
head does not provide that level of granularity. The substring is simply the start of a tag (e.g. something like "< tag start"). I would be grateful for any ideas. Thanks!
If you know the max number of lines you have before the matching string you can do something like this:
# cat testfile
123
9
1
1
2
3
4000
TAG
456
# grep -m 1 -B 10 TAG testfile | grep -v TAG
123
9
1
1
2
3
4000
#
Sounds like you want either of these (using GNU awk for nextfile) depending on if you want the tag line printed or not:
awk '/< tag start/{nextfile} 1' *.xml
awk '1; /< tag start/{nextfile}' *.xml
or less efficiently with any awk:
awk 'FNR==1{f=1} /< tag start/{f=0} f' *.xml
awk 'FNR==1{f=1} f; /< tag start/{f=0}' *.xml
or bringing back some efficiency in this case:
for file in *.xml; do
awk '/< tag start/{exit} 1' "$file"
done
I appreciate all the responses. I found that really I only needed the content of a single tag, rather than from the beginning of the xml files. This simplified the parsing. So for instance:
<mt:myTag LOTSOFSTUFF >"
, I really only needed LOTSOFSTUFF. So I simply did:
grep -oP "<mt:myTag(.*)>" *.xml | grep_more
and that worked exactly. Thanks again. I really appreciated and sorry I did not realize my use case was simpler than I made it out to be.

Using cut in Linux Mint Terminal more precisely

In the directory /usr/lib on Linux Mint there are files, among other things, that goes by the name of xxx.so.d where xxx is their name, and d being a number. The assignment is to find all files with .so file ending and write out their name, xxx. The code I got so far is
ls | grep "\.so\." | cut -d "." -f 1
The problem now is that cut cuts of some filenames short, as an example there is an file called libgimp-2.0.so.0, where the wanted output would be libgimp-2.0 since that part is infront of .so
Is there anyway to make cut cut at ".so" instead of the first .?
The answer given by pacholik can give you wrong files (ie: 'xyz.socket' will appear on your list). To correct his script:
for i in *.so.*; do echo "${i%%.so*}"; done
Another way to do this (easier to read in my opinion) is to use a little Perl:
ls | grep "\.so\." | perl -n0e "print ((split(/\.so/))[0], \"\n\")"
Sorry, I don't think there is a way to use only "cut" as you asked.
for i in *.so*; do echo "${i%.so*}"; done
just a bash parameter substitution
http://www.tldp.org/LDP/abs/html/parameter-substitution.html
Just use sed instead:
ls | grep -v ".socket" | grep .so | sed "s/.so.*//"
This will delete everything behind the first found .so in the file names. So also files named xxx.so.so would work.
Depend on the size of the directory probably using find could be the best option, as a start point give a try to this:
find . -iname "*.so.*" -exec basename {} \; | cut -d "." -f 1
Like cut there are many other options, like sed, awk that could help you achieve in some cases the same result in a faster way.

Linux command most recent non soft link file

Linux command: I am using following command which returns the latest file name in the directory.
ls -Art | tail -n 1
When i run this command it returns me latest file changed which is actually soft link, i wants to ignore soft link in my result, and wants to get file names other then soft link how can i do that any quick help appreciated.
May be can i specify regex matched latest file file name is
rum-12.53.2.war
-- Latest file in directory without softlink
ls -ArtL | tail -n 1
-- Latest file without extension
ls -ArtL | sed 's/\(.*\)\..*/\1/' | tail -n 1
The -L option for ls does dereference the link, i.e. you'll see the information of the reference instead of the link. Is this what you want? Or would you like to completely ignore links?
If you want to ignore links completely you can use this solution, although I am sure there exists an easier one:
a=$( ls -Artl | grep -v "^l" | tail -1 )
aa=()
for i in $(echo $a | tr " " "\n")
do
aa+=($i)
done
aa_length=${#aa[#]}
echo ${aa[aa_length-1]}
First you store the output of your ls in a variable called a. By grepping for "^l" you chose only symbolic links and with the -v option you invert this selection. So you basically have what you want, only downside is that you need to use the -l option for ls, as otherwise there's no grepping for "^l". So in the second part you split the variable a by " " and fill an array called aa (sorry for the bad naming). Then you need only the last item in aa, which should be the filename.

Resources