Find string using grep - string

How can I find all strings in a file which are alphanumeric and may contain either the symbol _ or # and end in the hex code 0x00. I've tried using grep with the following options but it doesn't seem to work:
-z [a-zA-Z0-9_]*
Update
Here's an example of some of the strings I'm trying to extract, as you can see they end with the hex code 0x00, vary in length and although this specific example doesn't show they can contain 0-9, an underscore (_) or a hash (#).
http://i42.tinypic.com/23kos5w.jpg

When I run this all I get is 'Binary file /cygdrive/d/dump.bin matches'? I'm using grep in cygwin. – Twisted89 Apr 4 at 10:09
MrJames answer does not include the -a. Plus his putting \w in brackets simply doesn't work.
grep -Eaoz "(\w|_|#)*" FILE

Related

How do I exclude a character in Linux

Write a wildcard to match all files (does not matter the files are in which directory, just ask for the wildcard) named in the following rule: starts with a string “image”, immediately followed by a one-digit number (in the range of 0-9), then a non-digit char plus anything else, and ends with either “.jpg” or “.png”. For example, image7.jpg and image0abc.png should be matched by your wildcard while image2.txt or image11.png should not.
My folder contained these files imag2gh.jpeg image11.png image1agb.jpg image1.png image2gh.jpg image2.txt image5.png image70.jpg image7bn.jpg Screenshot .png
If my command work it should only display image1agb.jpg image1.png image2gh.jpg image5.png image70.jpg image7bn.jpg
This is the command I used (ls -ad image[0-9][^0-9]*{.jpg,.png}) but I'm only getting this image1agb.jpg image2gh.jpg image7bn.jpg so I'm missing (image1.png image5.png)Kali Terminal and what I did
ls -ad image[0-9][!0-9]*{.jpg,.png}
Info
Character ranges like [0-9] are usually seen in RegEx statements and such. They won't work as shell globs (wildcards) like that.
Possible solution
Pipe output of command ls -a1
to standard input of the grep command (which does support RegEx).
Use a RegEx statement to make grep filter filenames.
ls -a1|grep "image"'[[:digit:]]\+[[:alpha:]]*\.\(png\|jpg\)'

Understanding sed

I am trying to understand how
sed 's/\^\[/\o33/g;s/\[1G\[/\[27G\[/' /var/log/boot
worked and what the pieces mean. The man page I read just confused me more and I tried the info sai Id but had no idea how to work it! I'm pretty new to Linux. Debian is my first distro but seemed like a rather logical place to start as it is a root of many others and has been around a while so probably is doing stuff well and fairly standardized. I am running Wheezy 64 bit as fyi if needed.
The sed command is a stream editor, reading its file (or STDIN) for input, applying commands to the input, and presenting the results (if any) to the output (STDOUT).
The general syntax for sed is
sed [OPTIONS] COMMAND FILE
In the shell command you gave:
sed 's/\^\[/\o33/g;s/\[1G\[/\[27G\[/' /var/log/boot
the sed command is s/\^\[/\o33/g;s/\[1G\[/\[27G\[/' and /var/log/boot is the file.
The given sed command is actually two separate commands:
s/\^\[/\o33/g
s/\[1G\[/\[27G\[/
The intent of #1, the s (substitute) command, is to replace all occurrences of '^[' with an octal value of 033 (the ESC character). However, there is a mistake in this sed command. The proper bash syntax for an escaped octal code is \nnn, so the proper way for this sed command to have been written is:
s/\^\[/\033/g
Notice the trailing g after the replacement string? It means to perform a global replacement; without it, only the first occurrence would be changed.
The purpose of #2 is to replace all occurrences of the string \[1G\[ with \[27G\[. However, this command also has a mistake: a trailing g is needed to cause a global replacement. So, this second command needs to be written like this:
s/\[1G\[/\[27G\[/g
Finally, putting all this together, the two sed commands are applied across the contents of the /var/log/boot file, where the output has had all occurrences of ^[ converted into \033, and the strings \[1G\[ have been converted to \[27G\[.

Linux unexpected things with shell and cmake etc

I am facing a strange issue with cd command and cmake.
cd command is not working with the paths which contain '-' minus sign in it. (unless used by tab expansion which is not desireable as path will be provided by ENV variable)
cmake issue
export $SOME_VAR=Some_value_for_this_variable
Now using this in cmake as
set (SOME_OTHER_VAR "$ENV{SOME_VAR}/SUFFIX")
above should give the output as SOME_OTHER_VAR=Some_value_for_this_variable/SUFFIX but instead it is replacing the env variable from starting and giving the output as SOME_OTHER_VAR=SUFFIXalue_for_this_variable means Some_v is replaced from starting with SUFFIX which is not expected.
Please help as i am not getting whats happening.
You're having some sort of character set issue. There are two different minus signs. The hyphen - (ASCII 45, U+002D), and the real minus sign − (U+2212). It's possible that the filename itself got the non-ASCII minus sign, which you can't easily type with your keyboard. The easiest fix would be to rename the file to the normal hyphen. Otherwise, you have to convince CMake to understand your Unicode filename. I have no idea if that's easy or hard.
I think your second problem is similar. The environment variable likely one or more non-printing characters in it, messing up the CMake variables, or at least the display. Try this: from the Linux command prompt, inspect the actual contents of the string.
echo $SOME_VAR | od -t c
For ASCII representation of everything, and/or
echo $SOME_VAR | od -t d1
for the byte contents

"grep" offset of ascii string from binary file

I'm generating binary data files that are simply a series of records concatenated together. Each record consists of a (binary) header followed by binary data. Within the binary header is an ascii string 80 characters long. Somewhere along the way, my process of writing the files got a little messed up and I'm trying to debug this problem by inspecting how long each record actually is.
This seems extremely related, but I don't understand perl, so I haven't been able to get the accepted answer there to work. The other answer points to bgrep which I've compiled, but it wants me to feed it a hex string and I'd rather just have a tool where I can give it the ascii string and it will find it in the binary data, print the string and the byte offset where it was found.
In other words, I'm looking for some tool which acts like this:
tool foobar filename
or
tool foobar < filename
and its output is something like this:
foobar:10
foobar:410
foobar:810
foobar:1210
...
e.g. the string which matched and a byte offset in the file where the match started. In this example case, I can infer that each record is 400 bytes long.
Other constraints:
ability to search by regex is cool, but I don't need it for this problem
My binary files are big (3.5Gb), so I'd like to avoid reading the whole file into memory if possible.
grep --byte-offset --only-matching --text foobar filename
The --byte-offset option prints the offset of each matching line.
The --only-matching option makes it print offset for each matching instance instead of each matching line.
The --text option makes grep treat the binary file as a text file.
You can shorten it to:
grep -oba foobar filename
It works in the GNU version of grep, which comes with linux by default. It won't work in BSD grep (which comes with Mac by default).
You could use strings for this:
strings -a -t x filename | grep foobar
Tested with GNU binutils.
For example, where in /bin/ls does --help occur:
strings -a -t x /bin/ls | grep -- --help
Output:
14938 Try `%s --help' for more information.
162f0 --help display this help and exit
I wanted to do the same task. Though strings | grep worked, I found gsar was the very tool I needed.
http://tjaberg.com/
The output looks like:
>gsar.exe -bic -sfoobar filename.bin
filename.bin: 0x34b5: AAA foobar BBB
filename.bin: 0x56a0: foobar DDD
filename.bin: 2 matches found

Using wildcards to exclude files with a certain suffix

I am experimenting with wildcards in bash and tried to list all the files that start with "xyz" but does not end with ".TXT" but getting incorrect results.
Here is the command that I tried:
$ ls -l xyz*[!\.TXT]
It is not listing the files with names "xyz" and "xyzTXT" that I have in my directory. However, it lists "xyz1", "xyz123".
It seems like adding [!\.TXT] after "xyz*" made the shell look for something that start with "xyz" and has at least one character after it.
Any ideas why it is happening and how to correct this command? I know it can be achieved using other commands but I am especially interested in knowing why it is failing and if it can done just using wildcards.
These commands will do what you want
shopt -s extglob
ls -l xyz!(*.TXT)
shopt -u extglob
The reason why your command doesn't work is beacause xyz*[!\.TXT] which is equivalent to xyz*[!\.TX] means xyz followed by any sequence of character (*) and finally a character in set {!,\,.,T,X} so matches 'xyzwhateveryouwant!' 'xyzwhateveryouwant\' 'xyzwhateveryouwant.' 'xyzwhateveryouwantT' 'xyzwhateveryouwantX'
EDIT: where whateveryouwant does not contain any of !\.TX
I don't think this is doable with only wildcards.
Your command isn't working because it means:
Match everything that has xyz followed by whatever you want and it must not end with sequent character: \, .,T and X. The second T doesn't count as far as what you have inside [] is read as a family of character and not as a string as you thought.
You don't either need to 'escape' . as long as it has no special meaning inside a wildcard.
At least, this is my knowledge of wildcards.

Resources