Command to list word completions - linux

I once found a built-in command that would take a prefix as an argument and return all words that could complete that word. So for example,
>> COMMAND cali
California
calibrate
calibration
........
of course it would list a lot more words, in alphanumerical order. It was really useful, and optionally took a file other than the default to look in.
I'm not just trying to produce this behavior: there are obviously a million ways to use grep, sed, awk, perl, or INSERT TURING-COMPLETE LANGUAGE HERE to get this. I'm looking for the command.
Unfortunately it's hard to google something when you don't remember the name, but while it might not have been POSIX standard it was definitely a very common Linux utility, does anyone know what this was called?

Found it: it's called look, and it seems to have been around Unix since V7. (The man page is dated 1993!)
It does a binary search on the optional second argument to find all matches, defaulting to /usr/share/dict/words.

Not really a builtin command, but there is /usr/share/dict/* and grep:
$ grep -i '^Cali' /usr/share/dict/words
Caliban
Calibanism
caliber
calibered
calibogus
calibrate
calibration
calibrator
calibre
Caliburn
...

Related

is there a special search command for options and commands in man-pages (in less)?

Question: using the less command in any linux shell (i'm using bash as probably most people do), is there a way to search a file only for it's commands or options?
So, to be more precise:
if i want to quickly find the description for one special option in a man-page,
is there a special search syntax to quickly jump to the corresponding line explaining that specific command?
example:
if i type:
man less
and i want to quickly find the description for the "-q" command,
is there a search syntax to directly jump to that line?
If I type /-q, it finds all occurences of "-q" everywhere in the file, so I get around 10-20 hits, of which only one is the one i was looking for..
So I'm just hoping there is a better/quicker way to do this..
(not to important though :D)
In man, options are generally described with the option name in bold at the start of the line.
So, if you are looking for the option -q, then the search command would be /^\s*-q\>
The regex ^\s*-q\> reads as follow:
^ start of a line
\s* any number of spaces (including none)
-q the option name you are looking for
\> the end of the word

Difference between the 3 option syntax for commands in bash

In Linux command line, one can use either of two ways to pass options to commands. Either we can use the short option format which uses a single dash followed by a single letter, for example: -o or the long option format which uses two consecutive dashes followed by a word, for example: --option. But recently I came across some commands which in my thinking uses a 'hybrid' of both the formats, which uses a single dash followed by a word, for example: -option. Now I'm not talking about a commands where you can stick multiple short options together like ls -lisa. I'm talking about options where the word after the single dash is just one option and not multiple short options strung together.
I don't seem to understand why there's a third option. Because what I know about the Linux command line is you can have only a short form format or a long form format. Where did the third format came from?
It's actually confusing because sometimes you cannot be sure if the third format is really a dash followed by one option or a dash followed by multiple short options.
This is not a bash issue. All programs have their on way of handling the options/flags. There are many different styles:
the singe letter style with a single hyphen, for example:
ls -l
the mnemonic-style with double dashes, which seems a preference for GNU-stuff, for example, ls --size
the variable=value-style, for example dd if=file of=otherfile
options without dashes, as in tar cvzf arghive.tgz
You could even use a + instead of a - (as in date +%m).
etcetera.
It is important to understand that bash just passes these options to the programs/commands. So, in the programs you will generally see:
int main(int argc, char *argv[]){
(c-code example). In that case, argv[0] will point to the program-name (to simplify things a bit) and argv[1] will point to the first argument. Depending on the program, that may be different.
A quick scan through the built-in commands reveals that the built-ins always seem to use the minus-single letter (-a) for specifying options.
I think you are confusing which component does which part of the parsing.
The command line you type into bash gets parsed twice. First it gets parsed by bash. At this stage, spaces are used to separate the different parameters. Quotes and escapes are being taken into consideration. Wildcards are expanded, and $ variables are substituted.
At the end of this phase, we are left with a command line that has a list of strings, the first of which describes the command to be executed. At this point, bash calls execve, and passes it that list of strings.
The next phase of parsing is optional, and is up to each program to carry out. Most programs call getopt_long, a library function that parses options. The one and two dash convention you mention is applied by it (as well as it's older sibling, getopt).
It is, however, up to each program to parse its own parameters. Many programs use getopt_long, which is why you feel, correctly, that it is a standard. Some, however, do not. Those who do not follow their own way.
That's just how things are.
For your programs, you should try to use either getopt_long or some compatible solution, as that causes the least amount of confusion for users.

How to grep text for small mistakes

Using standard Unix tools how can I search in a text file or output for a word with maybe 1-2 letters transposed or missed?
For example my input
function addtion(number, increment)
return number+increment
end
function additoin(number, increment)
return number+increment
end
I would like to search for addition and match addtion and additoin in my input and tell me about it. Because it's code, checking against dictionary is out of the question.
Currently cat file.txt | grep "addition" will simply yield me nothing.
You can play around with the agrep command. It can perform fuzzy, approximate matches.
The following command worked for me:
agrep -2 addition file
You can't do a fuzzy match with standard grep, but if there are specific misspelling you're interested in, you could construct a regular expression that matches those.
For example:
grep add[it]*on
matches the example misspelling you gave. But that's probably not general enough for your purposes.
A better approach is likely going to be to use some sort of static analysis tool specific to the language the code is in. It might not give you the right spelling, but should be able to tell you where the function name and calls to the function use different spellings.
Try the spell command. Note: You might need a dictionary (usually aspell-en in your distro's repositories).
As the answer says, you should definitely try agrep. In addition, there is a newer and much faster alternative ugrep for fuzzy search. Use -Z2 to allow up to 2 errors:
ugrep -Z2 addition file.txt
An insertion, deletion, or substitution is one error. A transposition (as in additoin) counts as two errors, i.e. two substitutions. Use option -i for case-insensitive search and -w to match whole words.
Try this on linux terminal:
grep -rnw "text" ./

Hyphen usage on Linux command options

Until recently, I was under the impression that by convention, all Linux command options were required to be prefixed by a hyphen (-). So for example, the instruction ls –l executes the ls command with the l option (here we can see that the l option is prefixed by a hyphen).
Life was good until I got to the chapter of my Linux for beginners book that explained the ps command. There I learned that I could write something like ps u U xyz where as far as I can tell, theu and U are options that are not required to be prefixed by a hyphen. Normally, I would have expected to have to write that same command as something like ps –uU xyz to force the usage of a hyphen.
I realize that this is probably a stupid question but I was wondering if there is a particular reason as to why the ps command does not follow what I thought was the standard way of specifying command options (prefixing them with hyphens). Why the variation? Is there a particular meaning to specifying hyphen-less options like that?
There are a handful of old programs on Unix that were written when the conventions were not as widely adopted, and ps is one of them. Another example is tar, although it has since been updated to allow options both with and without the - prefix.
IMO the best practice concerning hyphenation is to use them as the default go-to. More times than not, they have accepted hyphen prefixes to most or all flags/options available for commands. Happy to be corrected if I am wrong in this instance. I am still new to this myself! :)

How can I view log files in Linux and apply custom filters while viewing?

I need to read through some gigantic log files on a Linux system. There's a lot of clutter in the logs. At the moment I'm doing something like this:
cat logfile.txt | grep -v "IgnoreThis\|IgnoreThat" | less
But it's cumbersome -- every time I want to add another filter, I need to quit less and edit the command line. Some of the filters are relatively complicated and may be multi-line.
I'd like some way to apply filters as I am reading through the log, and a way to save these filters somewhere.
Is there a tool that can do this for me? I can't install new software so hopefully it's something that would already be installed -- e.g., less, vi, something in a Python or Perl lib, etc.
Changing the code that generates the log to generate less is not an option.
Use &pattern command within less.
From the man page for less
&pattern
Display only lines which match the pattern; lines which do not
match the pattern are not displayed. If pattern is empty (if
you type & immediately followed by ENTER), any filtering is
turned off, and all lines are displayed. While filtering is in
effect, an ampersand is displayed at the beginning of the
prompt, as a reminder that some lines in the file may be hidden.
Certain characters are special as in the / command:
^N or !
Display only lines which do NOT match the pattern.
^R Don't interpret regular expression metacharacters; that
is, do a simple textual comparison.
Try the multitail tool - as well as letting you view multile logs at once, I'm pretty sure it lets you apply regex filters interactively.
Based on ghostdog74's answer and the less manpage, I came up with this:
~/.bashrc:
export LESSOPEN='|~/less-filter.sh %s'
export LESS=-R # to allow ANSI colors
~/less-filter.sh:
#!/bin/sh
case "$1" in
*logfile*.log*) ~/less-filter.sed < $1
;;
esac
~/less-filter.sed:
/deleteLinesLikeThis/d # to filter out lines
s/this/that/ # to change text on lines (useful to colorize using ANSI escapes)
Then:
less logfileFooBar.log.1 -- applies the filter applies automatically.
cat logfileFooBar.log.1 | less -- to see the log without filtering
This is adequate for now but I would still like to be able to edit the filters on the fly.
see the man page of less. there are some options you can use to search for words for example. It has line editing mode as well.
There's an application by Casstor Software Solutions called LogFilter (www.casstor.com) that can edit Windows/Mac/Linux text files and can easily perform file filtering. It supports multiple filters as well as regular expressions. I think it might be what you're looking for.

Resources