How to generate a random string without repeated characters? - linux

Im trying to generate a password in Bash that matches MacOS password requirements and one of them is that it can't have repeated characters (aa, bb, 44, 00, etc).
I know i can use openssl rand -base64 or /dev/urandom and use tr -d to manipulate the output string. I use grep -E '(.)\1{1,}' to search for repeated characters but if i use this regex to delete (tr -d (.)\1{1,}'), it deletes the entire string. I even tried tr -s '(.)\1{1,}' to squeeze the characters to just one occurrence but it keep generating repeated characters in some attempts. Is it possible to achieve what i'm trying to?
P.S.: that's a situation where i cant download any "password generator tool" like pwgen and more. It must be "native"

Sorry I have no bash at hands to, but trying to help you.
What about iteratively grabbing the unique chars, eg. by
chars=$(openssl rand -base64)
pwd=
for (( i=0; i<${#chars}; i++ )); do
if [[ "$(echo $pwd | grep "${chars:$i:1}")" == "" ]]; then
pwd=$pwd${chars:$i:1}
fi
done

The issue might be that you have non-printable characters, so it's not actually repeated. If you first get the first e.g. 30 characters, then delete any non-alphanumeric, non punctuation characters, then squeeze any of those characters, then from whatever is left get the first 20 characters, it seems to work:
cat /dev/urandom | tr -dc '[:alnum:][:punct:]' | fold -w ${1:-30} | head -n 1 | tr -s '[:alnum:][:punct:]' | cut -c-20
Output e.g.:
]'Zc,fs6m;wUo%wLIG%K
2O3Ff4dzi30~L.RH8jR0
sU?,WkK]&I;z'|eTSLjY
5gK]\H51i#Rtux.{bdC=
:g"\?5JsjBd1r])2^WR+
;{cR:jY\rIc&Q(2yo:|-
fFykmxvZ|ATX_l6L(8h:
^Sd*,V%9}bWnTYNv"w?'
6foMgbU6:n<*cWj2W=3&
*v39FWmB#LwE5O`a3C36

Is there a specific size requirement? Other required characters?
How about -
openssl rand -base64 20 | sed -E 's/(.)\1+/\1/g'

You're getting close. tr doesn't use regex (but does use POSIX character classes).
Either of these will squeeze repeats:
tr -cs '\0'
tr -s '[:graph:][:space:]'
They differ only in how we refer to "all characters". First is "complement of null" second is all printable and all white space characters. There may be a neater way to specify "all characters".
Or using sed:
sed -E 's/(.)\1+/\1/g'
This both squeezes printable characters, and removes white space:
tr -ds '[:space:]' '[:graph:]'
Example for 32 non whitespace characters, with no repeats:
tr -ds '[:space:]' '[:graph:]' < /dev/urandom |
dd bs=32 count=1
Also, this example specifies a list of allowed characters (letters, digits, and _.), then squeezes any repeats:
tr -dc '[:alnum:]_.' < /dev/urandom |
tr -sc '\0' |
dd bs=32 count=1
Example output:
9mCEqrhHPwmq7.1qEky6qn4jqzDpRK7b
Putting dd at the end means we get 32 characters after removing repeats. You may also want to add status=none to hide dd logging on stderr.

It's not clear if you don't want consecutive chars repeated or no repeated chars at all (which in either case I don't think it's a good idea as it would make your passwords weaker and easier to guess), but having said that
#! /bin/bash
awk -vN=20 '
{
n=split($0,ch,"");
for (i=1; i<n; i++) {
a[ch[i]]++
}
n=0;
for (c in a) {
if (++n > N) {
break;
}
printf("%c",c)
}
printf("\n")
}
' < <(openssl rand -base64 32)
this generated N length passwords without repeated chars from 32 random bytes (that is N should be much smaller than 32)

Related

How can I truncate a line of text longer than a given length?

How would you go about removing everything after x number of characters? For example, cut everything after 15 characters and add ... to it.
This is an example sentence should turn into This is an exam...
GnuTools head can use chars rather than lines:
head -c 15 <<<'This is an example sentence'
Although consider that head -c only deals with bytes, so this is incompatible with multi-bytes characters like UTF-8 umlaut ü.
Bash built-in string indexing works:
str='This is an example sentence'
echo "${str:0:15}"
Output:
This is an exam
And finally something that works with ksh, dash, zsh…:
printf '%.15s\n' 'This is an example sentence'
Even programmatically:
n=15
printf '%.*s\n' $n 'This is an example sentence'
If you are using Bash, you can directly assign the output of printf to a variable and save a sub-shell call with:
trim_length=15
full_string='This is an example sentence'
printf -v trimmed_string '%.*s' $trim_length "$full_string"
Use sed:
echo 'some long string value' | sed 's/\(.\{15\}\).*/\1.../'
Output:
some long strin...
This solution has the advantage that short strings do not get the ... tail added:
echo 'short string' | sed 's/\(.\{15\}\).*/\1.../'
Output:
short string
So it's one solution for all sized outputs.
Use cut:
echo "This is an example sentence" | cut -c1-15
This is an exam
This includes characters (to handle multi-byte chars) 1-15, c.f. cut(1)
-b, --bytes=LIST
select only these bytes
-c, --characters=LIST
select only these characters
Awk can also accomplish this:
$ echo 'some long string value' | awk '{print substr($0, 1, 15) "..."}'
some long strin...
In awk, $0 is the current line. substr($0, 1, 15) extracts characters 1 through 15 from $0. The trailing "..." appends three dots.
Todd actually has a good answer however I chose to change it up a little to make the function better and remove unnecessary parts :p
trim() {
if (( "${#1}" > "$2" )); then
echo "${1:0:$2}$3"
else
echo "$1"
fi
}
In this version the appended text on longer string are chosen by the third argument, the max length is chosen by the second argument and the text itself is chosen by the first argument.
No need for variables :)
Using Bash Shell Expansions (No External Commands)
If you don't care about shell portability, you can do this entirely within Bash using a number of different shell expansions in the printf builtin. This avoids shelling out to external commands. For example:
trim () {
local str ellipsis_utf8
local -i maxlen
# use explaining variables; avoid magic numbers
str="$*"
maxlen="15"
ellipsis_utf8=$'\u2026'
# only truncate $str when longer than $maxlen
if (( "${#str}" > "$maxlen" )); then
printf "%s%s\n" "${str:0:$maxlen}" "${ellipsis_utf8}"
else
printf "%s\n" "$str"
fi
}
trim "This is an example sentence." # This is an exam…
trim "Short sentence." # Short sentence.
trim "-n Flag-like strings." # Flag-like strin…
trim "With interstitial -E flag." # With interstiti…
You can also loop through an entire file this way. Given a file containing the same sentences above (one per line), you can use the read builtin's default REPLY variable as follows:
while read; do
trim "$REPLY"
done < example.txt
Whether or not this approach is faster or easier to read is debatable, but it's 100% Bash and executes without forks or subshells.

How to double all words of latin letters in the string with sed?

I was trying to double all latin words in the string, actually we also need to put it in brackets. ("lol" -> "(lollol)")
"Word" here means the sequence of latin letters ([A-Za-z]\+). I was trying a lot, like:
1) ls -l /bin | sed "s/[^ ][A-Za-z][^ ]/(&&)/g", but it is doubling all symbolls even with special symbols and digits
2) Also i have an idea to take all nonsuitable words in '|' brackets:
(ls -l /bin | sed "s/[^ ][^A-Za-z ][^ ]/|&|/g") and then double all words without brackets (wasn`t still thinking how) and remove '|' brackets. I realize that it is not effective to use 3 sed commands and that there may be '|' symbol in the sequence(though i know how to solve this problem).
So after few days of struggle i've decided to take some help.
Here are some examples:
1)"rwx" - > "(rwxrwx)"
2) "-rwx" - > "-rwx"
3)"jk2l" - > "jk2l"
4)"jkl" - > "(jkljkl)"
Now string examples:
1) "I want 2 sh-w" - > "(II) (wantwant) 2 sh-w"
2) "-rwx but th1-s" - > "-rwx (butbut) th1-s"
This might work for you (GNU sed):
sed -r ':a;s/(^|\s)([[:alpha:]]+)(\s|$)/\1(\2\2)\3/g;ta' file
This matches on words that only contain alpha characters and replaces them in the desired style then repeats the match again so as to pick up any missed matches in the first pass.
I'm going to suggest awk for this task:
$ cat file
rwx
-rwx
jk2l
jkl
I want 2 sh-w
-rwx but th1-s
$ awk '{ for (i=1; i<=NF; ++i) if ($i ~ /^[A-Za-z]+$/) $i = "(" $i $i ")" }1' file
(rwxrwx)
-rwx
jk2l
(jkljkl)
(II) (wantwant) 2 sh-w
-rwx (butbut) th1-s

how to count occurrence of specific word in group of file by bash/shellscript

i have two text files 'simple' and 'simple1' with following data in them
simple.txt--
hello
hi hi hello
this
is it
simple1.txt--
hello hi
how are you
[]$ tr ' ' '\n' < simple.txt | grep -i -c '\bh\w*'
4
[]$ tr ' ' '\n' < simple1.txt | grep -i -c '\bh\w*'
3
this commands show the number of words that start with "h" for each file but i want to display the total count to be 7 i.e. total of both file. Can i do this in single command/shell script?
P.S.: I had to write two commands as tr does not take two file names.
Try this, the straightforward way :
cat simple.txt simple1.txt | tr ' ' '\n' | grep -i -c '\bh\w*'
This alternative requires no pipelines:
$ awk -v RS='[[:space:]]+' '/^h/{i++} END{print i+0}' simple.txt simple1.txt
7
How it works
-v RS='[[:space:]]+'
This tells awk to treat each word as a record.
/^h/{i++}
For any record (word) that starts with h, we increment variable i by 1.
END{print i+0}
After we have finished reading all the files, we print out the value of i.
It is not the case, that tr accepts only one filename, it does not accept any filename (and always reads from stdin). That's why even in your solution, you didn't provide a filename for tr, but used input redirection.
In your case, I think you can replace tr by fmt, which does accept filenames:
fmt -1 simple.txt simple1.txt | grep -i -c -w 'h.*'
(I also changed the grep a bit, because I personally find it better readable this way, but this is a matter of taste).
Note that both solutions (mine and your original ones) would count a string consisting of letters and one or more non-space characters - for instance the string haaaa.hbbbbbb.hccccc - as a "single block", i.e. it would only add 1 to the count of "h"-words, not 3. Whether or not this is the desired behaviour, it's up to you to decide.

Find HEX value in file and grep the following value

I have a 2GB file in raw format. I want to search for all appearance of a specific HEX value "355A3C2F74696D653E" AND collect the following 28 characters.
Example: 355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135
In this case I want the output: "323031312D30342D32365431343A34373A30322D31343A34373A3135" or better: 2011-04-26T14:47:02-14:47:15
I have tried with
xxd -u InputFile | grep '355A3C2F74696D653E' | cut -c 1-28 > OutputFile.txt
and
xxd -u -ps -c 4000000 InputFile | grep '355A3C2F74696D653E' | cut -b 1-28 > OutputFile.txt
But I can't get it working.
Can anybody give me a hint?
As you are using xxd it seems to me that you want to search the file as if it were binary data. I'd recommend using a more powerful programming language for this; the Unix shell tools assume there are line endings and that the text is mostly 7-bit ASCII. Consider using Python:
#!/usr/bin/python
import mmap
fd = open("file_to_search", "rb")
needle = "\x35\x5A\x3C\x2F\x74\x69\x6D\x65\x3E"
haystack = mmap.mmap(fd.fileno(), length = 0, access = mmap.ACCESS_READ)
i = haystack.find(needle)
while i >= 0:
i += len(needle)
print (haystack[i : i + 28])
i = haystack.find(needle, i)
If your grep supports -P parameter then you could simply use the below command.
$ echo '355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135' | grep -oP '355A3C2F74696D653E\K.{28}'
323031312D30342D32365431343A
For 56 chars,
$ echo '355A3C2F74696D653E323031312D30342D32365431343A34373A30322D31343A34373A3135' | grep -oP '355A3C2F74696D653E\K.{56}'
323031312D30342D32365431343A34373A30322D31343A34373A3135
Why convert to hex first? See if this awk script works for you. It looks for the string you want to match on, then prints the next 28 characters. Special characters are escaped with a backslash in the pattern.
Adapted from this post: Grep characters before and after match?
I added some blank lines for readability.
VirtualBox:~$ cat data.dat
Thisis a test of somerandom characters before thestringI want5Z</time>2011-04-26T14:47:02-14:47:15plus somemoredata
VirtualBox:~$ cat test.sh
awk '/5Z\<\/time\>/ {
match($0, /5Z\<\/time\>/); print substr($0, RSTART + 9, 28);
}' data.dat
VirtualBox:~$ ./test.sh
2011-04-26T14:47:02-14:47:15
VirtualBox:~$
EDIT: I just realized something. The regular expression will need to be tweaked to be non-greedy, etc and between that and awk need to be tweaked to handle multiple occurrences as you need them. Perhaps some of the folks more up on awk can chime in with improvements as I am real rusty. An approach to consider anyway.

Getting n-th line of text output

I have a script that generates two lines as output each time. I'm really just interested in the second line. Moreover I'm only interested in the text that appears between a pair of #'s on the second line. Additionally, between the hashes, another delimiter is used: ^A. It would be great if I can also break apart each part of text that is ^A-delimited (Note that ^A is SOH special character and can be typed by using Ctrl-A)
output | sed -n '1p' #prints the 1st line of output
output | sed -n '1,3p' #prints the 1st, 2nd and 3rd line of output
your.program | tail +2 | cut -d# -f2
should get you 2/3 of the way.
Improving Grumdrig's answer:
your.program | head -n 2| tail -1 | cut -d# -f2
I'd probably use awk for that.
your_script | awk -F# 'NR == 2 && NF == 3 {
num_tokens=split($2, tokens, "^A")
for (i = 1; i <= num_tokens; ++i) {
print tokens[i]
}
}'
This says
1. Set the field separator to #
2. On lines that are the 2nd line, and also have 3 fields (text#text#text)
3. Split the middle (2nd) field using "^A" as the delimiter into the array named tokens
4. Print each token
Obviously this makes a lot of assumptions. You might need to tweak it if, for example, # or ^A can appear legitimately in the data, without being separators. But something like that should get you started. You might need to use nawk or gawk or something, I'm not entirely sure if plain awk can handle splitting on a control character.
bash:
read
read line
result="${line#*#}"
result="${result%#*}"
IFS=$'\001' read result -a <<< "$result"
$result is now an array that contains the elements you're interested in. Just pipe the output of the script to this one.
here's a possible awk solution
awk -F"#" 'NR==2{
for(i=2;i<=NF;i+=2){
split($i,a,"\001") # split on SOH
for(o in a ) print o # print the splitted hash
}
}' file

Resources