Searching an entire drive for plaintext passwords

Searching an entire drive for plaintext passwords - linux

I have an encrypted database of about 6000 unique passwords, and I want to search about 1TB of data for any instance of these passwords. I am using Cygwin, but I could have the drive available in a real linux environment if I needed to.
I have a file "ClientPasswords.txt" which contains every unique password only once, one password per line. I am trying to compare every file in my T:/ drive to this list.
I am using this command:
grep -nr -F -f ClientPasswords.txt /cygdrive/t 2> SuspectFiles.txt
My goal is to generate a list of all files, "SuspectFiles.txt", that contain any known password in our password database in plaintext so that we can redact sensitive information from the drive.
Currently, it is getting a ton of false positives, including some that don't seem to match anything in the list. I have already eliminated all passwords that are fewer than 6 characters, can be found in the dictionary (or are otherwise known client names), or are just numbers.
I would like to:
Limit it to a select few filetypes (txt, csv, xls, xlsx, doc, docx, etc.)
Eliminate all compressed files (or find a way to search inside them)
Limit snippet output to prevent dumping entire binary files into the output file.
Anyone done something similar, or know of an easier way to search for these improperly documented passwords from a blacklist? I have also played around with the Windows program "Agent Ransack", but it seems much more limited than grep.
Thanks!

First thing you want to do is make a list of all files on drive T that are of the right type, and output that to a list of target file names. Use a shell script:
for i in txt csv xls xlsx doc docx;
do
find /cygdrive/t -name \*.$i >> target_file_names.txt
done
Now that you have the target file names, you can search your passwords amongst those target file names.
for target_file in `cat target_file_names.txt`
do
for pwd in `cat ClientPasswords.txt`
do
grep -l $pwd $target_file > /dev/null
test $? -eq 0 && echo $target_file has password $pwd
done
done
Something like that should work. You might have to tweak it a little.

Related

Bash script to copy file by type

How do I use file command for copying the files in a directory according to their type? I know I can use file to find the type of the file, but I don't know how to use it it the if condition.
What I want to achieve is this. I need to tidy up my downloads folder. When I run the specific script, I want the files in the mentioned folder to be moved into a dedicated folder, according to its type. For eg, image files should be moved to a folder named "Images", video files to "Videos", executables to "Programs" and so on.

Something like this?
for filename in ./*; do
case $(file -b -i "$filename") in
inode/directory* | inode/symlink*)
echo "$0: skip $filename" >&2
continue;;
application/*) dest=Random;;
image/*) dest=Images;;
text/html*) dest=Webpages;;
text/plain*) dest=Documents;;
video/*) dest=Videos;;
*) dest=Unknown;;
esac
mkdir -p "$dest"
mv "$filename" "$dest/"
done
The mapping of MIME types (-i option) to your hierarchy of directories isn't entirely straightforward. The application MIME type hierarchy in particular corresponds to a vast number of document types (PDF, Excel, etc) - some of which also have designated types - as well as the completely unspecified generic application/octet-stream. Using something else than MIME types is often even more problematic, as the labels that file prints are free-form human-readable text which can be essentially random (for example, different versions of the same file format may correspond to different detections with different labels, which are not systematically formatted, and so you might get Evil Empire Insult (tm) format 1997 from one file and Insult 2000 from another with the same extension).
Probably do a test run with file -i ./* and examine the results you get, then update the code above with cases which actually make sense for your specific files.

Making a text file grow by replicating on mac

I am doing some unit tests and I have a small text file (a few kilobytes) and what I would like to do is make a new file out of this where the same text is replicated over and over again for some user specified times. The reason I want to do this is to ensure that my algorithm can handle large files and the results are correct (I can extrapolate the correct results from the tests ran on the smaller text file).
Is there a utility on the mac or linux platform that allows me to do that?

You can use a for loop and concatenate the contents of the file to a temporary file.
COUNT=10 # larger or smaller, depending on how large you want the file
FILENAME=test.txt
# remove the mv command if you do not wish the original file to be overwritten
for i in $(seq 1 $COUNT) ; do cat $FILENAME >> $FILENAME.tmp ; done && mv $FILENAME.tmp $FILENAME

Add comments next to files in Linux

I'm interested in simply adding a comment next to my files in Linux (Ubuntu). An example would be:
info user ... my_data.csv Raw data which was sent to me.
info user ... my_data_cleaned.csv Raw data with duplicates filtered.
info user ... my_data_top10.csv Cleaned data with only top 10 values selected for each ID.
So sort of the way you can comment commits in Git. I don't particularly care about searching on these tags, filtering them etc. Just seeings them when I list files in a directory. Bonus if the comments/tags follow the document around as I copy or move it.

Most filesystem types support extended attributes where you could store comments.
So for example to create a comment on "foo.file":
xattr -w user.comment "This is a comment" foo.file
The attributes can be copied/moved with the file just be aware that many utilities require special options to copy the extended attributes.
Then to list files with comments use a script or program that grabs the extended attribute. Here is a simple example to use as a starting point, it just lists the files in the current directory:
#!/bin/sh
ls -1 | while read -r FILE; do
comment=`xattr -p user.comment "$FILE" 2>/dev/null`
if [ -n "$comment" ]; then
echo "$FILE Comment: $comment"
else
echo "$FILE"
fi
done
The xattr command is really slow and poorly written (it doesn't even return error status) so I suggest something else if possible. Use setfattr and getfattr in a more complex script than what I have provided. Or maybe a custom ls command that is aware of the user.comment attribute.

This is a moderately serious challenge. Basically, you want to add attributes to files, keep the attributes when the file is copied or moved, and then modify ls to display the values of these attributes.
So, here's how I would attack the problem.
1) Store the information in a sqlLite database. You can probably get away with one table. The table should contain the complete path to the file, and your comment. I'd name the database something like ~/.dirinfo/dirinfo.db. I'd store it in a subfolder, because you may find later on that you need other information in this folder. It'd be nice to use inodes rather than pathnames, but they change too frequently. Still, you might be able to do something where you store both the inode and the pathname, and retrieve by pathname only if the retrieval by inode fails, in which case you'd then update the inode information.
2) write a bash script to create/read/update/delete the comment for a given file.
3) Write another bash function or script that works with ls. I wouldn't call it "ls" though, because you don't want to mess with all the command line options that are available to ls. You're going to be calling ls always as ls -1 in your script, possibly with some sort options, such as -t and/or -r. Anyway, your script will call ls -1 and loop through the output, displaying the file name, and the comment, which you'll look up using the script from 2). You may also want to add file size, but that's up to you.
4) write functions to replace mv and cp (and ln??). These would be wrapper functions that would update the information in your table, and then call the regular Unix versions of these commands, passing along any arguments received by the functions (i.e. "$#"). If you're really paranoid, you'd also do it for things like scp, which can be used (inefficiently) to copy files locally. Still, it's unlikely you'll catch all the possibilities. What if someone else does a mv on your file, who doesn't have the function you have? What if some script moves the file by calling /bin/mv? You can't easily get around these kinds of issues.
Or if you really wanted to get adventurous, you'd write some C/C++ code to do this. It'd be faster, and honestly not all that much more challenging, provided you understand fork() and exec(). I can't recall whether sqlite has a C API. I assume it does. You'd have to tangle with that, too, but since you only have one database, and one table, that shouldn't be too challenging.
You could do it in perl, too, but I'm not sure that it would be that much easier in perl, than in bash. Your actual code isn't that complex, and you're not likely to be doing any crazy regex stuff or string manipulations. There are just lots of small pieces to fit together.
Doing all of this is much more work than should be expected for a person answering a question here, but I've given you the overall design. Implementing it should be relatively easy if you follow the design above and can live with the constraints.

awk/sed/grep command to compare the contents of three files

Hi I am trying to automate some data entry, and I am using a tcp server/client to send filenames around for other server to go into a repository and pull these files. as part of testing this I am running the program with logging the filenames that are supposed to be sent, what was received, and if it got received I am sending a reply back with the filename.
so I have three text files with file names inside of them.
SupposedToSend.txt
Recieved.txt
GotReplyFor.txt
I know that awk could do what I am trying to do but I am not sure how to set it up, I need to compare the three files for elements that does not exists in any of the other files, so if one entry is missing from any file i need to know which one and from which file.
I can write a program for this which will take much longer to write and to run since these files are getting 5 elements/minute dumped into them

paste -d '\n' SupposedToSend.txt Recieved.txt GotReplyFor.txt | uniq -c | grep -v '^ 3'
It's tolerable if you have no errors, deeply suboptimal otherwise. Or if the data in the different files is out of sequence... (In which case you might need to sort them somehow.)
Or you could just run diff3 to compare 3 files...

Download multiple files, with different final names

OK, what I need is fairly simple.
I want to download LOTS of different files (from a specific server), via cURL and would want to save each one of them as a specific new filename, on disk.
Is there an existing way (parameter, or whatever) to achieve that? How would you go about it?
(If there was an option to input all URL-filename pairs in a text file, one per line, and get cURL to process it, would be ideal)
E.g.
http://www.somedomain.com/some-image-1.png --> new-image-1.png
http://www.somedomain.com/another-image.png --> new-image-2.png
...

OK, just figured a smart way to do it myself.
1) Create a text file with pairs of URL (what to download) and Filename (how to save it to disk), separated by comma (,), one per line. And save it as input.txt.
2) Use the following simple BASH script :
while read line; do
IFS=',' read -ra PART <<< "$line";
curl $PART[0] -o $PART[1];
done < input.txt
*Haven't thoroughly tested it yet, but I think it should work.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string