Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I have a large dump of data from an outlook email account that comes entirely in .msg files. A quick call to ubuntu's file method revealed that they were Composite Document File V2 Documents (whatever that means). I would really like to be able to read these files as plaintext. Is that possible at all?
Update: Turns out it wasn't totally possible to do what I wanted for large scale data mining on these kinds of files which was a bummer. In case you face the same issue I made a library to address this issue. https://github.com/Slater-Victoroff/msgReader
Documentation isn't great, but it's a pretty small library so it should be self explanatory.
I faced the same problem this morning. I didn't find any information on the file format but it was possible to extract the required information from the file using strings and grep:
strings -e l *.msg | grep pattern
The -e l (that's a small L) converts from UTF-16.
This will only work if you can grep the data you need from the file (i.e. all required lines contain a standard string or pattern).
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
I'm currently studying Bash shell and have encountered command ls -F. I know it ls function is to append indicators to items lists, so to distinguish between different file types. I know that / is appended to directory and * is appended to executable files. But I have checked the manual page on ls command but couldn't find any information on indicator =>#|.
Could someone tell me what they represent? And it would be even better if can inform me where to find this kind of information when in need.
Try info ls, under "What information is listed":
‘-F’
‘--classify’
‘--indicator-style=classify’
Append a character to each file name indicating the file type.
Also, for regular files that are executable, append ‘*’. The file
type indicators are ‘/’ for directories, ‘#’ for symbolic links,
‘|’ for FIFOs, ‘=’ for sockets, ‘>’ for doors, and nothing for
regular files. Do not follow symbolic links listed on the command
line unless the ‘--dereference-command-line’ (‘-H’),
‘--dereference’ (‘-L’), or
‘--dereference-command-line-symlink-to-dir’ options are specified.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
The community reviewed whether to reopen this question 8 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I need to write a script to find out if a given document is of the format .doc or not.
Iam using Amazon Linux machine. I tried to make use of the linux file command.
For a given doc file the file command outputs the file information as following:
sample_file.doc: Composite Document File V2 Document, No summary info
I found out that file command provides the same file type information for 2003 excel files (.xls).
I want to know what all file types (like doc,xls) come under Composite Document File V2 Document and how I can check if given file is a doc file or not in Amazon Linux 2012 machine?
It is a document format of the Microsoft. I used the guide here to convert my files without issues.
Essentially, you can use the unoconv tool for the conversion to a more friendly format.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
I'm interested in saving a pcap that has network layer name resolution. While it works great within Wireshark, how can I save it with the resolved names intact? Having this information would be extremely helpful for me and save me a lot of time if this is possible. I understand in the documentation that it can't be saved within the pcap file (http://www.wireshark.org/docs/wsug_html_chunked/ChAdvNameResolutionSection.html#idp390072124) but is there an alternative way to do so? Does anyone have any solutions to this?
Thanks in advance!
I haven't tried it myself, but in theory the name resolution information can/will be stored in the pcap-ng file format, which has been Wireshark's default file format since version 1.8. The old pcap file format you cite won't, but pcap-ng has a specific defined block type in its format for ip<->name resolution information.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I have a excel file and set a password by 'bashing' at they keyboard..
I was basically testing it and pressed everything.. Got me thinking, Is they a way to crack/remove the password?
I'm not on about just a sheet, I mean the actual file..
When I click to open the xlsx file a pop up box comes up asking for the password.. Easy way around it?
No there isn't. An xlsx (which is essentially a zip file) uses a far superior encryption model than earlier Excel formats (e.g. xls). In short, the whole file is encrypted as opposed to a password hash being embedded in an otherwise readable file.
Your only hope is to write a brute force cracker that mimics the bashing behaviour you describe. (e.g. unlikely that you have mixed case etc.).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
If I run find (Ubuntu, specifically), can I expect it to give me the same order of results every time? (Assuming, of course, that the actual files haven't changed.)
In other words, if I run
$ find foo
and it gives me
bar.txt
foo.txt
can I expect that it will never give me
foo.txt
bar.txt
?
The answer is "probably" but you shouldn't rely on it because any number of things can affect it.
What order do you want the files in? Decide on that and then use a find command (perhaps piped into sort) which reproducibly gets the result you need.
The order of the files is determined by the fine details of the filesystem format and the filesystem driver. You can't rely on it. Depending on the filesystem and operating system, here are things that might change the order:
A file is created or removed in a traversed directory (even if none of the listed files changed).
The files are moved around (e.g. transfered to a different filesystem or restored from backup).
A defragmenter or filesystem check ran and decided to move things around.
If you want a reproducible order, sort the results. find … | sort will do nicely if none of the file names contain newlines.