I have a hard disk with 500.000 txt files, spread across different text formats, .eml, .txt, .emlx, .word etc
And I have a text file with 1500 words, I need to find all the files that contain any of the specific words, my own script is to slow for the job, is there any forensic software for OS X that indexes the files and can perform a fast search?
Does not have to be free.
I am having issues finding any my self.
Specifically for OS X, I don't know any tools. But you can always use Bootcamp or VMware and Oracle VMBox, install Windows, and run dtSearch. It's not a freeware, but works flawlessly and VERY quickly. It's also super easy to use.
Other than that, you can also use ELK: Elasticsearch Logstash Kibana, it's not as easy to use as dtSearch and kind of a hassle to setup, (since it's not a single piece of software). But it's free, scalable and is very well documented. You can found loads of tutorials all over the internet.
Hope it helps.
Related
Trying to utilize TTFs for image rendering. I didn't have any on the Linux box the application sits; I was at a loss and took a shot in the dark by SCPing the TTFs from my local machine to the server and pointing the application to them. I figured this wouldn't work since my machine is Windows, and box is Linux....but it was a shot in the dark. Alas, it didn't work. My question is: Are TTFs OS and OS Architecture specific?
No. They are plain data files, and data files are not OS specific (although their use may be).
The one single exception I can think of is that in the Bad Old Days, Apple's native file storage format on the Macintosh used two different disk objects: one for 'code' and one for 'data'. Without special software, only the 'code' parts could seen on other computers, leading to a swift exorcism of this storage format when Apple realized the rest of the world had problems reading their files. Still, it's far from unusual to read messages of confused people, finding that extracting an old Mac zip file can result in lots of zero-byte files.
As for your problem: since the problem does not lay in the font file format (there is no reason TTF "cannot work" on your system), it should be either the software you are using (does it actually support TTF fonts?) or - and I consider this more likely - you made an error transferring the files and you ended up with damaged fonts.
There is a Yahoo "Smush.it" service that allow to optimise image size.
I am looking for alternative way to reach the same goal using Linux application.
A lot of images need to processed and uploading them manually one by one seems not a good idea.
How such can be done in Linux ?
For jpegs I use JPEGmini and from what I tested it's the one with the best results keeping the same visible image quality while reducing a lot of the size, they have a server version for Linux which is not cheap and I never used.
There's also Mozilla's mozjpeg which you can use directly from the terminal, but it also reduces image quality.
In some tests I did, mozjpeg gives smaller files (not much) than JPEGmini's, but with lower image quality.
If you need to reduce pngs, you could try Trimage or some of the alternatives listed on the same link.
Smush.it's FAQ lists all the tools they are using in their service.
I want to code a desktop program to print microsoft office files (doc, docx, xls and xlxs) on linux machine. But I don't know how to print them without corruption on output.
Is there a way to print or convert to an other format the file as %100 same of the view on microsoft office?
The libreoffice API might be a good place to start, particularly the examples:
http://api.libreoffice.org/
I haven't used the API myself but have used open/libre-office as an alternative to word for quite a while.
However, you say '100%' the same as in office? I wouldn't be confident of that. Depending on the document it's likely to be fine, but there are some things which don't seem to convert well. If you're working on linux, you're not likely to have the same fonts installed as whichever windows/mac machine made the document.
If the documents you're processing are all of the same/similar layout/template, and you're able to test a few first, it should be fine. But if you're processing any sort of word document, some may not convert completely without a bit of human input. Depends how much difference you can tolerate. If you want completely consistent printing across platforms, I guess that's what pdfs are for.
I'm a Linux and gnome user, and I'm currently depending mainly on a notebook, and not surprisingly, I am not satisfied with the power quandary, so i recurred to power-management tools available for my system (currently Linux Mint 11), which is a really simple gui (gnome-power-preferences) with really few really basic features, which I'd love to expand.
I do not intend to work at low level features of power management, the states the are currently available are enough (suspend, hibernate, shutdown, do-nothing, monitor-brightness, downspin-hd, etc...), what I really need is a better way to create conditions in setting those states, which is, in the standard native tool, time and lid-closing, that's extremely limited.
So the question, I want to know what are my options to create scripts in any language(I'm willing to learn if i don't already know) that allow me to take a wider control of power-management conditions, i was think of(my possible settings):
down-spin disks immediately after lid closing and cut connection after n seconds.
don't cut connection after n seconds of lid-closing if bandwidth use is bigger than x bps
provide more statistical tools based on programs using, programs in background... services, etc.
create, save and load profiles that would automatically set monitor brightness, sound volume, wireless power, resource limits, etc... ex: 'college_ba.pp', 'default_ac.pp'...
brightness adjusting based on webcam shot illumination.
suspend or hibernate based on webcam shots without face for n seconds
etc
It may sound impossible and hard, I do not intend to have these stuff ready-to-use, as I said, i intend to use as much manual effort as needed, I just want to avoid low-level with existing libraries and tools, as much as possible, and i wish everyone to share information about any library, tool or project that comes to mind and deal with any subset of these things I've cited in this question.
This is a thing that i want from a long time, and just now i realize that this community could help me wide my options. My English is horrible I know, i learned online. I'm familiar with C++, C, Python and lately bash scripts. Thanks.
Your next step is to learn D-Bus, since most of the tools, both user and system, communicate using it.
Recently, i began developing a driver of an embedded device running linux.
Until now i have only read about linux internals.
Having no prior experience in driver devlopment, i am finding it a tad difficult to land my first step.
I have downloaded the kernel source-code (v2.6.32).
I have read (skimped) Linux Device Drivers (3e)
I read a few related posts here on StackOverflow.
I understand that linux has a "monolithic" approach.
I have built kernel (included existing driver in menuconfig etc.)
I know the basics of kconfig and makefile files so that should not be a problem.
Can someone describe the structure (i.e. the inter-links)
of the various directories in the kernel-source code.
In other words, given a source-code file,
which other files would it refer to for related code
(The "#include"-s provide a partial idea)
Could someone please help me in getting a better idea?
Any help will be greatly appreciated
Thank You.
Given a C file, you have to look at the functions it calls and data structures it uses, rather than worrying about particular files.
There are two basic routes to developing your own device driver:
Take a driver that is similar to yours; strip out the code that isn't applicable to your device, and fill in new code for your device.
Start with the very basic pieces of a device driver, and add pieces a little at a time until your device begins to function.
The files that compose your driver will make more sense as you complete this process. Do consider what belongs in each file, but to some extent, dividing a driver among files is more an art than a science. Smaller drivers often fit into just one or two files.
A bit of design may also be good. Consider what you device does, and what your driver will need to do. Based on that, you should be able to map out what functions a device driver will need to have.
I also believe Linux Device Drivers, Third Edition may help you get on your way to driver development.
Linux files themselves include files based on what they do, what layer they are in, and what layer they access of the call stack. The Big Picture truly informs how each file is related to the next.
I had to fix a kernel driver once. My biggest tip (if you use vim) is to set it up with ctags so you can jump around the kernel source with ctrl-] every time you see a function you don't understand.