Limiting Kismet log files to a size or duration - linux

Looking for a solid way to limit the size of Kismet's database files (*.kismet) through the conf files located in /etc/kismet/. The version of Kismet I'm currently using is 2021-08-R1.
The end state would be to limit the file size (10MB for example) or after X minutes of logging the database is written to and closed. Then, a new database is created, connected, and starts getting written to. This process would continue until Kismet is killed. This way, rather than having one large database, there will be multiple smaller ones.
In the kismet_logging.conf file there are some timeout options, but that's for expunging old entries in the logs. I want to preserve everything that's being captured, but break the logs into segments as the capture process is being performed.
I'd appreciate anyone's input on how to do this either through configuration settings (some that perhaps don't exist natively in the conf files by default?) or through plugins, or anything else. Thanks in advance!

Two interesting ways:
One could let the old entries be taken out, but reach in with SQL and extract what you wanted as a time-bound query.
A second way would be to automate the restarting of kismet... which is a little less elegant.. but seems to work.
https://magazine.odroid.com/article/home-assistant-tracking-people-with-wi-fi-using-kismet/
If you read that article carefully... there are lots of bits if interesting information here.

Related

Best practices for ArangoDB compaction on-demand for file space reclamation

Part of my evaluation of ArangoDB involves importing a few CSV files of over 1M rows into a staging area, then deleting the resulting collections or databases. I will need to do this repeatedly for the production processes I envision.
I understand the the ArangoDB service invokes compaction periodically per this page:
https://docs.arangodb.com/3.3/Manual/Administration/Configuration/Compaction.html
After deleting a database, I waited over 24 hours and no disk space has been reclaimed, so I'm not sure this automated process is working.
I'd like answers to these questions:
What are the default values for the automatic compaction parameters shown in the link above?
Other than observing a change in file space, how do I know that a compaction worked? Is the a log file or other place that would indicate this?
How can I execute a compaction on-demand? All the references I found that discussed such a feature indicated that it was not possible, but they were from several years ago and I'm hoping this feature has been added.
Thanks!
The GET route /_api/collection/{collection-name}/figures contains a sub-attribute compactionStatus in the attribute figures with time and message of the last compaction for debugging purposes. There is also some other information in the response that you might be interested in. See if doCompact is set to true at all.
https://docs.arangodb.com/3.3/HTTP/Collection/Getting.html#return-statistics-for-a-collection
You can run arangod --help-compaction to see the startup options for compaction including the default values. This information is also available online in the 3.4 docs:
https://docs.arangodb.com/3.4/Manual/Programs/Arangod/Options.html#compaction-options
The PUT route /_api/collection/{collection-name}/rotate, quoting the documentation directly:
Rotates the journal of a collection. The current journal of the
collection will be closed and made a read-only datafile. The purpose
of the rotate method is to make the data in the file available for
compaction (compaction is only performed for read-only datafiles, and
not for journals)
Saving new data in the collection subsequently will create a new journal file automatically if there is no current journal.
https://docs.arangodb.com/3.3/HTTP/Collection/Modifying.html#rotate-journal-of-a-collection

dfs.FSnameSystem.BlockCapacity getting reduced eventually

I have a small application that I am running on a 'EMR' Cluster with 3 nodes. I have a few gigabytes of csv files that are split across multiple files. The application reads the csv files and then converts into '.orc' files. I have a small program that sequentially and synchronously sends limited (less than ten) files as input to the application.
My problem is, after sometime, the cluster is going down eventually without leaving any trace (or may be I am looking at wrong places). After trying to find out various options, I observed in 'ganglia' that the dfs.FSNameSystem.BlockCapacity is reducing eventually.
Is this because of the application or is it with the server configuration? Can someone please share if you have any exposure on this?

Arangodb journal logfiles

What for are logfiles in:
"arango_instance_database/journals/logfile-xxxxxx.db
Can I delete them ?
How can I reduce their size ?
I set
database.maximal-journal-size = 1048576
but those files are still 32M large.
Can I set some directory for them like
/var/log/...
?
You're referencing the Write Ahead Logfiles which are at least temporarily the files your data is kept in.
So its a very bad idea to remove them on your own, as long as you still like your data to be intact.
The files are used so documents can be written to disk in a contineous fashion. Once the system is idle, the aggregator job will pick the documents from them and move them over into your database files.
You can find interesting documentation of situations where others didn't choose such an architectural approach, and data was written directly into their data files on the disk, and what this then does to your sytem.
Once all documents in a WAL-file have been moved into db files, the ArangoDB will remove the allocated space.
Thank You a lot for reply :-)
So in case of arangodb deployed as "single instance" I can set:
--wal.suppress-shape-information true
--wal.historic-logfiles 0
Anything else ?
How about
--wal.logfile-size
What are best/common practises in determining its size ?

copy command row size limit in cassandra

Could anyone tell the maximum size(no. of rows or file size) of a csv file we can load efficiently in cassandra using copy command. Is there a limit for it? if so is it a good idea to breakdown the size files into multiple files and load or we have any better option to do it? Many thanks.
I've run into this issue before... At least for me there was no clear statement in any datastax or apache documentation of the max size. Basically, it may just be limited to your pc/server/cluster resources (e.g. cpu and memory).
However, in an article by jgong found here it is stated that you can import up to 10MB. For me it was something around 8.5MB. In the docs for cassandra 1.2 here its stated that you can import a few million rows and that you should use the bulk-loader for more heavy stuff.
All in all, I do suggest importing via multiple csv files (just dont make them too small so your opening/closing files constantly) so that you can keep a handle on data being imported and finding errors easier. It can happen that waiting for an hour for a file to load it fails and you start over whereas if you have multiple files you dont need to start over on the ones that already have been successfully imported. Not to mention key duplicate errors.
Check out cassandra-9303 and 9302
and check out brian's cassandra-loader
https://github.com/brianmhess/cassandra-loader

How do we get around the Lotus Notes 60 Gb database barrier

Are there ways to get around the upper database size limit on Notes databases? We are compacting a database that is still approaching 60 gigs in size. Thank you very much if you can offer a suggestion.
Even if you could find a way to get over the 64GB limit it would not be the recommended solution. Splitting up the application into multiple databases is far better if you wish to improve performance and retain the stability of your Domino server. If you think you have to have everything in the same database in order to be able to search, please look up domain search and multi-database search in the Domino Administrator help.
Maybe some parts of the data is "old" and could be put into one or more archive databases instead?
Maybe you have a lot of large attachments and can store them in a series of attachment databases?
Maybe you have a lot of complicated views that can be streamlined or eliminated and thereby save a lot of space and keep everything in the same database for the time being? (Remove sorting on columns where not needed, using "click on column header to sort" is a sure way to increase the size of the view index.)
I'm assuming your database is large because of file attachments as well. In that case look into DAOS - it will store all file attachments on filesystem (server functionality - transparent to clients and existing applications).
As a bonus it finds duplicates and stores them only once.
More here: http://www.ibm.com/developerworks/lotus/library/domino-green/
Just a stab in the dark:
Use the DB2 storage method instead of to a Domino server?
I'm guessing that 80-90% of that space is taken up by file attachments. My suggestion is to move all the attachments to a file share, provided everyone can access that share, or to an FTP server that everyone can connect to.
It's not ideal because security becomes an issue - now you need to manage credentials to the Notes database AND to the external file share - however it'll be worth the effort from a Notes administrator's perspective.
In the Notes documents, just provide a link to the file. If users are adding these files via a Notes form, perhaps you can add some background code to extract the file from the document after it has been saved, and replace it with a link to that file.
The 64GB is not actually an absolute limit, you can go above that, I've seen 80GB and even close to 100Gb although once your past 64Gb you can get problems at any time. The limit is not actually Notes, its the underlying file system, I've seen this on AS400 but the great thing about Notes is that if you do get a huge crash you can still access all the documents and pull everything out to new copies using scheduled agents even if you can no longer get views to open in the client.
Your best best is regular archiving, if it is file attachments then anything over two years old doesn't need to be in main system, just brief synopsis and link, you could even have 5 year archive, 2 year archive 1 year archive etc, data will continue to accumulate and has to be managed, irrespective of what platform you use to store it.
If the issue really is large file attachments, I would certainly recommend looking into implementing DAOS on your server / database. It is only available with Domino Server 8.5 and later. On the other hand, if your database contains over 100,000+ documents, you may want to look seriously at dividing the data into multiple NSF's - at that number of documents, you need to be very careful about your view design, your lookup code, etc.
Some documented successes with DAOS:
http://www.edbrill.com/ebrill/edbrill.nsf/dx/yet-another-daos-success-story-from-darren-duke?opendocument&comments
If you're database is getting to 60gb.. don't use a Domino solution you need to switch to a relational database. You need to archive or move documents across several databases. Although you can get to 60gb, you shouldn't do it. The performance hit for active databases is significant. Not so much a problem for static databases.
I would also look at removing any unnecessary views & their indexes. View indexes can occupy 80-90% of your disk space. If you can't remove them, simplify their sorting arrangements/formulas and remove any unnecessary column sorting options. I halved a 50gb down to 25gb with a few simple changes like this and virtually no users noticed.
One path could be, for once, to start with the user. Do all the users need to access all that data all the time ? If no, it's time to split or archive. If yes, there is probably a flaw in the design of the application.
Technically, I would add to the previous comments a suggestion to check the many options for compaction. Quick and dirty : disard all view indices, but be sure to rebuild at least the one for the default view if you don't want your users to riot. See updall
One more thing to check: make sure you have checked
[x] Use LZ1 compression for attachments
in db properties.

Resources