How to manage multiline events based on a random field Logstash - logstash

I've been facing a problem related to multiline events lately, and I am needing a little bit of your help for this. My syslog server is sending multi-line events. One single event gathers several lines, and the indicator that proves a particular event line is part of a multi-line event is a random number that defines a user connection session. Here is a custom generated log file:
Feb 16 17:29:04 slot1/APM-LTM notice apd[5515]: 01490010:5: 1ec2b273:Username 'cjones'
Feb 16 17:29:04 slot1/APM-LTM warning apd[5515]: 01490106:4: 1ec2b273: AD module: authentication with 'cjones' failed: Preauthentication failed, principal name: cjones#GEEKO.COM. Invalid user credentials. (-1765328360)
Feb 16 17:10:04 slot1/APM-LTM notice apd[5515]: 01490010:5: d8b5a591: Username 'gbridget'
Feb 16 17:10:04 slot1/APM-LTM err apd[5515]: 01490107:3: d8b5a591: AD module: authentication with 'gbridget' failed: Clients credentials have been revoked, principal name: gbridget#GEEKO.COM. User account is locked (-1765328366)
Feb 16 17:29:04 slot1/APM-LTM notice apd[5515]: 01490005:5: 1ec2b273: Following rule 'fallback' from item 'AD Auth' to ending 'Deny'
Feb 16 17:29:04 slot1/APM-LTM notice apd[5515]: 01490102:5: 1ec2b273: Access policy result: Logon_Deny
Above are the lines related to two different connections defined by the following user sessions: d8b5a591(user gbridget) and 1ec2b273(user cjones). user sessions are the only indicators that connect those lines to two different events. not to mention that the line events are interwined.
The problem is that I am at loss as to how to explain the above to grok filter with a multiline plugin, knowing that the latter offers too few options. In fact , the notion of "previous" and "next" line cannot be applied here for instance, so the grok options "pattern" and "what" cannot be used, since the events are not necessarily consecutive.
I would really appreciate it if someone could shed some light on this and tell me if at least it is feasable or not.

I don't see those as multi-line events, but as related events. I would load them into elasticsearch as 6 different documents and then query as needed. If you have specific queries that you're trying to perform against this data, you might ask questions about how to perform them against multiple documents.
One alternative would be to use the session_id as the document ids and then you could update the initial documents when new information came in. They don't recommend using your own document ids (for performance reasons, IIRC), and updating a document involves deleting the old one and inserting a new one, which is also not good for performance.

Related

How to set a `User cap` for a particular domain in Gitlab

Original question:
I want to limit the number of users from a particular domain that can register into my Gitlab instance. I noticed that I could set a "user cap", but it wasn't specific to a domain.
For example:
I want to limit the number of users registered from these domains. 20 users from testdomain1.com and 30 users from testdomain2.com are allowed to sign up. So, if there are already 20 users registered sucessfully from testdomain1.com, new user from testdomain1.com will not be allowed to sign up.
What should I do for it?
2021.11.18 Edited:
I added a validate to the User model:
# gitlab/app/models/user.rb
class User < ApplicationRecord
# ...
validate :email_domain, :ensure_user_email_count
# ...
def email_domain
email_domain = /\#.*?$/.match(email)[0]
email_domain
end
def ensure_user_email_count
# select count(*) from users where email like '%#test.com';
if User.where("email LIKE ?", "%#{email_domain}" ).count >= 30
errors.add(email_domain, _('already has 30 registered email.'))
end
end
end
This validate can set "user cap = 30" for each domain but it's still not able to set a "User cap" for a particular domain.
Since the related issue post did not get any response yet. I'm tring to implement it by myself. And it seems like that I need to extend the UI of the Admin Settings page and add some related tables to database to set different "user cap" for different email domain.
The GitLab user cap seems to be per GitLab instance.
So if both your domains are reference the same GitLab instance, you would have only one user cap possible.
But if each of your domain redirects to one autonomous GitLab instance (per domain), then you should be able to set user cap per domain.
The OP Ann Lin has created the issue 345557 to follow that feature request.
TRhe OP reports:
A particular table is needed to store the caps.
But I don’t have enough time now to modify the UI so I found a simple way to do this:
The Allowed domains for sign-ups which called domain_allowlist in database is a text:
gitlabhq_production=# \d application_settings
...
domain_allowlist | text | | |
...
gitlabhq_production=# select domain_allowlist from >application_settings;
domain_allowlist
-------------------
--- +
- testdomain1.com+
- testdomain2.com+
(1 row)
I can modify the testdomain1.com to testdomain1.com#30 to store the user cap and use Regex to get the number 30.
I will modify the UI and add the database table later. And I’ll create a pull request on Gitlab when I’m done.

Jhipster auditing,wrong modified by name

I want to save 10k users from active directory and i am using auditing but I have a bug.
The column modified_by does not work well.
The error is
2016-04-04 14:49:27,353 DEBUG [ForkJoinPool.commonPool-worker-3] StateServiceImpl: Request to save userAD, 77879
2016-04-04 14:49:27,354 DEBUG [http-nio-80-exec-6] StateServiceImpl: Request to save userAD, 96459
As you can randomly it is using threads. When it is using ForkJoinPool.commonPool-worker-X in the modified_by column fill in with the name "system" and when it is called from http-nio-80-exec-X it filled by the name of the user who is logged in.
thanx
Please provide your .yo-rc.json file and JHipster version.
Indicate also how you load you 10k users: through REST API or something else?
Did you write/modify this code or is it the code generated by JHipster unmodified?
SecurityContextHolder probably uses a ThreadLocal variable to store current user and then if you start a thread you do not get this context copied to the execution thread which results in getting default user: system

Errbit keeps spamming emails

im using errbit 0-3 stable and its working really good .
but the problem is sometimes it start spamming me emails for the same error but different hashes like the following :
Mongo::Error::NoServerAvailable: No server is available matching preference: #<Mongo::ServerSelector::Primary:0x007fdba42891f0 #tag_sets=[], #options={:database=>"db_test", :max_pool_size=>200, :wait_queue_timeout=>5, :write=>{"w"=>0}}, #server_selection_timeout=30>
Mongo::Error::NoServerAvailable: No server is available matching preference: #<Mongo::ServerSelector::Primary:0x007fdbb8148e30 #tag_sets=[], #options={:database=>"db_test", :max_pool_size=>200, :wait_queue_timeout=>5, :write=>{"w"=>0}}, #server_selection_timeout=30>
How can i filter them so it would group them into 1 error only ?
There's two ways to deal with this.
Option 1) Catch the errors in your application and scrub the uniqueness out of the error messages before sending them to Errbit.
Option 2) Errbit supports configurable "fingerprinting" so you can actually tell Errbit what attributes contribute to the uniqueness of error notifications. This can be done system-wide or on individual Errbit apps. In your case, you could toggle off the error message as part of the Error fingerprint.
From the Errbit README:
The way Errbit arranges notices into error groups is configurable. By
default, Errbit uses the notice's error class, error message, complete
backtrace, component (or controller), action and environment name to
generate a unique fingerprint for every notice. Notices with identical
fingerprints appear in the UI as different occurences of the same
error and notices with differing fingerprints are displayed as
separate errors.
Changing the fingerprinter (under the 'config' menu) applies to all
apps and the change affects only notices that arrive after the change.
If you want to refingerprint old notices, you can run rake
errbit:notice_refingerprint.

Modeling time-based application in NodeJs

Im developing an auction style web app, where products are available for a certain period of time.
I would like to know how would you model that.
So far, what I've done is storing products in DB:
{
...
id: p001,
name: Product 1,
expire_date: 'Mon Oct 7 2013 01:23:45 UTC',
...
}
Whenever a client requests that product, I test *current_date < expire_date*.
If true, I show the product data and, client side, a countdown timer. If the timer reaches 0, I disable the related controls.
But, server side, there are some operations that needs to be done even if nobody has requested that product, for example, notify the owner that his product has ended.
I could scan the whole collection of products on each request, but seems cumbersome to me.
I thought on triggering a routine with cron every n minutes, but would like to know if you can think on any better solutions.
Thank you!
Some thoughts:
Index the expire_date field. You'll want to if you're scanning for auction items older than a certain date.
Consider adding a second field that is expired (or active) so you can also do other types of non-date searches (as you can always, and should anyway, reject auctions that have expired).
Assuming you add a second field active for example, you can further limit the scans to be only those auction items that are active and beyond the expiration date. Consider a compound index for those cases. (As over time, you'll have more an more expired items you don't need to scan through for example).
Yes, you should add a timed task using your favorite technique to scan for expired auctions. There are lots of ways to do this -- your infrastructure will help determine what makes sense.
Keep a local cache of current auction items in memory if possible to make scanning efficient as possible. There's no reason to hit the database if nothing is expiring.
Again, always check when retrieving from the database to confirm that items are still active -- as there easily could be race conditions where expired items may expire while being retrieved for display.
You'll possible want to store the state of status e-mails, etc. in the database so that any server restarts, etc. are properly handled.
It might be something like:
{
...
id: p001,
name: "Product 1",
expire_date: ISODate("Mon Oct 7 2013 01:23:45 UTC"),
active: true,
...
}
// console
db.auctions.esureIndex({expire_date: -1, active: 1})
// javascript idea:
var theExpirationDate = new Date(2013, 10, 06, 0, 0, 0);
db.auctions.find({ expire_date : { "$lte" : theExpirationDate }, active: true })
Scanning the entire collection on each request sounds like a huge waste of processing time.
I would use something like pm2 to handle both keeping track of your main server process as well as running periodic tasks with its built-in cron-like functionality.

Strict control over the statement_timeout variable in PostgreSQL

Does anybody know how to limit a users ability to set variables? Specifically statement_timeout?
Regardless of if I alter the user to have this variable set to a minute, or if I have it set to a minute in the postgresql.conf file, a user can always just type SET statement_timeount TO 0; to disable the timeout completely for that session.
Does anybody know a way to stop this? I know some variables can only be changed by a superuser but I cannot figure out if there is a way to force this to be one of those controlled variables. Alternatively, is there a way to revoke SET from their role?
In my application, this variable is used to limit the ability of random users (user registration is open to the public) from using up all the CPU time with (near) infinite queries. If they can disable it then it means that I must find a new methodology for limiting resources to users. If there is no method for securing this variable, is there other ways of achieving this same goal that you may suggest?
Edit 2011-03-02
The reason the database is open to the public and arbitrary SQL is allowed is because this project is for a game played directly in the database. Every player is a database user. Data is locked down behind views, rules and triggers, CREATE is revoked from public and the player role to prevent most alterations to the schema and SELECT on pg_proc is removed to secure game-sensitive function code.
This is not some mission critical system I have opened up to the world. It is a weird proof of concept that puts an abnormal amount of trust in the database in an attempt to maintain the entire CIA security triangle within it.
Thanks for your help,
Abstrct
There is no way to override this. If you allow the user to run arbitrary SQL commands, changing the statement_timeout is just the top of the iceberg anyway... If you don't trust your users, you shouldn't let them run arbitrary SQL - or accept that they can run, well, arbitrary SQL. And have some sort of external monitor that cancels the queries.
Basically you can't do this in plain postgres.
Meantime for accomplish your goal you may use some type of proxies and rewrite/forbidd some queries.
There several solutions for that, f.e.:
db-query-proxy - article how it born (in Russian).
BGBouncer + pgbouncer-rr-patch
Last contains very useful examples and it is very simple do on Python:
import re
def rewrite_query(username, query):
q1="SELECT storename, SUM\(total\) FROM sales JOIN store USING \(storeid\) GROUP BY storename ORDER BY storename"
q2="SELECT prodname, SUM\(total\) FROM sales JOIN product USING \(productid\) GROUP BY prodname ORDER BY prodname"
if re.match(q1, query):
new_query = "SELECT storename, SUM(total) FROM store_sales GROUP BY storename ORDER BY storename;"
elif re.match(q2, query):
new_query = "SELECT prodname, SUM(total) FROM product_sales GROUP BY prodname ORDER BY prodname;"
else:
new_query = query
return new_query

Resources