speedup scapy execution - packet sniffing - sniffing

I'm developing some application that shall monitor some data in real time.
The application shall collect data from the network, parse the relevant packets from my protocol and store it to the database.
When I start the application - everything seems to be OK, but then lags are starting to appear few seconds after that.
Checking my database, it seems that some data is not saved while others does stored (I'm using packet player to inject packets on my PC. Verifying with Wireshark, all the data which has is there ). The data is stored into several tables, and all of the tables have same issue and therefore I'm suspicious on scapy.
Checking the Wireshark statistics, I have about 200 packets per second.
Is there someway to improve the performance of it?
I'm using sniff(iface="Working", filter = "port 52000", prn=my_parsing_func, store = false) command
PS - I'm using win 10 OS, python 3.7.4

Related

OpenNMS threshold checks only one server

So I'm trying to configure OpenNMS to check the disk space on my linux servers.
After some work I got it to check one server through SNMP :
I installed snmpd on the server I'm monitoring, defined a threshold(in fact I use the predefined default one) and connected it to an event that triggers when ns-dskPercent goes to high. up until here all went well.
Now I added a second server, installed the same stuff on it, it seems to monitor the snmp daemon and notifies me when the service is down, but it doesn't seem to see the threshold.
When I make changes in the threshold - for example lower it to 20% in order to force it to trigger - only the first server sees that it changed (and also gives a notification that the configuration has changed) and fires the alarm, but the second server doesn't respond.
(These are the notifications I get on the first server:)
High threshold rearmed for SNMP datasource ns-dskPercent on interface
xxx.xxx.xxx.xxx, parms: label="/" ds="ns-dskPercent" description="ns-dskPercent"
value="NaN (the threshold definition has been changed)" instance="1"
instanceLabel="_root_fs" resourceId="node[9].dskIndex[_root_fs]"
threshold="20.0" trigger="1" rearm="75.0" reason="Configuration has been changed"
High threshold exceeded for SNMP datasource ns-dskPercent on interface
xxx.xxx.xxx.xxx, parms: label="/" ds="ns-dskPercent" description="ns-dskPercent"
value="52" instance="1" instanceLabel="_root_fs"
resourceId="node[9].dskIndex[_root_fs]" threshold="20.0" trigger="1" rearm="75.0"
Any ideas why or how I can make the second server to respond also?
The issue could be based upon the source of the data collected. Thresholding in modern versions of OpenNMS (14+) is evaluated inline and in memory as data is collected, so you must ensure that the threshold is evaluated against the exact metrics the node you are interested in contains.
There are usually two forms that file system metrics on linux systems come in- mib2 use of the host resources table (hrStorageSize, etc in $OPENNMS_HOME/etc/datacollection/mib2.xml) or net-snmp metrics from the net-snmp MIB (ns-dskTotal, etc in $OPENNMS_HOME/etc/datacollection/netsnmp.xml).
So, first verify that you are getting good data from the new server and that it is, indeed, collecting metrics from the same MIB table that you seek to threshold against.

Using Fleck Websocket for 10k simultaneous connections

I'm implementing a websocket-secure (wss://) service for an online game where all users will be connected to the service as long they are playing the game, this will use a high number of simultaneous connections, although the traffic won't be a big problem, as the service is used for chat, storage and notifications... not for real-time data synchronization.
I wanted to use Alchemy-Websockets, but it doesn't support TLS (wss://), so I have to look for another service like Fleck (or other).
Alchemy has been tested with high number of simultaneous connections, but I didn't find similar tests for Fleck, so I need to get some real info from users of fleck.
I know that Fleck is non-blocking and uses Async calls, but I need some real info, cuz it might be abusing threads, garbage collector, or any other aspect that won't be visible to lower number of connections.
I will use c# for the client as well, so I don't need neither hybiXX compatibility, nor fallback, I just need scalability and TLS support.
I finally added Mono support to WebSocketListener.
Check here how to run WebSocketListener in Mono.
10K connections is not little thing. WebSocketListener is asynchronous and it scales well. I have done tests with 10K connections and it should be fine.
My tests shows that WebSocketListener is almost as fast and scalable as the Microsoft one, and performs better than Fleck, Alchemy and others.
I made a test on a Windows machine with Core2Duo e8400 processor and 4 GB of ram.
The results were not encouraging as it started delaying handshakes after it reached ~1000 connections, i.e. it would take about one minute to accept a new connection.
These results were improved when i used XSockets as it reached 8000 simultaneous connections before the same thing happened.
I tried to test on a Linux VPS with Mono, but i don't have enough experience with Linux administration, and a few system settings related to TCP, etc. needed to change in order to allow high number of concurrent connections, so i could only reach ~1000 on the default settings, after that he app crashed (both Fleck test and XSocket test).
On the other hand, I tested node.js, and it seemed simpler to manage very high number of connections, as node didn't crash when reached the limits of tcp.
All the tests where echo test, the servers send the same message back to the client who sent the message and one random other connected client, and each connected client sends a random ~30 chars text message to the server on a random interval between 0 and 30 seconds.
I know my tests are not generic enough and i encourage anyone to have their own tests instead, but i just wanted to share my experience.
When we decided to try Fleck, we have implemented a wrapper for Fleck server and implemented a JavaScript client API so that we can send back acknowledgment messages back to the server. We wanted to test the performance of the server - message delivery time, percentage of lost messages etc. The results were pretty impressive for us and currently we are using Fleck in our production environment.
We have 4000 - 5000 concurrent connections during peak hours. On average 40 messages are sent per second. Acknowledged message ratio (acknowledged messages / total sent messages) never drops below 0.994. Average round-trip for messages is around 150 miliseconds (duration between server sending the message and receiving its ack). Finally, we did not have any memory related problems due to Fleck server after its heavy usage.

Jmeter TCP Raspberry Pi cache

I have written simple server in Qt, which responses for TCP requests with simple string (few bytes), the response and request are constant sets of data. I have compiled it on Raspberry Pi (Arch Linux), then ran and connected it to my LAN. On my laptop I ran Jmeter with TCP Sampler.
After 5 minutes responding to 15 threads server stays on constant 80ms time of response. Then, after 8 minutes it starts to falling down:
time - avg response time
5mins - 80ms
8mins - 72ms
10mins - 44ms
12mins and more - 20ms
And it stays on this about 20ms. Why is that happening? Is there some cache mechanism or just some random conditions changing? I cant run the tests again and I have no idea where is the possibility to cache the sending data.
How many hits per seconds are you getting?
The TCP sampler will NOT cache the results.
Maybe something in the OS is doing some caching?
The answer maybe too late, but I would like to share my experience in a similar situation when I was debugging a performance issue. Using a network sniffer to capture the packets, then look at a few sessions to check the response time. You can either prove or disprove the reported response time by traffic generator is correct or not and go from there. Good luck.

Measure perf - components of a roundtrip MDX query

I would like to decompose the performance of a round-trip MDX query from a client to Analysis Services and back. In particular, I'm looking to identify/distinguish individual queries and record the time each query takes for:
the XMLA over HTTP message from client to IIS
the XMLA over TCP/IP message from the Data Pump to Analysis Services
the response from Analysis Services to the Data Pump
the response from IIS to the client
I am open to other data-points that would be beneficial to identify bottlenecks in the lifecycle of a query.
My company has tested a mix of software including: Periodic SSAS DMV data collection, PerfMon, Flight Recorder, Splunk and SQL Sentry. We are having trouble tying it all together.
One of the main problems that you have is that there probably are overlaps in time: msmdpump in IIS can start sending the first bytes to the AS server as soon as it has available the first few bytes of the XMLA from the http request, and vice versa, it probably starts sending the message as soon as the first few bytes from the response from the AS server is available.
Actually, the communication between msmdpump and the AS server is a binary version of the XML that is sent between msmdpump and the client, and hence easy to translate without knowing information later in the message. See http://sqlblog.com/blogs/mosha/archive/2005/12/02/analysis-services-2005-protocol-xmla-over-tcp-ip.aspx for some details about the protocol.
To track the times, my approach would a low level one: I would run Wireshark (http://www.wireshark.org/) on on the computer running IIS, and filter to only the http frames between the client and IIS and the frames between the IIS computer and the AS server. The contents of the frames would be more or less irrelevant, but you could see the time stamp of the first and last package of a request, giving you an rough estimate about the durations of the different communications. And staying on one computer for all network traffic logging avoids the need to have the clocks of all computers exactly synchronized.

Can bittorrent peers handle seeding large numbers of idle torrents

I'm considering using bittorrent for a large data dissemination problem where the data source is petascale and users will want up to several terabytes. Some details
Number of torrents potentially in the millions
torrent sizes ranging from 100Mb to 100Gb
A stable set of clusters around the world capable of acting as seeders each holding a large subset of the total torrents (say 60% on average)
A relatively small number of simultaneous users (less than 100) wanting to download on average a few terabytes of data.
I expect the number of active torrents to be small compared to the total available but quality of service is important so there must be several seeders for each torrent or some mechanism for launching new seeders.
My question is can bittorrent clients handle seeding huge numbers of torrents, most of which are idle? Would I need to stripe torrents across the seeders in a cluster or could each node be seeding all torrents it has access to? Which client would do the best job? Are there any tools for managing clusters of seeders?
I am assuming that trackers can be made to scale to this level.
There are 2 main problems:
Each torrent (typically) needs to announce to a tracker periodically, this might end up using a significant amount of bandwidth.
The bittorrent client itself need to be written in a way to scale with a large number of torrents
As for the tracker traffic, let's assume you have 1 million torrents, the typical re-announce interval is 30 minutes, but some tracker has it set to 1 hour. Let's be conservative and assume your tracker uses 1 hour announce intervals. You will have to make 1 million GET requests per hour, let's say each request is 400 bytes up and 100 bytes down (assuming most responses will not contain any peers), that's about 111 kB/s up and 28 kB/s down constantly. That's not so bad, but keep in mind that TCP requires an extra round-trip for establishing connections, so that's another 40 bytes down and 40 bytes up.
This can be mitigated by only using UDP trackers. Then you would only need a single connect-message, and you can reuse the connection ID for each announce. Each announce message would then be 100 bytes, and the returned message would be a bit more compact as well, let's assume 60 bytes. That would get you 28 kB/s up and 16kB/s down, just to keep the torrents announced. For this you would need a client with decent udp tracker support (one that caches the connection ID for instance).
Not too bad, assuming that's insignificant compared to the actual data your seeds would send.
However, you don't necessarily need to stripe your torrents across separate data centers, you could also use an HTTP server to seed the torrents. All major bittorrent clients support http seeding, and you wouldn't have to worry about announcing to the tracker (the URL is burned into the .torrent itself).
As for a client that scales well with torrents, I don't know for sure, I haven't done any measurements. It should be fairly straightforward to just generate a million random torrents and try to load it up.
I have done some optimization work in libtorrent rasterbar to make it scale well with many torrents, I haven't tried millions though.
I've written a blog post on this topic, here.
You may be looking for Hekate
It's in, at best, pre-alpha right now, but it's quite nearly what you're describing.
To not collapse under the overhead of useless tracker announces and scrapes in the millions (and that in every announce interval), you have to restrict your seeding clusters to only load the current working set of items that are requested right now. Downloaders need to get (download) the .torrent file from a central place anyway, and that could trigger loading it into the seeding clusters. Alternatively, determine activity for a particcular info-hash by recognizing announces that do NOT originate from one of your seed clusters.
rTorrent has fast-resume (meaning no hashing happens when an appropriately prepared .torrent is loaded), and is controllable via xmlrpc so you can decommission idle items. That way, a .torrent download can trigger the actual data to be available for the next 24 hours, or as long as there's activity in the swarm.
The protocol allows for this, but I do not know which clients would scale to millions of torrents. In the worst case, you would have to write your own seed-only client.
The protocol feature most relevant to your use case is that, when a peer connects to another, the connecting peer is supposed to send the torrent's info-hash first. This means that a single listening TCP port could be used to seed an unlimited amount of torrents, with almost zero resources used when idle.
This can be found on The BitTorrent Protocol Specification:
If both sides don't send the same value, they sever the connection. The one possible exception is if a downloader wants to do multiple downloads over a single port, they may wait for incoming connections to give a download hash first, and respond with the same one if it's in their list.
I also found the same on this Bittorrent Protocol Specification v1.0:
The initiator of a connection is expected to transmit their handshake immediately. The recipient may wait for the initiator's handshake, if it is capable of serving multiple torrents simultaneously (torrents are uniquely identified by their info_hash).
However, there is one thing that would increase your load, and it is the tracker. With the normal tracker protocol, each client has to periodically announce to the tracker each torrent it has, together with information like how much it has uploaded. With millions of torrents, this would present a somewhat high load. If you were writing your own mass-seed-only client, a separate protocol to announce your seeders to the tracker would be a good idea.

Resources