Python3 scapy/kamene extremely slow - python-3.x

I was trying to use Pcap.net for some PCAP file analysis, which took around five seconds to loop through all available packets in a 1GB pcap file.
I'm now trying to use Scapy on Python3, which for whatever reason is called Kamene, but it's taking literally forever to parse the file, and CPU activity hits 100%, so I'm clearly doing something wrong. Here's the code:
from kamene.all import *
packetCount = 0
with PcapReader("C:\\Testing\\pcap\\maccdc2012_00000.pcap") as reader:
for packet in reader:
packetCount += 1
print(packetCount)
When running that, I get:
WARNING: No route found for IPv6 destination :: (no default route?).
This affects only IPv6
<UNIVERSAL><class 'kamene.asn1.asn1.ASN1_Class_metaclass'>
That UNIVERSAL message just gets repeated over and over, and after running for five minutes, I gave up. Does anyone have any idea on what is going on? Am I being dumb?
I've tried this on both Ubuntu and within Visual Studio on Windows (both virtualised)

First,l of all, you’re not using Scapy :/
from https://scapy.net
An independent fork of Scapy was created from v2.2.0 in 2015, aimed at
supporting only Python3 (scapy3k). The fork diverged, did not follow
evolutions and fixes, and has had its own life without contributions
back to Scapy. Unfortunately, it has been packaged as python3-scapy in
some distributions, and as scapy-python3 on PyPI leading to confusion
amongst users. It should not be the case anymore soon. Scapy supports
Python3 in addition to Python2 since 2.4.0. Scapy v2.4.0 should be
favored as the official Scapy code base. The fork has been renamed as
kamene.
Uninstall kamene and pip install scapy or pip3 install scapy (or get it from github) might help.
Once you've done that, you will find tips on how to speed up Scapy starting from 2.4.4 in the Performance section of the doc
That being said, Scapy isn’t designed to support very large amount of data (but rather aimed at being easy to implement). It will probably take some time to handle 1GB anyways :/ (Also, Python is slower than other languages (C) on such matters as packet dissection. You probably will never match Wireshark speed in Python)

Related

pandarallel package on windows infinite loop bug

so this is not really a question but rather a bug report for the pandarallel package:
this is the end of my code:
...
print('Calculate costs NEG...')
for i, group in tqdm(df_mol_neg.groupby('DELIVERY_DATE')):
srl_slice = df_srl.loc[df_srl['DATE'] == i]
srl_slice['srl_soll'] = srl_slice['srl_soll'].copy() * -1
df_aep_neg.loc[df_aep_neg['DATE'] == i, 'SRL_cost'] = srl_slice['srl_soll'].parallel_apply(lambda x: get_cost_of_nearest_mol(group, x)).sum()
what happens here is that instead of doing the parallel_apply function, it loops back to the start of my code and repeats it all again. the exact same code works fine on my remote linux mashine so I have 2 possible error sources:
since pandarallel itself already has some difficulties with the windows os it might just be a windows problem
the other thing is that I currently use the early access version of pycharm (223.7401.13) and use the debugger which might also be a problem source
other than this bug I can highly recommend the pandarallel package (at least for linux users). it's super easy to use and if you got some cores it can really shave off some time, in my case it shaved off a cool 90% of time.
(also if there is a better way to report bugs, please let me know)

How to get back raw binary output from python fabric remote command?

I am using Python 3.8.10 and fabric 2.7.0.
I have a Connection to a remote host. I am executing a command such as follows:
resObj = connection.run("cat /usr/bin/binaryFile")
So in theory the bytes of /usr/bin/binaryFile are getting pumped into stdout but I can not figure out what wizardry is required to get them out of resObj.stdout and written into a file locally that would have a matching checksum (as in, get all the bytes out of stdout). For starters, len(resObj.stdout) !== binaryFile.size. Visually glancing at what is present in resObj.stdout and comparing to what is in /usr/bin/binaryFile via hexdump or similar makes them seem about similar, but something is going wrong.
May the record show, I am aware that this particular example would be better accomplished with...
connection.get('/usr/bin/binaryFile')
The point though is that I'd like to be able to get arbitrary binary data out of stdout.
Any help would be greatly appreciated!!!
I eventually gave up on doing this using the fabric library and reverted to straight up paramiko. People give paramiko a hard time for being "too low level" but the truth is that it offers a higher level API which is pretty intuitive to use. I ended up with something like this:
with SSHClient() as client:
client.set_missing_host_key_policy(AutoAddPolicy())
client.connect(hostname, **connectKwargs)
stdin, stdout, stderr = client.exec_command("cat /usr/bin/binaryFile")
In this setup, I can get the raw bytes via stdout.read() (or similarly, stderr.read()).
To do other things that fabric exposes, like put and get it is easy enough to do:
# client from above
with client.open_sftp() as sftpClient:
sftpClient.put(...)
sftpClient.get(...)
also was able to get the exit code per this SO answer by doing:
# stdout from above
stdout.channel.recv_exit_status()
The docs for recv_exit_status list a few gotchas that are worth being aware of too. https://docs.paramiko.org/en/latest/api/channel.html#paramiko.channel.Channel.recv_exit_status .
Moral of the story for me is that fabric ends up feeling like an over abstraction while Paramiko has an easy to use higher level API and also the low level primitives when appropriate.

Python too many subprocesses?

I'm trying to start a lot of python procees on a single machine.
Here is a code snippet:
fout = open(path, 'w')
p = subprocess.Popen((python_path,module_name),stdout=fout,bufsize=-1)
After about 100 processes I'm getting the error below:
Running on win 10 64 bit, Python 3.5. Any Idea what that might be? Already tried to split the start (so start from two scripts) as well as sleep command. After a certain number of processes, the error shows up. Any Idea what that might be? Thanks a lot for any hint!
PS:
Some background. Each process opens database connections as well as does some requests using the requests package. Then some calculations are done using numpy, scipy etc.
PPS: Just discover this error message:
dll load failed the paging file is too small for this operation to complete python (when calling scipy)
Issues solved through reinstalling numpy and scipy + installing mkl.
Strange about this error was that it only appeared after a certain number of processes. Would love to hear if anybody knows why this happened!

Building custom small sized TCPDUMP executable in order 100 to 300KB

I need to build a small size tcpdump for the embedded project that I am working on. Since the memory size of my embedded device is limited, I need to strip all the unwanted functionality in the TCPDUMP while building it. My target is make the tcpdump executable size less that 300KB. After using "strip tcpdump option" and disabling package options in the configure, I have reached 750KB. To achieve this, I want to remove all the protocol decoding capability of tcpdump. I want the tcpdump to have no more that hex dump capability. I have a below initial list of unwanted protocols that has to be removed.
print-802_11.c
print-802_15_4.c
print-ah.c
print-ahcp.c
print-aodv.c
print-aoe.c
print-ap1394.c
print-atalk.c
print-atm.c
print-babel.c
print-bootp.C
print-bt.c
print-calm-fast.c
print-carp.c
print-cdp.c
print-cfm.c
print-chdlc.c
print-cip.c
print-cnfp.c
print-dccp.c
print-decnet.c
print-dtp.c
print-dvmrp.c
print-eap.c
print-egp.c
print-eigrp.c
print-enc.c
print-esp.c
print-fddi.c
print-forces.c
print-ipx.c
print-isakmp.c
print-isoclns.c
print-juniper.c
print-krb.c
print-lane.c
print-m3ua.c
print-sip.c
print-sl.c
print-sll.c
print-sunatm.c
print-zephyr.c
print-usb.c
print-vjc.c
print-vqp.c
print-timed.c
print-tipc.c
print-token.c
I started to remove these from Makefile.in and removing the function calls manually in the source code. But then I realized this approach is not scalable.
Is there a better way to do this ? Someway by using configure options?
I am new to this. So please explain.
Is there a better way to do this ? Someway by using configure options?
No, there are no such configure options. You'll have to do it the non-scalable way.
"I want to remove all the protocol decoding capability of tcpdump. I
want the tcpdump to have no more that hex dump capability. [...] Is
there a better way to do this ?"
I think there is, but with a very different approach.
If all you want from tcpdump is:
the capability of specifying an interface,
put this interface on promiscuous mode or not, or monitor mode if it's a Wi-Fi interface,
apply a capture filter,
and then spit the output in a file or as hex to stdout,
...you'd be better write your own from scratch, using libpcap (which is what tcpdump uses BTW).
This should be no more than 100-400 lines of C code depending on the options you want to have, you'll have a very, very small executable, and no more dependencies than tcpdump which require libpcap anyway. All the complexity is in the dissection, once you remove all that, what you have is basically... a pcap loop.
It's not that hard, and looks to me as far less work than your approach - and also more interesting work.
There's a tutorial to start with (30-60 minutes read):
http://www.tcpdump.org/pcap.html
...at the end of this tutorial, you'll already have the core of your program.
And you can find loads of info (and ask questions) in the related SO tags:
https://stackoverflow.com/tags/libpcap/info
https://stackoverflow.com/tags/pcap/info
...and have about 70 well-written man pages documenting the full pcap API (you'll end up using maybe 10-20 of these).

CentOS use SNMP to show interface useage

I have an SNMP monitoring box and want to monitor interface utilisation on a clustered database server. I'm trying to work out the correct OID to monitor - I just need SNMP to return the total interface throughput at a given time.
The SNMP box is already configured and will correctly graph it. All howtos I can find talk about setting up Catci or MRTG which is all well and good, but what I need seems simpler, yet I can't seem to find what I'm looking for. The SNMP box is already configured with the correct community name etc so this should be a really easy one in theory.
Any help very gratefully received
Thanks
When you say "interface utilisation", I assume you mean Ethernet interface utilization. If that assumption is correct, there are a couple OIDs to investigate:
1.3.6.1.2.1.2.2.1.10 - ifInOctets returns the total number of octets received on the interface, including framing characters.
1.3.6.1.2.1.2.2.1.16 - ifOutOctets returns the total number of octets transmitted out of the interface, including framing characters.
1.3.6.1.2.1.31.1.1.1.6 - ifHCInOctets returns the total number of octets received on the interface, including framing characters (this is the 64-bit version of ifInOctets).
1.3.6.1.2.1.31.1.1.1.10 - ifHCInOctets returns the total number of octets transmitted out of the interface, including framing characters (this is the 64-bit version of ifOutOctets).
Each OID is part of a table and will have an associated index that links it to an interface description (e.g., eth0 or br1).
These OIDs provide a count of octets received and transmitted so they require a little massaging to get into the utilization rates you desire. In the past when I've monitored these OIDs I've queried for two values a few seconds apart and then calculated the rate.
(QueryResult2 - QueryResult1) / (SecondsElapsed)
I would guess that Cacti (which I assume you're using since you tagged your question with it) has some way to calculate rates from SNMP values, however, I've never used it so I am not positive.
One other important note is that the default snmpd.conf included with CentOS may not have these OIDs enabled. If you run snmpwalk on 1.3.6.1.2.1.2 and 1.3.6.1.2.1.31 and receive empty results, edit /etc/snmpd.conf to configure the SNMP daemon to respond to those OIDs. I can't remember the exact syntax but I think adding a line like,
view all included .1
will enable all available OIDs on the server.
http://namhuy.net/908/how-to-install-iftop-bandwidth-monitoring-tool-in-rhel-centos-fedora.html
Requirements:
libpcap: module provides a user-level network packet capture information and statistics.
libncurses: is a API programming library that enables programmers to provide text-based interfaces in a terminal.
gcc: GNU Compiler Collection (GCC) is a compiler system produced by the GNU Project supporting various programming languages.
Install libpcap, libnurses, gcc via yum
yum -y install libpcap libpcap-devel ncurses ncurses-devel gcc
Download and Install iftop
wget http://www.ex-parrot.com/pdw/iftop/download/iftop-0.17.tar.gz
./configure
make
make install

Resources