When tracing shared library functions with funclatency, no function names were displayed - shared-libraries

When using the bcc tool funclatency, there are unknown function names.
It would be helpful if I could track the entry and return values of a number of functions contained in the library ibverbs (Infiniband).
I use funclatency to print a histogram of the ibverbs functions called by perftest.
https://github.com/iovisor/bcc/tree/master/tools
To send packages between two nodes, I use the perftest.
https://github.com/linux-rdma/perftest
To compile the perftest application, I used the following compiler flags:
CFLAGS = -g -Wall -D_GNU_SOURCE -O3 -ggdb3 -O2 -fno-omit-frame-pointer
System:
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye
I use funclatency like this:
sudo funclatency-bpfcc ibverbs:ibv_get*
funclatency output:
Tracing 13 functions for "ibverbs:ibv_get*"... Hit Ctrl-C to end.
Function = b'[unknown]' [784402]
nsecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 468 |**************************** |
2048 -> 4095 : 512 |****************************** |
4096 -> 8191 : 664 |****************************************|
8192 -> 16383 : 332 |******************** |
16384 -> 32767 : 14 | |
32768 -> 65535 : 3 | |
65536 -> 131071 : 6 | |
131072 -> 262143 : 0 | |
262144 -> 524287 : 0 | |
524288 -> 1048575 : 0 | |
1048576 -> 2097151 : 0 | |
2097152 -> 4194303 : 0 | |
4194304 -> 8388607 : 0 | |
8388608 -> 16777215 : 1 | |
Here is the source code for the translation of a memory address into a function name:
https://github.com/iovisor/bcc/blob/master/src/python/bcc/init.py
def sym(addr, pid, show_module=False, show_offset=False, demangle=True):
"""sym(addr, pid, show_module=False, show_offset=False)
Translate a memory address into a function name for a pid, which is
returned. When show_module is True, the module name is also included.
When show_offset is True, the instruction offset as a hexadecimal
number is also included in the string.
A pid of less than zero will access the kernel symbol cache.
Example output when both show_module and show_offset are True:
"start_thread+0x202 [libpthread-2.24.so]"
Example output when both show_module and show_offset are False:
"start_thread"
"""
#addr is of type stacktrace_build_id
#so invoke the bsym address resolver
typeofaddr = str(type(addr))
if typeofaddr.find('bpf_stack_build_id') != -1:
sym = bcc_symbol()
b = bcc_stacktrace_build_id()
b.status = addr.status
b.build_id = addr.build_id
b.u.offset = addr.offset
res = lib.bcc_buildsymcache_resolve(BPF._bsymcache,
ct.byref(b),
ct.byref(sym))
if res < 0:
if sym.module and sym.offset:
name,offset,module = (None, sym.offset,
ct.cast(sym.module, ct.c_char_p).value)
else:
name, offset, module = (None, addr, None)
else:
name, offset, module = (sym.name, sym.offset,
ct.cast(sym.module, ct.c_char_p).value)
else:
name, offset, module = BPF._sym_cache(pid).resolve(addr, demangle)
offset = b"+0x%x" % offset if show_offset and name is not None else b""
name = name or b"[unknown]"
name = name + offset
module = b" [%s]" % os.path.basename(module) \
if show_module and module is not None else b""
return name + module
How can I read the function name correctly? It should not just return b'[unknown]'!

Related

Sphinx Results Take Huge Time To Show (Slow Index)

I'm new to Sphinx, i have simple table tbl_urls with two columns (domain_id,url)
i created my index as below to get domain id and number of urls for any giving keyword
source src2
{
type = mysql
sql_host = 0.0.0.0
sql_user = spnx
sql_pass = 123
sql_db = db_spnx
sql_port = 3306 # optional, default is 3306
sql_query = select id,domain_id,url from tbl_domain_urls
sql_attr_uint = domain_id
sql_field_string = url
}
index url_tbl
{
source = src2
path =/var/lib/sphinx/data/url_tbl
}
indexer
{
mem_limit = 2047M
}
searchd
{
listen = 0.0.0.0:9312
listen = 0.0.0.0:9306:mysql41
listen = /home/charlie/sphinx-3.4.1/bin/searchd.sock:sphinx
log = /var/log/sphinx/sphinx.log
query_log = /var/log/sphinx/query.log
read_timeout = 5
max_children = 30
pid_file = /var/run/sphinx/sphinx.pid
max_filter_values = 20000
seamless_rotate = 1
preopen_indexes = 0
unlink_old = 1
workers = threads # for RT indexes to work
binlog_path = /var/lib/sphinx/data
max_batch_queries = 128
}
problem is the time taken to show results is over one min
SELECT domain_id,count(*) as url_counter
FROM ul_tbl WHERE MATCH('games')
group by domain_id limit 1000000 OPTION max_matches=1000000;show meta;
+-----------+-------+
| domain_id | url |
+-----------+-------+
| 9900 | 444 |
| 41309 | 48 |
| 62308 | 491 |
| 85798 | 401 |
| 595 | 4851 |
13545 rows in set (3 min 22.56 sec)
+---------------+--------+
| Variable_name | Value |
+---------------+--------+
| total | 13545 |
| total_found | 13545 |
| time | 1.406 |
| keyword[0] | games |
| docs[0] | 456667 |
| hits[0] | 514718 |
+---------------+--------+
table tbl_domain_urls 100,821,614 rows
dedicated server HP Proliant 2xL5420 16GB RAM 2x1TB HDD
I need your support to optimize my QUERY or config settings, i need the results in the lowest time possible, i really appreciate any new idea to test
Note:
I tried distributed index to use multiple core for processing without any noticable results

generator do not get stored in memory then how come i get those value even after the loop is ended

A loop is created for generating a series of number
import ctypes
g = [ ]
for item1 in range(10):
print(f'memory address of {item1} = {id(item1)}')
g.append(id(item1))
here my loops end but the only thing is stored is the location or the memory address not the number
checking the values at that memory address
for item in g:
a = ctypes.cast(item, ctypes.py_object).value
print(f'values at that memory address = {a}')
here
a = ctypes.cast(item, ctypes.py_object).value
gives the value stored at that memory address
output
C:\Users\Admin\Desktop\pythonProject\venv\Scripts\python.exe C:/Users/Admin/Desktop/pythonProject/main.py
memory address of 0 = 140709883771936
memory address of 1 = 140709883771968
memory address of 2 = 140709883772000
memory address of 3 = 140709883772032
memory address of 4 = 140709883772064
memory address of 5 = 140709883772096
memory address of 6 = 140709883772128
memory address of 7 = 140709883772160
memory address of 8 = 140709883772192
memory address of 9 = 140709883772224
my loop has ended before soo they are no longer in the memory according to generator but still those memory address shows the values of those elements of previous loop
values at that memory address = 0
values at that memory address = 1
values at that memory address = 2
values at that memory address = 3
values at that memory address = 4
values at that memory address = 5
values at that memory address = 6
values at that memory address = 7
values at that memory address = 8
values at that memory address = 9
Process finished with exit code 0
This is just an implementation detail of cpython which has prebuilt singletons for the numbers -4 through 256. Python holds its own reference to these numbers. Your loop adds and removes a reference, but that original reference remains, keeping the values in the same place in memory.
If you choose a range outside of this sequence, you get a different result.
import ctypes
g = [ ]
for item1 in range(257, 267):
print(f'memory address of {item1} = {id(item1)}')
g.append(id(item1))
for item in g:
a = ctypes.cast(item, ctypes.py_object).value
print(f'values at that memory address = {a}')
Output
memory address of 257 = 139740170741360
memory address of 258 = 139740170743920
memory address of 259 = 139740170743952
memory address of 260 = 139740170743856
memory address of 261 = 139740170743824
memory address of 262 = 139740170741744
memory address of 263 = 139740170743792
memory address of 264 = 139740170744048
memory address of 265 = 139740170744080
memory address of 266 = 139740170744112
values at that memory address = 139740170743920
values at that memory address = 139740170743952
values at that memory address = 139740170743856
values at that memory address = 139740170743824
values at that memory address = 139740170741744
values at that memory address = 139740170743792
values at that memory address = 139740170744048
values at that memory address = 139740170744080
values at that memory address = 139740170744112
values at that memory address = 266
A range is not a generator It is a class of a series of positive integers int inherited from the base class object in Python, and hence it is not a function or a generator. but if you goo through it's class code you can see __iter__ method, from which you can say that, it has the functionality of iteration and for loop underneat the hud uses iter() which is iteration. hence it should be able to iterate range
code
help(range)
output
C:\Users\Admin\PycharmProjects\pythonProject1\venv\Scripts\python.exe C:/Users/Admin/PycharmProjects/pythonProject1/main.py
Help on class range in module builtins:
class range(object)
| range(stop) -> range object
| range(start, stop[, step]) -> range object
|
| Return an object that produces a sequence of integers from start (inclusive)
| to stop (exclusive) by step. range(i, j) produces i, i+1, i+2, ..., j-1.
| start defaults to 0, and stop is omitted! range(4) produces 0, 1, 2, 3.
| These are exactly the valid indices for a list of 4 elements.
| When step is given, it specifies the increment (or decrement).
|
| Methods defined here:
|
| __bool__(self, /)
| self != 0
|
| __contains__(self, key, /)
| Return key in self.
|
| __eq__(self, value, /)
| Return self==value.
|
| __ge__(self, value, /)
| Return self>=value.
|
| __getattribute__(self, name, /)
| Return getattr(self, name).
|
| __getitem__(self, key, /)
| Return self[key].
|
| __gt__(self, value, /)
| Return self>value.
|
| __hash__(self, /)
| Return hash(self).
|
| __iter__(self, /)
| Implement iter(self).
|
| __le__(self, value, /)
| Return self<=value.
|
| __len__(self, /)
| Return len(self).
|
| __lt__(self, value, /)
| Return self<value.
|
| __ne__(self, value, /)
| Return self!=value.
|
| __reduce__(...)
| Helper for pickle.
|
| __repr__(self, /)
| Return repr(self).
|
| __reversed__(...)
| Return a reverse iterator.
|
| count(...)
| rangeobject.count(value) -> integer -- return number of occurrences of value
|
| index(...)
| rangeobject.index(value) -> integer -- return index of value.
| Raise ValueError if the value is not present.
|
| ----------------------------------------------------------------------
| Static methods defined here:
|
| __new__(*args, **kwargs) from builtins.type
| Create and return a new object. See help(type) for accurate signature.
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| start
|
| step
|
| stop
Process finished with exit code 0

Get values from custom packet in IP option fields in scapy

I want to get the value of some fields in my scapy code for receiving packets, but I don't know how to do it exactly. When I print the value, I get the error that the layer is not defined or AttributeError: 'NoneType' object has no attribute 'proto'
class TELEMETRY(IPOption):
name = "TELEMETRY"
option = 31
fields_desc = [ _IPOption_HDR,
ByteField("length", 2),
Emph(SourceIPField("src", "dst")),
Emph(DestIPField("dst", "127.0.0.1")),
ShortEnumField("sport", 20, TCP_SERVICES),
ShortEnumField("dport", 80, TCP_SERVICES),
ByteEnumField("proto", 0, IP_PROTOS),
BitField("timeTaken", 0, 32),
BitField("egress_timestamp", 0, 48),
BitField("enqQdepth", 0, 19),
BitField("deqQdepth", 0, 19),
BitField("padding", 0, 2) ]
I can access the IP packet within the code blow but when I want to access the telemetry fields in my custom fields,
I get error AttributeError: 'NoneType' object has no attribute 'proto'
def handle_pkt(pkt):
ip_src=pkt[IP].src
ip_dst=pkt[IP].dst
ip_ver=pkt[IP].version
ip_id=pkt[IP].id
telemetry = pkt.getlayer(TELEMETRY)
print ip_src,ip_dst,ip_ver,ip_id, telemetry.proto
os.system(" echo %s %s %s %s %s| nc localhost 6666" % (ip_src,ip_dst,ip_ver,ip_id,telemetry.proto))
Here is the result of pkt.show2()
niffing on h4-eth0
got a packet
###[ Ethernet ]###
dst = 08:00:00:00:02:00
src = ff:ff:ff:ff:ff:ff
type = IPv4
###[ IP ]###
version = 4
ihl = 13
tos = 0x0
len = 172
id = 1
flags =
frag = 0
ttl = 63
proto = tcp
chksum = 0x5efb
src = 192.168.1.1
dst = 192.168.3.3
\options \
|###[ TELEMETRY ]###
| copy_flag = 0
| optclass = control
| option = 31
| length = 32
| src = 192.168.1.1
| dst = 192.168.3.3
| sport = 64314
| dport = 1234
| proto = tcp
| timeTaken = 11
| egress_timestamp= 7740314797
| enqQdepth = 0
| deqQdepth = 0
| padding = 0
###[ TCP ]###
sport = 64314
dport = 1234
seq = 0
ack = 0
dataofs = 5
reserved = 0
flags = S
window = 8192
chksum = 0x2d40
urgptr = 0
options = ''
Any idea would be appreciated. :)
getlayer
works when you want to get a sub layer.
In you case, you want to get a layer inside a list.
(IP.options is a list of layers)
the solution is then:
telemetry = pkt[IP].options[0]
now, for the option list might be empty for a variaty of reason, so you might want to:
if len(pkt[IP].options):
telemetry = pkt[IP].options[0]
else:
pass
# deal with it
Carcigenicate Also pointed out that you might have more of those options in your packet, you may want to deal with that too.

How to add an increasing integer ID to items in a Spark DStream

I am developing a Spark Streaming application where I want to have one global numeric ID per item in my data stream. Having an interval/RDD-local ID is trivial:
dstream.transform(_.zipWithIndex).map(_.swap)
This will result in a DStream like:
// key: 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 || 0 | 1 | 2 | 3 | 4 || 0
// val: a | b | c | d | e | f | g | h | i || j | k | l | m | n || o
(where the double bar || indicates the beginning of a new RDD).
What I finally want to have is:
// key: 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 || 9 | 10 | 11 | 12 | 13 || 14
// val: a | b | c | d | e | f | g | h | i || j | k | l | m | n || o
How can I do that in a safe and performant way?
This seems like a trivial task, but I feel it very hard to preserve state (state = "number of items seen so far") between RDDs. Here are two approaches I tried, updating the number of seen so far (plus the number in the current interval) using updateStateByKey with a bogus key:
val intervalItemCounts = inputStream.count().map((1, _))
// intervalItemCounts looks like:
// K: 1 || 1 || 1
// V: 9 || 5 || 1
val updateCountState: (Seq[Long], Option[ItemCount]) => Option[ItemCount] =
(itemCounts, maybePreviousState) => {
val previousState = maybePreviousState.getOrElse((0L, 0L))
val previousItemCount = previousState._2
Some((previousItemCount, previousItemCount + itemCounts.head))
}
val totalNumSeenItems: DStream[ItemCount] = intervalItemCounts.
updateStateByKey(updateCountState).map(_._2)
// totalNumSeenItems looks like:
// V: (0,9) || (9,14) || (14,15)
// The first approach uses a cartesian product with the
// 1-element state DStream. (Is this performant?)
val increaseRDDIndex1: (RDD[(Long, Char)], RDD[ItemCount]) =>
RDD[(Long, Char)] =
(streamData, totalCount) => {
val product = streamData.cartesian(totalCount)
product.map(dataAndOffset => {
val ((localIndex: Long, data: Char),
(offset: Long, _)) = dataAndOffset
(localIndex + offset, data)
})
}
val globallyIndexedItems1: DStream[(Long, Char)] = inputStream.
transformWith(totalNumSeenItems, increaseRDDIndex1)
// The second approach uses a take() output operation on the
// 1-element state DStream beforehand. (Is this valid?? Will
// the closure be serialized and shipped in every interval?)
val increaseRDDIndex2: (RDD[(Long, Char)], RDD[ItemCount]) =>
RDD[(Long, Char)] = (streamData, totalCount) => {
val offset = totalCount.take(1).head._1
streamData.map(keyValue => (keyValue._1 + offset, keyValue._2))
}
val globallyIndexedItems2: DStream[(Long, Char)] = inputStream.
transformWith(totalNumSeenItems, increaseRDDIndex2)
Both approaches give the correct result (with local[*] master), but I am wondering about performance (shuffle etc.), whether it works in a truly distributed environment and whether it shouldn't be a lot easier than that...

Lining up strings with list object lengths

I've got this function that compares stats that I pull from a txt file, so they are static. I was trying to think of a way to line up the stats of each race/subrace with the names. Here's the source :
def compare():
print('------ Compare Race Stats ------')
comp1 = query()
comp2 = query()
comp1 = stats(comp1[0],comp1[1])
comp2 = stats(comp2[0],comp2[1])
print('{} - {} | {} - {}'
.format(comp1[0][0],comp1[0][1],comp2[0][0],comp2[0][1]))
for i in range(len(comp1[1])):
print('{}{}{}{}'.format(' '*round(len(comp1[0][0])+len(comp1[0][1])/5),comp1[1][i],
' '*round(len(comp2[0][0])+2+len(comp2[0][1])/5),comp2[1][i]))
query() asks what race/subrace you want and returns strings for each. stats() takes race/subrace names, pulls from txt file and returns stats along with the names. The +2 in the second empty space calculation is my accounting for the first print statement (print('{} - {} | {} - {}'), that was a guess. The output doesn't look so bad and I thought the space calculation was kind of clever(I'm a noob) but I couldn't help but wonder what Stack Overflow would have to say. Are there certain accepted ways of lining up various outputs.
Here's some output, there are 10 different race/subraces:
Elezen - Duskwight | Hyur - Midlander
STR : 20 STR : 21
DEX : 20 DEX : 19
VIT : 19 VIT : 20
INT : 23 INT : 21
MND : 20 MND : 18
PIE : 18 PIE : 21
Mi'Qote - Seekers of the Sun | Mi'Qote - Keepers of the Moon
STR : 21 STR : 18
DEX : 22 DEX : 21
VIT : 20 VIT : 17
INT : 18 INT : 19
MND : 19 MND : 23
PIE : 20 PIE : 22
After a little research I found the rjust and ljust str functions:
def compare():
print("------ Compare Race Stats ------")
comp1 = query()
comp2 = query()
comp1 = stats(comp1[0], comp1[1])
comp2 = stats(comp2[0], comp2[1])
print("{} - {} | {} - {}"
.format(comp1[0][0], comp1[0][1], comp2[0][0], comp2[0][1]))
for i in range(len(comp1[1])):
print("{} {} {}".format(comp1[1][i].rjust(len(comp1[0][0]+comp1[0][1])+3),
"|", comp2[1][i]))
Here's the output:
Roegadyn - Hellsguard | Elezen - Duskwight
STR : 20 | STR : 20
DEX : 17 | DEX : 20
VIT : 21 | VIT : 19
INT : 20 | INT : 23
MND : 22 | MND : 20
PIE : 20 | PIE : 18
Mi'Qote - Keepers of the Moon | Roegadyn - Hellsguard
STR : 18 | STR : 20
DEX : 21 | DEX : 17
VIT : 17 | VIT : 21
INT : 19 | INT : 20
MND : 23 | MND : 22
PIE : 22 | PIE : 20
My apologies for asking a question I should have answered myself before posting, the least I can do it give an answer obviously.

Resources