Sphinx claiming memory is too low and my ids are null - linux

I am trying to index about 3,000 document but here is what I am getting
[root#domU-12-31-39-0A-19-CB data]# /usr/local/sphinx/bin/indexer --all
Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file '/usr/local/sphinx/etc/sphinx.conf'...
indexing index 'catalog'...
WARNING: Attribute count is 0: switching to none docinfo
WARNING: collect_hits: mem_limit=0 kb too low, increasing to 12288 kb
WARNING: source catalog: skipped 3558 document(s) with zero/NULL ids
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.040 sec, 0 bytes/sec, 0.00 docs/sec
total 1 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 5 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
I have it set to rt_mem_limit = 512M why is it telling me I dont have enough memory?

rt_mem_limit != mem_limit - they are different variables - with different purposes.
mem_limit - is the value used by indexer during indexing
http://sphinxsearch.com/docs/current.html#conf-mem-limit
- its in the 'indexer' section of your config file.
You must have it sent too loo. Either just leave it out (to use 32M), or change it to better value.
But you also have no document_ids in your dataset. Check your sql_query actully works.

Related

How to record CPU usage of a single app from top command?

I know 'grep' command can help me do this:
$ top -bd 0.5 -o +%CPU | grep "zoom" > cpu_usage.log
Then I can use another python code to extract the figures but I also want to grab the timestamp from the first line of 'top' result. Is there a way to do it? Thank you so much.
After some research on the internet and reading document (man top / man greb), we can add timestamp by searching for another keyword in the first line which contains its timestamp.
The first lines of top looks like this:
top - 17:05:43 up 11:08, 1 user, load average: 1.16, 1.05, 1.18
Tasks: 294 total, 1 running, 293 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.6 us, 1.9 sy, 0.0 ni, 94.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 7765.0 total, 1048.7 free, 3520.4 used, 3195.9 buff/cache
MiB Swap: 2048.0 total, 1953.7 free, 94.2 used. 3294.5 avail Mem
Notice that the timestamp is the first line and contains "top - ". So I will add another search string "top - " in my command with "\|" between the two search strings signalling the system that I want to get lines that have either/both strings, like this:
$ top -bd 0.5 -o +%CPU | grep "top - \|zoom" > cpu_usage.log
Done!

"vmstat" and "perf stat -a" show different numbers for context-switching

I'm trying to understand the context-switching rate on my system (running on AWS EC2), and where the switches are coming from. Just getting the number is already confusing, as two tools that I know can output such a metric give me different results. Here's the output from vmstat:
$ vmstat -w 2
procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
r b swpd free buff cache si so bi bo in cs us sy id wa st
8 0 0 443888 492304 8632452 0 0 0 1 0 0 14 2 84 0 0
37 0 0 444820 492304 8632456 0 0 0 20 131602 155911 43 5 52 0 0
8 0 0 445040 492304 8632460 0 0 0 42 131117 147812 46 4 50 0 0
13 0 0 446572 492304 8632464 0 0 0 34 129154 142260 49 4 46 0 0
The number is ~140k-160k/sec.
But perf tells something else:
$ sudo perf stat -a
Performance counter stats for 'system wide':
2980794.013800 cpu-clock (msec) # 35.997 CPUs utilized
12,335,935 context-switches # 0.004 M/sec
2,086,162 cpu-migrations # 0.700 K/sec
11,617 page-faults # 0.004 K/sec
...
0.004 M/sec is apparently 4k/sec.
Why is there a disparity between the two tools? Am I misinterpreting something in either of them, or are their CS metrics somehow different?
FWIW, I've tried doing the same on a machine running a different workload, and the difference there is even twice larger.
Environment:
AWS EC2 c5.9xlarge instance
Amazon Linux, kernel 4.14.94-73.73.amzn1.x86_64
The service runs on Docker 18.06.1-ce
Some recent versions of perf have a unit-scaling bug in the printing code. Manually do 12.3M / wall-time and see if that's sane. (spoiler alert: it is according to OP's comment.)
https://lore.kernel.org/patchwork/patch/1025968/
Commit 0aa802a79469 ("perf stat: Get rid of extra clock display
function") introduced the bug in mainline Linux 4.19-rc1 or so.
Thus, perf_stat__update_shadow_stats() now saves scaled values of clock events
in msecs, instead of original nsecs. But while calculating values of
shadow stats we still consider clock event values in nsecs. This results
in a wrong shadow stat values.
Commit 57ddf09173c1 on Mon, 17 Dec 2018 fixed it in 5.0-rc1, eventually being released with perf upstream version 5.0.
Vendor kernel trees that cherry-pick commits for their stable kernels might have the bug or have fixed the bug earlier.

Golang: What is etext?

I've started to profile some of my Go1.2 code and the top item is always something named 'etext'. I've searched around but couldn't find much information about it other than it might relate to call depth in Go routines. However, I'm not using any Go routines and 'etext' is still taking up 75% or more of the total execution time.
(pprof) top20
Total: 171 samples
128 74.9% 74.9% 128 74.9% etext
Can anybody explain what this is and if there is any way to reduce the impact?
I hit the same problem then I found this: pprof broken in go 1.2?. To verify that it is really a 1.2 bug I wrote the following "hello world" program:
package main
import (
"fmt"
"testing"
)
func BenchmarkPrintln( t *testing.B ){
TestPrintln( nil )
}
func TestPrintln( t *testing.T ){
for i := 0; i < 10000; i++ {
fmt.Println("hello " + " world!")
}
}
As you can see it only calls fmt.Println.
You can compile this with “go test –c .”
Run with “./test.test -test.bench . -test.cpuprofile=test.prof”
See the result with “go tool pprof test.test test.prof”
(pprof) top10
Total: 36 samples
18 50.0% 50.0% 18 50.0% syscall.Syscall
8 22.2% 72.2% 8 22.2% etext
4 11.1% 83.3% 4 11.1% runtime.usleep
3 8.3% 91.7% 3 8.3% runtime.futex
1 2.8% 94.4% 1 2.8% MHeap_AllocLocked
1 2.8% 97.2% 1 2.8% fmt.(*fmt).padString
1 2.8% 100.0% 1 2.8% os.epipecheck
0 0.0% 100.0% 1 2.8% MCentral_Grow
0 0.0% 100.0% 33 91.7% System
0 0.0% 100.0% 3 8.3% _/home/xxiao/work/test.BenchmarkPrintln
The above result is got using go 1.2.1
Then I did the same thing using go 1.1.1 and got the following result:
(pprof) top10
Total: 10 samples
2 20.0% 20.0% 2 20.0% scanblock
1 10.0% 30.0% 1 10.0% fmt.(*pp).free
1 10.0% 40.0% 1 10.0% fmt.(*pp).printField
1 10.0% 50.0% 2 20.0% fmt.newPrinter
1 10.0% 60.0% 2 20.0% os.(*File).Write
1 10.0% 70.0% 1 10.0% runtime.MCache_Alloc
1 10.0% 80.0% 1 10.0% runtime.exitsyscall
1 10.0% 90.0% 1 10.0% sweepspan
1 10.0% 100.0% 1 10.0% sync.(*Mutex).Lock
0 0.0% 100.0% 6 60.0% _/home/xxiao/work/test.BenchmarkPrintln
You can see that the 1.2.1 result does not make much sense. Syscall and etext takes most of the time. And the 1.1.1 result looks right.
So I'm convinced that it is really a 1.2.1 bug. And I switched to use go 1.1.1 in my real project and I'm satisfied with the profiling result now.
I think Mathias Urlichs is right regarding missing debugging symbols in your cgo code. Its worth noting that some standard pkgs like net and syscall make use of cgo.
If you scroll down to the bottom of this doc to the section called Caveats, you can see that the third bullet says...
If the program linked in a library that was not compiled with enough symbolic information, all samples associated with the library may be charged to the last symbol found in the program before the library. This will artificially inflate the count for that symbol.
I'm not 100% positive this is what's happening but i'm betting that this is why etext appears to be so busy (in other words etext is a collection of various functions that doesn't have enough information for pprof to analysis properly.).

Can I place my application code on partition in my RAM?

I want use RAM instead of SSD. I'm looking for experienced people to give me some advice about this. I want to mount a partition and put into it my Rails app.
Any ideas?
UPD: I tested the SSD and RAM. I have a OSX with 4x4Gb Kingston # 1333 RAM, Intel Core i3 # 2,8 Ghz, OCZ Vertex3 # 120Gb, HDD Seagate ST3000DM001 # 3Tb. My OS installed on SSD and ruby with gems placed in home folder on SSD. I create new Rails app with 10.000 product items in sqlite and create controller with code:
#products = Product.all
Rails.cache.clear
Tested it with AB.
SSD
Document Length: 719 bytes
Concurrency Level: 4
Time taken for tests: 39.274 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 130600 bytes
HTML transferred: 71900 bytes
Requests per second: 2546.21
Transfer rate: 3325.35 kb/s received
Connnection Times (ms)
min avg max
Connect: 0 0 0
Processing: 398 1546 1627
Total: 398 1546 1627
RAM
Document Length: 719 bytes
Concurrency Level: 4
Time taken for tests: 39.272 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 130600 bytes
HTML transferred: 71900 bytes
Requests per second: 2546.33
Transfer rate: 3325.51 kb/s received
Connnection Times (ms)
min avg max
Connect: 0 0 0
Processing: 366 1546 1645
Total: 366 1546 1645
HDD
Document Length: 719 bytes
Concurrency Level: 4
Time taken for tests: 40.510 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 130600 bytes
HTML transferred: 71900 bytes
Requests per second: 2468.54
Transfer rate: 3223.92 kb/s received
Connnection Times (ms)
min avg max
Connect: 0 0 0
Processing: 1193 1596 2400
Total: 1193 1596 2400
So, I think that thing in ruby with gems placed on SSD and get this scripts slowly, I will test on a real server and puts all ruby scripts into RAM with more complicated code or real application.
ps: sorry for my english :)
You are looking for a ramdisk.

phpcassa get_range is too slow

I have CF with 1280 rows.
Each row has 6 columns. Im trying to $cf->get_range('pq_questions','','',1200) and it gets all rows but too slow(about 4-6 sec)
Column Family: pq_questions
SSTable count: 1
Space used (live): 668363
Space used (total): 668363
Number of Keys (estimate): 1280
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 0
Read Latency: NaN ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 1000
Key cache hit rate: 0.10998439937597504
Row cache capacity: 1000
Row cache size: 1000
Row cache hit rate: 0.0
Compacted row minimum size: 373
Compacted row maximum size: 1331
Compacted row mean size: 574
It is strange but read latency in cfstats is NaN ms
When i calling htop on debian i see that the most load causes phpcassa
I has only one node and use consistency level ONE.
What can cause so slow quering?
I'm guessing you don't have the C extension installed. Without it, a similar query takes 1-2 seconds for me. With it installed, the same query takes about 0.2 seconds.
Regarding the NaN read latency, latencies aren't captured for get_range_slices (get_range in phpcassa).

Resources