multi cpu core gzip a big file

multi cpu core gzip a big file - linux

How can I use all cpu cores in my server(has 4 cores) linux Debian over OpenVZ to gzipping faster one big file?
I am trying to use these commands but I can not put the pieces together
get number of cores
CORES=$(grep -c '^processor' /proc/cpuinfo)
this for split big file in more
split -b100 file.big
this for use gzip command with multiple core
find /source -type f -print0 | xargs -0 -n 1 -P $CORES gzip --best
I don't know if this is the best way for optimize gzip process of big files..

Use pigz, a parallel gzip implementation.
Unlike parallel with gzip, pigz produces a single gzip stream.

Try GNU Parallel
cat bigfile | parallel --pipe --recend '' -k gzip -9 >bigfile.gz
This will use all your cores to gzip in parallel.
By way of comparison, on my Mac running OSX Mavericks, and using a 6.4GB file on solid state disk, this command
time gzip -9 <bigger >/dev/null
takes 4 minutes 23s and uses 1-2 CPUs at around 50%.
Whereas the GNU Parallel version below
time cat bigger | parallel --pipe --recend '' -k gzip -9 >/dev/null
takes 1 minute 44 seconds and keeps all 8 cores 80+% busy. A very significant difference, with GNU Parallel running in under 40% of the time of the simplistic approach.

Related

Ubuntu terminal - using gnu parallel to read lines in all files in folder

I am Trying to count the lines in all the files in a very large folder under Ubuntu.
The files are .gz files and I use
zcat * | wc -l
to count all the lines in all the files, and it's slow!
I want to use multi core computing for this task and found this
about Gnu parallel,
I tried to use this bash command:
parallel zcat * | parallel --pipe wc -l
and the cores are not all working
I found that the job starting might cause major overhead and tried using batching with
parallel -X zcat * | parallel --pipe -X wc -l
without improvenemt,
how can I use all the cores to count the lines in all the files in a folder given they are all .gz files and need to be decompresses before counting the rows (don't need to keep them uncompressed after)
Thanks!

If you have 150,000 files, you will likely get problems with "argument list too long". You can avoid that like this:
find . -name \*gz -maxdepth 1 -print0 | parallel -0 ...
If you want the name beside the line count, you will have to echo it yourself, since your wc process will only be reading from its stdin and won't know the filename:
find ... | parallel -0 'echo {} $(zcat {} | wc -l)'
Next, we come to efficiency and it will depend on what your disks are capable of. Maybe try with parallel -j2 then parallel -j4 and see what works on your system.
As Ole helpfully points out in the comments, you can avoid having to output the name of the file whose lines are being counted by using GNU Parallel's --tag option to tag output line, so this is even more efficient:
find ... | parallel -0 --tag 'zcat {} | wc -l'

Basically the command you are looking for is:
ls *gz | parallel 'zcat {} | wc -l'
What it does is:
ls *gzlist all gz files on stdout
Pipe it to parallel
Spawn subshells with parallel
Run in said subshells the command inside quotes 'zcat {} | wc -l'
About the '{}', according to the manual:
This replacement string will be replaced by a full line read from the input source
So each line piped to parallel get fed to zcat.
Of course this is basic, I assume it could be tuned, the documentation and examples might help

Performance of wc -l

I ran the following command :
time for i in {1..100}; do find / -name "*.service" | wc -l; done
got a 100 lines of the result then :
real 0m35.466s
user 0m15.688s
sys 0m14.552s
I then ran the following command :
time for i in {1..100}; do find / -name "*.service" | awk 'END{print NR}'; done
got a 100 lines of the result then :
real 0m35.036s
user 0m15.848s
sys 0m14.056s
I precise I already ran find / -name "*.service" just before so it was cached for both commands.
I expected wc -l to be faster. Why is it not ?

other's have mentioned that you're probably timing find, not wc or awk. still, there may be interesting differences to explore between wc and awk in their various flavors.
here are the results I get:
Mac OS 10.10.5 awk 0.16m lines/second
GNU awk/gawk 4.1.4 4.4m lines/second
Mac OS 10.10.5 wc 6.8m lines/second
GNU wc 8.27 11m lines/second
i didn't use find, but instead used wc -l or `awk 'END{print NR}' on a large text file (66k lines) in a loop.
i varied the order of the commands and didn't find any deviations large enough to change the rankings i reported.
LC_CTYPE=C had no measurable effect on any of these.
conclusions
don't use mac builtin command line tools except for trivial amounts of data.
GNU wc is faster than GNU awk at counting lines.
i use MacPorts GNU binaries. it would be interesting to see how Homebrew binaries compare. (i'm guessing they'd lose.)

Three things:
Such a small difference is usually not significant:
0m35.466s - 0m35.036s = 0m0.43s or 1.2%
Yet wc -l is faster (10x) than awk 'END{print NR}'.
% time seq 100000000 | awk 'END{print NR}' > /dev/null
real 0m13.624s
user 0m14.656s
sys 0m1.047s
% time seq 100000000 | wc -l > /dev/null
real 0m1.604s
user 0m2.413s
sys 0m0.623s
My guess is that the hard drive cache holds the find results, so after the first run with wc -l, most of the reads needed for find are in the cache. Presumably the difference in times between the initial find with disk reads and the second find with cache reads, would be greater than the difference in run times between awk and wc.
One way to test this is to reboot, which clears the hard disk cache, then run the two tests again, but in the reverse order, so that awk is run first. I'd expect that the first-run awk would be even slower than the first-run wc, and the second-run wc would be faster than the second-run awk.

Split a .gz file into multiple 1GB compressed(.gz) files

I have a 250GB gzipped file on Linux and I want to split it in 250 1GB files and compress the generated part files on the fly (as soon as one file is generated, it should be compressed).
I tried using this -
zcat file.gz | split -b 1G – file.gz.part
But this is generating uncompressed file and rightly so. I modified it to look like this, but got an error:
zcat file.gz | split -b 1G - file.gz.part | gzip
gzip: compressed data not written to a terminal. Use -f to force compression.
For help, type: gzip -h
I also tried this, and it did not throw any error, but did not compress the part file as soon as they are generated. I assume that this will compress each file when the whole split is done (or it may pack all part files and create single gz file once the split completed, I am not sure).
zcat file.gz | split -b 1G - file.gz.part && gzip
I read here that there is a filter option, but my version of split is (GNU coreutils) 8.4, hence the filter is not supported.
$ split --version
split (GNU coreutils) 8.4
Please advise a suitable way to achieve this, preferably using a one liner code (if possible) or a shell (bash/ksh) script will also work.

split supports filter commands. Use this:
zcat file.gz | split - -b 1G --filter='gzip > $FILE.gz' file.part.

it's definitely suboptimal but I tried to write it in bash just for fun ( I haven't actually tested it so there may be some minor mistakes)
GB_IN_BLOCKS=`expr 2048 \* 1024`
GB=`expr $GB_IN_BLOCKS \* 512`
COMPLETE_SIZE=`zcat asdf.gz | wc -c`
PARTS=`expr $COMPLETE_SIZE \/ $GB`
for i in `seq 0 $PARTS`
do
zcat asdf.gz | dd skip=`expr $i \* GB_IN_BLOCKS` count=$GB_IN_BLOCKS | gzip > asdf.gz.part$i
done

How do I grep in parallel

I usually use grep -rIn pattern_str big_source_code_dir to find some thing. but the grep is not parallel, how do I make it parallel? My system has 4 cores, if the grep can use all the cores, it would be faster.

There will not be speed improvement if you are using a HDD to store that directory you are searching in. Hard drives are pretty much single-threaded access units.
But if you really want to do parallel grep, then this website gives two hints of how to do it with find and xargs. E.g.
find . -type f -print0 | xargs -0 -P 4 -n 40 grep -i foobar

The GNU parallel command is really useful for this.
sudo apt-get install parallel # if not available on debian based systems
Then, paralell man page provides an example:
EXAMPLE: Parallel grep
grep -r greps recursively through directories.
On multicore CPUs GNU parallel can often speed this up.
find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
This will run 1.5 job per core, and give 1000 arguments to grep.
In your case it could be:
find big_source_code_dir -type f | parallel -k -j150% -n 1000 -m grep -H -n pattern_str {}
Finally, the GNU parallel man page also provides a section describing differences betwenn xargs and parallel command, that should help understanding why parallel seems better in your case
DIFFERENCES BETWEEN xargs AND GNU Parallel
xargs offer some of the same possibilities as GNU parallel.
xargs deals badly with special characters (such as space, ' and "). To see the problem try this:
touch important_file
touch 'not important_file'
ls not* | xargs rm
mkdir -p "My brother's 12\" records"
ls | xargs rmdir
You can specify -0 or -d "\n", but many input generators are not optimized for using NUL as separator but are optimized for newline as separator. E.g head, tail, awk, ls, echo, sed, tar -v, perl (-0 and \0 instead of \n),
locate (requires using -0), find (requires using -print0), grep (requires user to use -z or -Z), sort (requires using -z).
So GNU parallel's newline separation can be emulated with:
cat | xargs -d "\n" -n1 command
xargs can run a given number of jobs in parallel, but has no support for running number-of-cpu-cores jobs in parallel.
xargs has no support for grouping the output, therefore output may run together, e.g. the first half of a line is from one process and the last half of the line is from another process. The example Parallel grep cannot be
done reliably with xargs because of this.
...

Note that you need to escape special characters in your parallel grep search term, for example:
parallel --pipe --block 10M --ungroup LC_ALL=C grep -F 'PostTypeId=\"1\"' < ~/Downloads/Posts.xml > questions.xml
Using standalone grep, grep -F 'PostTypeId="1"' would work without escaping the double quotes. It took me a while to figure that out!
Also note the use of LC_ALL=C and the -F flag (if you're just searching full strings) for additional speed-ups.

Here are 3 ways to do it, but you can't get line number for two of them.
(1) Run grep on multiple files in parallel, in this case all files in a directory and its subdirectories. Add /dev/null to force grep to prepend the filename to the matching line, because you're gonna want to know what file matched. Adjust the number of process -P for your machine.
find . -type f | xargs -n 1 -P 4 grep -n <grep-args> /dev/null
(2) Run grep on multiple files in serial but process 10M blocks in parallel. Adjust the block size for your machine and files. Here are two ways to do that.
# for-loop
for filename in `find . -type f`
do
parallel --pipepart --block 10M -a $filename -k "grep <grep-args> | awk -v OFS=: '{print \"$filename\",\$0}'"
done
# using xargs
find . -type f | xargs -I filename parallel --pipepart --block 10M -a filename -k "grep <grep-args> | awk -v OFS=: '{print \"filename\",\$0}'"
(3) Combine (1) and (2): run grep on multiple files in parallel and process their contents in blocks in parallel. Adjust block size and xargs parallelism for your machine.
find . -type f | xargs -n 1 -P 4 -I filename parallel --pipepart --block 10M -a filename -k "grep <grep-args> | awk -v OFS=: '{print \"filename\",\$0}'"
Beware that (3) may not be the best use of resources.
I've got a longer write-up, but that's the basic idea.

How to obtain the number of CPUs/cores in Linux from the command line?

I have this script, but I do not know how to get the last element in the printout:
cat /proc/cpuinfo | awk '/^processor/{print $3}'
The last element should be the number of CPUs, minus 1.

grep -c ^processor /proc/cpuinfo
will count the number of lines starting with "processor" in /proc/cpuinfo
For systems with hyper-threading, you can use
grep ^cpu\\scores /proc/cpuinfo | uniq | awk '{print $4}'
which should return (for example) 8 (whereas the command above would return 16)

Processing the contents of /proc/cpuinfo is needlessly baroque. Use nproc which is part of coreutils, so it should be available on most Linux installs.
Command nproc prints the number of processing units available to the current process, which may be less than the number of online processors.
To find the number of all installed cores/processors use nproc --all
On my 8-core machine:
$ nproc --all
8

The most portable solution I have found is the getconf command:
getconf _NPROCESSORS_ONLN
This works on both Linux and Mac OS X. Another benefit of this over some of the other approaches is that getconf has been around for a long time. Some of the older Linux machines I have to do development on don't have the nproc or lscpu commands available, but they have getconf.
Editor's note: While the getconf utility is POSIX-mandated, the specific _NPROCESSORS_ONLN and _NPROCESSORS_CONF values are not.
That said, as stated, they work on Linux platforms as well as on macOS; on FreeBSD/PC-BSD, you must omit the leading _.

Preface:
The problem with the /proc/cpuinfo-based answers is that they parse information that was meant for human consumption and thus lacks a stable format designed for machine parsing: the output format can differ across platforms and runtime conditions; using lscpu -p on Linux (and sysctl on macOS) bypasses that problem.
getconf _NPROCESSORS_ONLN / getconf NPROCESSORS_ONLN doesn't distinguish between logical and physical CPUs.
Here's a sh (POSIX-compliant) snippet that works on Linux and macOS for determining the number of - online - logical or physical CPUs; see the comments for details.
Uses lscpu for Linux, and sysctl for macOS.
Terminology note: CPU refers to the smallest processing unit as seen by the OS. Non-hyper-threading cores each correspond to 1 CPU, whereas hyper-threading cores contain more than 1 (typically: 2) - logical - CPU.
Linux uses the following taxonomy[1], starting with the smallest unit:
CPU < core < socket < book < node
with each level comprising 1 or more instances of the next lower level.
#!/bin/sh
# macOS: Use `sysctl -n hw.*cpu_max`, which returns the values of
# interest directly.
# CAVEAT: Using the "_max" key suffixes means that the *maximum*
# available number of CPUs is reported, whereas the
# current power-management mode could make *fewer* CPUs
# available; dropping the "_max" suffix would report the
# number of *currently* available ones; see [1] below.
#
# Linux: Parse output from `lscpu -p`, where each output line represents
# a distinct (logical) CPU.
# Note: Newer versions of `lscpu` support more flexible output
# formats, but we stick with the parseable legacy format
# generated by `-p` to support older distros, too.
# `-p` reports *online* CPUs only - i.e., on hot-pluggable
# systems, currently disabled (offline) CPUs are NOT
# reported.
# Number of LOGICAL CPUs (includes those reported by hyper-threading cores)
# Linux: Simply count the number of (non-comment) output lines from `lscpu -p`,
# which tells us the number of *logical* CPUs.
logicalCpuCount=$([ $(uname) = 'Darwin' ] &&
sysctl -n hw.logicalcpu_max ||
lscpu -p | egrep -v '^#' | wc -l)
# Number of PHYSICAL CPUs (cores).
# Linux: The 2nd column contains the core ID, with each core ID having 1 or
# - in the case of hyperthreading - more logical CPUs.
# Counting the *unique* cores across lines tells us the
# number of *physical* CPUs (cores).
physicalCpuCount=$([ $(uname) = 'Darwin' ] &&
sysctl -n hw.physicalcpu_max ||
lscpu -p | egrep -v '^#' | sort -u -t, -k 2,4 | wc -l)
# Print the values.
cat <<EOF
# of logical CPUs: $logicalCpuCount
# of physical CPUS: $physicalCpuCount
EOF
[1] macOS sysctl (3) documentation
Note that BSD-derived systems other than macOS - e.g., FreeBSD - only support the hw.ncpu key for sysctl, which are deprecated on macOS; I'm unclear on which of the new keys hw.npu corresponds to: hw.(logical|physical)cpu_[max].
Tip of the hat to #teambob for helping to correct the physical-CPU-count lscpu command.
Caveat: lscpu -p output does NOT include a "book" column (the man page mentions "books" as an entity between socket and node in the taxonomic hierarchy). If "books" are in play on a given Linux system (does anybody know when and how?), the physical-CPU-count command may under-report (this is based on the assumption that lscpu reports IDs that are non-unique across higher-level entities; e.g.: 2 different cores from 2 different sockets could have the same ID).
If you save the code above as, say, shell script cpus, make it executable with chmod +x cpus and place it in folder in your $PATH, you'll see output such as the following:
$ cpus
logical 4
physical 4
[1] Xaekai sheds light on what a book is: "a book is a module that houses a circuit board with CPU sockets, RAM sockets, IO connections along the edge, and a hook for cooling system integration. They are used in IBM mainframes. Further info: http://ewh.ieee.org/soc/cpmt/presentations/cpmt0810a.pdf"

lscpu gathers CPU architecture information form /proc/cpuinfon in human-read-able format:
# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
CPU socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 15
Stepping: 7
CPU MHz: 1866.669
BogoMIPS: 3732.83
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 4096K
NUMA node0 CPU(s): 0-7
See also https://unix.stackexchange.com/questions/468766/understanding-output-of-lscpu.

You can also use Python! To get the number of physical cores:
$ python -c "import psutil; print(psutil.cpu_count(logical=False))"
4
To get the number of hyperthreaded cores:
$ python -c "import psutil; print(psutil.cpu_count(logical=True))"
8

Here's the way I use for counting the number of physical cores that are online on Linux:
lscpu --online --parse=Core,Socket | grep --invert-match '^#' | sort --unique | wc --lines
or in short:
lscpu -b -p=Core,Socket | grep -v '^#' | sort -u | wc -l
Example (1 socket):
> lscpu
...
CPU(s): 28
Thread(s) per core: 2
Core(s) per socket: 14
Socket(s): 1
....
> lscpu -b -p=Core,Socket | grep -v '^#' | sort -u | wc -l
14
Example (2 sockets):
> lscpu
...
CPU(s): 56
Thread(s) per core: 2
Core(s) per socket: 14
Socket(s): 2
...
> lscpu -b -p=Core,Socket | grep -v '^#' | sort -u | wc -l
28
Example (4 sockets):
> lscpu
...
CPU(s): 64
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 4
...
> lscpu -b -p=Core,Socket | grep -v '^#' | sort -u | wc -l
32

For the total number of physical cores:
grep '^core id' /proc/cpuinfo |sort -u|wc -l
On multiple-socket machines (or always), multiply the above result by the number of sockets:
echo $(($(grep "^physical id" /proc/cpuinfo | awk '{print $4}' | sort -un | tail -1)+1))
#mklement0 has quite a nice answer below using lscpu. I have written a more succinct version in the comments

Using getconf is indeed the most portable way, however the variable has different names in BSD and Linux to getconf, so you have to test both, as this gist suggests:
https://gist.github.com/jj1bdx/5746298
(also includes a Solaris fix using ksh)
I personally use:
$ getconf _NPROCESSORS_ONLN 2>/dev/null || getconf NPROCESSORS_ONLN 2>/dev/null || echo 1
And if you want this in python you can just use the syscall getconf uses by importing the os module:
$ python -c 'import os; print os.sysconf(os.sysconf_names["SC_NPROCESSORS_ONLN"]);'
As for nproc, it's part of GNU Coreutils, so not available in BSD by default. It uses sysconf() as well after some other methods.

Crossplatform solution for Linux, MacOS, Windows:
CORES=$(grep -c ^processor /proc/cpuinfo 2>/dev/null || sysctl -n hw.ncpu || echo "$NUMBER_OF_PROCESSORS")

If you want to do this so it works on linux and OS X, you can do:
CORES=$(grep -c ^processor /proc/cpuinfo 2>/dev/null || sysctl -n hw.ncpu)

It is very simple. Just use this command:
lscpu

You can use one of the following methods to determine the number of physical CPU cores.
Count the number of unique core ids (roughly equivalent to grep -P '^core id\t' /proc/cpuinfo | sort -u | wc -l).
awk '/^core id\t/ {cores[$NF]++} END {print length(cores)}' /proc/cpuinfo
Multiply the number of 'cores per socket' by the number of sockets.
lscpu | awk '/^Core\(s\) per socket:/ {cores=$NF}; /^Socket\(s\):/ {sockets=$NF}; END{print cores*sockets}'
Count the number of unique logical CPU's as used by the Linux kernel. The -p option generates output for easy parsing and is compatible with earlier versions of lscpu.
lscpu -p | awk -F, '$0 !~ /^#/ {cores[$1]++} END {print length(cores)}'
Just to reiterate what others have said, there are a number of related properties.
To determine the number of processors available:
getconf _NPROCESSORS_ONLN
grep -cP '^processor\t' /proc/cpuinfo
To determine the number of processing units available (not necessarily the same as the number of cores). This is hyperthreading-aware.
nproc
I don't want to go too far down the rabbit-hole, but you can also determine the number of configured processors (as opposed to simply available/online processors) via getconf _NPROCESSORS_CONF. To determine total number of CPU's (offline and online) you'd want to parse the output of lscpu -ap.

The above answers are applicable to most situations, but if you are in a docker container environment and your container is limited by CpusetCpus, then you can't actually get the real cpu cores through the above method.
In this case, you need do this to get the real cpu cores:
grep -c 'cpu[0-9]' /proc/stat

I also thought cat /proc/cpuinfo would give me the correct answer, however I recently saw that my ARM quad core Cortex A53 system only showed a single core. It seems that /proc/cpuinfo only shows the active cores, whereas:
cat /sys/devices/system/cpu/present
is a better measure of what's there. You can also
cat /sys/devices/system/cpu/online
to see which cores are online, and
cat /sys/devices/system/cpu/offline
to see which cores are offline. The online, offline, and present sysfs entries return the index of the CPUS, so a return value of 0 just means core 0, whereas a return value of 1-3 means cores 1,2, and 3.
See https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu

In case anybody was wondering, here is what the Python psutil.cpu_count(logical=False) call does on Linux in equivalent shell script:
cat /sys/devices/system/cpu/cpu[0-9]*/topology/core_cpus_list | sort -u | wc -l
And here’s a slightly longer version that falls back to the information from the deprecated thread_siblings_list file if core_cpus_list isn’t available (psutil has this fallback):
cat /sys/devices/system/cpu/cpu[0-9]*/topology/{core_cpus_list,thread_siblings_list} | sort -u | wc -l

The following should give you the number of "real" cores on both a hyperthreaded and non-hyperthreaded system. At least it worked in all my tests.
awk -F: '/^physical/ && !ID[$2] { P++; ID[$2]=1 }; /^cpu cores/ { CORES=$2 }; END { print CORES*P }' /proc/cpuinfo

Not my web page, but this command from http://www.ixbrian.com/blog/?p=64&cm_mc_uid=89402252817914508279022&cm_mc_sid_50200000=1450827902 works nicely for me on centos. It will show actual cpus even when hyperthreading is enabled.
cat /proc/cpuinfo | egrep "core id|physical id" | tr -d "\n" | sed s/physical/\\nphysical/g | grep -v ^$ | sort | uniq | wc -l

Count "core id" per "physical id" method using awk with fall-back on "processor" count if "core id" are not available (like raspberry)
echo $(awk '{ if ($0~/^physical id/) { p=$NF }; if ($0~/^core id/) { cores[p$NF]=p$NF }; if ($0~/processor/) { cpu++ } } END { for (key in cores) { n++ } } END { if (n) {print n} else {print cpu} }' /proc/cpuinfo)

cat /proc/cpuinfo | grep processor
This worked fine. When I tried the first answer I got 3 CPU's as the output. I know that I have 4 CPUs on the system so I just did a grep for processor and the output looked like this:
[root#theservername ~]# cat /proc/cpuinfo | grep processor
processor : 0
processor : 1
processor : 2
processor : 3

If it's okay that you can use Python, then numexpr module has a function for this:
In [5]: import numexpr as ne
In [6]: ne.detect_number_of_cores()
Out[6]: 8
also this:
In [7]: ne.ncores
Out[7]: 8
To query this information from the command prompt use:
# runs whatever valid Python code given as a string with `-c` option
$ python -c "import numexpr as ne; print(ne.ncores)"
8
Or simply it is possible to get this info from multiprocessing.cpu_count() function
$ python -c "import multiprocessing; print(multiprocessing.cpu_count())"
Or even more simply use os.cpu_count()
$ python -c "import os; print(os.cpu_count())"

Use below query to get core details
[oracle#orahost](TESTDB)$ grep -c ^processor /proc/cpuinfo
8

If you just want to count physical cores, this command did it for me.
lscpu -e | tail -n +2 | tr -s " " | cut -d " " -f 4 | sort | uniq | wc -w
Pretty basic, but seems to count actual physical cores, ignoring the logical count

Fravadona's answer is awesome and correct, but it requires the presence of lscpu. Since it is not present on the system where I need the number of physical cores, I tried to come up with one that relies only on proc/cpuinfo
cat /proc/cpuinfo | grep -B2 'core id' | sed 's/siblings.*/'/ | tr -d '[:space:]' | sed 's/--/\n/'g | sort -u | wc -l
It works perfectly, but unfortunately it isn't as robust as Fravadona's, since it will break if
the name or order of the fields inside /proc/cpuinfo changes
grep replaces the line separator it inserts (currently --) by some other string.
BUT, other than that, it works flawlessly :)
Here is a quick explanation of everything that is happening
grep -B2 'core id'
get only the lines we are interested (i.e "core id" and the 2 preceding lines)
sed 's/siblings.*/'/
remove the "siblings..." line
tr -d '[:space:]'
replace spacing chars
sed 's/--/\n/'g
replace the '--' char, which was inserted by grep, by a line break
sort -u
group by "physical id,core id"
wc -l
count the number of lines
Being a total noobie, I was very pleased with myself when this worked. I never thought I would be able to join the required lines together to group by "physical id" and "core id". It is kind of hacky, but works.
If any guru knows a way to simplify this mess, please let me know.

Most answers in this thread pertain to logical cores.
Using BaSH on Ubuntu 18.x, I find this works well to determine number of physical CPUs:
numcpu="$(lscpu | grep -i 'socket(s)' | awk '{print $(2)}')"
It should work on most Linux distros.

One more answer among the numerous previous ones. It is possible to use the cgroups when they are available. The cpuset sub-system provides the list of actives cpus. This can be listed in the top most cgroup of the hierarchy in /sys/fs/cgroup. For example:
$ cat /sys/fs/cgroup/cpuset/cpuset.effective_cpus
0-3
Then, a parsing of the latter would be necessary to get the number of active CPUs. The content of this file is a comma separated list of CPU sets.
Here is an example using tr to break the list into single expressions and using sed to translate the intervals into arithmetic operations passed to expr:
#!/bin/sh
# For test purposes, the CPU sets are passed as parameters
#cpuset=`cat /sys/fs/cgroup/cpuset/cpuset.effective_cpus`
cpuset=$1
ncpu=0
for e in `echo $cpuset | tr ',' ' '`
do
case $e in
# CPU interval ==> Make an arithmetic operation
*-*) op=`echo $e | sed -E 's/([0-9]+)-([0-9]+)/\2 - \1 + 1/'`;;
# Single CPU number
*) op=1;;
esac
ncpu=`expr $ncpu + $op`
done
echo $ncpu
Here are some examples of executions with several flavors of CPU sets:
$ for cpuset in "0" "0,3" "0-3" "0-3,67" "0-3,67,70-75" "0,1-3,67,70-75"
> do
> ncpu.sh $cpuset
> done
1
2
4
5
11
11

dmidecode | grep -i cpu | grep Version
gives me
Version: Intel(R) Xeon(R) CPU E5-2667 v4 # 3.20GHz
Version: Intel(R) Xeon(R) CPU E5-2667 v4 # 3.20GHz
Which is correct socket count - looking up the E5-2667 tells me each socket has 8 cores, so multiply and end up with 16 cores across 2 sockets.
Where lscpu give me 20 CPUs - which is totally incorrect - not sure why. (same goes for cat /proc/cpu - ends up with 20.

Python 3 also provide a few simple ways to get it:
$ python3 -c "import os; print(os.cpu_count());"
4
$ python3 -c "import multiprocessing; print(multiprocessing.cpu_count())"
4

Summary:
to get physical CPUs do this:
grep 'core id' /proc/cpuinfo | sort -u
to get physical and logical CPUs do this:
grep -c ^processor /proc/cpuinfo
/proc << this is the golden source of any info you need about processes and
/proc/cpuinfo << is the golden source of any CPU information.

Quicker, without fork
This works with almost all shell.
ncore=0
while read line ;do
[ "$line" ] && [ -z "${line%processor*}" ] && ncore=$((ncore+1))
done </proc/cpuinfo
echo $ncore
4
In order to stay compatible with shell, dash, busybox and others, I've used ncore=$((ncore+1)) instead of ((ncore++)).
bash version
ncore=0
while read -a line ;do
[ "$line" = "processor" ] && ((ncore++))
done </proc/cpuinfo
echo $ncore
4

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string