for and start commands in a batch for parallel and sequential work - multithreading

I have an 8 core CPU with 8GB of RAM, and I'm creating a batch file to automate 7-zip CLI in exhausting most parameters and variables to compress the same set of files with the ultimate goal of finding the strongest combination of parameters and variables that result in the smallest archive size possible.
This is very time consuming by nature especially when the set of files to be processed is in gigabytes. I need a way not only to automate but to speed up this whole process.
7-zip works with different compression algorithms, some are single-threaded only, and some are multi-threaded, some do not require much amount of memory, and some require huge amounts of it and could even surpass the 8GB barrier. I've already successfully created an automated batch that works in sequence which exclude combinations requiring more than 8GB of memory.
I've split the different compression algorithms in several batches to simplify the whole process. For example, compression in PPMd as a 7z archive uses 1-thread and up to 1024MB. This is my current batch:
#echo off
echo mem=1m 2m 3m 4m 6m 8m 12m 16m 24m 32m 48m 64m 96m 128m 192m 256m 384m 512m 768m 1024m
echo o=2 3 4 5 6 7 8 10 12 14 16 20 24 28 32
echo s=off 1m 2m 4m 8m 16m 32m 64m 128m 256m 512m 1g 2g 4g 8g 16g 32g 64g on
echo x=1 3 5 7 9
for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s
exit
x, s, o and mem are parameters, and what's after each of them are the variables which 7z.exe will work with. x and s in this case are of no concern, they mean compression strength and solid block size for the archive.
That batch will work fine, but is limited to running only 1 instance of 7z.exe at a time and now I'm looking for a way to make it run more 7z.exe instances in parallel but without exceeding 8GB of RAM or 8 threads at once, whichever comes first, before proceeding to do the next ones in the sequence.
How can I improve this? I have some ideas but I don't know how to make them work in a batch. I was thinking of 2 other variables that won't interact with the 7z processes but would control when the next 7z instance would start. One variable would keep track of how many threads are currently in use and another would track how much memory are in use. Could that work?
Edit:
Sorry, I need to add details, I'm new to this posting style. Following this answer - https://stackoverflow.com/a/19481253/2896127 - I mentioned 8 batches were created and that 7z.PPMd batch was one of them. Maybe listing all the batches and how 7z deals with the parameters will give a better insight on the whole issue. I'll start with the simple ones:
7z.PPMd - 1 fully utilized thread and dictionary dependant 32m-1055m memory usage per instance.
7z.BZip2 - 8 fully utilized threads and fixed 109m memory usage per instance.
zip.Bzip2 - 8 partially utilized threads and fixed 336m memory usage per instance.
zip.Deflate - 8 partially utilized threads and fixed 260m memory usage per instance.
zip.PPMd - 8 partially utilized threads and dictionary dependant 280m-2320m memory usage per instance.
What I mean with partially utilized threads is that, while I assign 8 threads to be used by each 7.exe instance, the algorithm can do variable CPU usage at a randomly fashion, out of my control, unpredictable, but the limitation is set there - no more than 8 threads. In the case of 8 fully utilized threads, it means that on my 8 core CPU, each instance is utilizing 100% of CPU.
The most complex ones - 7z.LZMA, 7z.LZMA2, zip.LZMA - will need to be explained in detail but I am running short on time now. I'll be back to edit the LZMA part whenever I have more free time.
Thanks again.
EDIT: Adding in LZMA part.
7z.LZMA - each instance is n-threaded, ranging from 1 to 2:
1 fully utilized thread, dictionary dependant, 64k to 512m:
64k dictionary uses 32m memory
...
512m dictionary uses 5407m memory
excluded range: 768m to 1024m (above the limit of 8192m memory available)
2 partially utilized threads, dictionary dependant, 64k to 512m:
64k dictionary uses 38m memory
...
512m dictionary uses 5413m memory
excluded range: 768m to 1024m (above the limit of 8192m memory available)
7z.LZMA2 - each instance is n-threaded, ranging from 1 to 8:
1 fully utilized thread, dictionary dependant, 64k to 512m:
64k dictionary uses 32m memory
...
512m dictionary uses 5407m memory
excluded range: 768m to 1024m (above the limit of 8192m memory available)
2 or 3 partially utilized threads, dictionary dependant, 64k to 512m:
64k dictionary uses 38m memory
...
512m dictionary uses 5413m memory
excluded range: 768m to 1024m (above the limit of 8192m memory available)
4 or 5 partially utilized threads, dictionary dependant, 64k to 256m:
64k dictionary uses 51m memory
...
256m dictionary uses 5677m memory
excluded range: 384m to 1024m (above the limit of 8192m memory available)
6 or 7 partially utilized threads, dictionary dependant, 64k to 192m:
64k dictionary uses 62m memory
...
192m dictionary uses 6965m memory
excluded range: 256m to 1024m (above the limit of 8192m memory available)
8 partially utilized threads, dictionary dependant, 64k to 128m:
64k dictionary uses 72m memory
...
128m dictionary uses 6717m memory
excluded range: 192m to 1024m (above the limit of 8192m memory available)
zip.LZMA - each instance is n-threaded, ranging from 1 to 8:
1 fully utilized thread, dictionary dependant, 64k to 512m:
64k dictionary uses 3m memory
...
512m dictionary uses 5378m memory
excluded range: 768m to 1024m (above the limit of 8192m memory available)
2 or 3 partially utilized threads, dictionary dependant, 64k to 512m:
64k dictionary uses 9m memory
...
512m dictionary uses 5384m memory
excluded range: 768m to 1024m (above the limit of 8192m memory available)
4 or 5 partially utilized threads, dictionary dependant, 64k to 256m:
64k dictionary uses 82m memory
...
256m dictionary uses 5456m memory
excluded range: 384m to 1024m (above the limit of 8192m memory available)
6 or 7 partially utilized threads, dictionary dependant, 64k to 256m:
64k dictionary uses 123m memory
...
256m dictionary uses 8184m (very close to the limit though, I may consider excluding it)
excluded range: 384m to 1024m (above the limit of 8192m memory available)
8 partially utilized threads, dictionary dependant, 64k to 128m:
64k dictionary uses 164m memory
...
128m dictionary uses 5536m memory
excluded range: 192m to 1024m (above the limit of 8192m memory available)
I'm trying to understand the behaviour of the commands with nul in them. I don't quite understand what's happening during that part, what those symbols ^ > ^&1 "" are meant to say.
2>nul del %lock%!nextProc!
%= Redirect the lock handle to the lock file. The CMD process will =%
%= maintain an exclusive lock on the lock file until the process ends. =%
start /b "" cmd /c %lockHandle%^>"%lock%!nextProc!" 2^>^&1 !cpu%%N! !cmd!
)
set "launch="
Then later on, at the :wait code:
) 9>>"%lock%%%N"
) 2>nul
if %endCount% lss %startCount% (
1>nul 2>nul ping /n 2 ::1
goto :wait
)
2>nul del %lock%*
EDIT 2 (29-10-2013): Adding the current point of the situation.
After trial and error research, complemented with step by step notes of what's happening, I was able to understand the behaviour above. I simplified the line with start command to this:
start /b /low cmd /c !cmd!>"%lock%!nextProc!"
Though it works, I still don't understand the meaning of 1^>"filename" 2^>^&1 'command'. I know it is related to writing text in the filename what would otherwise be displayed to me. In this case, it would show all of 7z.exe text but written in the file. Until 7z.exe instance finishes its job, nothing is written in the file, but the file already exists, yet at the same time doesn't exist. When 7z.exe actually finishes, the file is finalized and this time it exists for the next part of the script.
Now I can understand the processing behaviour of the suggested script and I'm complementing it with something of my own - I am trying to implement all batches into "one batch do it all" script. In the simplified version, this is it:
echo 8 threads - maxproc=1
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (900k) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.bzip2.%%tt.%%dd.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=BZip2:d=%%d:mt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (900k) DO 7z.exe a teste.resultado\%%xx.bzip2.%%tt.%%dd.zip .\teste.original\* -mx=%%x -mm=BZip2:d=%%d -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%w IN (257 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.deflate64.%%tt.%%ww.zip .\teste.original\* -mx=%%x -mm=deflate64:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%w IN (258 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.deflate.%%tt.%%ww.zip .\teste.original\* -mx=%%x -mm=deflate:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (8) DO for %%d IN (256m 128m 64m 32m 16m 8m 4m 2m 1m) DO for %%w IN (16 15 14 13 12 11 10 9 8 7 6 5 4 3 2) DO 7z.exe a teste.resultado\%%xx.ppmd.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=PPMd:mem=%%d:o=%%w -mmt=%%t
echo 4 threads - maxproc=2
for %%x IN (9) DO for %%t IN (4) DO for %%d IN (256m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
echo 2 threads - maxproc=4
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (2) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t
echo 1 threads - maxproc=8
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t
for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO 7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t
In short, I want to process all that in the most efficient manner possible. Doing it by deciding how many processes can run at a time would be a way, but then there's also the memory required for each process, so that the sum of all required memory by those processes won't exceed 8192 MB. I got this part working.
#echo off
setlocal enableDelayedExpansion
set "maxMem=8192"
set "maxThreads=8"
:cycle1
set "cycleCount=4"
set "cycleThreads=1"
set "maxProc="
set /a "maxProc=maxThreads/cycleThreads"
set "cycleFor1=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO ("
set "cycleFor2=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO ("
set "cycleFor3=for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO for %%w IN (32 28 24 20 16 14 12 10 8 7 6 5 4 3 2) DO for %%s IN (on) DO ("
set "cycleFor4=for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO ("
set "cycleCmd1=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t"
set "cycleCmd2=7z.exe a teste.resultado\%%xx.lzma2.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=lzma2:d=%%d:fb=%%w -mmt=%%t"
set "cycleCmd3=7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s"
set "cycleCmd4=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.zip .\teste.original\* -mx=%%x -mm=lzma:d=%%d:fb=%%w -mmt=%%t"
set "tempMem1=5407"
set "tempMem2=5407"
set "tempMem3=1055"
set "tempMem4=5378"
rem set "tempMem1=5407"
rem set "tempMem2=5407"
rem set "tempMem3=1055 799 543 415 287 223 159 127 95 79 63 55 47 43 39 37 35 34 33 32"
rem set "tempMem4=5378"
set "memSum=0"
if not defined memRem set "memRem=!maxMem!"
for /l %%N in (1 1 %cycleCount%) DO (set "tempProc%%N=")
for /l %%N in (1 1 %cycleCount%) DO (
set memRem
set /a "tempProc%%N=%memRem%/tempMem%%N"
set /a "memSum+=tempMem%%N"
set /a "memRem-=tempMem%%N"
set /a "maxProc=!tempProc%%N!"
call :executeCycle
set /a "memRem+=tempMem%%N"
set /a "memSum-=tempMem%%N"
set /a "maxProc-=!tempProc%%!
)
goto :fim
:executeCycle
set "lock=lock_%random%_"
set /a "startCount=0, endCount=0"
for /l %%N in (1 1 %maxProc%) DO set "endProc%%N="
set launch=1
for %%x IN (9) DO for %%t IN (1) DO for %%d IN (512m) DO for %%w IN (273 256 192 128 96 64 48 32 24 16 12 8) DO for %%s IN (on) DO (
set "cmd=7z.exe a teste.resultado\%%xx.lzma.%%tt.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -ms=%%s -m0=LZMA:d=%%d:fb=%%w -mmt=%%t"
if !startCount! lss %maxProc% (
set /a "startCount+=1, nextProc=startCount"
) else (
call :wait
)
set cmd!nextProc!=!cmd!
echo !time! - proc!nextProc!: starting !cmd!
2>nul del %lock%!nextProc!
start /b /low cmd /c !cmd!>"%lock%!nextProc!"
)
set "launch="
:wait
for /l %%N in (1 1 %startCount%) do (
if not defined endProc%%N if exist "%lock%%%N" (
echo !time! - proc%%N: finished !cmd%%N!
if defined launch (
set nextProc=%%N
exit /b
)
set /a "endCount+=1, endProc%%N=1"
) 9>>"%lock%%%N"
) 2>nul
if %endCount% lss %startCount% (
1>nul 2>nul ping /n 2 ::1
goto :wait
)
2>nul del %lock%*
echo ===
echo Thats all folks!
exit /b
:fim
pause
I have trouble with cycleFor1 and cycleCmd1 located in :cycle1 part - they should be replacing the for line and the first cmd variable inside the :executeCycle, to make it work as I intend to. How do I do that?
The other issue I have is about tempMem3. I have logged all memory required when the command cycleCmd3 would be running. It is dictionary dependant. tempMem3 and cycleCmd3 are related like this:
for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO
set "tempMem3=1055 799 543 415 287 223 159 127 95 79 63 55 47 43 39 37 35 34 33 32"
So 1024m would use 1055, 768m would use 799, and so on till 1m using 32. I don't know how to translate that into the script.
Any help is appreciated.

I've already posted a robust batch solution that limits the number of parallel processes at Parallel execution of shell processes. That script uses a list of commands that is embedded within the script. Follow the link to see how it works.
I modified that script to generate the commands using FOR loops as per your question. I also set the limit to 8 simultaneous processes.
Your maximum memory is 1g, and you never have more than 8 processes, so I don't see how you could ever exceed 8g. If you increase the max memory per processes, then you do have to worry about total memory. you will have to add additional logic to keep track of how much memory is being used, and which cpu IDs are available. Note that batch numbers are limited to ~2g, so I recommend computing memory used in megabytes.
By default, the script hides the output of the commands. If you want to see the output, then run it with the /O option.
#echo off
setlocal enableDelayedExpansion
:: Display the output of each process if the /O option is used
:: else ignore the output of each process
if /i "%~1" equ "/O" (
set "lockHandle=1"
set "showOutput=1"
) else (
set "lockHandle=1^>nul 9"
set "showOutput="
)
:: Define the maximum number of parallel processes to run.
:: Each process number can optionally be assigned to a particular server
:: and/or cpu via psexec specs (untested).
set "maxProc=8"
:: Optional - Define CPU targets in terms of PSEXEC specs
:: (everything but the command)
::
:: If a cpu is not defined for a proc, then it will be run on the local machine.
:: I haven't tested this feature, but it seems like it should work.
::
:: set cpu1=psexec \\server1 ...
:: set cpu2=psexec \\server1 ...
:: set cpu3=psexec \\server2 ...
:: etc.
:: For this demo force all cpu specs to undefined (local machine)
for /l %%N in (1 1 %maxProc%) do set "cpu%%N="
:: Get a unique base lock name for this particular instantiation.
:: Incorporate a timestamp from WMIC if possible, but don't fail if
:: WMIC not available. Also incorporate a random number.
set "lock="
for /f "skip=1 delims=-+ " %%T in ('2^>nul wmic os get localdatetime') do (
set "lock=%%T"
goto :break
)
:break
set "lock=%temp%\lock%lock%_%random%_"
:: Initialize the counters
set /a "startCount=0, endCount=0"
:: Clear any existing end flags
for /l %%N in (1 1 %maxProc%) do set "endProc%%N="
:: Launch the commands in a loop
set launch=1
echo mem=1m 2m 3m 4m 6m 8m 12m 16m 24m 32m 48m 64m 96m 128m 192m 256m 384m 512m 768m 1024m
echo o=2 3 4 5 6 7 8 10 12 14 16 20 24 28 32
echo s=off 1m 2m 4m 8m 16m 32m 64m 128m 256m 512m 1g 2g 4g 8g 16g 32g 64g on
echo x=1 3 5 7 9
for %%x IN (9) DO for %%d IN (1024m 768m 512m 384m 256m 192m 128m 96m 64m 48m 32m 24m 16m 12m 8m 6m 4m 3m 2m 1m) DO (
set "cmd=7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s"
if !startCount! lss %maxProc% (
set /a "startCount+=1, nextProc=startCount"
) else (
call :wait
)
set cmd!nextProc!=!cmd!
if defined showOutput echo -------------------------------------------------------------------------------
echo !time! - proc!nextProc!: starting !cmd!
2>nul del %lock%!nextProc!
%= Redirect the lock handle to the lock file. The CMD process will =%
%= maintain an exclusive lock on the lock file until the process ends. =%
start /b "" cmd /c %lockHandle%^>"%lock%!nextProc!" 2^>^&1 !cpu%%N! !cmd!
)
set "launch="
:wait
:: Wait for procs to finish in a loop
:: If still launching then return as soon as a proc ends
:: else wait for all procs to finish
:: redirect stderr to null to suppress any error message if redirection
:: within the loop fails.
for /l %%N in (1 1 %startCount%) do (
%= Redirect an unused file handle to the lock file. If the process is =%
%= still running then redirection will fail and the IF body will not run =%
if not defined endProc%%N if exist "%lock%%%N" (
%= Made it inside the IF body so the process must have finished =%
if defined showOutput echo ===============================================================================
echo !time! - proc%%N: finished !cmd%%N!
if defined showOutput type "%lock%%%N"
if defined launch (
set nextProc=%%N
exit /b
)
set /a "endCount+=1, endProc%%N=1"
) 9>>"%lock%%%N"
) 2>nul
if %endCount% lss %startCount% (
1>nul 2>nul ping /n 2 ::1
goto :wait
)
2>nul del %lock%*
if defined showOutput echo ===============================================================================
echo Thats all folks!

To execute at the same time no more than 8 instances of 7z.exe process you could do this:
#Echo OFF & Setlocal EnableDelayedExpansion
Set /A "pCount=0" & REm Process count
For
...
) DO (
Set /A "pCount+=1"
If !pCount! LEQ 8 (
Start /B 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s
)
)
...
If do you want to run each process in a new parallel CMD window then you would replace the Start /B line on my code for this else:
CMD /C "Start /w 7z.exe a teste.resultado\%%xx.ppmd.%%dd.%%ww.%%ss.7z .\teste.original\* -mx=%%x -m0=PPMd:mem=%%d:o=%%w -ms=%%s"

Related

Handle EOF with multi-line command in Dockerfile

I'm trying to use sfdisk to create an image file inside a docker container and I can use below command without any problem:
root#c8e9be2eb26f:/# sfdisk bbb_image.img << EOF
> 1M,48M,0xE,*
> ,,,-
> EOF
Checking that no-one is using this disk right now ... OK
Disk bbb_image.img: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> Created a new DOS disklabel with disk identifier 0x34b8e793.
bbb_image.img1: Created a new partition 1 of type 'W95 FAT16 (LBA)' and of size 48 MiB.
bbb_image.img2: Created a new partition 2 of type 'Linux' and of size 975 MiB.
bbb_image.img3: Done.
New situation:
Disklabel type: dos
Disk identifier: 0x34b8e793
Device Boot Start End Sectors Size Id Type
bbb_image.img1 * 2048 100351 98304 48M e W95 FAT16 (LBA)
bbb_image.img2 100352 2097151 1996800 975M 83 Linux
The partition table has been altered.
Syncing disks.
Now in my Dockerfile this seems doesn't work and reproduce incomplete results:
RUN sfdisk bbb_image.img << "EOF\n\
1M,48M,0xE,*\n\
,,,-\n\
EOF"
And reproduce this in the console which is wrong:
Step 4/4 : RUN sfdisk bbb_image.img << "EOF\n1M,48M,0xE,*\n,,,-\nEOF\n"
---> Running in afc86ffef92a
Checking that no-one is using this disk right now ... OK
Disk bbb_image.img: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
>>> Done.
New situation:
ERROR: Service 'test' failed to build: The command '/bin/sh -c sfdisk bbb_image.img << "EOF\n1M,48M,0xE,*\n,,,-\nEOF\n"' returned a non-zero code: 1
I'm not sure how to handle the EOF in the Dockerfile.
Maybe try this one:
RUN sfdisk bbb_image.img << "\n1M,48M,0xE,*\n,,,-\n"

Run program on boot with initramfs

I'm running uClinux on a SmartFusion2 as part of a University team building a small cube satellite. However, I'm not super experienced in Linux kernel, and this issue has had me stumped for a few days. I'm trying to get the SmartFusion to run a program on bootup. Currently, the only .uImage that does this is the test 'hello' file. I'm trying to recreate the process for another program, but am running into some difficulties.
in my hello directory I have the following files: hello.busybox, hello.kernel.M2S, help.txt, hello.uImage, Makefile, hello.initramfs, hello (directory)
in the hello subdirectory (projects/hello/hello):
hello (executable), hello.c, hello.gdb, hello.h, hello.o, Makefile
to try and get the uImage to boot and run a different program, I made a copy of my projects/hello/hello directory and renamed it 'goodbye', with a few minor changes int the .h and .c files for testing purposes. Now I'm trying to get the executable 'hello' in projects/hello/goodbye to run on boot.
My initramfs file originally looked like this:
# This is a very simple, default initramfs
dir /dev 0755 0 0
nod /dev/console 0600 0 0 c 5 1
nod /dev/tty 0666 0 0 c 5 0
nod /dev/null 0600 0 0 c 1 3
nod /dev/mem 0600 0 0 c 1 1
nod /dev/kmem 0600 0 0 c 1 2
nod /dev/zero 0600 0 0 c 1 5
nod /dev/random 0600 0 0 c 1 8
nod /dev/urandom 0600 0 0 c 1 9
dir /dev/pts 0755 0 0
nod /dev/ptmx 0666 0 0 c 5 2
nod /dev/ttyS0 0666 0 0 c 4 64
nod /dev/ttyS1 0666 0 0 c 4 65
nod /dev/ttyS2 0666 0 0 c 4 66
nod /dev/ttyS3 0666 0 0 c 4 67
nod /dev/ttyS4 0666 0 0 c 4 68
nod /dev/ttyS5 0666 0 0 c 4 69
dir /bin 755 0 0
dir /proc 755 0 0
file /bin/hello ${INSTALL_ROOT}/projects/${SAMPLE}/hello/hello 755 0 0
slink /bin/init hello 777 0 0
I changed the last two lines of the initramfs to read as follows:
file /bin/hello ${INSTALL_ROOT}/projects/${SAMPLE}/hello/goodbye 755 0 0
slink /bin/init hello 777 0 0
But when I try and boot the SmartFusion2 after remaking the uImage, I get this, witht the error at the bottom:
Starting kernel ...
Linux version 2.6.33-arm1 (ecenstudent#EE10308) (gcc version 4.4.1 (Sourcery G++ Lite 2010q1-189) ) #38 Thu May 25 09:09:08 MDT 2017
CPU: ARMv7-M Processor [412fc231] revision 1 (ARMv7M)
CPU: NO data cache, 8K instruction cache
Machine: Microsemi M2S
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 16256
Kernel command line: m2s_platform=m2s-fg484-som console=ttyS0,115200 panic=10 ip=10.2.118.102:10.2.118.101:192.168.0.1::m2s-fg484-som:eth0:off ethaddr=3C:FB:96:05:00:53
PID hash table entries: 256 (order: -2, 1024 bytes)
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 64MB = 64MB total
Memory: 64408k/64408k available, 1128k reserved, 0K highmem
Virtual kernel memory layout:
vector : 0x00000000 - 0x00001000 ( 4 kB)
fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB)
vmalloc : 0x00000000 - 0xffffffff (4095 MB)
lowmem : 0xa0000000 - 0xa4000000 ( 64 MB)
modules : 0xa0000000 - 0x01000000 (1552 MB)
.init : 0xa0008000 - 0xa0012000 ( 40 kB)
.text : 0xa0074bc0 - 0xa0083000 ( 58 kB)
.data : 0xa0084000 - 0xa008cce0 ( 36 kB)
Hierarchical RCU implementation.
NR_IRQS:83
Calibrating delay loop... 132.30 BogoMIPS (lpj=661504)
Mount-cache hash table entries: 512
Switching to clocksource mss_timer2
Serial: 8250/16550 driver, 2 ports, IRQ sharing disabled
serial8250.0: ttyS0 at MMIO 0x40000000 (irq = 10) is a 16550A
console [ttyS0] enabled
serial8250.1: ttyS1 at MMIO 0x40010000 (irq = 11) is a 16550A
Freeing init memory: 40K
Kernel panic - not syncing: No init found. Try passing init= option to kernel.
Backtrace: no frame pointer
Rebooting in 10 seconds..
Can somebody help explain why this is happening and what I need to do to my initramfs to make it run the proper program on boot? Thanks!!
As it turns out, I was confused about how those two lines worked. When I finally figured it out, they looked like this:
file /bin/hello ${INSTALL_ROOT}/projects/${SAMPLE}/goodbye/hello 755 0 0
slink /bin/init hello 777 0 0
then it worked as desired, and I was able to implement it into other uImages.

isolcpus - binding not working

Sorry, this is a duplicate question from ServerFault. I am posting this here since I didn't get any response there.
Question:
I am using isolcpus to isolate cores. I would like to bind specific threads to cores, but it is not working. The threads are moved to different cores after I bind them.
Cores 13, 14, and 15 are isolated:
$ cat /proc/cmdline
ro root=/dev/mapper/vg0-root rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=vg0/swaprd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=137M#0M rd_NO_DM KEYBOARDTYPE=pc KEYTABLE=us rd_LVM_LV=vg0/root rhgb quiet audit=0 intel_idle.max_cstate=0 console=tty0 console=ttyS1,115200 printk.time=1 processor.max_cstate=1 idle=poll biosdevname=0 isolcpus=13-15
top -H -p pgrep -u prusr12 Ser -d 1 shows this: 5017 and 5018 should have been bound to 14 and 15 and 5014 and 5016 should have been on 13.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
5017 prusr12 20 0 1312m 1.1g 1.1g R 99.9 0.9 9:53.93 5 Server-3.10.
5018 prusr12 20 0 1312m 1.1g 1.1g R 99.9 0.9 10:08.88 7 Server-3.10.
5014 prusr12 20 0 1312m 1.1g 1.1g S 0.0 0.9 0:00.40 2 Server-3.10.
5016 prusr12 20 0 1312m 1.1g 1.1g S 0.0 0.9 0:01.04 4 Server-3.10.
The command line is this:
sg devuser "taskset -c 13 /releases/3.10.0/bin/Server-3.10.0 -n X -e DEV -p DEFAULT > /logs/ServerDevPR_DEFAULT.out 2>&1 &"
There are 4 threads in the process. I want the main thread to start on 13, hence taskset -c 13. Then two threads are spawed and will bind them to 14 and 15. I see that the threads were bound to 14 and 15, but then they were moved to other cores. pthread_setaffinity_np() is being used to bind the threads to cores.
Log after I bind the threads to 14 and 15:
CpuSet returned by pthread_getaffinity_np() contained:CPU 14
CpuSet returned by pthread_getaffinity_np() contained:CPU 15
System details:
$ uname -a
Linux host123 2.6.32-573.12.1.el6.x86_64 #1 SMP Mon Nov 23 12:55:32 EST 2015 x86_64 x86_64 x86_64 GNU/Linux
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Stepping: 2
CPU MHz: 3199.847
BogoMIPS: 6399.06
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 20480K
NUMA node0 CPU(s): 0-7
NUMA node1 CPU(s): 8-15
What could be going wrong? Thanks for your time.

What does "file system outputs" mean with time -v?

What is 'file system outputs' counting when using the Linux 'time' command with dd?
It doesn't equal dd 'count' (presumably the number of calls to fwrite?), nor the size of the output in 4096-byte pages (which should be 1024000 in this example).
An example:
> /usr/bin/time -v dd if=/dev/zero of=/tmp/dd.test bs=4M count=1000
1000+0 records in
1000+0 records out
4194304000 bytes (4.2 GB) copied, 4.94305 s, 849 MB/s
Command being timed: "dd if=/dev/zero of=/tmp/dd.test bs=4M count=1000"
User time (seconds): 0.00
System time (seconds): 4.72
Percent of CPU this job got: 95%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.94
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 5040
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 1322
Voluntary context switches: 32
Involuntary context switches: 15
Swaps: 0
File system inputs: 240
File system outputs: 8192000
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
The command time is printing out values from the rusage struct (see getrusage(2)).
And according to the source:
/*
* We approximate number of blocks, because we account bytes only.
* A 'block' is 512 bytes
*/
static inline unsigned long task_io_get_oublock(const struct task_struct *p)
{
return p->ioac.write_bytes >> 9;
}
So (at least on Linux) "File system outputs" in time output is the total number of bytes written / 512.

CPU and HDD information

I searched but I found nothing for my problem.
I would like to have in Linux command line the information about the CPU usage and the local HDDs with formatting text like exactly as the examples below for my program.
These examples are command line outputs on MS-Windows.
I hope it is possible on Linux, too.
Thank you
wmic logicaldisk where drivetype=3 get caption,freespace,size
Caption FreeSpace Size
C: 135314194432 255953203200
D: 126288519168 128033222656
E: 336546639872 1000194015232
F: 162184503296 1000194015232
wmic cpu get loadpercentage
LoadPercentage
4
You won't find anything exactly like the output you provided.
The only option is to use for disk space df:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 73216256 27988724 41485276 41% /
devtmpfs 8170164 0 8170164 0% /dev
tmpfs 8203680 544 8203136 1% /dev/shm
tmpfs 8203680 12004 8191676 1% /run
tmpfs 5120 4 5116 1% /run/lock
tmpfs 8203680 0 8203680 0% /sys/fs/cgroup
/dev/sdb1 482922 83939 374049 19% /boot
and for cpu you have many more options, e.g.
vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 11865304 149956 1474172 0 0 53 46 126 707 3 0 96 0 0
or top -b | head:
top - 21:48:43 up 54 min, 1 user, load average: 0.13, 0.17, 0.22
Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.0 us, 0.4 sy, 0.1 ni, 96.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16407364 total, 11848936 free, 2888844 used, 1669584 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 13230972 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 40544 6440 3780 S 0.0 0.0 0:01.15 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
There is no command that gives you a load percentage of the cpu. It's actually impossible to get that with a system call (nor in linux neither in Windows). What you can get is the number of ticks currently executed (for each field, user, system, io,irq idle)..., then call it again a certain amount of time later and calculate it. That way is how work all the commands for reading the cpu percentage.
Here a script bash that gives you that: (just create a file named for example cpu.sh paste this code and execute to see the results)
_estado()
{
cat /proc/stat | grep "cpu " | sed -e 's/ */:/g' -e 's/^cpux//'
}
_ticksconcretos()
{
echo $1 | cut -d ':' -f $2
}
while true ; do
INICIAL=$(_estado)
sleep 1
FINAL=$(_estado)
UsuarioI=$(_ticksconcretos $INICIAL 1)
UsuarioF=$(_ticksconcretos $FINAL 1)
NiceI=$(_ticksconcretos $INICIAL 2)
NiceF=$(_ticksconcretos $FINAL 2)
SistemaI=$(_ticksconcretos $INICIAL 3)
SistemaF=$(_ticksconcretos $FINAL 3)
idleI=$(_ticksconcretos $INICIAL 4)
idleF=$(_ticksconcretos $FINAL 4)
IOI=$(_ticksconcretos $INICIAL 5)
IOF=$(_ticksconcretos $FINAL 5)
IRQI=$(_ticksconcretos $INICIAL 6)
IRQF=$(_ticksconcretos $FINAL 6)
SOFTIRQI=$(_ticksconcretos $INICIAL 7)
SOFTIRQF=$(_ticksconcretos $FINAL 7)
STEALI=$(_ticksconcretos $INICIAL 8)
STEALF=$(_ticksconcretos $FINAL 8)
InactivoF=$(( $idleF + $IOF ))
InactivoI=$(( $idleI + $IOI ))
ActivoI=$(( $UsuarioI + $NiceI + $SistemaI + $IRQI + $SOFTIRQI + $STEALI ))
ActivoF=$(( $UsuarioF + $NiceF + $SistemaF + $IRQF + $SOFTIRQF + $STEALF ))
TOTALI=$(( $ActivoI + $InactivoI ))
TOTALF=$(( $ActivoF + $InactivoF ))
PORC=$(( ( ( ( $TOTALF - $TOTALI ) - ( $InactivoF - $InactivoI ) ) * 100 / ( $TOTALF - $TOTALI ) ) ))
clear
echo "CPU: $PORC %"
done
For the free space You could use something like this:
df -h -x tmpfs -x devtmpfs | awk -F " " '{print $1 " " $4 " " $2}'
wich will give you this output:
Filesystem Free Size
/dev/sda1 16G 25G
/dev/sda5 46G 79G
/dev/sdb8 130G 423G
sda represents the first disk, sda1 the first partition, sda2, the second one etc. you can add (or change) $6 inside the print to get the mount points instead of the partitions, change the order or even more things.

Resources