replace numbers smaller than certain threshold with zero - linux

I have a large data sheet (see below) in .csv format. I want to replace the numbers in each column and row with zero if it is smaller than a certain value, let's say 0.1.
Could anyone give me a hand? Thanks a lot.
I guess it can be done with sed as in this example
BCC_ACR_CR BCC_ACR_CU BCC_ACR_FE BCC_ACR_MN BCC_ACR_MO
0.2826027 3.89E-12 0.58420346 2.23E-13 0.2105587
0.27986588 3.80E-12 0.58501168 2.27E-13 0.20890705
0.27986588 3.80E-12 0.58501168 2.27E-13 0.20890705
0.27986588 3.80E-12 0.58501168 2.27E-13 0.20890705
0.28038733 3.81E-12 0.58196375 5.88E-05 0.21239142
0.26855376 3.27E-12 0.60364524 2.06E-13 0.11205138
0.27220042 3.28E-12 0.60349573 2.08E-13 0.11530944
0.36294869 3.14E-12 0.50515464 1.64E-13 3.12E-12
0.36294869 3.14E-12 0.50515464 1.64E-13 3.12E-12
0.40837234 3.07E-12 0.47202708 1.73E-13 3.03E-12
0.3643896 3.25E-12 0.50431576 1.63E-13 3.14E-12
0.3643896 3.25E-12 0.50431576 1.63E-13 3.14E-12
0.35885258 3.21E-12 0.50978952 1.64E-13 3.12E-12

Here is one for GNU awk. Field separator is assumed to be a run of space so empty fields are not allowed:
$ gawk -v value=0.1 ' # give treshold values as parameter
BEGIN { RS="( +\n?|\n)" } # every field is considered a record
{
ORS=RT # RT stores actualized RS
if($0<value) # comparison
$0=sprintf("0%-" length()-1 "s","") # format data to fit the field
}1' file # output
Output:
BCC_ACR_CR BCC_ACR_CU BCC_ACR_FE BCC_ACR_MN BCC_ACR_MO
0.2826027 0 0.58420346 0 0.2105587
0.27986588 0 0.58501168 0 0.20890705
0.27986588 0 0.58501168 0 0.20890705
0.27986588 0 0.58501168 0 0.20890705
0.28038733 0 0.58196375 0 0.21239142
0.26855376 0 0.60364524 0 0.11205138
0.27220042 0 0.60349573 0 0.11530944
0.36294869 0 0.50515464 0 0
0.36294869 0 0.50515464 0 0
0.40837234 0 0.47202708 0 0
0.3643896 0 0.50431576 0 0
0.3643896 0 0.50431576 0 0
0.35885258 0 0.50978952 0 0

Related

How do I view the PDF file downloaded from docusign?

I am downloading the signed documents using the docusign getDocument API, but the downloaded file is coming in a weird format(json as per the Content-type header). What format is this and how can I convert it to a viewable/readable format.
Here is a snippet from the response -
%PDF-1.5\n%ûüýþ\n%Writing objects...\n4 0 obj\n<<\n/Type /Page\n/Resources 5 0 R\n/Parent 3 0 R\n/MediaBox [0 0 595.32000 841.92000 ]\n/Contents [6 0 R 7 0 R 8 0 R ]\n/Group <<\n/Type /Group\n/S /Transparency\n/CS /DeviceRGB\n>>\n/Tabs /S\n/StructParents 0\n>>\nendobj\n5 0 obj\n<<\n/Font <<\n/F1 9 0 R\n/F2 10 0 R\n/F3 11 0 R\n/F4 12 0 R\n/F5 13 0 R\n>>\n/ExtGState <<\n/GS7 14 0 R\n/GS8 15 0 R\n>>\n/ProcSet [/PDF /Text /ImageB /ImageC /ImageI ]\n/XObject <<\n/X0 16 0 R\n>>\n>>\nendobj\n2 0 obj\n<<\n/Producer (PDFKit.NET 21.1.200.20790)\n/CreationDate (D:20210429103256-07'00')\n/ModDate (D:20210429103256-07'00')\n/Author ()\n/Creator ()\n/Keywords <>\n/Subject ()\n/Title ()\n>>\nendobj\n6 0 obj\n<<\n/Length 4\n>>\nstream\n\n q \nendstream\nendobj\n7 0 obj\n<<\n/Filter /FlateDecode\n/Length 7326\n>>\nstream\nxœ½]ëo\u001c7’ÿnÀÿCã\u0016¸›YDm¾šd\u0007A\u0000É’\u001dçüZ[Þà\u0010ï\u0007Å’ma-É‘FÉæ¿?
What language are you using, and are you using one of the DocuSign SDKs or a raw API call?
When I make a GetDocument call (specifically {{vx}}/accounts/{{accountid}}/envelopes/{{envelopeId}}/documents/combined, for example), the response headers have Content-Type: application/pdf and Content-Dispostion: file; filename=file.pdf, and the body of the response is the binary of the PDF file itself.
A snippet of mine begins with this:
%PDF-1.5
%����
%Writing objects...
4 0 obj
<<
/Type /Page
/Parent 3 0 R
/Resources 10 0 R
/MediaBox [0 0 612 792 ]
/Contents [11 0 R 12 0 R 13 0 R 14 0 R 15 0 R 16 0 R 17 0 R ]
/Group <<
/Type /Group
/S /Transparency
/CS /DeviceRGB
So it looks like whatever system you have picking up the response is including \n newlines and potentially other control characters.
You'll want to look at how your tools handle the API response: if you can dump the raw output from DocuSign to a PDF file, that would work, but with the extra formatting being injected it's no longer valid.
I used this piece of code to save the response from DocuSign getDocument API to a PDF document-
envelopesApi.getDocument(ACCOUNT_ID, envelope_id, 'combined', function (error, document, response) {
if (document) {
var filename = ‘docusign.pdf';
fs.writeFile(filename, new Buffer(document, 'binary'), function (err) {
if (err) console.log('Error: ' + err);
});
}
});

Incorrect filling in SVG paths with arcs

I have shapes made out of svg paths of which I want to fill the inside of. But I am having issue if arcs are involved.
Expected Result
Current Result
HTML
<path class="polygon nofill" d="M 3399.999988793259 1394.9999692097963 A 3220.8276 3220.8276 0 0 0 1474.9999692097967 3319.9999887932586 M 5905.000011783687 1394.9999560225888 A 4839.2884 4839.2884 0 0 0 3399.999988216311 1394.9999560225888 M 5905.000004208143 1395.0000282015517 A 9331.1683 9331.1683 0 0 0 1475.000125700969 3320.0000323260156 " style="stroke-width: 8"/>
<path class="polygon" d="M 4874.999880183425 2775.000054445585 A 2046.6629 2046.6629 0 0 0 3475.000031514362 2775.0000865849615 M 4874.999985343183 2775.0000068346008 A 3369.9299 3369.9299 0 0 0 6504.999793165401 4404.999814656818 M 3474.9999727218064 2775.0001243226507 A 5029.8359 5029.8359 0 0 0 6504.999989335176 4405.000073299978 " style="stroke-width: 8"/>
<path class="polygon" d="M 2329.9999143902046 3650.000039536792 A 1673.1571 1673.1571 0 0 0 1330.000007397411 4650.000002692427 M 4229.9998864060235 3650.0000507333134 A 3670.5179 3670.5179 0 0 0 2330.000020292689 3650.0000757333178 M 1330.0000006966607 4649.999909899181 A 5926.0946 5926.0946 0 0 0 4230.000008277293 3650.0000122597635 " style="fill:url(#pattern);stroke-width: 8"/>
Solutions I could find suggest to change the order of drawing path, but the data is dynamic and I can't know beforehand how the arcs will be curving.
It's the way you are drawing the paths. Please take a look at the d attribute of any of your paths. For every side of the shape you are moving to a different point (the M command) and next draw the curve. In order to be able to fill the shapes you draw you need to move the pointer to one point and draw the shape without lifting your finger mouse.
Next I've rewritten your paths. Please take a look at the d attribute: there are no M commands (except the first one) meaning that the shape was drawn continuously. without interruptions.
svg{border:1px solid;width:300px}
<svg id="theSVG" viewBox="1200 1200 5500 3600" >
<path class="polygon nofill" d="M 5905.000011783687 1394.9999560225888
A 4839.2884 4839.2884 0 0 0 3399.999988216311 1394.9999560225888
A 3220.8276 3220.8276 0 0 0 1474.9999692097967 3319.9999887932586
A 9331.1683 9331.1683 0 0 1 5905.000004208143 1395.0000282015517 " style="stroke-width: 8" />
<path class="polygon" d="M 4874.999880183425 2775.000054445585
A 2046.6629 2046.6629 0 0 0 3475.000031514362 2775.0000865849615
A 5029.8359 5029.8359 0 0 0 6504.999989335176 4405.000073299978
A 3369.9299 3369.9299 0 0 1 4874.999985343183 2775.0000068346008"
style="stroke-width: 8"/>
<path class="polygon" d="M 4229.9998864060235 3650.0000507333134
A 3670.5179 3670.5179 0 0 0 2330.000020292689 3650.0000757333178
A 1673.1571 1673.1571 0 0 0 1330.000007397411 4650.000002692427
A 5926.0946 5926.0946 0 0 0 4230.000008277293 3650.0000122597635 " style="stroke-width: 8"/>
</svg>

MDP Policy Plot for a Maze

I have a 5x-5 maze specified as follows.
r = [1 0 1 1 1
1 1 1 0 1
0 1 0 0 1
1 1 1 0 1
1 0 1 0 1];
Where 1's are the paths and 0's are the walls.
Assume I have a function foo(policy_vector, r) that maps the elements of the policy vector to the elements in r. For example 1=UP, 2=Right, 3=Down, 4=Left. The MDP is set up such that the wall states are never realized so policies for those states are ignored in the plot.
policy_vector' = [3 2 2 2 3 2 2 1 2 3 1 1 1 2 3 2 1 4 2 3 1 1 1 2 2]
symbols' = [v > > > v > > ^ > v ^ ^ ^ > v > ^ < > v ^ ^ ^ > >]
I am trying to display my policy decision for a Markov Decision Process in the context of solving a maze. How would I plot something that looks like this? Matlab is preferable but Python is fine.
Even if some body could show me how to make a plot like this I would be able to figure it out from there.
function[] = policy_plot(policy,r)
[row,col] = size(r);
symbols = {'^', '>', 'v', '<'};
policy_symbolic = get_policy_symbols(policy, symbols);
figure()
hold on
axis([0, row, 0, col])
grid on
cnt = 1;
fill([0,0,col,col],[row,0,0,row],'k')
for rr = row:-1:1
for cc = 1:col
if r(row+1 - rr,cc) ~= 0 && ~(row == row+1 - rr && col == cc)
fill([cc-1,cc-1,cc,cc],[rr,rr-1,rr-1,rr],'g')
text(cc - 0.55,rr - 0.5,policy_symbolic{cnt})
end
cnt = cnt + 1;
end
end
fill([cc-1,cc-1,cc,cc],[rr,rr-1,rr-1,rr],'b')
text(cc - 0.70,rr - 0.5,'Goal')
function [policy_symbolic] = get_policy_symbols(policy, symbols)
policy_symbolic = cell(size(policy));
for ii = 1:length(policy)
policy_symbolic{ii} = symbols{policy(ii)};
end

High CPU usage through context switches in Akka application

I am maintaining and developing two Akka Scala applications that interface with a Serial device to gather sensor information. The main difference between the two is that one (My CO2 sensor application) uses 1% CPU while the other (My Power sensor application) uses 250% CPU. This is both the case on a Linux machine (Raspberry Pi 3) as well as on my Windows Desktop PC. Code-wise the main difference is that CO2 uses the Serial library (http://fazecast.github.io/jSerialComm/) directly, while the Power sensor app goes through a layer of middleware to convert the In/OutputStreams of the Serial library to Akka Source/Sink as such:
val port = SerialPort.getCommPort(comPort)
port.setBaudRate(baudRate)
port.setFlowControl(flowControl)
port.setComPortParameters(baudRate, dataBits, stopBits, parity)
port.setComPortTimeouts(timeoutMode, timeout, timeout)
val isOpen = port.openPort()
if(!isOpen) {
error(s"Port $comPort could not opened. Use the following documentation for troubleshooting: https://github.com/Fazecast/jSerialComm/wiki/Troubleshooting")
throw new Exception("Port could not be opened")
}
(reactive.streamSource(port.getInputStream), reactive.streamSink(port.getOutputStream))
When I saw this high CPU usage I immediately slapped a Profiler (VisualVM) against it which told me the following:
After googling for Unsafe.park I found the following answer: https://stackoverflow.com/a/29414580/1122834 - Using this information I checked the amount of context switching WITH and WITHOUT my Power sensor app, and the results were very clear about the root cause of the issue:
pi#dex:~ $ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
10 0 32692 80144 71228 264356 0 0 0 5 7 8 38 5 55 2 0
1 0 32692 80176 71228 264356 0 0 0 76 12932 18856 59 6 35 0 0
1 0 32692 80208 71228 264356 0 0 0 0 14111 20570 60 8 32 0 0
1 0 32692 80208 71228 264356 0 0 0 0 13186 16095 65 6 29 0 0
1 0 32692 80176 71228 264356 0 0 0 0 14008 23449 56 6 38 0 0
3 0 32692 80208 71228 264356 0 0 0 0 13528 17783 65 6 29 0 0
1 0 32692 80208 71228 264356 0 0 0 28 12960 16588 63 6 31 0 0
pi#dex:~ $ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 32692 147320 71228 264332 0 0 0 5 7 8 38 5 55 2 0
0 0 32692 147296 71228 264332 0 0 0 84 963 1366 0 0 98 2 0
0 0 32692 147296 71228 264332 0 0 0 0 962 1347 1 0 99 0 0
0 0 32692 147296 71228 264332 0 0 0 0 947 1318 1 0 99 0 0
As you can see, the amount of context switches went down by ~12000 a second just by killing my application. I continued by checking which exact threads were doing this, and it seems Akka is really eager to do stuff:
Both a comment here and on another SO question point towards tweaking the parallelism settings of Akka. I added the following to my application.conf - to no result.
akka {
log-config-on-start = "on"
actor{
default-dispatcher {
# Dispatcher is the name of the event-based dispatcher
type = Dispatcher
# What kind of ExecutionService to use
executor = "fork-join-executor"
# Configuration for the fork join pool
default-executor {
fallback = "fork-join-executor"
}
fork-join-executor {
# Min number of threads to cap factor-based parallelism number to
parallelism-min = 1
# Parallelism (threads) ... ceil(available processors * factor)
parallelism-factor = 1.0
# Max number of threads to cap factor-based parallelism number to
parallelism-max = 1
}
# Throughput defines the maximum number of messages to be
# processed per actor before the thread jumps to the next actor.
# Set to 1 for as fair as possible.
throughput = 1
}
}
stream{
default-blocking-io-dispatcher {
type = PinnedDispatcher
executor = "fork-join-executor"
throughput = 1
thread-pool-executor {
core-pool-size-min = 1
core-pool-size-factor = 1.0
core-pool-size-max = 1
}
fork-join-executor {
parallelism-min = 1
parallelism-factor = 1.0
parallelism-max = 1
}
}
}
}
This seems to improve the CPU usage (100% -> 65%) but still, the CPU usage is unnecessarily high.
UPDATE 21-11-'16
It would appear the problem is inside my graph. When not running the graph the CPU usage goes down immediately to normal levels. The graph is the following:
val streamGraph = RunnableGraph.fromGraph(GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val responsePacketSource = serialSource
.via(Framing.delimiter(ByteString(frameDelimiter), maxFrameLength, allowTruncation = true))
.via(cleanPacket)
.via(printOutput("Received: ",debug(_)))
.via(byteStringToResponse)
val packetSink = pushSource
.via(throttle(throttle))
val zipRequestStickResponse = builder.add(Zip[RequestPacket, ResponsePacket])
val broadcastRequest = builder.add(Broadcast[RequestPacket](2))
val broadcastResponse = builder.add(Broadcast[ResponsePacket](2))
packetSink ~> broadcastRequest.in
broadcastRequest.out(0) ~> makePacket ~> printOutput("Sent: ",debug(_)) ~> serialSink
broadcastRequest.out(1) ~> zipRequestStickResponse.in0
responsePacketSource ~> broadcastResponse.in
broadcastResponse.out(0).filter(isStickAck) ~> zipRequestStickResponse.in1
broadcastResponse.out(1).filter(!isStickAck(_)).map (al => {
val e = completeRequest(al)
debug(s"Sinking: $e")
e
}) ~> Sink.ignore
zipRequestStickResponse.out.map { case(request, stickResponse) =>
debug(s"Mapping: request=$request, stickResponse=$stickResponse")
pendingPackets += stickResponse.sequenceNumber -> request
request.stickResponse trySuccess stickResponse
} ~> Sink.ignore
ClosedShape
})
streamGraph.run()
When removing the filters from broadcastResponse, the CPU usage goes down to normal levels. This leads me to believe that the zip never happens, and therefore, the graph goes into an incorrect state.
The problem is that Fazecast's jSerialComm library has a number of different time-out modes.
static final public int TIMEOUT_NONBLOCKING = 0x00000000;
static final public int TIMEOUT_READ_SEMI_BLOCKING = 0x00000001;
static final public int TIMEOUT_WRITE_SEMI_BLOCKING = 0x00000010;
static final public int TIMEOUT_READ_BLOCKING = 0x00000100;
static final public int TIMEOUT_WRITE_BLOCKING = 0x00001000;
static final public int TIMEOUT_SCANNER = 0x00010000;
Using the non blocking read() method (TIMEOUT_NONBLOCKING) results in a very high CPU usage when combined with the Akka Stream's InputStreamPublisher. To prevent this simply use TIMEOUT_READ_SEMI_BLOCKING or TIMEOUT_READ_BLOCKING.

How to extract Curly Braces Vector from Cell String in Matlab

I have the following string (char of 1X48) in cell in Matlab
{ {1 , 0 , 0 } , { 0 , 1 , 0 } , { 0 , 0 , 1 } }.
I am trying to get three separate strings in new line with just space, so that data will look like
1 0 0
0 1 0
0 0 1
I will really appreciate if anyone has any idea to covert in matlab.
Thanks
Better to use cell2mat function
In your case you can try something like this,
temp = { {1 , 0 , 0 } , { 0 , 1 , 0 } , { 0 , 0 , 1 } };
out = [cell2mat(temp{1, 1}); cell2mat(temp{1, 2}); cell2mat(temp{1, 3})]
I hope it will help!!
You are looking for strjoin:
a = { {1 , 0 , 0 } , { 0 , 1 , 0 } , { 0 , 0 , 1 } };
b = strjoin(a, ' ')
Then:
b =
1 0 0 0 1 0 0 0 1
Edit: if you want to get new lines involved, you can use
b = [strjoin(a(1), ' '); strjoin(a(2), ' '); strjoin(a(3), ' ');]
Then:
b =
1 0 0
0 1 0
0 0 1
P.S.: strjoin works in MATLAB R2013b. For earlier versions, you can download the strjoin function from here.

Resources