How to nest ETW events - etw

I'm using TraceLogging on Windows 10, recording the events and viewing them with WPA. It works fine but I'd like to nest certain events similar to how stack traces nest.
Basically I am defining custom profiling scopes in my code and I'd like to nest them to make exploration of the data easier.
e.g.:
Depth 0 | Scope 1
Depth 1 | Scope 2
Depth 1 | Some event
Depth 1 | Scope 2 ends | (0.2 ms)
Depth 0 | Scope 1 ends | (0.3 ms)
Right now everything is flat when I drill down by Thread ID.
e.g.:
Depth 0 | Scope 1
Depth 1 | Scope 2
Depth 1 | Some event
Depth 1 | Scope 2 ends | (0.2 ms)
Depth 0 | Scope 1 ends | (0.3 ms)
The obvious benefit of nesting like stacks is that I can expand and collapse on demand to show/hide more information.

Use ETW ActivityID to group Events. Now move the ActivityID to the first or second position after time in WPA to group the Events for each ActivityID.
This should give you the grouping you want. Play a bit with it.

Related

EXCEL: How to automatically create groups based on sum being less than X and not greater than Y

I have table in Excel with some information, the main column is Weight (in KG).
I need Excel to group Rows into groups, where each group's sum of Weight (in KG) is less than 24000 kg and greater than 23500 kg.
To do so manually is very time consuming, since there are thousands of rows with different Weight values.
table example:
ID | Weight (KG)
1 | 11360
2 | 22570
3 | 10440
4 | 20850
5 | 9980
6 | 9950
7 | 19930
8 | 9930
9 | 9616
10 | 9580
... and so on
The closest I got to solving the problem is adding 3 new columns: Total, Starts Group and Group Number.
Total function: =IF(SUM(B3+C2)>24000,B3,SUM(B3+C2)) - calculates current sum of Weight values in the current group
Starts group function: =IF(SUM(B3+C2)>24000,B3,SUM(B3+C2)) - checks if current row makes a new group
Group number function: =IF(D3,E2+1,E2) - all rows that contain same number are in the same group
The problem with this is that it doesn't create groups that are greater than 23500 too, but only that are less than 2400 kg.
It doesn't have to be in Excel, any app/script would work too, it just has to get the job done.
Desired output:
ID | Weight (KG) | Group ID
1 | 11360 | 1
2 | 2570 | 2
3 | 10440| 1
4 | 20850 | 2
5 | 180| 2
6 | 1950 | 1
So i want to get groups similar to these:
Group number 1 - Total 23750kg
Group number 2 - Total 2360kg
Url to my example table with functions I added:
https://1drv.ms/x/s!Au0UogL2uddbgTFJJ4TzSKLhPFPE?e=r02sPX
You may want to try this for total:
=IF(SUM(B3+C2)>24000;B3;IF(SUM(B3+C2)<=23500;SUM(B3+C2);B3))
edit:
I just saw you pasted the proposal into your sample file. You may need to replace the ; with , due to regional format settings.
The limitation remains:
first priority is <24k and second priority is >=23.5k
If the next row’s value makes the “jump” above 24k you may end up remaining below 23.5k and switching to the next group
edit2:
You may want to look up some optimization models and algorithms for your combination problem before trying to implement it in Excel.
Or try with simple rules, e.g. categorizing your rows such as weight over 20k, 16k, 12k,8k, 4k, 2k, 1k, 500, etc. and try to group/combine them accordingly

Spark: count events based on two columns

I have a table with events which are grouped by a uid. All rows have the columns uid, visit_num and event_num.
visit_num is an arbitrary counter that occasionally increases. event_num is the counter of interactions within the visit.
I want to merge these two counters into a single interaction counter that keeps increasing by 1 for each event and continues to increase when then next visit has started.
As I only look at the relative distance between events, it's fine if I don't start the counter at 1.
|uid |visit_num|event_num|interaction_num|
| 1 | 1 | 1 | 1 |
| 1 | 1 | 2 | 2 |
| 1 | 2 | 1 | 3 |
| 1 | 2 | 2 | 4 |
| 2 | 1 | 1 | 500 |
| 2 | 2 | 1 | 501 |
| 2 | 2 | 2 | 502 |
I can achieve this by repartitioning the data and using the monotonically_increasing_id like this:
df.repartition("uid")\
.sort("visit_num", "event_num")\
.withColumn("iid", fn.monotonically_increasing_id())
However the documentation states:
The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.
As the id seems to be monotonically increasing by partition this seems fine. However:
I am close to reaching the 1 billion partition/uid threshold.
I don't want to rely on the current implementation not changing.
Is there a way I can start each uid with 1 as the first interaction num?
Edit
After testing this some more, I notice that some of the users don't seem to have consecutive iid values using the approach described above.
Edit 2: Windowing
Unfortunately there are some (rare) cases where more thanone row has the samevisit_numandevent_num`. I've tried using the windowing function as below, but due to this assigning the same rank to two identical columns, this is not really an option.
iid_window = Window.partitionBy("uid").orderBy("visit_num", "event_num")
df_sample_iid=df_sample.withColumn("iid", fn.rank().over(iid_window))
The best solution is the Windowing function with rank, as suggested by Jacek Laskowski.
iid_window = Window.partitionBy("uid").orderBy("visit_num", "event_num")
df_sample_iid=df_sample.withColumn("iid", fn.rank().over(iid_window))
In my specific case some more data cleaning was required but generally, this should work.

How to dynamically create a cumulative overall total based on a non-cumulative categorical column in excel

Slightly wordy title but here goes
I have a grid in excel which includes 3 columns (media spend, marginal revenue returns & media channel invested in) and I want to create the column below called desired cumulative spend
The reason the grid is structured in this way it does is that it represents an optimised spend laydown ordered by how much of each media channel's budget should be invested in until the marginal returns diminish such that it should be substituted for another media channel.
It is possible that this substitution can then be reversed back to the original channel if the new channel has a sharply diminishing curve, such that all marginal benefit associated to the new channel diminishes and the total spend level still means it is mathematically sensible to switch back to the original curve (maybe it has a lower base level but reduces less sharply). It is also possible that at the point in which the marginal benefit associated to the new channel diminishes, the best next step is to invest in a third channel.
The desired new spend column has two elements to it
it is a simple accumulation of spend from row to row when the
media channel is constant from row to row
it is a slightly more tricky accumulation of spend when the media
channel changes - then it needs to be able to reference back to the
last spend level associated to the channel which has been
substituted in. For row 4, the logic I am struggling with would need
to the running total from row 3 plus the new spend level associated
to row 4 minus the spend level the last time this channel was used
(row 2)
|spend | mar return | media | desired cumulative spend |
|------ |----------- |-------| ----------------------------------------- |
1 | £580 | 128 | chan1 | 580 |
2 | £620 | 121 | chan1 | 580+(620-580) |
3 | £900 | 115.8 | chan2 | 580+(620-580)+900 |
4 | £660 | 115.1 | chan1 | 580+(620-580)+900+(660-620) |
5 | £920 | 114 | chan2 | 580+(620-580)+900+(660-620)+(920-900) |
6 | £940 | 112 | chan2 | 580+(620-580)+900+(660-620)+(920-900)+(940-920) |
If my comment is the correct sugestion, then something like this should do it (£580 is at A2, so the first output is D2):
D2 =A2
D3 =D2+A3-IF(COUNTIF($C$2:C2,C3),INDEX(A:A,MAX(IF($C$2:C2=C3,ROW($A$2:A2)))))
D3 contains an array formula and must be confirmed with ctrl+shift+enter.
Now you can simply copy down from D3.

Can I use BDD by testing low abstraction level code?

I checked several (real world) BDD examples, but all I have found are e2e tests using selenium. I was wondering, is it possible to write unit tests with BDD? If so, how should such a unit test look alike in gherkin? I have a hard time to imagine what to write into the feature and scenario description and how to use them to generate a documentation for example by the java collection framework.
edit
I have found an example here: http://jonkruger.com/blog/2010/12/13/using-cucumber-for-unit-tests-why-not/comment-page-1/
features:
Feature: Checkout
Scenario Outline: Checking out individual items
Given that I have not checked anything out
When I check out item
Then the total price should be the of that item
Examples:
| item | unit price |
| "A" | 50 |
| "B" | 30 |
| "C" | 20 |
| "D" | 15 |
Scenario Outline: Checking out multiple items
Given that I have not checked anything out
When I check out
Then the total price should be the of those items
Examples:
| multiple items | expected total price | notes |
| "AAA" | 130 | 3 for 130 |
| "BB" | 45 | 2 for 45 |
| "CCC" | 60 | |
| "DDD" | 45 | |
| "BBB" | 75 | (2 for 45) + 30 |
| "BABBAA" | 205 | order doesn't matter |
| "" | 0 | |
Scenario Outline: Rounding money
When rounding "" to the nearest penny
Then it should round it using midpoint rounding to ""
Examples:
| amount | rounded amount |
| 1 | 1 |
| 1.225 | 1.23 |
| 1.2251 | 1.23 |
| 1.2249 | 1.22 |
| 1.22 | 1.22 |
step definitions (ruby):
require 'spec_helper'
describe "Given that I have not checked anything out" do
before :each do
#check_out = CheckOut.new
end
[["A", 50], ["B", 30], ["C", 20], ["D", 15]].each do |item, unit_price|
describe "When I check out an invididual item" do
it "The total price should be the unit price of that item" do
#check_out.scan(item)
#check_out.total.should == unit_price
end
end
end
[["AAA", 130], # 3 for 130
["BB", 45], # 2 for 45
["CCC", 60],
["DDD", 45],
["BBB", 75], # (2 for 45) + 30
["BABBAA", 205], # order doesn't matter
["", 0]].each do |items, expected_total_price|
describe "When I check out multiple items" do
it "The total price should be the expected total price of those items" do
individual_items = items.split(//)
individual_items.each { |item| #check_out.scan(item) }
#check_out.total.should == expected_total_price
end
end
end
end
class RoundingTester
include Rounding
end
[[1, 1],
[1.225, 1.23],
[1.2251, 1.23],
[1.2249, 1.22],
[1.22, 1.22]].each do |amount, rounded_amount|
describe "When rounding an amount of money to the nearest penny" do
it "Should round the amount using midpoint rounding" do
RoundingTester.new.round_money(amount).should == rounded_amount
end
end
end
I don't know a way of generating documentation based on this. It is not hopeless, e.g. it is easy to map the Feature: Checkout to the Checkout class. Maybe something similar can be done on the method level. Another possible solution to write helpers specific to this task.
A key idea here is understanding the difference between describing behaviour, and testing. In this context describing behaviour is:
more abstract
easy to read by a wider audience
more focused on what you are doing and why you are doing
less focused on 'how' you are doing something
less exhaustive, we use examples, we don't cover everything
Testing tends to be:
precise
detailed
exhaustive
technical
When you use a BDD tool, e.g. Cucumber to write unit tests you tend to end up with tests that are
verbose
full of technical detail which only a few people can appreciate
very expensive to run
difficult to maintain
So we have different tools and different techniques for different sorts of 'testing'. You get the most bang for your buck by using the right tool for the right job.
Whilst the idea of using one tool for all your testing seems very appealing. In the end its about as sensible as using one tool to fix your car - try pumping up your tyres with a hammer!
BDD describe systems as a black box. If you have any words in there related to the implementation, it's no longer BDD. Inf3rno posted an example with the correct abstraction.
I always ask myself, if the user interface was gone, would I be able to keep the same feature files? If the use cases were to be carried out over voice commands, would the steps still make sense?
Another way to think about it is, steps statements should be facts about a system, not instructions on how to manually test it.
good step definition
Given An account "FOO" with username <username> and password <password>
bad step definition (only applies to ui)
Given I am at the login page
And I enter <username> as the username
And I enter <password> as the password
Full example
Given An account "FOO" with username <username> and password <password>
When Creating an account with username <username> and password BAR
Then An error "Username already in use" is returned
Note that I could implement this last example against the user interface, against the api, but I could also implement it over voice commands ;)

Web parts, dynamically created controls and eventhandlers

What is the best way to display, in a web part, dynamic tables where each cell can cause a postback to display a different set of data?
For example, imagine some financial data:
Table 1: Quarters in year
| Q1 | Q2 | Q3 | Q4 |
Things 1 | 23 | 34 | 44 | 32 |
Things 2 | 24 | 76 | 67 | 98 |
On clicking on the value for Q2, Things 1 (34), this will lead to a second table being displayed instead of Table 1:
Table 2: Weeks in Quarter
| W1 | W2 | W3 | W4 | W5 | W6 | W7 |
SubThings 1 | 231 | 22 | 44 | 22 | 344 | 86 | 12 |
SubThings 2 | 14 | 75 | 47 | 108 | 344 | 86 | 12 |
The problem with the approach I am taking at the moment is that I can create Table 1 in CreateChildControls, which leads to all the events being wired up fine for all the linkbuttons in the cells.
However, because on the postback, I need to create Table 1 in CreateChildControls again, in order to have the eventhandlers correctly wired up, and as the events fire after CreateChildControls, I only know that I need to change the table after CreateChildControls.
Thus, wherever I create Table 2 as a resault (since its after CreateChildControls), I cant get it to wire up events correctly.
Any thoughts?
Regards
Moo
Edit: Solved it.
What you need to do is check in OnPreRender any eventhandler calls, set any flags you need to and then call this.CreateChildControls manually so the new table is created and everything is wired up correctly.
Looks like you are talking about a master/detail situation here. Could you not create two web parts and use web part connection to communicate the required information from table 1, in the first web part to table 2 in the second web part?
J
Just add 2 tables to your web part, hide the second until the first has an element clicked, then set the second table's datasource in the OnClick event handler, set the second grid to visible and the first to hidden...
At Alex's suggestion, here is the answer:
The events need to be tied up prior to them being called, so you need to create the same control in CreateChildControls, allow the event to be called and then resetup everything afterward.
To do this, first do CreateChildControls identically to the prior page, then check in OnPreRender if any eventhandler calls have been made, set any flags you need to and then call this.CreateChildControls manually with the new setup information so the new table is created and everything is wired up correctly.

Resources