How do I get the Gross Value Added (GVA) of countries/industries from MRIO models? - io

I want to calculate the Gross Value Added for different countries and industries using multi-regional input-output (MRIO) tables. However, I struggle to find a good explanation of how this is done based on the data available. The definition of the GVA (Gross Value Added) is the output of a country/industry less the intermediate consumption, and it is related to the GDP by:
GVA = GDP + subsidies - taxes
So far, I have used the "extensions" or "satellite accounts" that provide the Value Added (VA) disaggregated across different flows, i.e. example from Exiobase in the picture. The VA is the sum of all 12 to my understanding. However, based on the definition of the GVA, I have subtracted 1-3 since these are taxes (so GVA = sum of line 4-12). To me, this seems like the correct approach, but I have not succeeded in finding an explanation that could confirm/disprove. I also become uncertain due to the naming of the extension, i.e. "value added" sounding like "gross value added". Does anyone know the correct way of doing this?
Finally, in MRIO x is termed "gross output" being the total output to final demand + intermediate consumption:
x = Ax + y (Ax = intermediate, y = demand)
or
x = (I-A)^-1 * y = L*y (L = Leontif inverse/requirement matrix)
Does this mean that I can also derive the GVAs from x by subtracting the intermediate consumption? In my mind, this will just leave me with "y", but there might be a another smart way?
Thanks in advance!

From what I understand, yes you can !
You have to differentiate Z = Ax summed along its rows or along its columns
x - rowSum(Z) is the GVA.
x - colSum(Z) is the total final demand.
Regarding Exiobase, I don't have a real answer.
I found that summing all lines of the VA (keeping lines 1-3), I get "quasi" the same results as subtracting the row sum of Z to x.
Which is stange...

Related

A function in excel that outputs a Y or an N if all the values are equal for a specific value

Components
Serial Num
Market 1
Market 2
Market 3
1234
100000000
N
Y
N
1233
100000001
N
Y
Y
1235
100000000
Y
N
N
1236
100000000
N
Y
N
1231
100000001
Y
Y
Y
I have a table with over 1k rows each with repeating and different serial numbers. The table's market status (Y or N) is based on an Xlookup that checks the component number and the components market status in another tab. The components market status is accurate. However, since the serial numbers repeat, I'm trying to create a function that can check that if for example, check if Serial number "100000001" is fully qualified Market 2. When it checks Market 2 would be "Y" because all of the "100000001" have a Y in Market 2 but if we do the same for "100000000" it should be "N" because not all of its Market qualify or have "Y". It would be simple there were a small amount of serial numbers but there's over 1k so I was trying to find out if anyone knows if this could even be possible with a spreadsheet of this size. In essence I think it would be like an Xlookup function but it would need to find and check the market status for each Serial number found and then equate if the value is equal to "Y" for each, if they all are then "Y" would come out, otherwise "N".
I hope this makes sense. It's the first time I post on stackoverflow, as such, definitely feel free to let me know if there's anything I can do to make this question clearer or up to standard with StackOverFlow language.
--
I tried using the =FILTER function but when it outputs data it creates a spill since for each serial I would need space below to give me the values. From there I would create an =AND(EXACT([range of the values output],"Y")). This would determine if all the values in the range are equal to Y, if they are "TRUE" would be the output, if not then "FALSE".
I also tried creating an xlookup function that would look for the smallest value. In doing so I would make the Market outputs Y1 and N. What I hoped would happen is that if it came across an "N", it would give me that output first. Since all it takes for the market status of the serial to be "N" is one N in its market. Only way it can be a Y is if they're all the same. Nonetheless, this does not work since XLookup only looks for the value of the serial code and not the market status value.
Hey guys thank you all for your support! I figured out a solution to the madness haha. I downloaded Ablebits which has the ability to merge together cells that contain duplicate values and the columns that contain those values are included together separated by semicolons. For instance, if merged all the cells in my example, serial number "100000000" would have the following under each market: Market 1 (N;Y;N), Market 2 (Y;N;Y), Market 3 (N;N;N). From there I created 2 color conditions in this order. 1.) if any cell contains an "N" then it would be red indicating that it isn't qualified. 2.) If any cell contains a "Y" then it would be green [since the first rule made those that contain an N red, then this one would only apply to those that have all Y;Y;Y;Y or just 1 "Y" if it didn't repeat. It does not exactly give me an output of Y or N but with all the values in a cell we can distinguish which ones have all qualifications and which don't.

Finding the optimal selections of x number per column and y numbers per row of an NxM array

Given an NxM array of positive integers, how would one go about selecting integers so that the maximum sum of values is achieved where there is a maximum of x selections in each row and y selections in each column. This is an abstraction of a problem I am trying to face in making NCAA swimming lineups. Each swimmer has a time in every event that can be converted to an integer using the USA Swimming Power Points Calculator the higher the better. Once you convert those times, I want to assign no more than 3 swimmers per event, and no more than 3 races per swimmer such that the total sum of power scores is maximized. I think this is similar to the Weapon-targeting assignment problem but that problem allows a weapon type to attack the same target more than once (in my case allowing a single swimmer to race the same event twice) and that does not work for my use case. Does anybody know what this variation on the wta problem is called, and if so do you know of any solutions or resources I could look to?
Here is a mathematical model:
Data
Let a[i,j] be the data matrix
and
x: max number of selected cells in each row
y: max number of selected cells in each column
(Note: this is a bit unusual: we normally reserve the names x and y for variables. These conventions can help with readability).
Variables
δ[i,j] ∈ {0,1} are binary variables indicating if cell (i,j) is selected.
Optimization Model
max sum((i,j), a[i,j]*δ[i,j])
sum(j,δ[i,j]) ≤ x ∀i
sum(i,δ[i,j]) ≤ y ∀j
δ[i,j] ∈ {0,1}
This can be fed into any MIP solver.

Question(s) regarding computational intensity, prediction of time required to produce a result

Introduction
I have written code to give me a set of numbers in '36 by q' format ( 1<= q <= 36), subject to following conditions:
Each row must use numbers from 1 to 36.
No number must repeat itself in a column.
Method
The first row is generated randomly. Each number in the coming row is checked for the above conditions. If a number fails to satisfy one of the given conditions, it doesn't get picked again fot that specific place in that specific row. If it runs out of acceptable values, it starts over again.
Problem
Unlike for low q values (say 15 which takes less than a second to compute), the main objective is q=36. It has been more than 24hrs since it started to run for q=36 on my PC.
Questions
Can I predict the time required by it using the data I have from lower q values? How?
Is there any better algorithm to perform this in less time?
How can I calculate the average number of cycles it requires? (using combinatorics or otherwise).
Can I predict the time required by it using the data I have from lower q values? How?
Usually, you should be able to determine the running time of your algorithm in terms of input. Refer to big O notation.
If I understood your question correctly, you shouldn't spend hours computing a 36x36 matrix satisfying your conditions. Most probably you are stuck in the infinite loop or something. It would be more clear of you could share code snippet.
Is there any better algorithm to perform this in less time?
Well, I tried to do what you described and it works in O(q) (assuming that number of rows is constant).
import random
def rotate(arr):
return arr[-1:] + arr[:-1]
y = set([i for i in range(1, 37)])
n = 36
q = 36
res = []
i = 0
while i < n:
x = []
for j in range(q):
if y:
el = random.choice(list(y))
y.remove(el)
x.append(el)
res.append(x)
for j in range(q-1):
x = rotate(x)
res.append(x)
i += 1
i += 1
Basically, I choose random numbers from the set of {1..36} for the i+q th row, then rotate the row q times and assigned these rotated rows to the next q rows.
This guarantees both conditions you have mentioned.
How can I calculate the average number of cycles it requires?( Using combinatorics or otherwise).
I you cannot calculate the computation time in terms of input (code is too complex), then fitting to curve seems to be right.
Or you could create an ML model with iterations as data and time for each iteration as label and perform linear regression. But that seems to be overkill in your example.
Graph q vs time
Fit a curve,
Extrapolate to q = 36.
You might want to also graph q vs log(time) as that may give an easier fitted curve.

How to calculate with the Poisson-Distribution in Matlab?

I’ve used Excel in the past but the calculations including the Poisson-Distribution took a while, that’s why I switched to SQL. Soon I’ve recognized that SQL might not be a proper solution to deal with statistical issues. Finally I’ve decided to switch to Matlab but I’m not used to it at all, my problem Is the following:
I’ve imported a .csv-table and have two columns with values, let’s say A and B (110 x 1 double)
These values both are the input values for my Poisson-calculations. Since I wanna calculate for at least the first 20 events, I’ve created a variable z=1:20.
When I now calculated let’s say
New = Poisspdf(z,A),
it says something like non-scalar arguments must match in size.
Z only has 20 records but A and l both have 110 records. So I’ve expanded Z= 1:110 and transposed it:
Znew = Z.
When I now try to execute the actual calculation:
Results = Poisspdf(Znew,A).*Poisspdf(Znew,B)
I always get only a 100x1 Vector but what I want is a matrix that is 20x20 for each record of A and B (based on my actual choice of z=1:20, I only changed to z=1:110 because Matlab told that they need to match in size).
So in this 20x20 Matrix there should always be in each cell the result of a slightly different calculation (Poisspdf(Znew,A).*Poisspdf(Znew,B)).
For example in the first cell (1,1) I want to have the result of
Poisspdf(0,value of A).*Poisspdf(0,value of B),
in cell(1,2): Poisspdf(0,value of A).*Poisspdf(1,value of B),
in cell(2,1): Poisspdf(1,value of A).*Poisspdf(0,value of B),
and so on...assuming that it’s in the Format cell(row, column)
Finally I want to sum up certain parts of each 20x20 matrix and show the result of the summed up parts in new columns.
Is there anybody able to help? Many thanks!
EDIT:
Poisson Matrix in Excel
In Excel there is Poisson-function: POISSON(x, μ, FALSE) = probability density function value f(x) at the value x for the Poisson distribution with mean μ.
In e.g. cell AD313 in the table above there is the following calculation:
=POISSON(0;first value of A;FALSE)*POISSON(0;first value of B;FALSE)
, in cell AD314
=POISSON(1;first value of A;FALSE)*POISSON(0;first value of B;FALSE)
, in cell AE313
=POISSON(0;first value of A;FALSE)*POISSON(1;first value of B;FALSE)
, and so on.
I am not sure if I completely understand your question. I wrote this code that might help you:
clear; clc
% These are the lambdas parameters for the Poisson distribution
lambdaA = 100;
lambdaB = 200;
% Generating Poisson data here
A = poissrnd(lambdaA,110,1);
B = poissrnd(lambdaB,110,1);
% Get the first 20 samples
zA = A(1:20);
zB = B(1:20);
% Perform the calculation
results = repmat(poisspdf(zA,lambdaA),1,20) .* repmat(poisspdf(zB,lambdaB)',20,1);
% Sum
sumFinal = sum(results,2);
Let me know if this is what you were trying to do.

How to produce a table of three inputs to reach a given output? (Excel model)

I have a very detailed excel model to calculate the profitability of a project, that we can call P.
The model has been simplified to compute from 3 unrelated variables. I would like to automatically create a table that shows how inputs A, B and C might vary in order to produce a pre-defined level of profitability, P. For instance, if A = 4 & B = 30, then C must = 2 in order for P to equal 20%. Likewise, if A = 5 & B = 25, then C must = 3 in order for P to equal 20%. A and B should be tested at sensible increments, perhaps 8 intervals each.
A laborious (not scalable) equivalent would be to manually define A and B, then goal-seek C to our pre-defined level of P - we'd then repeat for each combination of A and B at the given intervals and record in a two-way table.
I believe a conventional two-way data table would be pratical if the model sitting behind the inputs were greatly simplified, unfortunately this isn't possible.
Thanks to anyone that can lend a hand. Kind regards.
I think the best way to approach this will be with a VBA macro and the prebuilt GoalSeek Function something like this (p is in cell D1) :
Range(”D1”).GoalSeek Goal:=20 _
ChangingCell:=Range(“C1”)

Resources