How can I generate a list of linked data with SQL - aggregation

I've got a table with this kind of data:
I would like to this result with a query:
Do you have any idea how to achieve this ?
I know that I need to use XMLAGG somewhere to get the final concatenation but I don't know how to group A B C and D (the rule is because 1 has A & B, 2 has B & C then A is linked to C, etc)
Thanks

Related

How to produce a table of three inputs to reach a given output? (Excel model)

I have a very detailed excel model to calculate the profitability of a project, that we can call P.
The model has been simplified to compute from 3 unrelated variables. I would like to automatically create a table that shows how inputs A, B and C might vary in order to produce a pre-defined level of profitability, P. For instance, if A = 4 & B = 30, then C must = 2 in order for P to equal 20%. Likewise, if A = 5 & B = 25, then C must = 3 in order for P to equal 20%. A and B should be tested at sensible increments, perhaps 8 intervals each.
A laborious (not scalable) equivalent would be to manually define A and B, then goal-seek C to our pre-defined level of P - we'd then repeat for each combination of A and B at the given intervals and record in a two-way table.
I believe a conventional two-way data table would be pratical if the model sitting behind the inputs were greatly simplified, unfortunately this isn't possible.
Thanks to anyone that can lend a hand. Kind regards.
I think the best way to approach this will be with a VBA macro and the prebuilt GoalSeek Function something like this (p is in cell D1) :
Range(”D1”).GoalSeek Goal:=20 _
ChangingCell:=Range(“C1”)

Matching two columns to get the desired value

A C D
12:58:09 12:58:09 400.9
12:58:16 12:58:10 468.0
12:58:20 12:58:11 425.9
12:58:34 12:58:12 432.4
12:58:38 12:58:13 439.3
12:58:49 12:58:14 442.5
12:58:53 12:58:15 445.2
12:58:56 12:58:16 447.2
12:59:00 12:58:17 449.7
12:59:04 12:58:18 450.4
12:59:07 12:58:19 453.9
12:59:11 12:58:20 454.3
I have a data set like this. I want to make a new helper column B that matches column A and C and gives the value D. So my Bshould look like 400.9, 447.2, 454.3, and so on. Can anyone suggest me what approach should I use for this problem? Thanks!
Put this in column B and drag it down:
=VLOOKUP(A1,$C$1:$D$100,2,FALSE)

How to match the ordering and sorting of multiple columns in Excel

I have data that look like this (going on for many more rows):
What I want to do is:
Match the relationship of C and G to the relationship of I and J.
For example, I:Q1652 matches up with J:Q1662; therefore, C:Q1652 should also match up with G:Q1662.
At the same time, A & B and E & F should maintain their relationships with C and G, respectively
For example, when C:Q1652 and G:Q1662 are being matched, they should carry with them their respective rows/values from columns A & B and E & F.
Please let me know if there's anything more I can clarify! Thanks!
Please see K1:N1 cells in the below graph.
K1: =INDEX(A:A,MATCH($I1,$C:$C,0))
L1: =INDEX(B:B,MATCH($I1,$C:$C,0))
M1: =INDEX(E:E,MATCH($J1,$G:$G,0))
N1: =INDEX(F:F,MATCH($J1,$G:$G,0))

How to check if elements of macro are in another macro

I have a simple-seeming problem, but in practice it seems to be more involved. In python, for example, it seems like it would be much more straightforward. But I would really like to learn how to do this in Stata.
Say that I have a big dataset. I have several string variables, S1, S2, and S3. I get a subset of S1 based on some criteria. Let's say that this gets me (after sorting and only the data of interest are displayed):
S1
1 A
2 B
3 C
4 D
5 E
Based on different criteria, I get, for S2:
S2
1 B
2 B
3 C
4 F
For S3:
S3
1 B
2 Long string
What I am interested in doing is to get a list of all of the distinct values across S1, S2, and S3. One way I have thought about doing this is:
Save all desired values of S1 into a macro, M1. I didn't figure out how one is able to do this.
Save all desired values of S2 into a macro, M2.
Check if the values of M2 are in M1. Do not add the values of M2 to M1 that are already in M1, but do add the values of M2 to M1 that are not already there. It seems like this post is similar to how to do this step. (Why is there a : in front of list?)
Repeat step 3, except for S3/M3 instead of S2/M2.
This would produce the macro M1 with values:
A B C D E F Long String
Note that I do not need this to be in a macro. If it could be in a matrix or some other way, that would work as well. The important part is to get the information.
Several ways to do this.
Many assumptions made in this example (many things are not clear in your post):
clear
set more off
input ///
str15(s1 s2 s3)
a "b" "b"
b "b" "long string"
c "c" ""
d "f" ""
e "" ""
end
list
stack s*, into(news) clear
bysort news : keep if _n == 1
drop _stack
list
If you want to work your way through, using macros, then help macrolists and help levelsof can aid:
clear
set more off
input ///
str15(s1 s2 s3)
a "b" "b"
b "b" "long string"
c "c" ""
d "f" ""
e "" ""
end
list
local uvalues
foreach var of varlist _all {
levelsof `var', local(loc`var')
local uvalues : list uvalues | loc`var'
}
display `"`uvalues'"'
Saying more about how your variables are organized (e.g. one or several files), whether you care or not to destroy the original data set, the treatment of missings, etc. can probably get you an ad hoc answer.

String matching on two columns in [R]

I am looking to match multiple string criteria and then subset the row in R, using grepl to find the match. I have found a nice solution from another post where some specific code is used (but you get the idea): subset(GEMA_EO5, grepl(paste(l, collapse="|"),GEMA_EO5$RefSeq_ID))
I am wondering if it is possible to grepl in two columns, instead of just RefSeq_ID in the example above. That is, in grepl via any other method. In other words, I would like to look for the options in l not just in one column, but in two (or however many). Is this possible?
eg.: 3 columns, a b and c. I would like to criteria such that T (rows 3 and 4) is selected, despite the format "T I" in (3,b). it should identify both (4,a) and (3,b), hence the link to the previous question. I want it to look in column a AND column b, not one or the other.
a b c
A A C P L
V V B W E E
W T I P J G
T W P J
Here's some demo data to show how this works:
set.seed(1234)
dat <- data.frame(A = sample(letters[1:3],10,TRUE),
B = sample(letters[1:3],10,TRUE))
Using [ to subset makes this a lot more clear in my opinion - we can use grepl to give a logical vector based on a match, and use | to combine two tests (on multiple columns). If you wanted a subset of all the rows that contained an 'a' in either column:
dat.a <- dat[with(dat, grepl("a", A)|grepl("a", B)),]
A B
1 b a
2 b a
3 a c
5 a a
9 a a

Resources