Sorry, i can't use the right terms but i try to explain my task:
In Calc, or Spreadsheet I have two worksheets with columns like this:
| ID|
| 32|
| 51|
| 51|
| 63|
| 70|
and
| ID|Name |
| 01|name1 |
| 02|name2 |
...
| 69|name69 |
| 70|name70 |
i need to combine/assign/migrate these together, like:
| ID|Name |
| 32|name32 |
| 51|name51 |
| 51|name51 |
| 63|name63 |
| 70|name70 |
I have no idea how can is start to solve it. Please help!
Thank you #PsysicalChemist, the VLookup function is working in Calc to.
Related
I have dataframe like this:
+---+--------------------------------------+-----------+
| | envelopeid | message |
+---+--------------------------------------+-----------+
| 1 | d55edb65-dc77-41d0-bb53-43cf01376a04 | CMN.00002 |
| 2 | d55edb65-dc77-41d0-bb53-43cf01376a04 | CMN.00004 |
| 3 | d55edb65-dc77-41d0-bb53-43cf01376a04 | CMN.11001 |
| 4 | 5cb72b9c-adb8-4e1c-9296-db2080cb3b6d | CMN.00002 |
| 5 | 5cb72b9c-adb8-4e1c-9296-db2080cb3b6d | CMN.00001 |
| 6 | f4260b99-6579-4607-bfae-f601cc13ff0c | CMN.00202 |
| 7 | 8f673ae3-0293-4aca-ad6b-572f138515e6 | CMN.00002 |
| 8 | fee98470-aa8f-4ec5-8bcd-1683f85727c2 | TKP.00001 |
| 9 | 88926399-3697-4e15-8d25-6cb37a1d250e | CMN.00002 |
| 10| 88926399-3697-4e15-8d25-6cb37a1d250e | CMN.00004 |
+---+--------------------------------------+-----------+
I've grouped it with grouped = df.groupby('envelopeid')
And I need to remove all groups from the dataframe and stay only that groups that have messages (CMN.00002) or (CMN.00002 and CMN.00004) only.
Desired dataframe:
+---+--------------------------------------+-----------+
| | envelopeid | message |
+---+--------------------------------------+-----------+
| 7 | 8f673ae3-0293-4aca-ad6b-572f138515e6 | CMN.00002 |
| 9 | 88926399-3697-4e15-8d25-6cb37a1d250e | CMN.00002 |
| 10| 88926399-3697-4e15-8d25-6cb37a1d250e | CMN.00004 |
+---+--------------------------------------+-----------+
tried
(grouped.message.transform(lambda x: x.eq('CMN.00001').any() or (x.eq('CMN.00002').any() and x.ne('CMN.00002' or 'CMN.00004').any()) or x.ne('CMN.00002').all()))
but it is not working properly
Try:
grouped = df.loc[df['message'].isin(['CMN.00002', 'CMN.00002', 'CMN.00004'])].groupby('envelopeid')
Try this: df[df.message== 'CMN.00002']
outdf = df.groupby('envelopeid').filter(lambda x: tuple(x.message)== ('CMN.00002',) or tuple(x.message)== ('CMN.00002','CMN.00004'))
So i figured it up.
resulting dataframe will got only groups that have only CMN.00002 message or CMN.00002 and CMN.00004. This is what I need.
I used filter instead of transform.
Have a problem with a formula that I can't seem to wrap my head around. When presented with the same Object, I need the formula to return a 1 when Object is there twice, at the row where the POP-number is the highest (Which would be POP03 every time). It does work, but the problem appears when Object is seen only once. It should give a 1 then as well, but I can't get it to work. What am I missing?
Sample data looks as following;
+-------+------------+
| POP | Object |
+-------+------------+
| POP02 | B0005-8701 |
| POP02 | B0005-8702 |
| POP02 | B0005-8703 |
| POP02 | B0005-8704 |
| POP02 | B0006-4359 |
| POP02 | LBK-0013 |
| POP03 | LBK-0017 |
| POP02 | LBK-0017 |
| POP03 | LBK-0018 |
| POP02 | LBK-0018 |
| POP03 | LBK-0019 |
| POP02 | LBK-0019 |
| POP03 | LBK-0020 |
| POP02 | LBK-0020 |
| POP03 | LBK-0021 |
| POP02 | LBK-0021 |
+-------+------------+
Used formula is as following (POP is in Column B, and Object in Column C);
=IF(C2="";"";IF(C2=C3;IF(Q2<Q3;0;IF(Q2>Q3;1;))))
I would use a countifs like this:
=IF(B$2:B$20="","",IF(COUNTIFS(C$2:C$20,C2,B$2:B$20,">"&B2)=0,1,""))
Say I have a column in a SparkSQL DataFrame like this:
+-------+
| word |
+-------+
| chair |
| lamp |
| table |
+-------+
I want to explode out all the prefixes like so:
+--------+
| prefix |
+--------+
| c |
| ch |
| cha |
| chai |
| chair |
| l |
| la |
| lam |
| lamp |
| t |
| ta |
| tab |
| tabl |
| table |
+--------+
Is there a good way to do this WITHOUT using udfs, or functional programming methods such as flatMap in spark sql? (I'm talking about a solution using the codegen optimal functions in org.apache.spark.sql.functions._)
Technically it is possible but I doubt it will perform any better than a simple flatMap (if performance is the reason to avoid flatMap):
val df = Seq("chair", "lamp", "table").toDF("word")
df.withColumn("len", explode(sequence(lit(1), length($"word"))))
.select($"word".substr(lit(1), $"len") as "prefix")
.show()
Output:
+------+
|prefix|
+------+
| c|
| ch|
| cha|
| chai|
| chair|
| l|
| la|
| lam|
| lamp|
| t|
| ta|
| tab|
| tabl|
| table|
+------+
There are data:
AutoFill need it with this order. I change cell format to:
then made autocomplete, but the result is the number of filled not by 7 units, and 9 or 10 pieces:
how to make autocomplete in order to each number were 7 pieces?
Try the following in the top cell,
'EN-US
=INT((ROW(1:1)-1)/ 7)+1
'RU-RU
=ЦЕЛОЕ((СТРОКА(1:1)-1)/ 7)+1
Fill down as necessary.
Just use the formulas:
+--------+
| 1 |
| =A1 |
| =A2 |
| =A3 |
| =A4 |
| =A5 |
| =A6 |
| =A7+1 |
| =A8 |
| =A9 |
| =A10 |
| =A11 |
| =A12 |
| =A13 |
| =A14+1 |
| =A15 |
| =A16 |
| =A17 |
| =A18 |
| =A19 |
| =A20 |
| =A21+1 |
| =A22 |
| =A23 |
| =A24 |
| =A25 |
| =A26 |
| =A27 |
| =A28+1 |
| =A29 |
| =A30 |
| =A31 |
| =A32 |
| =A33 |
| =A34 |
| =A35+1 |
| =A36 |
| =A37 |
| =A38 |
| =A39 |
| =A40 |
| =A41 |
+--------+
Beginning in cell A1, where you put a 1, in A2 just =A1 to take the value of the above cell, to A7, and in A8 you put =A7+1.
This way you get seven 1's, seven 2's and so on.
I don't really know how to search for this question or an appropriate title, so I hope that this will make sense.
I'm trying to construct an Excel spreadsheet to keep track of functions of a software that are currently have tests made for them. The spreadsheet looks something like below where A-F are placeholders for the tests and 1-5 are placeholders for functions.
| | A | B | C | D | E | F |
|:-:|---|---|---|---|---|---|
| 1 | X | | | | | X |
| 2 | | | | | | |
| 3 | | X | | | | |
| 4 | | | X | | | |
| 5 | | | | X | X | |
I would like to have another column at the end that would do something like this:
| | A | B | C | D | E | F | Tested? |
|:-:|---|---|---|---|---|---|---------|
| 1 | X | | | | | X | Yes |
| 2 | | | | | | | No |
| 3 | | X | | | | | Yes |
| 4 | | | X | | | | Yes |
| 5 | | | | X | X | | Yes |
where the final column is an if statement that will display a conditional string base on if there are any entries in the row. I know that Excel's IF statements work something like this =IF(A1=10,"YES","NO") but I can't think how I would construct an IF statement that would print YES or NO based on whether there are any entries at all in the row.
EDIT: To add a little more detail. I've thought about constructing an IF statement like this: =IF(SUM(C3:AI3)>0, "YES", "NO") and this works essentially if I use 1s or 0s instead of X or O but I'd rather use the latter. Or really I'd just rather use strings instead of integers.
You can use following formula:
=IF(COUNTA(A1:F1)>0,"Yes","No")
You're looking for the ISBLANK function.
Your solution should be something like this:
=IF(ISBLANK(A1:F1), "Yes","No")