Excel formula by combining 2 sheets - excel

I need help in generating excel report.Can anyone of you please help me.
I have 2 excel files. I have tried to paste the files in the question.
file1:
Column A Column B Column C
----------------------------------------------------
$www.example1.com/ab 200 abc
file 2:
URL Hits
-----------------------------------------
$www.something.com/dir/abc 1000
$www.example1.com/ab 100
$www.example2.com/cd 50
$www.example1.com/ab 100
Contains 3 columns -- colA (URLs), colB(Hits in Numerals), colC(some data)
Contains 2 columns -- ColA (URLs), ColB(Hits in Numericals)
Steps:
Take ColA(URLs) from file1 and search in ColA(URL) of files2.
Suppose we get 10 searches, I need to get the Sum of all the ColB(Hits) of file2 and
place it in file1 ColB of the first result.
Any kind of hints would be helpful. I tried many options, but none of them worked.

Should be possible under the following conditions:
Both Files are open
the URLs are the same
Then use code similar to this example:
=SUMIF([Name of file 2]NameOfSheet!$A$2:$C$6;A2;[Name of file 2]NameOfSheet!$B$2:$B$6)
Where $A$2:$C$6 is the range of data in file 2 and A2 is the cell with the value in file 1 and $B$2:$B$6 is the range of data to be summed up within file 2.
Hope this helps.

Related

Find and replace function in Alteryx -How it can be done in Azure Data Flow

I have a "Find and replace " tool in Alteryx which finds the Col value of csv file1 and replaces it with the look up csv file2 which has 2 columns like
Word and ReplacementWord.
Example :
Address is a col in Csv file1 which has value like St.Xyz,NY,100067
And Csv file 2 has
Word ReplacementWord
NY NewYork
ZBW Zimbawe etc....
Now the final Output should be
Address
St.Xyz,NewYork,100067
Please help guys .
Hey here's the problem .I have a "Find and replace " tool in Alteryx which finds the Col value of csv file1 and replaces it with the look up csv file2 which has 2 columns like
Word and ReplacementWord.
Example :
Address is a col in Csv file1 which has value like St.Xyz,NY,100067
And Csv file 2 has
Word ReplacementWord
NY NewYork
ZBW Zimbawe etc....
Now the final Output should be
Address
St.Xyz,NewYork,100067
Please help guys .
I tried to reproduce your scenario in my environment to achieve the desired output I Followed below steps:
In dataflow activity I took 2 Sources
Source 1 is the file which contain the actual address.
Source 2 is the file which contain the country codes with names.
After that I took lookup to merge files based on the country code. In lookup condition I provided split(Address,',')[2] to split the address string with comma and get the 2nd value from it Which will be the country code based on this : Xyz,NY,100067 and column_1 of 2nd source.
Lookup data preview:
Now took Derived Column and gave column name as Address with the expression replace(Address, split(Address,',')[2], Column_2) It will replace the What we split in lookup from Address string to value of Column_2
Derived column preview:
then took select and deleted the unwanted columns
Select Preview:
now providing this to sink dataset
Output

How to select a column from a text file which has no header using python

I have a text file which is tabulated. When I open the file in python using pandas, it shows me that the file contains only one column but there are many columns in it. I've tried using pd.DataFrames, sep= '\s*', sep= '\t', but I can't select the column since there is only one column. I've even tried specifying the header but the header moves to the exterior right side and specifies the whole file as one column only. I've also tried .loc method and mentioned specific column number but it always returns rows. I want to select the first column (A, A), third column (HIS, PRO) and fourth column (0, 0).
I want to get the above mentioned specific columns and print it in a CSV file.
Here is the code I have used along with some file components.
1) After opening the file using pd:
[599 rows x 1 columns]
2) The file format:
pdb_id: 1IHV
0 radii_filename: MD_threshold: 4
1 A 20 HIS 0 MaximumDistance
2 A 21 PRO 0 MaximumDistance
3 A 22 THR 0 MaximumDistance
Any help will be highly appreciated.
3) code:
import pandas as pd
df= pd.read_table("file_path.txt", sep= '\t')
U= df.loc[:][2:4]
Any help will be highly appreciated.
If anybody gets any file like this, it can be opened and the column can be selected using the following codes:
f=open('file.txt',"r")
lines=f.readlines()
result=[]
for x in lines:
result.append(x.split(' ')[range])
for w in result:
s='\t'.join(w)
print(s)
Where range is the column you want to select.

Trouble of finding matching values from a different workbook excel

I have this excel files, this is what my data looks like in the first workbook, which could have 2000 + entries and in a general format.
A
1 5001987
2 1458285
3 2506588
4 4745089
5 2540486
.
.
My other excel file looks like this, but also in a general, but the data within it is generated by something else which results of its output like this.
A
1 ['2506588']
2 ['2540181']
3 ['2553486']
4 ['2540181']
5 ['2540389']
6 ['2553384']
On a specific column somewhere, i have written this function:
=IF(VLOOKUP([outputbarcode.xlsx]Sheet1!$B$4,B2:B1992,2,TRUE),"Y","N")
I simply want it to look if excefile 2 cell A1 value exist in excelfile 1, print Y, if not, N.
Running the function above returns #N/A
Is there something wrong with my function?
On excel file 2, try:
=IFERROR(IF(INDEX(MATCH(VALUE(MID(A1,3,7)), Sheet1!A:A, 0),)>0, "Y"), "N")
Sheet1 is excel file 1 here. I prefer index & match to vlookup. You can search why.
I suggest that you do an edit/replace and remove those odd characters permanently. Then you won't need the mid() function but the rest of #Sangbok lee answer will be fine and that may help with future operations.

Compare multiple columns, pull out only cells that appear in every column

I have 10 or so columns in my worksheet. Each column contains about 200 names, and there is no other data on the sheet.
What I'd like to do is create a new column that only contains the names that are common between the columns. So essentially compare each cell in each column to all the other cells in all the other columns, and only return the the common cells.
For example:
Column1 : name_A, name_C, name_F
Column2: name_C, name_B, name_D
Column3: name_C, name_Z, name_X
So in this example, the new column would only contain name_C, because it's the only value common to all three columns.
Is there any way to do this? My knowledge of Excel is quite poor, and I can't find anything similar to my problem online so I would appreciate any help.
Thanks for reading,
N
Put everything on a single spreadsheet and create a pivot table is probably more efficient than the algorithm you have on your mind.
here is my mock-up. I added extra names to demonstrate better
D(formula) has the easiest version. this will list only values that appear in all columns, but these will appear on the same lines as the corresponding name in column A, with blanks, and not sorted (giving D(result))
IF you would like all the names to appear the the top - as shown here in column E you can either sort your table (you will have to re-sort if the columns change) OR you can use my solution below:
get yourself the MoreFunc Addon for Excell ( here is the last working download link I found, and here is a good installation walk-through video )
once all is done select cells E1:E8, click the formula bar and type the following: =UNIQUEVALUES(IF(COUNTIF(A2:C9,A2:A9)=3,A2:A9,""))
accept the formula by clicking ctrl-shift-enter (this will create an array-formula and curly braces will appear around your formula)
A B C D(formula) D(result) E(result - sorted)
-------------------------------------------------------------------------------------------------------
1 | name_A name_C name_C =IF(COUNTIF($A$1:$C$8,A1)=3,A1,"") name_m
2 | name_C name_B name_Z =IF(COUNTIF($A$1:$C$8,A2)=3,A2,"") name_C name_C
3 | name_F name_D name_X =IF(COUNTIF($A$1:$C$8,A3)=3,A3,"")
4 | name_t name_o name_g =IF(COUNTIF($A$1:$C$8,A4)=3,A4,"")
5 | name_y name_p name_h =IF(COUNTIF($A$1:$C$8,A5)=3,A5,"")
6 | name_u name_k name_7 =IF(COUNTIF($A$1:$C$8,A6)=3,A6,"")
7 | name_i name_5 name_9 =IF(COUNTIF($A$1:$C$8,A7)=3,A7,"")
8 | name_m name_m name_m =IF(COUNTIF($A$1:$C$8,A8)=3,A8,"") name_m

Excel - Looping through data till it finds the right one

I'm pulling data from one Excel table into another using an IF statement. I want it to check two fields and if it is a match I want it to print something and if it doesn't then I want it to continue searching. If there is no absolute match then leave the field blank.
I believe I'm running into an syntax issue but after numerous iterations I can't get it to pull everything over. Here is my current syntax.
=IF(BM5<>"External","",IF(AND(S5=VLOOKUP(A5,ExternalOnly,5,FALSE),A5=VLOOKUP(A5,ExternalOnly,1,FALSE)),S5,"")
Add an additional ')' at the end of the formula and see if this works.
i.e.
try this
=IF(BM5<>"External","",IF(AND(S5=VLOOKUP(A5,ExternalOnly,5,FALSE),A5=VLOOKUP(A5,ExternalOnly,1,FALSE)),S5,""))
I use this:
INDEX($E$1:$E$7,MATCH(A7,$D$1:$D$7,0))
Here is a sample table to illustrate. the formula is in the cells of column B (B7 in this case). How it works is the match finds the corresponding entry in the target list ($D$1:$D$7) for our selected value (A7). It returns the index from that list and the INDEX() function lets us select a different column from the matched row to return.
A B C D E
------ --- --- ------ ---
011597 99 012062 3
012062 3 012142 8
012136 3 011597 99
012142 8 012136 3
014157 2 014157 2
011582 87 011582 87
011707 101 011707 101

Resources