Using excel 2013 extract the country extension from a large series urls - excel

I have a large list of urls that i need to break down by a variety of factors. One factor is if it is for a country other than the US. I have complied a list of all the extensions I'm searching for within the workbook (i.e., .az,.mx,.nl, etc.).
Due to the variety in the urls, simple extraction of TLD won't work on all. For example:
b2 http://www.domainname.in
b3 http://www.domainname.co.in
b4 http://www.domainname.in/un.htm
b5 https://www.domainname.in/wp.content/_input_3_.txt I
I was using RIGHT(b2,LEN(b2)-FIND("*",SUBSTITUTE(b2,".","*",LEN(b2)-LEN(SUBSTITUTE(b2,".","")))))) to extract the TLD for those in the b2 example. However, it will not work for those with other endings. I was considering searching for the third "/" in example b4, however longer urls (like b5) prevent that.
Is there a way to identify if there is a country extension present? Furthermore, would be be possible to have the column next to it list what the country extension is, if one exists.
I currently have no working knowledge of Macros or VBA,

Ok, this is gonna be crazy. So we know that the country code is attached to the domain name, and is usually the last two letters. The first thing I would do is separate out the domain name. The formula would look something like this:
(I put all the URL's in column A so you'd have to fix the column reference)
=TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1,"//","/"),"/",REPT(" ",999)),COLUMNS($A:B)*999-998,999))
What this does is replace the double // with a single / it then uses it as a delimiter and get's the second column which is the domain name.
Next we need to find the country code of the domain which you've already done but I'll use a slightly different formula. If you had just the domain name in column A by itself you could extract it like this.
=TRIM(RIGHT(SUBSTITUTE($A1,".",REPT(" ",LEN($A1))),LEN($A1)))
Now that we have those two formulas we can replace the $A1 reference of the second formula with the first formula so that we aren't using a bunch of cells to extract the code. The combined formula would look like this:
=TRIM(RIGHT(SUBSTITUTE(TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1,"//","/"),"/",REPT(" ",999)),COLUMNS($A:B)*999-998,999)),".",REPT(" ",LEN(TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1,"//","/"),"/",REPT(" ",999)),COLUMNS($A:B)*999-998,999))))),LEN(TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1,"//","/"),"/",REPT(" ",999)),COLUMNS($A:B)*999-998,999)))))
Hope that helps

Related

Extract a Word from String Containing a Specific Multiple Words

"I'm setting up a pivot in excel, and want to extract specific words from a data set of text.
I have tried using the below formula to extract one particular word, but want to nest the multiple formula to extract other words as well
=TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",99)),MAX(1,FIND("Evaluation",SUBSTITUTE(A1," ",REPT(" ",99)))-50),99))
The above formula works but only for one word. I want to create nested formula to search first word or second word or third...
If your goal is to search an array for a a substring, if that substring matches any words in a list, and if so, return the matched substring, as in the post suggested by JvdV, use the formula below, which I have modified.
I recommend, in a different worksheet, add a table with a list of the words you want to find, like this. Highlight the range of cells, including the header, then Home > Format as Table > pick a table style and give it a name. This table's name is "t_WordsToFind" (so I can easily identify it in other functions later). You may want to also put your primary data into a table as well. My go-to name is usually "t_Data". Now, instead of worrying about column numbers/letters, you have the user-friendly column headers you started with which makes reading the formula much easier. Your table ranges will also automatically expand when addtl data is added, so row numbers don't need to be referenced any more either.
If you don't have your data in tables, use this version of the formula, and remember to update your range parameters when data is added. B2 is the first cell to be searched, D2:D4 is the list of words to look for, copy the formula down. I do prefer not to use IFERROR as it includes many different types of errors that I may need to know about, like if I misspelled the function name, for example. If you simply need to have an alternative in the event no matches are return and your function is valid, I recommend IFNA.
IFNA(LOOKUP(1,1/COUNTIF(B2,"*"&$D$2:$D$4&"*"),$D$2:$D$4),"")
If you do use tables for your data and lookup tables (you are very wise) and here is the formula version to use (below). In this example, #[Search This Column] is the the equivalent to B2 and t_WordsToFind[Find This] is the table name and column name of words to look for, but it's much more legible, and doesn't need to be copied down or manually expanded in the future.
IFNA(LOOKUP(1,1/COUNTIF([#[Search This Column]],"*"&t_WordsToFind[Find This]&"*"),t_WordsToFind[Find This]),"")
Even wiser still, assuming this is a perpetual need, would be to use power query/power pivot, but I don't want you to go into TMI overload.
Also, your pivot table range will be nice and easy, "t_Data".

Use INDEX/MATCH to select formula written as string, and enable it?

I have a pricelist, with currently 5 different categories of products. Each product will have to have two different prices. Depedning of the product and the type of price, the calculation will be different. Therefor I've used INDEX/MATCH to find the formula needed, from a table I created.
Below a screendump, and I wanted to attach the Excel fil, but canøt seem to work out how.
Question: HOW do I then "run" the formula I fetched? -I've tried different suggestions on using EVALUATION, but it doesn't seem to cut it? Also I've tried "Indirect' on the whole formula, without success.
I would like to avoid any VBA for this case.
Can anybody provide some insight?
You could but if I understand properly, the only thing changing in the formulas is the "muliplier" number, then it's better to lookup that number instead of the whole formula. The other method (which would use Evaluate etc) is not be considered "good practice" for a number of reasons.
EDIT:
I didn't see the 2nd varying value (since I was on the SO mobile app) but it's still not an issue since it would a target column. You could be thinking of the opposite: sometimes lookups based on multiple criteria can get complicated, but this a matter of more data, as opposed to adding criteria for the lookup.
VLookup would have been the simplest method, like G2 could have been:
=VLOOKUP(E2, $J$4:$L$8, 2, False)
...to return the second column of range J4:L8 where the first column equals E2. (Then for the next required column, same formula except with 3 instead of 2.)
Since I wasn't sure more columns could be added one day, I allowed for that by, instead of specifying "Column 2 or 3" etc, it finds the column dynamically by name. (So the multiplier/factor used in G2 will change if you change the title in G1 to the name of a different column existing in the target data chart.
For the sake of neatness as well as potential of additional columns like G & H, I moved the lookup table to a separate sheet. It can stay out of the way since you won't need to see or change it very often. (If the same chart was going to be referenced by many workbooks, you could even move it to a separate workbook and point all formulas at that, since it's always best to have one copy of identical data instead of many in different workbooks.
Also to assist with potential future changes (and just to be tidier), instead of referring to the target table range addresses (like "J4:L8" etc) I named two ranges:
the table of multiplier/factor data can be referred to by it's address, or by myMultipliers
the titles of the same table is also called myMultiplierTitles (used to match to the titles of column G & H on the original sheet.
Formula
After those changes, the lookup formula in G2 is:
=INDIRECT(VLOOKUP($E2,myMultipliers,MATCH(G$1,myMultiplierTitles,0),FALSE)&ROW())*VLOOKUP($E2,myMultipliers,MATCH(G$1,myMultiplierTitles,0)+1,FALSE)
INDIRECT returns the value of a cell that you refer to by name (text/string) as opposed to directly (as a range). For example:
=INDIRECT("A1")
returns the same as
=A1
...but with INDIRECT we can get the name from elsewhere (a cell, function or formula). So if x="A1" then =INDIRECT(x) returns the same as the 2 above examples.
Your original plan of storing the entire formula in a table as text would have worked with the help of INDIRECT and/or EVALUATE but I think this way is considered better practice partly because it facilitates easier future expansion.
The formula is longer than it would have been, but that's mostly because it's dynamically reading the field names. And size doesn't matter. :-)

Code a huge formula into VBA combining IF and Vlookup

I need to reduce a file I am developing and I know one good way is using VBA. Unfortunately I am not advanced in VBA yet and I am struggling in designing this.
I have a list price organized in three different streams, and I want to combine in one like below:
Stream Site Brand Code Price
Mainstream Boston Brand 01 Formula
Midstream New York Brand 02
Midstream Los Angeles Brand 02
Currently I am using a formula that basically does the following:
=IF(AND(stream="mainstream",Site = "Boston"),vlookup(Brandcode,list 1, 2,0),IF(ANd(stream="midstream", Site = "Boston", vlookup(Brandcode,list 2, 2,0),...))
The formula actually works just fine, the problem is I am testing many other conditions than just this one and thus the file is becoming very heavy, so I wanted to create a VBA code to create either a function or a subroutine, but I am struggling to understand how to do it.
Thanks
This can actually be achieved without using VBA by using named ranges instead. Looking at the formula, it's clear that your lookup range varies based on the combination of stream and site.
You can create a Named Range for each of these lookup ranges. To do so, highlight the range of cells that contains the first lookup group (let's say Midstream New York). Next, press CTRL + F3 to open up the Name Manager. Finally, give this group the name MidstreamNewYork. (Note: you cannot include spaces in the name).
Next, you can update your Vlookup function. You no longer have to include the IF(AND... component because your lookups will be dynamic. Let's say you're inputting a formula on Row 2, the formula would be:
=VLOOKUP(C2,INDIRECT(SUBSTITUTE(A2&B2, " ", "")),2,FALSE)
Let's break down the formula. (1) C2 is just the brandcode (I assumed it was in column C.(2) TheINDIRECTfunction treats aStringas aRange. We are passing in CellsA2andB2, which are "Midstream" and "New York", respectively. We use theSUBSTITUTEfunction to remove the spaces since they aren't allowed. So now, we are looking up a named range calledMidstreamNewYork` (sound familiar?). (3) The rest of the VLOOKUP is standard: lookup the second column, and only match exacts.
Give it a try and let me know if it meets your requirements.

Looking for an Excel formula combining VLOOKUP and SEARCH functions

I have a spreadsheet I am using for real estate where I want to be able to populate the name of a building based on an apartment's address. I need to create a formula that searches for certain information on a cell (probably the building number) and, based on that, looks at a table with building numbers and corresponding names and returns the appropriate value. I can't use a simple VLOOKUP based on the full address because, since they all contain apartment numbers, every address is unique.
I though about combining the search function with LOOKUP but that is not working for me so far.
Any thoughts on how to accomplish this?
Please try:
=VLOOKUP(VALUE(LEFT(D4,FIND(" ",D4)-1)),W:X,2,0)
copied down to suit.
Good point spotted by #Jerry:
=VLOOKUP(VALUE(LEFT(TRIM(D8),FIND(" ",D8,2)-1)),W:X,2,0)
(though any leading space is not necessarily going to be removed by TRIM).

Return a Comma-Separated List from an Array Formula

I could probably puzzle this one out in VBA, but would rather not have to use that approach if possible, and it seems like this is something Excel's built-in functions should be able to manage.
I have a spreadsheet we recently stopped printing that we would use to verify data came in from our various sites, and for which days. Column headers correspond to site numbers ($D$1:$AU$1 for this spreadsheet) and rows are for dates ($A$3:$A$24 for the month of December -- we don't do this data validation on weekends). The "B" column contains the date whose information SHOULD have come in, for example, on 12/4, we should receive the information from 12/2 at each site.
In the past, we used column "C" to write which sites were behind. If Site 3 only sent in information from 12/1 on the 4th, we would write a "3" in that column. I'd like to continue this convention, since it's what the office understands. Change Is Bad and all that.
So far I've muddled through on my own and wrote an Array Formula that returns {0,0,1,0,...,0} if only site 3 is behind. That formula is =IFERROR(SEARCH(B3,D3:AU3)-SEARCH(B3,D3:AU3),1)
From here it's trivial to do =IF(ISBLANK(D4),"",INDEX(D1:AU1,1,MATCH(1,IFERROR(SEARCH(B3,D3:AU3)-SEARCH(B3,D3:AU3),1)),0)) which works great, so long as there's only one site that's behind. If we have more than one site behind, it returns the first value (as expected).
If both sites 3 and 6 are behind, we want to see "3, 6" (or any other human-readable format) but the only solutions I found are to write a custom VBA script to concat an array. I'd rather stay away from custom VBA if humanly possible.
Thanks,
Adam
Try this "array formula" in BA3
=IFERROR(INDEX($D1:$AU1,SMALL(IF(ISERROR(SEARCH($B3,$D3:$AU3)),COLUMN($D3:$AU3)-COLUMN($D3)+1),COLUMNS($BA3:BA3))),"")
confirm with CTRL+SHIFT+ENTER and copy across to get all matches
As you say you can then concatenate those back to C3

Resources