I am trying to import the following wikipedia list using the following command but its not working, am not sure what the correct index should be.
I would like to have the name in one column and the wikipedia url in another column if possible.
=ImportHtml("http://en.wikipedia.org/wiki/List_of_Ice_Bucket_Challenge_participants","list", 2)
Also if possible, removing square brackets with the number beside each name
Thank you
ImportHtml is not sufficiently powerful / advanced for what you want to do.
Use IMPORTXML function instead. You will need understanding of xpath to be able to write the queries you need.
Also keep in mind that google docs supports only a subset of the xpath functions available. This is similar to Xpath 1.0, but still some functions don't work. Don't ask me why....
I have had a go, and written what you need. For reasons I am not going to go into, to be able to give you the reference number, I had to create a separate list of names with and without references.
The formulas are:
A4:
=importxml(A1,"//div[#class='div-col columns column-width']//li/a[following-sibling::sup]")
B4:
=importxml(A1,"//div[#class='div-col columns column-width']//li/a[following-sibling::sup/a/text()]/#href")
C4:
=importxml(A1,"//div[#class='div-col columns column-width']//li/a[following-sibling::sup/a/text()]/#href")
E4:
=importxml(A1,"//div[#class='div-col columns column-width']//li/a[not(following-sibling::sup)]")
F4:
=importxml(A1,"//div[#class='div-col columns column-width']//li/a[not(following-sibling::sup)]/#href")
EDIT:
If you are not bothered about the reference then you can have all the names in one list.
I've also added an additional query for the image link
New formulas:
A4:
=importxml(A1,"//div[#class='div-col columns column-width']//li/a")
B4:
=importxml(A1,"//div[#class='div-col columns column-width']//li/a/#href")
C4:
="http://en.wikipedia.org"&importxml("http://en.wikipedia.org"&B4,"//table[#class='infobox biography vcard' or #class='infobox vcard']//td[#colspan='2']//a[#class='image']/#href")
Related
I have two lists of products in Excel. Each list will be of varying length each month.
Is there a way to combine the two lists into a third list, with the second list being underneath the first?
I would like to do this avoiding macros.
I image this could be done using Dynamic Arrays, but I can't figure it out.
Please see an example below:
Thank you so much in advance.
I have had this problem before and used this tutorial to help me. I attach the example sheet also, which provides the formula that may work for your problem.
See the image below for cell references - then try this:
=IFERROR(INDEX($B$3:$B$7, ROWS(H2:$H$2)), IFERROR(INDEX($D$3:$D$4, ROWS(H2:$H$2)-ROWS($B$3:$B$7)), IFERROR(INDEX($F$3:$F$6, ROWS(H2:$H$2)-ROWS($B$3:$B$7)-ROWS($D$3:$D$4)), "")))
I have managed to find a solution that works for me, where the lists are of variable length.
Using a similar scenario to Mardi-Louise's answer, I am using the following formula in cell F3, and then dragging down:
=IF(B3<>"",B3,OFFSET($D$3,ROW()-COUNTA($B$3:$B$7),0))
Explanation:
So long as List 1 is not finished, it takes the value from List 1.
Once List 1 is finished, it begins at the top of List 2, and uses an offset to move down.
I'm late to the party, but for anyone still looking for this there's (now) a function for this in Excel 365: vstack(array1;array2;...)
Here is Microsoft's page on it
With the arrays as columns in tables you'll get dynamic lengths. It's also possible to combine vstack() with for example unique().
I benefitted from Answer 2 with slightly different syntax. The ROW() function provides an output based on the absolute cell address when an output based on the relative position of the list is actually more generally applicable. I found the following syntax works better to reference the output of ROW() to the cell above the top cell of range D3:D8:
=IF(B3<>"",B3,OFFSET($D$2,ROW()-ROW($D$2)-COUNTA($B$3:$B$7),0))
Additionally, the COUNTA function can provide inconsistent results when cells in the range are not based on simple data but on the output of formulas which can be equal to 0 or blank without actually being empty. In that case COUNTIF often works better such as:
=IF(B3<>"",B3,OFFSET($D$2,ROW()-ROW($D$2)-COUNTIF($B$3:$B$7,"<>"&0),0))
I am trying to get a dedicated material table in excel. So we have a few products and these products require particular materials. I know how much and which materials go in particular products. I also know how much is sold in which year, now I want to calculate the required materials for these years. Because the productbase is large (>100), and thus >100 columns, I would like to use some lookup or index function to automate the multiplication.
As shown in the picture, I tried using a sumproduct, which was also explained in some other question on stackoverflow. This sumproduct should multiply all values obtained in one table with the corresponding values in the other. I feel that something is not right about my first two match functions (see picture again)
The code used:
=SUMPRODUCT(INDEX($B$19:$E$22;MATCH(B$2;$A$19:$A$22;0);MATCH(B$10;$B$18:$E$18;0));INDEX($B$3:$E$5;MATCH($A11;$A$3:$A$5;0);MATCH(TRUE;$B$3:$E$5>0;0)))
The image contains some extra info and explanation of the actual need
The reason that it needs a lookup or index is because the products in table 3 are always in another order than what is shown in table 1.
I would like to have this sumproduct as automated as possible, thank you in advance:)
You could try and adapt the below:
Formula in B11:
=SUM(INDEX($B:$B,MATCH($A11,$A$1:$A$5,0)):INDEX($E:$E,MATCH($A11,$A$1:$A$5,0))*TRANSPOSE(B$19:B$22))
Entered as array, CtrlShiftEnter
Drag right and down into matrix.
Side-note: Be sure to edit your question to include all relevant information, including your own atempted formula, as text. Way easier to copy paste sample data that has been formatted as markdown :)
I've lost days to this problem.
To solve this issue I have amassed 6 sheets of different data with different attempts. I've tried offsets, dynamic ranges, vlookups, counts, countifs, left, right, row, everything. I can't get around it.
What I want to do is transpose data in the following format:
Into its correspondingly named four columns to reach the desired outcome:
Inconsistency of each individual product's data is the issue. Product 9 has two product options and two prices, whilst the tenth product has no description and no options.
I also have the original data in the following format if it is of use.
Any help or resources greatly appreciated.
I have gotten by with excel for the past few years without learning VBA, therefore a formula approach is most welcome. Though, if necessary, I am not massively intimidated by learning the language if needs be as I am a novice in CSS and HTML and dabbled in a couple of programming languages.
Edit 1 - A more concise way of viewing the issue is contained in this image where the data is on the left and the outcome is on the right.
Edit 2 - This is a link to a google sheet subject to request in the comments below. I have included all relevant information and some other stuff.
EDIT: you've added clarification since I put this together. I'll leave it in, because it is a partial solution and may lead you/someone to a full one.
(it supplies only 1 result if there are multiple matches for a particular product and information type, but handles missing information).
I have a solution. In your second example of the input (product #, type and information columns across the top), insert a new column between B and C. In the first data row of that new column, combine the text from the product # and 'Type' columns, i.e
C2 = A2 & "," & B2
Result of C2 should be "Product 1,Product Name"
Then for your transposed results table, all you need is a valid list of product #'s, and you can use a simple vlookup on C and D columns. You just have to construct a matching search term that combines the required product # with the type you are looking for.
The formula in H2 below is constructed so that you can drag it across and down as necessary.
You'll get N/A for anything that doesn't have an entry, and you'll get whatever vlookup usually does if it finds duplicate matches.
I'm looking for a way to insert a column based on two criteria, as illustrated below. I have a main table with one row per company, and I want to add a column to this with the city names. However, the lookup table has two rows for some companies - one for "small" and one for "large". I'm only interested in retrieving the cities for companies that have size value "small".
I know that I can achieve this with =SUMIFS if the content of the column was a number instead of text. However, with the cities column consisting of text, I don't know how to proceed. I'd ideally like a solution where I don't have to use a helper column.
Edit: this is just an example of my data. I have hundreds of rows,the duplicate answer suggested uses INDEX/MATCH which requires me to give the exact cell location of each condition. This is not the case in my data.
There are a few solutions that I usually use for these tasks. They're not elegant i.e. not a 2-criteria look-up per se, but they get the job done.
Going by your data structure, you have these choices:
Sort your lookup table by size-company, with size in descending order. Thereafter, it's a straightforward vlookup since your big companies are seggregated from small ones.
Build a new key consisting of company-size i.e. CONCAT(company,size) and do the vlookup based on this key.
It's not possible with VLOOKUP. Look my solution in the picture using a array formula.
Solution using array formulas
Formula in F2: =INDEX($C$1:$C$6;SUM(IF(E2=$A$2:$A$6;1)*IF($B$2:$B$6="small";1)*ROW($C$2:$C$6));1)
Ps: don't forget to confirm the formula with Ctrl+Shift+Enter.
Multi-column lookups are certianly possible but not using VLOOKUP. You'll need to use INDEX and MATCH. This becomes pretty complex as it combines array formulas with boolean logic. Here's a nice explanation.
https://exceljet.net/formula/index-and-match-with-multiple-criteria
For your example, assuming Desired Result Company is in column I.
=INDEX($F$4:$F$5,MATCH(1,(D4:D5=I4)*(E4:E5="small"),0))
I'm basically looking for a $A:$A equivalent for structured table references in Excel.
Say I have this formula:
=INDEX(tChoice,MATCH(OFFSET(tData[#[cm_sex]],-3,0),tChoice[name],0),3)
Basically tData is a table full of raw data (many columns), taken from surveys (so each column is Survey question, more or less). tChoice is a smaller table (just a few columns), I basically want to look up into tChoice the raw data value & return a label based on that (to value-label table is tChoice).
I actually want the tData[#[cm_sex]] to auto-increment as I apply formulas in cells to the left (so I cycle through all the columns of the raw data), however I DON'T want the column tChoice[name] to change: e.g. the column to look for a match based on the raw table data.
This is equivalent to writing, say, A:A (which would auto-increment to B:B, C:C, etc) and $A:$A (which would not).
Is there a dollar sign equivalent for structured table references?
P-S: Of course I can other things like increment the whole thing, than search & replace the range with say tChoice[*] replaced by tChoice[name]... However it would be cleaner & more efficient to have a proper notation for it....
Didn't find it in the support pages (https://support.office.com/en-us/article/Using-structured-references-with-Excel-tables-f5ed2452-2337-4f71-bed3-c8ae6d2b276e)
user3964075 provided the answer in the comments. I had never seen this before so thanks to him or her for this answer. There's some information out there on the web about absolute structured table references, so I thought I'd summarize what I found.
For your situation you can use tChoice[[name]:[name]] Specifying a range that's just the one column anchors the column like $ signs do in normal cell references.
If you want to just deal with one row (the one that the formula is in) the anchor looks like this:tChoice[#[name]:[name]].
Now say you want to anchor one cell but have the other be relative, as in this scenario where I'm summing from a to the right, starting with a:a, then a:b, etc:
You can do that with a formula like this, where the first part is absolute and the second is relative:
=SUM(Table1[#[a]:[a]]:Table1[#a])
Note that these formulas much be dragged, not copied. Perhaps there is a keyboard shortcut that does this.
This process is rather clunky compared to just clicking F4, as with a regular cell reference. Jon Acampora has created an addin that automates this process, as well as two detailed posts on this topic. His first post contains a link to the one with the addin.