openpyxl: '#' is inserted to formula when saving to file - excel-formula

When I add the following formula to a cell, the cell's value looks good when printed to the console. However, after I save the file, the formula has '#' inserted right after the '=' (for simplicity, I am providing the output from the console):
>>> from openpyxl import Workbook
>>> wb = Workbook()
>>> ws = wb.active
>>> ws['A1'] = '=CONCAT("Week ",TEXT(MID(' + get_column_letter(9) + '1,6,2)+ 1, "##"))'
>>> ws['A1'].value
'=CONCAT("Week ",TEXT(MID(I1,6,2)+ 1, "##"))'
>>> wb.save('formula.xlsx')
>>>
In the 'formula.xlsx' file, the formula looks like this:
=#CONCAT("Week ",TEXT(MID(I1,6,2)+ 1, "##"))
If, however, instead of '=CONCAT()' I specify '=SUM()', for example, it is saved as expected, i.e. without the '#' inserted.
I am using openpyxl 3.0.3 and Python 3.8.
Many thanks
-------- Udate --------
I have looked into the XML code of 'formula.xlsx'; but before doing that, I opened it in Excel, copied cell A1 into cell D1, and deleted '#' from the formula in cell D1, after which D1 started showing the correct value while A1 still showed the '#NAME?' error.
So, after my changes in cell D1, the XML code for the sheet showed the following:
<row r="1" spans="1:9" x14ac:dyDescent="0.45">
<c r="A1" t="e"><f ca="1">_xludf.CONCAT("Week ",TEXT(MID(I1,6,2)+ 1, "##"))</f><v>#NAME?</v></c>
<c r="D1" t="str"><f>_xlfn.CONCAT("Week ",TEXT(MID(I1,6,2)+ 1, "##"))</f><v>Week 68</v></c>
<c r="I1"><v>12345678</v></c>
</row>
The _xludf prefix used by openpyxl for CONCAT in cell A1 above is described as "User Defined Function" on https://learn.microsoft.com/en-us/office/client-developer/excel/xludf.
Could it mean that the library did not recognise CONCAT as a standard Excel function, and therefore used _xludf instead of _xlfn for it?
----- End of update ---

As specified in the openpyxl documentation known formulas are used just by inserting the formula name.
One can use
>>> from openpyxl.utils import FORMULAE
>>> "CONCAT" in FORMULAE
False
To check if the formula is a known one in openpyxl. If the formula isn't you need to add _xlfn. just before the formula name, like so:
>>> ws['A1'] = '=_xlfn.CONCAT("Week ",TEXT(MID(' + get_column_letter(9) + '1,6,2)+ 1, "##"))
It is also mentioned in the documentation:
If you’re trying to use a formula that isn’t known this could be
because you’re using a formula that was not included in the initial
specification. Such formulae must be prefixed with _xlfn. to work.

I have the same problem using formulae in Spanish (I'm from Argentina).
When I try to assign something like "=SUMA(A1:A20)" to a cell, it comes out as "=#SUMA(A1:A20)".
I tried the _xlfn. solution, but now it just ends up as
"=#_xlfn.SUMA(A1:A20)"
If/when I find an answer I'll post it here.
SOLVED
If you are using a non-English version of Excel, you still have to assign the cell the English name of the function, e.g. "=SUM(A1:A20)"
Afterwards, when you check the content of the cell in the worksheet, it will have changed to the proper language, in this case the Spanish "=SUMA(A1:A20)"
Caveat: I've only checked it in the Spanish version, but I'm pretty sure it works for all.
Also: If you use another set of characters as separators, for example a comma (,) instead of a dot (.) for decimals, you still need to use the dot when assigning a formula to a cell, for example "= E8 * 0.5". You will see a comma when you check the cell. Using a comma in that string will result in a damaged-file error when opening the xlsx file.

In my case I was trying to put =ARRED formulae and the character # appears in the Excel file like =#ARRED.
ARRED is a Brazilian Portuguese prefix and is not recognized, so I replaced by ROUND and is worked perfectly.
Testing:
from openpyxl.utils import FORMULAE
"ARRED" in FORMULAE
False
"ROUND" in FORMULAE
True

Specifying _xlfn prefix explicitly in the python code fixes the problem:
>>> ws['A1'] = '=_xlfn.CONCAT("Week ",TEXT(MID(' + get_column_letter(9) + '1,6,2)+ 1, "##"))'
Thanks goes to Dror Av. for guidance!

Related

How to extract specific text from a sentence in Excel?

I have a database that exports data like this:
How can I get for instance, the Net Rentable Area with the values needed:
E.G.
Net Rentable Area
I tried the TextSplit function but I got a spill.
Please let me know what can be done, thanks!
Also it would be nice to see it working in something such as the Asking Rate, which has a different format.
In cell C2 you can put the following formula:
=1*TEXTSPLIT(TEXTAFTER(A2, B2&" ")," ")
Note: Multiplying by 1 ensures the result will be a number instead of a text.
and here is the output:
If all tokens to find are all words (not interpreted as numbers), then you can use the following without requiring to specify the token to find:
=LET(split, 1*TEXTSPLIT(A2," "), FILTER(split, ISNUMBER(split)))
Under this assumption you can even have the corresponding array version as follow:
=LET(rng, A2:A100, input, FILTER(rng, rng <>""), IFERROR(DROP(REDUCE(0, input,
LAMBDA(acc,text, LET(split, 1*TEXTSPLIT(text," "),
nums, FILTER(split, ISNUMBER(split),""), VSTACK(acc, nums)))),1),"")
)
Note: It uses the trick for creating multiple rows using VSTACK within REDUCE. An idea suggested by #JvdV from this answer. It assumes A1 has the title of the column, if not you can use A:A instead.

How/which formula to use, to show combine text results for false condition (for pending task reporting usage)?

Wanted to check if CONCATENATE is the one to use (not sure if my excel has TEXTJOIN), and how to show just the text that has empty value in the cells.
For example in my attachment below, I want the intended result shown like in B2 and B3, where the texts shown with delimiter, when the values are false (empty).
If I were to use CONCATENATE like in Row 10 and Row 11, it's rather manual and it only capture "positive values" as in non-blank cells.
Purpose: To show pending tasks (empty/blank status cells)
Use MID with CONCATENATED IFS:
=MID(IF(C2="","/"&$C$1,"")&IF(D2="","/"&$D$1,"")&IF(E2="","/"&$E$1,"")&IF(F2="","/"&$F$1,"")&IF(GC2="","/"&$G$1,"")&IF(H2="","/"&$H$1,""),2,999)
I would use TEXJOIN and FILTER if you have the newest version of Excel.
For example: =TEXTJOIN("/",1,FILTER($E$2:$I$2, ISBLANK(E3:I3)))
EDIT: For older versions, a temporary workaround is as follows:
make a temporary array the same size as your original dataframe where each value is determined by a formula such as =IF(ISBLANK(E3), E$2&"/","")
Use something like =LEFT(CONCAT(E15:J15), LEN(CONCAT(E15:J15))-1) to get the desired result (where E15:J15 is where I elected to store the first row of the temporary array created in step 1).
I am not sure of your Excel version, but I think this would work in older versions (formatted for readability - will work if you paste it directly into cell B2 and copy down):
=LEFT(CONCAT( INDEX( CHOOSE({1;2;3},$C$1:$H$1,{"/","/","/","/","/","/"},{"","","","","",""}),
INDEX( IF(ISBLANK(C2:H2),{1;2},{3;3}),
MOD(COLUMN(A1:INDEX(1:1,,12))-1,2)+1,
(COLUMN(A1:INDEX(1:1,,12))-1)/2+1 ),
(COLUMN(A1:INDEX(1:1,,12))-1)/2+1 ) ),
SUM(7*ISBLANK(C2:H2))-1 )
Notes
As this is an array formula, you may have to enter it with CTRL + SHIFT + ENTER with an older version of Excel.
The stat labels must all have a length of 6 characters as shown in your post. If not, then they must at least have the same length and the last line SUM(7*ISBLANK(C2:H2))-1 must be changed to replace the 7 with the string length + 1, e.g. a length of 9 would be SUM(10*ISBLANK(C2:H2))-1.
If they don't have the same length, the LEFT( can be removed along with the SUM(10*ISBLANK(C2:H2))-1) at the end. You will end up having a trailing / delimiter at the end. You could fix that for the case of stat F being the last part by changing {"/","/","/","/","/","/"} to {"/","/","/","/","/",""}, but the other cases would still have a trailing /. Another approach is much more complex, but the component SUM(10*ISBLANK(C2:H2))-1) could be shaped to identify what to cut off or maybe a helper column could be built - in any case, let's hope your situation is that the stat labels all have the same length.
The delimiter "/" can be changed, but must always be a single character. If not, then then last line must be changed to SUM( [label length + delimiter length] *ISBLANK(C2:H2))-1.
This formula is fixed to 6 stat columns. If you need for it to accommodate more, it is possible by extending the {"/","/","/","/","/","/"} and {"","","","","",""} (one element for each new column) and replacing every 12 with 2 times the number of columns. Also, obviously, the references $C$1:$H$1 and C1:H2 must be changed to read in your new columns.

Excel Replace Using Formula

I'm trying to Replace a Cell with other Cell Text, the think is i need to removed some text from an Image URLs using Excel Formula.
Image URL : domain.com/images/products/63/63/19787/2/279/image-name.jpg
Text that i needed to be removed is 279/. But i need it to be removed at the exact place like domain.com/xxx/xxx/xxx/xxx/xxx/x/279/image-name.jpg between URL and Image name.
I've tried to split it first with this Formula
=MID(A1,FIND("|",SUBSTITUTE(A1,"/","|",LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1)),LEN(A1))
Result after i used the Formula
/279/image-name.jpg
and i tried using Replace Formula to replace text from other cell text. Before that i removed the /279 from the result so its only /image-name.jpg now.
=REPLACE(A1, FIND(B1,A1), 4, C1)
But its keep giving me double result in the end of the text like this
domain.com/images/products/63/63/19787/2/image-name.jpg/image-name.jpg
result should be
domain.com/images/products/63/63/19787/2/image-name.jpg - without 279/
is there any problem with the Replace Formula ? or is there any other simpler Formula to make it work ?
To find the location of the next to last slash:
=FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1))
If the contents of that node will always be three characters, you can use replace:
=REPLACE(A1,FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1)),FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1)),"")
If the contents of that node will be a variable number of characters, then we first return the part up to that node, and concatenate with the last node:
=LEFT(A1,FIND(CHAR(1),LEFT(SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1),99))) &
TRIM(RIGHT(SUBSTITUTE(A1,"/",REPT(" ",99)),99))
EDIT Logic added to ensure that the next to last node is equal to 279
If you need to confirm that the next to last node contains 279, you can check it with:
=ISNUMBER(N(FIND("/279/",MID(A1,FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1)),99))=1))
Using that as part of an IF will return the original string if 279 is not the contents of that node, and replace it only if it is:
=IF(ISNUMBER(N(FIND("/279/",MID(A1,FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1)),99))=1)),
LEFT(A1,FIND(CHAR(1),LEFT(SUBSTITUTE(A1,"/",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1),99))) &
TRIM(RIGHT(SUBSTITUTE(A1,"/",REPT(" ",99)),99)),A1)
If you happen to have access to TEXTJOIN then use:
=TEXTJOIN("/",,FILTERXML("<t><s>"&SUBSTITUTE(A1,"/","</s><s>")&"</s></t>","//s[not(position() = last()-1)]"))
And if you need to check if it's equal to '279' before removal:
=TEXTJOIN("/",,FILTERXML("<t><s>"&SUBSTITUTE(A1,"/","</s><s>")&"</s></t>","//s[not(position() = last()-1 and .='279')]"))
If you don't have access to TEXTJOIN then you have a fine alternative by #RonRosenfeld.
Another option would be to use REPLACE():
=REPLACE(A1,FIND("|",SUBSTITUTE(A1,"/","|",LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1)),FIND("|",SUBSTITUTE(A1,"/","|",LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))))-FIND("|",SUBSTITUTE(A1,"/","|",LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))-1)),"")
You want to use the SUBSTITUTE() function.
The below finds "/279/image-name.jpg" and replaces it with "/image-name.jpg". The added /image-name.jpg will ensure other instances of /279 will remain unaltered.
A1 value = "domain.com/images/products/63/63/19787/2/279/image-name.jpg"
=SUBSTITUTE(A1,"/279/image-name.jpg","/image-name.jpg")
output:
domain.com/images/products/63/63/19787/2/image-name.jpg

if function does not work in Excel.Can't see the issue

So for school i'm working with this huge excel Dataset. I'm trying Excel 365 to find the word terrorist and return the value 1 with the follow code:
=IF'(A22="terrorist", "No", "Yes")
But I keep getting the excel error.
There is a problem with the function
you type =1+1, cell shows: 2
to get around this type an apostrophe (') first:
you type '=1+1 cell shows =1+1
Whats going wrong?
Remove the ' - use simply:
=IF(A22="terrorist", 1, 0)
This will show 1 if A22 = "terrorist", otherwise it will show 0.

Excel formula contains error

I have an error in this excel formula and I can't just figure it out:
=LEFT(B3,FIND(",",B3&",")-1)&","&RIGHT(B3,LEN(B3)-FIND("&",B3&"&")),RIGHT(B3,LEN(B3)-SEARCH("#",SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ",""))))&", "&SUBSTITUTE(RIGHT(B3,LEN(B3)-FIND("&",B3&"&")-1),RIGHT(B3,LEN(B3)-SEARCH("#",SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ",""))))),""))
It may seem like a big formula, but all it's intended to do is if no ampersand is in a cell, return an empty cell, if no comma but ampersand exists, then return this, for example:
KNUD J & MARIA L HOSTRUP
into this:
HOSTRUP,MARIA L
Otherwise, there is no ampersand but there is a comma so we just return: LEFT(A1,FIND("&",A1,1)-1).
Seems basic, but formula has been giving me error message and doesn't point to the problem.
Your error is here:
=LEFT(B3,FIND(",",B3&",")-1)&","&RIGHT(B3,LEN(B3)-FIND("&",B3&"&")),
At this point, the comma doesn't apply to anthing, because the right operator has matching parens
As far as what you want? Let's break that up into what you actually asked for:
if no ampersand in a cell, return empty cell,
B4=Find("&", B3&"&")
B5=IF(B4>LEN(B3),"",B6)
if no comma but ampersand exists
B6=IF(FIND(",", B3&",")>LEN(B3),B8,B7)
then turn this, for example:
KNUD J & MARIA L HOSTRUP
into this:
HOSTRUP,MARIA L
I'm presuming you mean to put the last whole word? Let's mark the last whole word:
B9=SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ","")))
B10=RIGHT(B7,LEN(B9)-FIND("#",B9))
And the stuff between the ampersand and the last word
B11=TRIM(MID(B9,B4 + 1, LEN(B9)-FIND("#",B9)-1))
Then calculating it is easy
B7=B10&","&B11
Otherwise, there is no ampersand but there is a comma so we just return:
LEFT(A1,FIND("&",A1,1)-1).
Well, if you want that, let's just put that in B8
B8=LEFT(A1,FIND("&",A1,1)-1)
(But I think you actually mean B3 instead of A1)
B8=LEFT(B3,FIND("&",B3,1)-1)
And there you have it (B5 contains the information you're looking for) It took a few cells, but it's easier to debug this way. If you want to collapse it, you can (but doing so is more code, because we can reduce duplication by referencing a previously calculated cell on more than one occasion).
Summary:
B3=<Some Name with & or ,>
B4=FIND("&", B3&"&")
B5=IF(B4>LEN(B3),"",B6)
B6=IF(FIND(",", B3&",")>LEN(B3),B7,B8)
B7=B10&","&B11
B8=LEFT(B3,FIND("&",B3,1)-1)
B9=SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ","")))
B10=RIGHT(B9,LEN(B9)-FIND("#",B9))
B11=TRIM(MID(B9,B4 + 1, LEN(B9)-FIND("#",B9)-1))
When I put in "KNUD J & MARIA L HOSTRUP", I get "HOSTRUP,MARIA" in B5.

Resources