Arcade Labelling New Line and Keep Trailing Zeros - text

I'm trying to create simple labels in ArcGIS Pro using Arcade (though I'm open to using another language - I haven't had luck doing this particular task in any language).
I have a joined attribute table that has a string field with rates. This field must be a string because it includes values of *** when the rate must be suppressed. The rates have been rounded to one decimal point.
I want to create labels for my map that have county names plus a new line plus the rate. The Arcade code looks like this:
$feature['CNTYNAME'] + "/n" + $feature['Rate']
This of course works fine, but I want to keep trailing zeros on the rates, even if there's no trailing zero in the attribute table.
This works:
$feature['CNTYNAME'] + Text($feature['Rate'], "#,###.0")
However, when I try to add a new line, I lose the trailing zero:
$feature['BOUNDARY_BNDHASH_REGION_COUNTIES.CNTYNAME'] + "/n" + Text($feature['HPV Cancers.csv.Rate'], "#,###.0")
The trailing zeros also disappear if I use the function textformatting.NewLine:
$feature['CNTYNAME'] + textformatting.NewLine + Text($feature['Rate'], "#,###.0")
Thanks for any suggestions!

Related

Python pd.read_excel with values of mixed decimal comma or point and integers

(I haven't found a solution that would solve my "combined" case entirely)
I am running into issues while reading through the answers/solutions I am still facing the obstacles as below.
Obviously, it is not about a few files that would be faster to clean "manually" but a flow of multiple excel-format files that the Python script seems to (would) be a perfect tool for.
Excel-format files I am getting have numbers in some columns (like "Unit selling price" or "Sales Amount") stored and displayed by MS Excel as "general" format. Which gives a fancy result even in Excel itself, since those with decimal signs are shown as strings/text (adjusted to left), while "integer" strings, ie. without any decimal part or sign are shown as digits in the same column (adjusted to right). But the look is not what matters.
And YES, there is a mixture: basically rows should come with a decimal coma "," but some rows - in the same file - might have a decimal point - which is an ERROR for my system and that is why I am trying to clean it up with Python script.
In the end I could manage WHATEVER decimal sign (comma or dot) it was as long as it was unified across all the files in specific column/columns
The decimal comma and/or point is one of things to be managed and I have tried some working solutions being provided also in stackoverflow (THANKS!) like in here: Comma as decimal separator in read_excel for Pandas
But then are also some of rows (ie. cells in those columns mentioned above) containing a value that is actually and integer number (no wonder, some prices might be like USD 100 without any cents, right?) Then I am loosing those values if they are shown in Excel as (i.e.) 100, instead of a 100,00 or 100.00.
Issue 1. Python cannot "pd.read_excel" values and re-format them DIRECTLY into float() properly without me telling that there might be a decimal point OR a decimal comma ( .astype('float') or float() would love to have decimal point ONLY by default)
Issue 2. While solving Issue 1. I cannot make the script smart enough to PROPERLY re-format into float() those values that are actually integers without any sign or decimal part.
Issue 3. If I am "pd.excel_read"-ing excel directly and getting "integers" read-in properly (which allows to avoid Issue 2.) ,then I have no chance to tell the pd.excel_read() function, that it sholud read the comma "," as a decimal sign. That is because the pd.read_excel("file.xlsx", decimal=',') - throws an error saying that 'decimal" is an unknown to the pd.read_excel(). Multiple-checked for misspells etc. have I..
"Conversions" function approach works for comma/point issue EXCEPT for all the cells with "strings" that are equivalent to INT, meaning pure integer figures without any decimal part or any sign, are simple returned as nulls/dissapear.
Those kind of issues I found on foras dated some years back already, still, nowhere firm solution of all of them AT ONCE. Today is JAN 02, 2023, my pandas version is 1.3.4. Would greatly appreciate "combined" advise to the above. Only way that see now would be more elaboated regex-on-string-like approach but I have feeling like I am missing some more proper solution.
The decimal comma and/or point is one of things to be managed and I have tried some working solutions being provided also in stackoverflow (THANKS!) like in here: Comma as decimal separator in read_excel for Pandas
but the "integer-like" string/objects ( as Python reads their type) are not properly converted to floats, actually lost to null.
I have come with such a solution, but hope something simplier might be proposed:
# df=pd.read_excel("file.xlsx", decimal=',') # <<< my pandas does NOT recognize decimal=',' as a valid option/argument
df=pd.read_excel("file.xlsx")
for a_column in columns_to_have_fomat_changed:# <<< that is to avoid wasting time for processing columns of no importance. My Excel files come with 150+ columns.
# df[a_column] = df[a_column].astype('float')# <<< here will be errors since comma instead of a point may happen
# df[a_column] = df[a_column].str.replace(",", ".").astype(float) # <<< here all "integers" will be lost
for i in range(len(df[a_column]-1)): # <<< this is for distinguishing from the "integer" strings
if r"," in df[a_column][i] or r"." in df[a_column][i]:
df[a_column][i] = pd.to_numeric(df[a_column][i].str.replace(',', '.').str.replace(' ', ''), # <<< overstack solution working for mixed decimal signs etc.
errors='coerce')
else:
df[a_column][i]=df[a_column][i].astype('str')+'.00'# <<< changing "integer" strings into "decimal" strings
df[a_column][i]=df[a_column][i].astype('float') <<< now it works without "integers" being lost
I am not sure I understand what you mean by "loosing values" in "Then I am loosing those values if they are shown in Excel as (i.e.) 100, instead of a 100,00 or 100.00." Maybe you mean adding only one decimal to the end.
Anyways, I tried reproducing your code in a much more efficient way. Looping through cells of a pandas dataframes is painfully slow, and everyone advices against it. You can use a function (a lambda function in this answer) and use .apply() to apply the function:
import pandas as pd
# Create some sample data based on the description
df = pd.DataFrame(data={"unit_selling_price" : ['100,00 ', '92.20 ', '90,00 ', '156']
,"sales_amount" : ['89.45 ', '91.23 ', '45,458 ', '5784']
}
)
columns_to_have_fomat_changed = ["unit_selling_price","sales_amount"]
for column in df[columns_to_have_fomat_changed].columns:
# Replace commas with .
df[column] = df[column].replace(',', '.', regex=True)
# Strip white spaces from left and right side of the strings
df[column] = df[column].str.strip()
# Convert numbers to numeric
df[column] = df[column].apply(lambda x: float(x) if '.' in x else float(str(x)+'.00'))
Output:
unit_selling_price sales_amount
0 100.0 89.450
1 92.2 91.230
2 90.0 45.458
3 156.0 5784.000

Spotfire - Extract text based on conditions

I have a column containing a string value as shown is the example below :
ZAE/GER-ERT/HEZ/PDC
The idea is to extract the first trigraph (ZAE in this extract) and a second one based a rule.
The rule is, if there is a '-' separating two trigraphs, we don't extract them, we just take the first trigraph after a '/' and without a '-' after it.
We then use a - to separate the two results, here is the aim for the example : ZAE-HEZ
I would like to get this value in a new calculated column.
I've tried to play with the indexes based on the Find() and ExtractRX() functions, but couldn't make it work.
Thanks in advance !
I am not sure this is the simplest way, but it works for your example (assuming the strings are always alphanumeric in chunks of 3).
You can do it via an intermediate column (for sanity, although you could put the [tmp] formula directly into the final column):
[tmp] as
RXReplace(RXReplace([your_column],'\\w{3}-\\w{3}','','g'),'/+','/','g')
This removes any double trigraph like GER-ERT and then removes any leftover double /
Then the final column splits [tmp] by / and concatenates the first and second item
Concatenate(Split([tmp],'/',1),'-',Split([tmp],'/',2))

Excel: Concatenating cells & text using IF statements

I'm trying to combine several cells of data. My problem is in placing spaces between data and, more importantly NOT putting a space when there's no data so I don't get double spaces. Here's a sample:
=TRIM(M12)&IF(N12<>M12;"-"&TRIM(N12);"")&" "&TRIM(G12)&" "&TRIM(H12)&IF(LEN(I12>0);" "&TRIM(I12)&" ")&TRIM(J12)
The data is start year (M), end year (N), make (G), model (H), body style (I), driveline (J).
For some the values in start year and end year are the same.
&IF(N12<>M12;"-"&TRIM(N12);"")
This works perfectly. If the end year is the same as the start year it does not add a - or space after.
For many rows there is no value in body style.
&IF(LEN(I12>0);" "&TRIM(I12)&" ")
This will print the body style if it's present but it always adds a double space if there is no value in body style.
When I change that reference to:
&IF(LEN(I12>0);"-"&TRIM(I12)&"+")
both the - and + print regardless of what's in I12
I've tried many variations. None work, some throw errors. Probably obviously, I do not know what I'm doing in Excel but I'm thinking there must be a better way of checking the cell I12? I tried >1 with no luck but I'm not sure what to check besides the length of the data within.
The TRIM function not only removes leading and trailing spaces, but also reduces any internal multiple space sequences to a single space. By wrapping the whole formula in TRIM(..) you can ignore the possibility of creating double spaces.
Regarding
When I change that reference to:
&IF(LEN(I12>0);"-"&TRIM(I12)&"+")
both the - and + print regardless of what's in I12
This suggests that I12 actually has one or more spaces. Fix that by using LEN(TRIM(I12))>0
Or better, just go ahead and concatenate I12 and let TRIM clean up the spaces.
Note: I'm assuming the IF(LEN(I12>0);"-"&TRIM(I12)&"+") version was just to test that bit of code, so haven't delt with adding - and +.
So, your whole formula can become
=TRIM(M17&IF(M17<>N17;"-"&TRIM(N17);"")&" "&G17&" "&H17&" "&I17&" "&J17)
If you have a version of Excel that supports TEXTJOIN then you can use
=TRIM(M16&IF(M16<>N16,"-"&TRIM(N16),"")&" "&TEXTJOIN(" ",TRUE,G16:J16))

data validation with numbers + text

Trying to write a custom data validation formula that would only allow values in the following format: 2-digit year (this can be just 2 numbers), dash ("-"), then a 1 or 2 letter character(s) (would prefer upper case, but would settle for lower case), another dash ("-"), and then a 5-digit number. So the final value looks like: 17-FL-12345 ...or 16-G-00008...
I actually have a but more, but if I could get the above working, that would be terrific. I don't know if there's a way, but it would be great if additionally I could use custom formatting to get the dashes to appear when they are not entered, i.e., user enters "17FL12345" and it gets automatically formatted to "17-FL-12345". Finally, again, this isn't a deal breaker either, but it would also be great if the last 5 digits would add any leading zero's, i.e., the user enters 17-G-8 (or just 17G8) and it gets formatted to 17-G-00008.
Can't use VBA unfortunately. Some potential solutions to similar questions I've viewed include:
https://www.mrexcel.com/forum/excel-questions/615799-data-validation-mixed-numeric-text-formula-only.html
Data VAlidation - Text Length & Character Type
Excel : Data Validation, how to force the user to enter a string that is 2 char long?
Try this:
=AND(ISNUMBER(VALUE(LEFT(A1,2))),MID(A1,3,1)="-",OR(ISNUMBER(FIND(MID(A1,4,1),$C$1)),AND(ISNUMBER(FIND(MID(A1,4,1),$C$1)),ISNUMBER(FIND(MID(A1,5,1),$C$1)))),MID(A1,LEN(A1)-5,1)="-",ISNUMBER(VALUE(RIGHT(A1,5))),OR(LEN(A1)=11,LEN(A1)=10),LEN(A1)-LEN(SUBSTITUTE(A1,"-",""))=2,LEN(A1)-LEN(SUBSTITUTE(A1,"+",""))=0,LEN(A1)-LEN(SUBSTITUTE(A1," ",""))=0)
Assuming, you want to validate A1. I inserted the letters in C1.
Edit:
I edited the original function, to be more secure and left out the Isnumber part and rather went digit by digit.
If you want exceed the 255 limit, you have to slice the function up.
I created 5 functions.
=AND(ISNUMBER(FIND(LEFT(A1),$C$2)),ISNUMBER(FIND(MID(A1,2,1),$C$2)))
=MID(A1,3,1)="-"
=IF(LEN(A1)=10,AND(ISNUMBER(FIND(MID(A1,4,1),$C$1)),MID(A1,5,1)="-"),IF(LEN(A1)=11,AND(ISNUMBER(FIND(MID(A1,4,1),$C$1)),ISNUMBER(FIND(MID(A1,5,1),$C$1)))))
=IF(LEN(A1)=10,MID(A1,5,1)="-",IF(LEN(A1)=11,MID(A1,6,1)="-"))
=IF(LEN(A1)=10,AND(ISNUMBER(FIND(MID(A1,6,1),$C$2)),ISNUMBER(FIND(MID(A1,7,1),$C$2)),ISNUMBER(FIND(MID(A1,8,1),$C$2)),ISNUMBER(FIND(MID(A1,9,1),$C$2)),ISNUMBER(FIND(MID(A1,10,1),$C$2))),IF(LEN(A1)=11,AND(ISNUMBER(FIND(MID(A1,7,1),$C$2)),ISNUMBER(FIND(MID(A1,8,1),$C$2)),ISNUMBER(FIND(MID(A1,9,1),$C$2)),ISNUMBER(FIND(MID(A1,10,1),$C$2)),ISNUMBER(FIND(MID(A1,11,1),$C$2)))))
Set up data validation as on the picture:

Variable using decimal comma instead of decimal point

I have been using Excel VBA for a while now and I have come across a problem that I have never encountered before. I am using someone elses computer that is set up as Polish, so the decimal separator is set to comma. But even when I had my computer set up similarly in the past I did not have any problem.
This current project creates a drawing in Visio.
So,
I have a variable of type double that is calculated as pgeWidth = 550 / 1.4
What I would expect is that VBA would calculate pageWidth = 392.857...
However what VBA is doing is pgeWidth = 392,857... If I put a break in and check the value of pgeWidth it shows the value with a comma separator. pgeWidth is then set to the page width property in Visio which thinks it should be 392857.... mm wide which obviously gives an error.
Why is VBA using a comma decimal separator instead of a point?
If you want to pass the value further with point, consider converting variable to a string and then replacing comma with point:
pgeWidth = replace(CStr(pgeWidth), ",",".")

Resources