XSLT 1.0 - Selecting specific value from node string - string

I have this XML structure.
<Organisation>
<Employee type="gorilla">
<Publication>
The Hills and the Sunrise, 1-2013-05, $94,000
</Publication>
<Publication>
Sunny and Rainy on the same day, 1-2013-05, $584,932
</Publication>
</Employee>
<Employee type="gorilla">
<Publication>
The Hills and the Sunrise, 1-2013-05, $94,000
</Publication>
<Publication>
Sunny and Rainy on the same day, 1-2013-05, $584,932
</Publication>
</Employee>
<Employee type="pig">
<pigpen>
</pigpen>
</Employee>
</Organisation>
I would like to know how I can select only the dollar amount at the end of every Publication where an employe type = gorilla, and add these values together to give the total publication income. I would also like to use this total to calculate the average income per gorilla employee.
How could I go about doing this? I'm having trouble finding a function to select just the dollar amounts.
If there isn't a method to do this in XSLT 1, would it be considered good coding form to add more nodes to store the dollar amounts separately?
e.g.
<Publication>
<Description>The Hills and the Sunrise</Description>
<Amount>$94,000</Amount>
</Publication>

Related

Rating scale 1-10

I have survey data with rating survey responses "Excellent", "Very Good"."Good" or Poor. How do I replace these texts in excel with number range - "9-10", "7-8" or "0-6"
=CHOOSE(MATCH(A1,{"Excellent","Very Good","Good","Poor"},0),"9-10","7-8","5-6","0-4")
I made up the numbers, since your question showed 4 options and 3 results only.
This can be done with a simple =IF() construction, something like:
=IF(OR(A1=9,A1=10),"Excellent",
IF(OR(A1=7,A1=8),...),
...
)

Search and get row from large single string

Hi I have single large string and i need to search set of string from this string and get that row create a data frame with this rows.
large String:
This is democracy’s day.
A day of history and hope.
Of renewal and resolve.
Through a crucible for the ages America has been tested anew and America has risen to the challenge.
Today, we celebrate the triumph not of a candidate, but of a cause, the cause of democracy.
The will of the people has been heard and the will of the people has been heeded.
We have learned again that democracy is precious.
Now i want to search few set of strings from above.
and my final output dataframe should look like below
Searching string
democracy’s day
America has been tested
celebrate the triumph
democracy is precious
Thanks in advance
You can create a regex out of your search strings and compare them for a match against the Large String column using extract. Where there's a match, the match string will be the value in the Searching String column, otherwise it will be null. The dataframe can then be filtered on the Searching String value being not null:
import re
df = pd.DataFrame({ 'Large String': ["This is democracy's day.", "A day of history and hope.","Of renewal and resolve.","Through a crucible for the ages America has been tested anew and America has risen to the challenge.","Today, we celebrate the triumph not of a candidate, but of a cause, the cause of democracy.","The will of the people has been heard and the will of the people has been heeded.","We have learned again that democracy is precious."] })
search_strings = ["democracy's day", "America has been tested", "celebrate the triumph", "democracy is precious"]
regex = '|'.join(map(re.escape, search_strings))
df['Searching String'] = df['Large String'].str.extract(f'({regex})')
df = df[~df['Searching String'].isna()]
print(df)
Output:
Large String Searching String
0 This is democracy's day. democracy's day
3 Through a crucible for the ages America has be... America has been tested
4 Today, we celebrate the triumph not of a candi... celebrate the triumph
6 We have learned again that democracy is precious. democracy is precious
Note:
we use re.escape on the search strings in case they contain special characters for regex e.g. . or ( etc.
if one of the search strings is a subset of another, the list should be sorted by order of decreasing length to ensure the longer matches are captured

Extract value from column with pandas lib (data frame)

original data frame:
Date
Detail
31/03/22
I watch Netflix at home with my family 4 hours
01/04/22
I walk to the market for 3km and I spent 11.54 dollar
02/04/22
my dog bite me, I go to hospital, spend 29.99 dollar
03/04/22
I bought a game on steam 7 games spen 19.23 dollar
result data frame:
Date
Detail
Cost
31/03/22
I watch Netflix at home with my family 4 hours
0
01/04/22
I walk to the market for 3km and I spent 11.54 dollar
11.54
02/04/22
my dog bite me, I go to hospital, spend 29.99 dollar
29.99
03/04/22
I bought a game on steam 7 games spen 19.23 dollar
19.23
Describe my question:
If Detail Column does not contain specific string which is begin with sp.. and end with dollar
then value in Cost col equal zero.
If Detail Column does contain specific string which is begin with sp.. and end with dollar,
then value in Cost col equal value in the middle of specific string which is begin with sp..
and end with dollar.
I try to use regex but it's got first int that contain in the col like
| 01/04/22 | I walk to the market for 3km and I spent 11.54 dollar| 3 |
You should be able to use a regex pattern of a form such as:
df['Cost'] = df['Detail'].str.extract(r'sp\D*([\d\.]*)\D*dollar')
This will look for the literal string sp and then any non-digit characters after it. The capture group (denoted by the ()) looks for any digits or period characters, representing the dollar amount. This is what is returned to the Cost column. The final part of the pattern allows any number of non-digit characters after the dollar amount, followed by the literal string dollar.
The pd.NA for rows which don't have a cost can then be replaced with 0:
df['Cost'] = df['Cost'].replace({pd.NA: 0})
If you want to make any enhancements I used this site to test the regex: https://regexr.com/6ir6o

how to extract string based on this manner in excel?

I want to extract only these words if present in a cell:
{Beijing, New York, Japan}
I have a column with the following data(rowwise):
Nice city- Beijing, Awesome climate
Fair city- Japan, Cool weather
New York is so nice
All i want is another column which will have:
Beijing
Japan
New York
Is it possible to do it without vba?
Is there any formula? I have nth entries rowwise
You can try:
=LOOKUP(2^15,SEARCH({"Beijing","New York","Japan"},A1,1),{"Beijing","New York","Japan"})
You could use a formula such as
=IF(IFERROR(FIND("Beijing",A1),0)=0,"","Beijing")&
IF(IFERROR(FIND("Japan",A1),0)=0,"","Japan")&
IF(IFERROR(FIND("New York",A1),0)=0,"","New York")

Any simple way to do VLOOKUP combine "linear interpolation" in excel?

I'm making an excel sheet for calculating z-score for infant weight/age (Input: "Baby Month Age", and "Baby weight"). To do that, I need get LMS parameters first for a specific month, from below table.
http://www.who.int/childgrowth/standards/tab_wfa_boys_p_0_5.txt
(For Integer Month number, this can be done by vlookup Method without issue.) For Non-Integer Month number, I need use some kind of "linear interpolation" approach to get an approximate LMS data.
The question is, both Trend method and Vlookup method are not working for me. For Trend method, it is not working as the raw data, like L parameters is not linear data, if I use Trend method, for the several top month, return data will far from existing data. As for Vlookup method, it just finds the closest month data.
I had to use multiple "Match" and "Index" Method to do the "linear interpolation" for myself. However, I wonder whether there is any existing function for that?
My current formula for L parameters is below:
=MOD([Month Age],1)*(INDEX('WHO BOY AGE WEIGHT'!A:D,MATCH([Month Age],'WHO BOY AGE WEIGHT'!A:A)+1,2)-INDEX('WHO BOY AGE WEIGHT'!A:D,MATCH([Month Age],'WHO BOY AGE WEIGHT'!A:A),2))+INDEX('WHO BOY AGE WEIGHT'!A:D,MATCH([Month Age],'WHO BOY AGE WEIGHT'!A:A),2)
If we assume that months increment always by 1 (no gap in month data), you can use something like this formula to interpolate between the two values surrounding the give non-integer value:
=(1-MOD(2.3, 1))*VLOOKUP(2.3,A:S,2)+MOD(2.3, 1)*VLOOKUP(2.3+1,A:S, 2)
Which interpolates L(2.3) from data of L(2) = .197 and L(3) = .1738, resulting in .19004.
You can replace 2.3 by any cell reference. You can also change the lookup column 2 for L into 3 for M, 4 for S etc.
To answer the question whether there is some direct "interpolate" function in Excel, not that I know about, although there is good artillery for statistical estimation.

Resources