How to combine 2 pmml files into 1 file with 2 output? - modeling

i have developed two models (classification and regression) and exported them to PMML exchange format via https://github.com/jpmml/jpmml-xgboost. Both models run fine when I call them in python. However, I would love to combine both into one file which returns two values, the class probability of the classification model AND the predicted value from the regression model.
I tried for hours now but failed to understand the PMML specification as good as necessary.
Does anyone have experience with this and could give me a hint how to combine files and flow the values though the file ? Both models require exactly the same input.
Thank you!
See two mini examples below:
regression model:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML xmlns="http://www.dmg.org/PMML-4_4" xmlns:data="http://jpmml.org/jpmml-model/InlineTable" version="4.4">
<Header>
<Application name="JPMML-XGBoost" version="1.5-SNAPSHOT"/>
<Timestamp>2021-07-27T11:55:26Z</Timestamp>
</Header>
<DataDictionary>
<DataField name="mpg" optype="continuous" dataType="float">
<Value value="NaN" property="missing"/>
</DataField>
<DataField name="IDVAR_REINIG" optype="continuous" dataType="float">
<Value value="NaN" property="missing"/>
</DataField>
</DataDictionary>
<MiningModel functionName="regression" algorithmName="XGBoost (GBTree)" x-mathContext="float">
<MiningSchema>
<MiningField name="mpg" usageType="target"/>
<MiningField name="IDVAR_REINIG"/>
</MiningSchema>
<Targets>
<Target field="mpg" rescaleConstant="0.5"/>
</Targets>
<Segmentation multipleModelMethod="sum">
<Segment id="1">
<True/>
<TreeModel functionName="regression" noTrueChildStrategy="returnLastPrediction" x-mathContext="float">
<MiningSchema>
<MiningField name="IDVAR_REINIG"/>
</MiningSchema>
<Output>
<OutputField name="mpg" optype="continuous" dataType="float" isFinalResult="false" rescaleConstant="0.5"/>
</Output>
<Node score="1.7433707">
<True/>
<Node score="6.1398296">
<SimplePredicate field="IDVAR_REINIG" operator="greaterOrEqual" value="6033.51"/>
</Node>
</Node>
</TreeModel>
</Segment>
</Segmentation>
</MiningModel>
</PMML>
classification model:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PMML xmlns="http://www.dmg.org/PMML-4_4" xmlns:data="http://jpmml.org/jpmml-model/InlineTable" version="4.4">
<Header>
<Application name="JPMML-XGBoost" version="1.5-SNAPSHOT"/>
<Timestamp>2021-07-27T11:54:45Z</Timestamp>
</Header>
<DataDictionary>
<DataField name="mpg" optype="categorical" dataType="integer">
<Value value="0"/>
<Value value="1"/>
</DataField>
<DataField name="IDVAR_REINIG" optype="continuous" dataType="float">
<Value value="NaN" property="missing"/>
</DataField>
</DataDictionary>
<MiningModel functionName="classification" algorithmName="XGBoost (GBTree)" x-mathContext="float">
<MiningSchema>
<MiningField name="mpg" usageType="target"/>
<MiningField name="IDVAR_REINIG"/>
</MiningSchema>
<Segmentation multipleModelMethod="modelChain" missingPredictionTreatment="returnMissing">
<Segment id="1">
<True/>
<MiningModel functionName="regression" x-mathContext="float">
<MiningSchema>
<MiningField name="IDVAR_REINIG"/>
</MiningSchema>
<Output>
<OutputField name="xgbValue" optype="continuous" dataType="float" isFinalResult="false"/>
</Output>
<Segmentation multipleModelMethod="sum">
<Segment id="1">
<True/>
<TreeModel functionName="regression" noTrueChildStrategy="returnLastPrediction" x-mathContext="float">
<MiningSchema>
<MiningField name="IDVAR_REINIG"/>
</MiningSchema>
<Node score="0.0070259375">
<True/>
<Node score="-0.030500757">
<SimplePredicate field="IDVAR_REINIG" operator="greaterOrEqual" value="2240.835"/>
</Node>
</Node>
</TreeModel>
</Segment>
</Segmentation>
</MiningModel>
</Segment>
<Segment id="2">
<True/>
<RegressionModel functionName="classification" normalizationMethod="logit" x-mathContext="float">
<MiningSchema>
<MiningField name="mpg" usageType="target"/>
<MiningField name="xgbValue"/>
</MiningSchema>
<Output>
<OutputField name="probability(0)" optype="continuous" dataType="float" feature="probability" value="0"/>
<OutputField name="probability(1)" optype="continuous" dataType="float" feature="probability" value="1"/>
</Output>
<RegressionTable intercept="0.0" targetCategory="1">
<NumericPredictor name="xgbValue" coefficient="1.0"/>
</RegressionTable>
<RegressionTable intercept="0.0" targetCategory="0"/>
</RegressionModel>
</Segment>
</Segmentation>
</MiningModel>
</PMML>

Create a parent MiningModel element that holds the two existing child model elements. Insert the classification model first, and the regression model second; execute them as a model chain.
By default, this model chain will only display the result fields of the last child model. However, you can export one or more result fields of the first model into "local variables", and then reflect their values wherever needed.
Sample PMML markup skeleton:
<MiningModel>
<Segmentation multipleModelMethod="modelChain">
<Segment id="classification">
<True/>
<MiningModel>
<Output>
<!-- Export the probability value to evaluation context -->
<OutputField name="probability(event)" feature="probability" value="event"/>
</Output>
</MiningModel>
</Segment>
<Segment id="regression">
<True/>
<MiningModel>
<MiningSchema>
<!-- Import the probability value from the evaluation context -->
<MiningField name="probability(event")/>
</MiningSchema>
<Output>
<!-- Re-export the probability value under a different name -->
<OutputField name="copy(probability(event))" feature="transformedValue">
<FieldRef field="probability(event)"/>
</OutputField>
</Output>
</MiningModel>
</Segment>
</Segmentation>
</MiningModel>

Alternatively, you may employ the "segment referencing" mechanism to access child model outputs from the parent output.
See the description of OutputField#segmentId attribute here.
Sample PMML markup skeleton:
<MiningModel>
<Segmentation multipleModelMethod="modelChain">
<Segment id="classification/>
<Segment id="regresion">
</Segmentation>
<Output>
<!-- Reflect the probability of the "event" class of the classification model -->
<OutputField name="probability(event)" segmentId="classification" feature="probability"/>
<!-- Reflect the predicted value of the regression model -->
<OutputField name="y" segmentId="regression" feature="predictedValue"/>
</Output>
</MiningModel>

Related

XLSX XML cell formatting works in LibreOffice but not MS Excel

I modified data tables xlsx export to generate tables with my custom styles. Primarily for the background colors. Mine is a mess, but it works. It generates the xlsx file and in LibreOffice it looks exactly like it should. But in Excel, the cells with Style #3 (FFAAAA) are not filled with solid yellow background but with a dotted gray background.
The ones with red or white background just work fine everywhere.
The whole xml was reverse engineered from other exports.
Any idea what Excel expects to be different?
<?xml version="1.0" encoding="UTF-8"?>
<styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac">
<numFmts count="7">
<numFmt numFmtId="0" formatCode=""/>
<numFmt numFmtId="1" formatCode="#,##0.00_-\ [$$-45C]"/>
<numFmt numFmtId="2" formatCode=""£"#,##0.00"/>
<numFmt numFmtId="3" formatCode="[$€-2]\ #,##0.00"/>
<numFmt numFmtId="4" formatCode="0.0%"/>
<numFmt numFmtId="5" formatCode="#,##0;(#,##0)"/>
<numFmt numFmtId="6" formatCode="#,##0.00;(#,##0.00)"/>
</numFmts>
<fonts count="2" x14ac:knownFonts="1">
<font>
<sz val="11" />
<name val="undefined" />
<color rgb="FF000000" />
</font>
<font>
<sz val="11" />
<name val="Calibri" />
<color rgb="FF000000" />
<b />
</font>
</fonts>
<fills count="4">
<fill>
<patternFill patternType="none" />
</fill>
<fill>
<patternFill patternType="solid">
<fgColor rgb="FFffeeaa" />
<bgColor indexed="64" />
</patternFill>
</fill>
<fill>
<patternFill patternType="solid">
<fgColor rgb="FFffaaaa" />
<bgColor indexed="65" />
</patternFill>
</fill>
<fill>
<patternFill patternType="solid">
<fgColor rgb="FFffffff" />
<bgColor indexed="66" />
</patternFill>
</fill>
</fills>
<borders count="2">
<border> <left /> <right /> <top /> <bottom /> <diagonal /> </border>
<border diagonalUp="false" diagonalDown="false"> <left style="thin"> <color auto="1" /> </left> <right style="thin"> <color auto="1" /> </right> <top style="thin"> <color auto="1" /> </top> <bottom style="thin"> <color auto="1" /> </bottom> <diagonal /> </border>
</borders>
<cellStyleXfs count="1">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0" />
</cellStyleXfs>
<cellXfs count="5">
<xf numFmtId="0" fontId="0" fillId="0" borderId="0" applyFont="1" applyFill="1" applyBorder="1"/>
<xf numFmtId="0" fontId="1" fillId="0" borderId="1" applyFont="1" applyFill="1" applyBorder="1"/>
<xf numFmtId="0" fontId="1" fillId="1" borderId="1" applyFont="1" applyFill="1" applyBorder="1"/>
<xf numFmtId="0" fontId="1" fillId="2" borderId="1" applyFont="1" applyFill="1" applyBorder="1"/>
<xf numFmtId="0" fontId="1" fillId="3" borderId="1" applyFont="1" applyFill="1" applyBorder="1"/>
</cellXfs>
<cellStyles count="1">
<cellStyle name="Normal" xfId="0" builtinId="0" />
</cellStyles>
<dxfs count="0" />
<tableStyles count="0" defaultTableStyle="TableStyleMedium9" defaultPivotStyle="PivotStyleMedium4" />
</styleSheet>
It seems excel always overwrites the second with patternType="gray125".
I just keep
<fill>
<patternFill patternType="gray125">
<fgColor rgb="FFffffff" />
<bgColor rgb="FFffffff" />
</patternFill>
</fill>
as second , regardless if i actually use it in any style and add the i need after this. Now it works in Libre Office Calc and MS Excel.
I hope that helps others aswell.

Solr Better search result with adjacent query keyword

I have configured solr for my ecommerce application (That mostly contains books data). The search result does not seem to return what I expect.
Following is the configuration.
schema.xml
`
<field name="namespace" type="string" indexed="true" stored="false" />
<field name="id" type="string" indexed="true" stored="true" />
<field name="productId" type="long" indexed="true" stored="true" />
<field name="skuId" type="long" indexed="true" stored="true" />
<field name="category" type="long" indexed="true" stored="false" multiValued="true" />
<field name="explicitCategory" type="long" indexed="true" stored="false" multiValued="true" />
<field name="searchable" type="text_general" indexed="true" stored="false" />
<dynamicField name="*_searchable" type="text_general" indexed="true" stored="false" />
<dynamicField name="*_i" type="int" indexed="true" stored="false" />
<dynamicField name="*_is" type="int" indexed="true" stored="false" multiValued="true" />
<dynamicField name="*_s" type="string" indexed="true" stored="false" />
<dynamicField name="*_ss" type="string" indexed="true" stored="false" multiValued="true" />
<dynamicField name="*_l" type="long" indexed="true" stored="false" />
<dynamicField name="*_ls" type="long" indexed="true" stored="false" multiValued="true" />
<dynamicField name="*_t" type="text_general" indexed="true" stored="false" />
<dynamicField name="*_txt" type="text_general" indexed="true" stored="false" multiValued="true" />
<dynamicField name="*_b" type="boolean" indexed="true" stored="false" />
<dynamicField name="*_bs" type="boolean" indexed="true" stored="false" multiValued="true" />
<dynamicField name="*_d" type="double" indexed="true" stored="false" />
<dynamicField name="*_ds" type="double" indexed="true" stored="false" multiValued="true" />
<dynamicField name="*_p" type="double" indexed="true" stored="false" />
<dynamicField name="*_dt" type="date" indexed="true" stored="false" />
<dynamicField name="*_dts" type="date" indexed="true" stored="false" multiValued="true" />
<!-- some trie-coded dynamic fields for faster range queries -->
<dynamicField name="*_ti" type="tint" indexed="true" stored="false" />
<dynamicField name="*_tl" type="tlong" indexed="true" stored="false" />
<dynamicField name="*_td" type="tdouble" indexed="true" stored="false" />
<dynamicField name="*_tdt" type="tdate" indexed="true" stored="false" />
<!-- Both field types required for geolocation searches. First stores the
lat and lon components for the "coordinate" FieldType. Second stores
the coordinate. -->
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>
<dynamicField name="*_c" type="coordinate" indexed="true" stored="false"/>
</fields>
<uniqueKey>id</uniqueKey>
<types>
<!-- The StrField type is not analyzed, but indexed/stored verbatim. -->
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
<!-- boolean type: "true" or "false" -->
<fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" />
<!-- Default numeric field types. For faster range queries, consider the
tint/tlong/tdouble types. -->
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0" />
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0" />
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0" />
<!-- Numeric field types that index each value at various levels of precision
to accelerate range queries when the number of values between the range endpoints
is large. See the javadoc for NumericRangeQuery for internal implementation
details. Smaller precisionStep values (specified in bits) will lead to more
tokens indexed per value, slightly larger index size, and faster range queries.
A precisionStep of 0 disables indexing at different precision levels. -->
<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0" />
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0" />
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0" />
<!-- The format for this date field is of the form 1995-12-31T23:59:59Z,
and is a more restricted form of the canonical representation of dateTime
http://www.w3.org/TR/xmlschema-2/#dateTime The trailing "Z" designates UTC
time and is mandatory. Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z
All other components are mandatory. Expressions can also be used to denote
calculations that should be performed relative to "NOW" to determine the
value, ie... NOW/HOUR ... Round to the start of the current hour NOW-1DAY
... Exactly 1 day prior to now NOW/DAY+6MONTHS+3DAYS ... 6 months and 3 days
in the future from the start of the current day Consult the DateField javadocs
for more information. Note: For faster range queries, consider the tdate
type -->
<fieldType name="date" class="solr.TrieDateField" precisionStep="0" positionIncrementGap="0" />
<!-- A Trie based date field for faster date range queries and date faceting. -->
<fieldType name="tdate" class="solr.TrieDateField" precisionStep="6" positionIncrementGap="0" />
<!-- A general text field that has reasonable, generic cross-language defaults:
it tokenizes with StandardTokenizer and down cases. -->
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory" />
<filter class="solr.LowerCaseFilterFactory" />
</analyzer>
</fieldType>
<!-- A specialized field for geospatial search. If indexed, this fieldType must not be multivalued. -->
<fieldType name="coordinate" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
</types>
`
solrconfig.xml
<?xml version="1.0" encoding="UTF-8" ?>
<config>
<luceneMatchVersion>4.10.3</luceneMatchVersion>
<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.StandardDirectoryFactory}" />
<updateHandler class="solr.DirectUpdateHandler2" />
<query>
<maxBooleanClauses>1024</maxBooleanClauses>
<filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" />
<queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0" />
<documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0" />
<cache name="perSegFilter" class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10"
regenerator="solr.NoOpRegenerator" />
<enableLazyFieldLoading>true</enableLazyFieldLoading>
<queryResultWindowSize>20</queryResultWindowSize>
<queryResultMaxDocsCached>200</queryResultMaxDocsCached>
<listener event="newSearcher" class="solr.QuerySenderListener" />
<listener event="firstSearcher" class="solr.QuerySenderListener">
<arr name="queries">
<lst>
<str name="q">static firstSearcher warming in solrconfig.xml</str>
</lst>
</arr>
</listener>
<useColdSearcher>false</useColdSearcher>
<maxWarmingSearchers>2</maxWarmingSearchers>
</query>
<requestDispatcher handleSelect="false">
<requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000" formdataUploadLimitInKB="2048"
addHttpRequestToContext="false"/>
<httpCaching never304="true" />
</requestDispatcher>
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rowsa">10</int>
<str name="df">name_t</str>
</lst>
</requestHandler>
<queryResponseWriter name="json" class="solr.JSONResponseWriter">
<str name="content-type">text/plain; charset=UTF-8</str>
</queryResponseWriter>
For example when I search for 2 states it gives me lot of random results, which does not even contain 2 states in the title.
However when I search for 2 states in phrase "2 States", I do get the relevant results"
I dont want to restrict every search into quotes, since user might search for some combination like "book by author" which certainly give 0 results if searched in phrase since it wont match the exact phrase.
How can I imporve my search so that I can list most relevant results on the top.
You can use the pf2 and pf3 parameters in the edismax handler to give boosts to documents where two (pf2) or three (pf3) of your terms are found after each other in the field.
defType=edismax&pf2=title^4
You also have the pf argument for the regular dismax handler, but that's built on the assumption that all the terms are close together. It might help, but pf2 or pf3 sounds better suited for what you need.

How to group properly excel data source with rowspan?

From my understanding, <groupExpression> tag will be the one that decides whether a new group will be created or not, if there's a change to the element inside <groupExpression>, a new group will be created.
I want my report to look similar to my Excel data source (refer below) hence I want to group the ID and Name in the Excel file. In my jasperReport.jrxml (refer below), my <groupExpression> for Group1 is the ID column of my Excel File. But when I preview the report (refer below), the ID and Name column is not grouped, instead, there's a null string.
How to properly group them and eliminate the null string?
Excel datasource:
jasperReport.jrxml
<?xml version="1.0" encoding="UTF-8"?>
<!-- Created with Jaspersoft Studio version 6.2.0.final using JasperReports Library version 6.2.0 -->
<!-- 2016-01-26T15:33:41 -->
<jasperReport xmlns="http://jasperreports.sourceforge.net/jasperreports" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://jasperreports.sourceforge.net/jasperreports http://jasperreports.sourceforge.net/xsd/jasperreport.xsd" name="FirstJasperReport" pageWidth="595" pageHeight="842" columnWidth="555" leftMargin="20" rightMargin="20" topMargin="20" bottomMargin="20" uuid="8b8832df-588e-4202-826e-a6b3efcbd22b">
<property name="com.jaspersoft.studio.data.defaultdataadapter" value="ExcelDataBase"/>
<queryString>
<![CDATA[]]>
</queryString>
<field name="ID" class="java.lang.Integer"/>
<field name="Name" class="java.lang.String"/>
<field name="Title" class="java.lang.String"/>
<field name="Balance" class="java.lang.Integer"/>
<variable name="Balance1" class="java.lang.Integer" resetType="Group" resetGroup="Group1" calculation="Count">
<variableExpression><![CDATA[$F{Balance}]]></variableExpression>
</variable>
<variable name="Balance2" class="java.lang.Integer" resetType="Group" resetGroup="Group1" calculation="Sum">
<variableExpression><![CDATA[$F{Balance}]]></variableExpression>
</variable>
<group name="Group1">
<groupExpression><![CDATA[$F{ID}]]></groupExpression>
<groupHeader>
<band height="30">
<rectangle>
<reportElement x="0" y="0" width="400" height="30" backcolor="#DEFCF2" uuid="de6c2f8d-afa6-45b4-b40e-574f2e07057e"/>
</rectangle>
<textField>
<reportElement x="0" y="0" width="100" height="30" uuid="c028645d-9b29-42d3-b91e-d47f15a5b44a"/>
<textFieldExpression><![CDATA[$F{ID}]]></textFieldExpression>
</textField>
<textField>
<reportElement x="100" y="0" width="100" height="30" uuid="85d2844f-ef91-47a1-9223-c6943a25fe4d"/>
<textFieldExpression><![CDATA[$F{Name}]]></textFieldExpression>
</textField>
</band>
</groupHeader>
......
Preview result:
How to make title1 and title2 appear under the group test1, without the null string (similar to the Excel source file)?
The problem is that the excel datasource is passing $F{ID}==null for second record. This generates the null group (as you can see also the name is null).
The easiest way to fix it is to not use rowspan in excel (include all data in the excel sheet).
If this is not possible you need to save the first $F{ID} value and return this if $F{ID}==null
Example
<variable name="First_ID" class="java.lang.Integer" resetType="Group" resetGroup="Group1" calculation="First">
<variableExpression><![CDATA[$F{ID}]]></variableExpression>
</variable>
In group return variable $V{First_ID} if $F{ID}==null
<group name="Group1">
<groupExpression><![CDATA[$F{ID}==null?$V{First_ID}:$F{ID}]]></groupExpression>
... your groupHeader ....
</group>

Add start and end attribute to node and edge in gexf file

I have a .gexf file that contains nodes and edges with IDs and labels. I generated this .gexffile from a .gml file using networkx. Here's the code for that:
import networkx as nx
G = nx.read_gml('data/gml/test.gml') # read in gml file as Graph
nx.write_gexf(G, "output/test.gexf") # write to gexf format
The next thing I want to do, is to add a startand end attribute to every node and every edge in my file.
So basically, I want this:
<?xml version='1.0' encoding='utf-8'?>
<gexf version="1.1" xmlns="http://www.gexf.net/1.1draft" xmlns:viz="http://www.gexf.net/1.1draft/viz" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/XMLSchema-instance">
<graph defaultedgetype="undirected" mode="static">
<nodes>
<node id="clock" label="clock" />
<node id="beach" label="beach" />
<node id="sun" label="sun" />
<node id="sea" label="sea" />
<node id="sand" label="sand" />
<node id="guitar" label="guitar" />
(...)
</nodes>
<edges>
<edge id="0" source="ice" target="shoe" weight="0.9995600294856769" />
<edge id="1" source="ice" target="toothbrush" weight="0.9992457544219484" />
<edge id="1533" source="snake" target="ant" weight="0.9999144063155566" />
(...)
<edge id="1534" source="mosquito" target="jellyfish" weight="0.9994175606336606" />
<edge id="1535" source="ant" target="star" weight="0.9994226236705537" />
</edges>
</graph>
</gexf>
to look like this (note the dynamicmode and start and end attributes):
<?xml version='1.0' encoding='utf-8'?>
<gexf version="1.1" xmlns="http://www.gexf.net/1.1draft" xmlns:viz="http://www.gexf.net/1.1draft/viz" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/XMLSchema-instance">
<graph defaultedgetype="undirected" mode="dynamic">
<nodes>
<node id="clock" label="clock" start="2000-02-20" end="2000-02-22" />
<node id="beach" label="beach" start="2000-02-20" end="2000-02-22" />
<node id="sun" label="sun" start="2000-02-20" end="2000-02-22" />
<node id="sea" label="sea" start="2000-02-20" end="2000-02-22" />
<node id="sand" label="sand" start="2000-02-20" end="2000-02-22" />
<node id="guitar" label="guitar" start="2000-02-20" end="2000-02-22" />
(...)
</nodes>
<edges>
<edge id="0" source="ice" target="shoe" weight="0.9995600294856769" start="2000-02-20" end="2000-02-22" />
<edge id="1" source="ice" target="toothbrush" weight="0.9992457544219484" start="2000-02-20" end="2000-02-22" />
<edge id="1533" source="snake" target="ant" weight="0.9999144063155566" start="2000-02-20" end="2000-02-22" />
(...)
<edge id="1534" source="mosquito" target="jellyfish" weight="0.9994175606336606" start="2000-02-20" end="2000-02-22" />
<edge id="1535" source="ant" target="star" weight="0.9994226236705537" start="2000-02-20" end="2000-02-22" />
</edges>
</graph>
</gexf>
Unfortunately I was not able to find any documentation (neither for networkx nor for pygexf) on how to write a dyamic gexf file and add a startand end attribute to every (already existing) node and edge. Can anyone please help me with this?
UPDATE:
When I use
nx.set_edge_attributes(G, 'start', '2000-02-20')
nx.set_edge_attributes(G, 'end', '2000-02-22')
To set the edge attributes, I get the correct output, e.g.:
<edge id="0" source="great" target="wait" weight="0.998675772419067" start="2000-02-20" end="2000-02-22" />
However, when I do:
nx.set_node_attributes(G, 'start','2000-02-20')
nx.set_node_attributes(G, 'end','2000-02-22')
I get:
<node id="blue" label="blue">
<attvalues>
<attvalue for="0" value="2000-02-20" />
<attvalue for="1" value="2000-02-22" />
</attvalues>
How can I set the start and endattribute within the node tag?
Came across that same problem. Networkx 2.1 still does not support that, but there is a workaround:
Write the .gexf file as usual
Download Gephi 0.9.2 and open the .gexf file
Go to Data Laboratory and press 'Merge Columns'. Now select the start and end column and 'Merge Strategy': 'Create time interval'. Now your Interval column is filled with <[start, end]>
Go to File > Export > Graph file... and select File Format:GEXF Files. Your nodes now contain the start and end attribute.

FetchXML View to Include Attributes from Nested Link-Entity

I would like to have a view that show attributes from 3 entities:
Statistics has a lookup to Account and Account has a lookup to Address.
The view is on Statistics and I want attributes from all 3 entities; is this even possible?
The problem is with the GridXML.
I want to include the attribute wl_city in the GridXML.
This is the FetchXML with link-entities:
<fetchxml>
<fetch version="1.0" output-format="xml-platform" mapping="logical">
<entity name="sb_statistics">
<order attribute="sb_amount" descending="false" />
<!-- It is easy to get these into the GridXML -->
<attribute name="sb_debtor" />
<attribute name="sb_date" />
<attribute name="sb_amount" />
<link-entity name="account" from="accountid" to="sb_debtor"
alias="relatedAccount" link-type="outer">
<!-- It is possible to get this into the GridXML
by using the link-entity alias: relatedAccount.wl_towncity -->
<attribute name="wl_towncity" />
<link-entity name="wl_postalcode" from="wl_postalcodeid"
to="wl_postaltowncity" alias="relatedAddress" link-type="outer">
<!-- I have trouble getting this attribute into the GridXML -->
<attribute name="wl_city" />
</link-entity>
</link-entity>
<attribute name="sb_statisticsid" />
</entity>
</fetch>
</fetchxml>
When I change the GridXML as below this error is displayed when the view is opened:
"To use this saved query, you must remove criteria and columns that refer to deleted or non-searchable items"
<layoutxml>
<grid name="resultset" object="10008" jump="sb_name" select="1" preview="1"
icon="1">
<row name="result" id="sb_statisticsid" multiobjectidfield="1">
<cell name="sb_amount" width="100" />
<cell name="sb_date" width="100" />
<cell name="sb_debtor" width="100" />
<cell name="relatedAccount.relatedAddress.wl_city" width="100" />
</row>
</grid>
</layoutxml>
The below GridXML shows this error when the view is opened:
"Unexpected Error An error has occured".
<layoutxml>
<grid name="resultset" object="10008" jump="sb_name" select="1" preview="1"
icon="1">
<row name="result" id="sb_statisticsid" multiobjectidfield="1">
<cell name="sb_amount" width="100" />
<cell name="sb_date" width="100" />
<cell name="sb_debtor" width="100" />
<cell name="relatedAddress.wl_city" width="100" />
</row>
</grid>
</layoutxml>
The GridXML below results in this error being shown when the view is opened:
"To use this saved view, you must remove criteria and columns that refer to deleted or non-searchable columns".
<layoutxml>
<grid name="resultset" object="10008" jump="sb_name" select="1" preview="1"
icon="1">
<row name="result" id="sb_statisticsid" multiobjectidfield="1">
<cell name="sb_amount" width="100" />
<cell name="sb_date" width="100" />
<cell name="sb_debtor" width="100" />
<cell name="wl_city" width="100" />
</row>
</grid>
</layoutxml>
This saved query works, but it only includes attributes from the primary entity and the first link-entity.
<savedquery>
<IsCustomizable>1</IsCustomizable>
<CanBeDeleted>0</CanBeDeleted>
<isquickfindquery>0</isquickfindquery>
<isprivate>0</isprivate>
<isdefault>0</isdefault>
<returnedtypecode>10008</returnedtypecode>
<savedqueryid>{df101ac4-2e4d-e311-9377-005056bd0001}</savedqueryid>
<layoutxml>
<grid name="resultset" object="10008" jump="sb_name" select="1" preview="1"
icon="1">
<row name="result" id="sb_statisticsid" multiobjectidfield="1">
<cell name="sb_amount" width="100" />
<cell name="sb_date" width="100" />
<cell name="sb_debtor" width="100" />
<cell name="relatedAccount.wl_city" width="100" />
</row>
</grid>
</layoutxml>
<querytype>0</querytype>
<fetchxml>
<fetch version="1.0" output-format="xml-platform" mapping="logical">
<entity name="sb_statistics">
<order attribute="sb_amount" descending="false" />
<attribute name="sb_debtor" />
<attribute name="sb_date" />
<attribute name="sb_amount" />
<link-entity name="account" from="accountid" to="sb_debtor"
alias="relatedAccount" link-type="outer">
<attribute name="wl_towncity" />
<link-entity name="wl_postalcode" from="wl_postalcodeid"
to="wl_postaltowncity" alias="relatedAddress" link-type="outer">
<attribute name="wl_city" />
</link-entity>
</link-entity>
<attribute name="sb_statisticsid" />
</entity>
</fetch>
</fetchxml>
<LocalizedNames>
<LocalizedName description="Statistics and Address" languagecode="1033" />
</LocalizedNames>
</savedquery>
Is GridXML limited to showing only attributes from the primary entity and the first link-entity?
This is not possible, according to the best of my knowledge, but please someone prove me wrong.
A limitation of GridXML appears to be that attributes can only be included that are from the first link-entity, not any nested link-entities.
It should work when using link-type="inner" for nested link.
<entity name="sb_statistics">
...
<link-entity name="account" from="accountid" to="sb_debtor"
alias="relatedAccount" link-type="outer">
<attribute name="wl_towncity" />
<link-entity name="wl_postalcode" from="wl_postalcodeid"
to="wl_postaltowncity" alias="relatedAddress" link-type="inner"> //link-type="inner"
<attribute name="wl_city" />
</link-entity>
</link-entity>
<attribute name="sb_statisticsid" />
</entity>
I have found no evidence that it can be done. With or without link-type='inner' the designer (in 2013) says, "The relatedAddress.wl_city column is no longer a valid column because it has been deleted as a column option. You need to remove this column and, if you want, add a different one."
It does NOT need multiple dereferrences, nor does that work. If you dump the keyValuePairs of the AttributeCollection returned by the fetch, you will see the key is relatedAddress.w1_city -- not its parent nor the combination.
Like the UI, it just appears the layout is limited to only root and children, no grandchildren nor further descendants.
I think it's a little late to answer this question, but maybe someone come to this post and find it helpful.
first thing you should know is that, fetchxml will return only column that are not null, so if you are querying a column that there is no data in that, then fetchxml automatically remove it from result set.
second thing is, if you have different table with different relationship, then alias name will be added to the column name, so in your case relatedAccount.wl_towncity and relatedAddress.wl_city is correct and not relatedAccount.relatedAddress.wl_city. in your example, you put alias name after each other that is not correct.
third thing that you should know is that when a nested result will return, the type is object, but original type AliasedValue , so first you have to cast the object to AliasedValue. then it become ready to cast it to OptionSetValue. after that you have to look for .Value that has the result of what you want
I made it work like this: I still have an issue with unresolved columnheaders.
<fetch distinct='true'>
<entity name='rdiac_riskobject'>
<attribute name='rdiac_riskobjectid' />
<attribute name='rdiac_name' />
<attribute name='rdiac_riskobjectproduct' />
<link-entity name='rdiac_riskobject_rdiac_propertydetail' from='rdiac_riskobjectid' to='rdiac_riskobjectid' intersect='true'>
<link-entity name='rdiac_propertydetail' alias='pd1' from='rdiac_propertydetailid' to='rdiac_propertydetailid'>
<attribute name='rdiac_valuestring' />
<link-entity name='rdiac_propertysvconfig' from='rdiac_property' to='rdiac_propertyid'>
<filter>
<condition attribute='rdiac_svfield' operator='eq' value='100000000'/>
</filter>
</link-entity>
</link-entity>
</link-entity>
<link-entity name='rdiac_riskobject_rdiac_propertydetail' from='rdiac_riskobjectid' to='rdiac_riskobjectid' intersect='true'>
<link-entity name='rdiac_propertydetail' alias='pd2' from='rdiac_propertydetailid' to='rdiac_propertydetailid'>
<attribute name='rdiac_valuestring' />
<link-entity name='rdiac_propertysvconfig' from='rdiac_property' to='rdiac_propertyid'>
<filter>
<condition attribute='rdiac_svfield' operator='eq' value='100000001'/>
</filter>
</link-entity>
</link-entity>
</link-entity>
</entity>
</fetch>
<grid name='resultset' object='10139' jump='rdiac_riskobjectproduct' select='1' preview='0' icon='1' >
<row name='result' id='rdiac_riskobjectid' >
<cell name='rdiac_riskobjectproduct' width='100' />
<cell name='pd1.rdiac_valuestring' width='200' />
<cell name='pd2.rdiac_valuestring' width='200' />
</row>
</grid>

Resources