Nested triples with units on value in rdflib - > turtle file - python-3.x

I'm currently making a turtle file. I have to relate values/units based on qudt.org
Based on example data shown below:
data = { "objectid_l1": "Bridge_1",
"defid_l1": "Bridge",
"objectid_l2": "Deck_1",
"defid_l2": "Deck",
"variable": "height",
"value": "50.0",
"unit": "M",
}
i made the following code, but the output is not what i want:
from rdflib.namespace import RDF
from rdflib import URIRef, BNode, Literal, Namespace, Graph
EX = Namespace('https://example.com/id/')
UNIT = Namespace('http://qudt.org/2.1/vocab/unit/')
BS = Namespace('https://w3id.org/def/basicsemantics-owl#')
for i in data:
if i['objectid_l1'] != None:
g.add((
EX[i['objectid_l1']],
RDF.type,
EX[i['defid_l1']]
))
g.add((
EX[i['objectid_l1']],
BS['hasPart'],
EX[i['objectid_l2']]
))
g.add((
EX[i['objectid_l1']],
EX[i['variable']],
EX[i['value']]
))
g.add((
EX[i['variable']],
BS['unit'],
UNIT[i['unit']]
))
output:
ex:Bridge_1
a ex:Bridge ;
bs:hasPart ex:Deck_1 ;
ex:height 50.0 .
50.0 bs:unit unit:M .
As output i want that the units indent, the desired output is as follows:
ex:Bridge_1
a ex:Bridge ;
bs:hasPart ex:Deck_1 ;
ex:height [
rdf:value 50.0 ;
bs:unit unit:M ;
];

There is a few things missing in your code to achieve what you want :
you need to explicitely create the blank node you show in the expected output, using rdflib BNode, then use it as subject of the triples setting the height value and the unit
you have to specify that the height value (50.0) is a litteral value.
from rdflib.namespace import RDF, XSD, NamespaceManager
from rdflib import BNode, Literal, Namespace, Graph
EX = Namespace('https://example.com/id/')
UNIT = Namespace('http://qudt.org/2.1/vocab/unit/')
BS = Namespace('https://w3id.org/def/basicsemantics-owl#')
g = Graph()
g.namespace_manager = NamespaceManager(Graph())
g.namespace_manager.bind('unit', UNIT)
g.namespace_manager.bind('bs', BS)
g.namespace_manager.bind('ex', EX)
i = {
"objectid_l1": "Bridge_1",
"defid_l1": "Bridge",
"objectid_l2": "Deck_1",
"defid_l2": "Deck",
"variable": "height",
"value": "50.0",
"unit": "M",
}
if i['objectid_l1'] != None:
g.add((EX[i['objectid_l1']], RDF.type, EX[i['defid_l1']]))
g.add((EX[i['objectid_l1']], BS['hasPart'], EX[i['objectid_l2']]))
bnode = BNode()
g.add((bnode, RDF.value, Literal(i['value'])))
g.add((bnode, BS['unit'], UNIT[i['unit']]))
g.add((EX[i['objectid_l1']], EX[i['variable']], bnode))
g.serialize('test.ttl', format='ttl')
Should output as expected :
ex:Bridge_1 a ex:Bridge ;
bs:hasPart ex:Deck_1 ;
ex:height [ rdf:value "50.0" ;
bs:unit unit:M ] .

Related

Replace value of a specific key in json file for multiple files

I am preparing a test data for the feature testing .
I have 1000 json file with same below structure in local folder .
I need to replace two key from all the places and create 1000 different json .
Key that needs to be replaced is
"f_ID": "80510926" and "f_Ctb": "10333"
"f_ID": "80510926" appears in all the array but "f_Ctb": "10333" appear only once .
The replaced value can be running number 1 to 100 in all files .
Can some one suggest how we can do this python to create 1000 files
{
"t_A": [
{
"f_ID": "80510926",
"f_Ctb": "10333",
"f_isPasswordProtected": "False"
}
],
"t_B": [
{
"f_ID": "80510926",
"f_B_Company_ID": "10333",
"f_B_ID": "1",
"f_ArriveDate": "20151001 151535.775"
},
{
"f_ID": "80510926",
"f_B_Company_ID": "10333",
"f_B_ID": "1700",
"f_ArriveDate": "20151001 151535.775"
}
],
"t_C": [
{
"f_ID": "80510926",
"f_Set_Code": "TRBC ",
"f_Industry_Code": "10 ",
"f_InsertDate": "20151001 151535.775"
},
],
"t_D": [
{
"f_ID": "80510926",
"f_Headline": "Global Reinsurance: Questions and themes into Monte Carlo",
"f_Synopsis": ""
}
]
}
here is the solution run in the folder where you have 1000 json files.
It will read all the json files and replace the f_ID and f_Ctb with the count and write the file with same file name in output folder.
import os
import json
all_files = os.listdir()
json_files = {f: f for f in all_files if f.endswith(".json")}
json_files_keys = list(json_files.keys())
json_files_keys.sort()
OUTPUT_FOLDER = "output"
if not os.path.exists(OUTPUT_FOLDER):
os.mkdir(OUTPUT_FOLDER)
for file_name in json_files_keys:
f_read = open(file_name, "r").read()
data = json.loads(f_read)
output_data = {}
count = 1
for key, val_list in data.items():
for nest_dict in val_list:
if "f_ID" in nest_dict:
nest_dict["f_ID"] = count
if "f_Ctb" in nest_dict:
nest_dict["f_Ctb"] = count
if key in output_data:
output_data[key].append(nest_dict)
else:
output_data[key] = [nest_dict]
else:
output_data[key] = val_list
count += 1
output_file = open(f"{OUTPUT_FOLDER}/{file_name}", "w")
output_file.write(json.dumps(output_data))
output_file.close()

Pandas Dataframe to JSON add JSON Object Name

I have a dataframe that I'm converting to JSON but I'm having a hard time naming the object. The code I have:
j = (df_import.groupby(['Item', 'Subinventory', 'TransactionUnitOfMeasure', 'TransactionType', 'TransactionDate', 'TransactionSourceId', 'OrganizationName'])
.apply(lambda x: x[['LotNumber', 'TransactionQuantity']].to_dict('records'))
.reset_index()
.rename(columns={0: 'lotItemLots'})
.to_json(orient='records'))
The result I'm getting:
[
{
"Item": "000400MCI00099",
"OrganizationName": "OR",
"Subinventory": "LAB R",
"TransactionDate": "2021-08-19 00:00:00",
"TransactionSourceId": 3000001595xxxxx,
"TransactionType": "Account Alias Issue",
"TransactionUnitOfMeasure": "EA",
"lotItemLots": [
{
"LotNumber": "00040I",
"TransactionQuantity": -5
}
]
}
]
The result I need (the transactionLines part), but can't figure out:
{
"transactionLines":[
{
"Item":"000400MCI00099",
"Subinventory":"LAB R",
"TransactionQuantity":-5,
"TransactionUnitOfMeasure":"EA",
"TransactionType":"Account Alias Issue",
"TransactionDate":"2021-08-20 00:00:00",
"OrganizationName":"OR",
"TransactionSourceId": 3000001595xxxxx,
"lotItemLots":[{"LotNumber":"00040I", "TransactionQuantity":-5}]
}
]
}
Index,Item Number,5 Digit,Description,Subinventory,Lot Number,Quantity,EOM,[Qty],Transaction Type,Today's Date,Expiration Date,Source Header ID,Lot Interface Number,Transaction Source ID,TransactionType,Organization Name
1,000400MCI00099,40,ACANTHUS OAK LEAF,LAB R,00040I,-5,EA,5,Account Alias Issue,2021/08/25,2002/01/01,160200,160200,3000001595xxxxx,Account Alias Issue,OR
Would appreciate any guidance on how to get the transactionLines name in there. Thank you in advance.
It would seem to me you could simply parse the json output, and then re-form it the way you want:
import pandas as pd
import json
data = [{'itemID': 0, 'itemprice': 100}, {'itemID': 1, 'itemprice': 200}]
data = pd.DataFrame(data)
pd_json = data.to_json(orient='records')
new_collection = [] # store our reformed records
# loop over parsed json, and reshape it the way we want
for record in json.loads(pd_json):
nested = {'transactionLines': [record]} # matching specs of question
new_collection.append(nested)
new_json = json.dumps(new_collection) # convert back to json str
print(new_json)
Which results in:
[
{"transactionLines": [{"itemID": 0, "itemprice": 100}]},
{"transactionLines": [{"itemID": 1, "itemprice": 200}]}
]
Note that of course you could probably do this in a more concise manner, without the intermediate json conversion.

How to append multiple JSON object in a custom list using python?

I have two dictionary (business and business 1). I convert this dictionary into JSON file as (a and b). Then i append this two JSON object in a custom list called "all".
Here, list creation is static, i have to make it dynamic because the number of dictionary could be random. But output should be in same structure.
Here is my code section
Python Code
import something as b
business = {
"id": "04",
"target": b.YesterdayTarget,
'Sales': b.YSales,
'Achievements': b.Achievement
}
business1 = {
"id": "05",
"target": b.YesterdayTarget,
'Sales': b.YSales,
'Achievements': b.Achievement
}
# Convert Dictionary to json data
a= str(json.dumps(business, indent=5))
b= str(json.dumps(business1, indent=5))
all = '[' + a + ',\n' + b + ']'
print(all)
Output Sample
[{
"id": "04",
"target": 55500000,
"Sales": 23366927,
"Achievements": 42.1
},
{
"id": "05",
"target": 55500000,
"Sales": 23366927,
"Achievements": 42.1
}]
Thanks for your suggestions and efforts.
Try this one.
import ast, re
lines = open(path_to_your_file).read().splitlines()
result = [ast.literal_eval(re.search('({.+})', line).group(0)) for line in lines]
print(len(result))
print(result)

Changing the values of a map nested in another map

Using HQL queries I've been able to generate the following map, where the keys represent the month number constant defined in java.util.Calendar, and every value is a map:
[
0:[ client_a:[order1, order2, order3]],
1:[ client_b:[order4], client_c:[order5, order6], client_d:[order7]],
2:[ client_e:[order8, order9], client_f:[order10]]
]
order1, order2, ... are instances of a domain class called Order:
class Order {
String description
Date d
int quantity
}
Now I've got that structure containing orders that belong to some specific year, but I don't really care about the Order object itself. I just want the sum of the quantities of all the orders of each month. So the structure should look something like this:
[
0:[ client_a:[321]],
1:[ client_b:[9808], client_c:[516], client_d:[20]],
2:[ client_e:[22], client_f:[10065]]
]
I don't mind if the values are lists of one element or not lists at all. If this is possible, it would be fine anyway:
[
0:[ client_a:321 ],
1:[ client_b:9808, client_c:516, client_d:20 ],
2:[ client_e:22, client_f:10065 ]
]
I know I have to apply something like .sum{it.quantity} to every list of orders to get the result I want, but I don't know how to iterate over them as they are nested within another map.
Thank you.
Here You go:
class Order {
String description
Date d
int quantity
}
def orders = [
0:[ client_a:[new Order(quantity:1), new Order(quantity:2), new Order(quantity:3)]],
1:[ client_b:[new Order(quantity:4)], client_c:[new Order(quantity:5), new Order(quantity:6)], client_d:[new Order(quantity:7)]],
2:[ client_e:[new Order(quantity:8), new Order(quantity:9)], client_f:[new Order(quantity:10)]]
]
def count = orders.collectEntries { k, v ->
def nv = v.collectEntries { nk, nv ->
[(nk): nv*.quantity.sum()]
}
[(k):(nv)]
}
assert count == [0:[client_a:6], 1:[client_b:4, client_c:11, client_d:7],2:[client_e:17, client_f:10]]
def map = [
0:[ client_a:[[q: 23], [q: 28], [q: 27]]],
1:[ client_b:[[q: 50]], client_c:[[q: 100], [q: 58]], client_d:[[q: 90]]],
2:[ client_e:[[q: 48], [q: 60]], client_f:[[q: 72]]]
]
map.collectEntries { k, v ->
[ k, v.collectEntries { key, val ->
[ key, val*.q.sum() ]
} ]
}
you can also use val.sum { it.q } instead of val*.q.sum()

Perl: String to anonymous array?

SOLVED ALREADY --> See edit 7
At this moment I'm fairly new on Perl, and trying to modify part of an existing page (in Wonderdesk).
The way the page works, is that it gets the information from the GET url and parses it to an SQL query.
Since this is part of a much larger system, I'm not able to modify the coding around it, and have to solve it in this script.
A working test I performed:
$input->{help_id} = ['33450','31976'];
When running this, the query that is being build returns something as
select * from table where help_id in(33450,31976)
The part of my code that does not work as expected:
my $callIDs = '33450,31450';
my #callIDs = split(/,/,$callIDs);
my $callIDsearch = \#callIDs;
$input->{help_id} = $callIDsearch;
When running this, the query that is being build returns something as
select * from table where help_id = '33450,31976'
I've tried to debug it, and used Data::Dumper to get the result of $callIDsearch, which appears as [33450, 31450] in my browser.
Can someone give me a hint on how to transform from '123,456' into ['123', '456']?
With kind regards,
Marcel
--===--
Edit:
As requested, minimal code piece that works:
$input->{help_id} = ['123','456']
Code that does not work:
$str = '123,456';
#ids = split(/,/,$str);
$input->{help_id} = \#ids;
--===--
Edit 2:
Source of the question:
The following part of the code is responsible for getting the correct information from the database:
my $input = $IN->get_hash;
my $db = $DB->table('help_desk');
foreach (keys %$input){
if (/^corr/ and !/-opt$/ and $input->{$_} or $input->{keyword}){
$db = $DB->table('help_desk','correspondence');
$input->{rs} = 'DISTINCT help_id,help_name,help_email,help_datetime,help_subject,help_website,help_category,
help_priority,help_status,help_emergency_flag,help_cus_id_fk,help_tech,help_attach';
$input->{left_join} = 1;
last;
}
}
# Do the search
my $sth = $db->query_sth($input);
my $hits = $db->hits;
Now instead of being able to provide a single parameter help_id, I want to be able to provide multiple parameters.
--===--
Edit 3:
query_sth is either of the following two, have not been able to find it out yet:
$COMPILE{query} = __LINE__ . <<'END_OF_SUB';
sub query {
# -----------------------------------------------------------
# $obj->query($HASH or $CGI);
# ----------------------------
# Performs a query based on the options in the hash.
# $HASH can be a hash ref, hash or CGI object.
#
# Returns the result of a query as fetchall_arrayref.
#
my $self = shift;
my $sth = $self->_query(#_) or return;
return $sth->fetchall_arrayref;
}
END_OF_SUB
$COMPILE{query_sth} = __LINE__ . <<'END_OF_SUB';
sub query_sth {
# -----------------------------------------------------------
# $obj->query_sth($HASH or $CGI);
# --------------------------------
# Same as query but returns the sth object.
#
shift->_query(#_)
}
END_OF_SUB
Or
$COMPILE{query} = __LINE__ . <<'END_OF_SUB';
sub query {
# -------------------------------------------------------------------
# Just performs the query and returns a fetchall.
#
return shift->_query(#_)->fetchall_arrayref;
}
END_OF_SUB
$COMPILE{query_sth} = __LINE__ . <<'END_OF_SUB';
sub query_sth {
# -------------------------------------------------------------------
# Just performs the query and returns an active sth.
#
return shift->_query(#_);
}
END_OF_SUB
--===--
Edit 4: _query
$COMPILE{_query} = __LINE__ . <<'END_OF_SUB';
sub _query {
# -------------------------------------------------------------------
# Parses the input, and runs a select based on input.
#
my $self = shift;
my $opts = $self->common_param(#_) or return $self->fatal(BADARGS => 'Usage: $obj->insert(HASH or HASH_REF or CGI) only.');
$self->name or return $self->fatal('NOTABLE');
# Clear errors.
$self->{_error} = [];
# Strip out values that are empty or blank (as query is generally derived from
# cgi input).
my %input = map { $_ => $opts->{$_} } grep { defined $opts->{$_} and $opts->{$_} !~ /^\s*$/ } keys %$opts;
$opts = \%input;
# If build_query_cond returns a GT::SQL::Search object, then we are done.
my $cond = $self->build_query_cond($opts, $self->{schema}->{cols});
if ( ( ref $cond ) =~ /(?:DBI::st|::STH)$/i ) {
return $cond;
}
# If we have a callback, then we get all the results as a hash, send them
# to the callback, and then do the regular query on the remaining set.
if (defined $opts->{callback} and (ref $opts->{callback} eq 'CODE')) {
my $pk = $self->{schema}->{pk}->[0];
my $sth = $self->select($pk, $cond) or return;
my %res = map { $_ => 1 } $sth->fetchall_list;
my $new_results = $opts->{callback}->($self, \%res);
$cond = GT::SQL::Condition->new($pk, 'IN', [keys %$new_results]);
}
# Set the limit clause, defaults to 25, set to -1 for none.
my $in = $self->_get_search_opts($opts);
my $offset = ($in->{nh} - 1) * $in->{mh};
$self->select_options("ORDER BY $in->{sb} $in->{so}") if ($in->{sb});
$self->select_options("LIMIT $in->{mh} OFFSET $offset") unless $in->{mh} == -1;
# Now do the select.
my #sel = ();
if ($cond) { push #sel, $cond }
if ($opts->{rs} and $cond) { push #sel, $opts->{rs} }
my $sth = $self->select(#sel) or return;
return $sth;
}
END_OF_SUB
--===--
Edit 5: I've uploaded the SQL module that is used:
https://www.dropbox.com/s/yz0bq8ch8kdgyl6/SQL.zip
--===--
Edit 6:
On request, the dumps (trimmed to only include the sections for help_id):
The result of the modification in Base.pm for the non-working code:
$VAR1 = [
33450,
31450
];
The result of the modification in Condition.pm for the non-working code:
$VAR1 = [
"help_id",
"IN",
[
33450,
31450
]
];
$VAR1 = [
"cus_username",
"=",
"Someone"
];
$VAR1 = [
"help_id",
"=",
"33450,31450"
];
The result for the modification in Base.pm for the working code:
$VAR1 = [
33450,
31976
];
The result for the modification in Condition.pm for the working code:
$VAR1 = [
"help_id",
"IN",
[
33450,
31976
]
];
It looks as if the value gets changed afterwards somehow :S
All I changed for the working/non-working code was to replace:
$input->{help_id} = ['33450','31976'];
With:
$input->{help_id} = [ split(/,/,'33450,31450') ];
--===--
Edit 7:
After reading all the tips, I decided to start over and found that by writing some logs to files, I could break down into the issue with more details.
I'm still not sure why, but it now works, using the same methods as before. I think it's a typo/glitch/bug in my code somewhere..
Sorry to have bothered you all, but I still recommend the points to go to amon due to his tips providing the breakthrough.
I don't have an answer, but I have found a few critical points where we need to know what is going on.
In build_query_cond (Base.pm line 528), an array argument will be transformed into an key in (...) relation:
if (ref($opts->{$field}) eq 'ARRAY' ) {
my $add = [];
for ( #{$opts->{$field}} ) {
next if !defined( $_ ) or !length( $_ ) or !/\S/;
push #$add, $_;
}
if ( #$add ) {
push #ins, [$field, 'IN', $add];
}
}
Interesting bit in sql (Condition.pm line 181). Even if there is an arrayref, an IN test will be simplified to an = test if it contains only a single element.
if (uc $op eq 'IN' || $op eq '=' and ref $val eq 'ARRAY') {
if (#$val > 1) {
$op = 'IN';
$val = '('
. join(',' => map !length || /\D/ ? quote($_) : $_, #$val)
. ')';
}
elsif (#$val == 0) {
($col, $op, $val) = (qw(1 = 0));
}
else {
$op = '=';
$val = quote($val->[0]);
}
push #output, "$col $op $val";
}
Before these two conditions, it would be interesting to insert the following code:
Carp::cluck(Data::Dumper::Dump(...));
where ... is $opts->{$field} in the first snippet or $cond in the second snippet. The resulting stack trace would allow us to find all subroutines which could have modified the value. For this to work, the following code has to be placed in your main script before starting the query:
use Carp ();
use Data::Dumper;
$Data::Dumper::Useqq = 1; # escape special characters
Once the code has been modified like this, run both the working and not-working code, and print out the resulting query with
print Dumper($result);
So for each of your code snippets, we should get two stack traces and one resulting SQL query.
A shot in the dark... there's a temporary array #callIDs created by this code:
my #callIDs = split(/,/,$callIDs);
my $callIDsearch = \#callIDs;
$input->{help_id} = $callIDsearch;
If some other part of your code modifies #callIDs, even after it's been assigned to $input->{help_id}, that could cause problems. Of course the fact that it's a lexical (my) variable means that any such changes to #callIDs are probably "nearby".
You could eliminate the named temporary array by doing the split like this:
$input->{help_id} = [ split(/,/,$callIDs) ];
I'm not sure I understand exactly why this is happening. It seems that your query builder needs an arrayref of strings. You can use map to do that
my $callIDs = '33450,31450';
my #callIDs = map {$_*1} split(/,/,$callIDs);
$input->{help_id} = \#callIDs;
This code should work
my $callIDs = '33450,31450';
$input->{help_id} = [split ",", $callIDs];
If your code somehow detect your data is number you can use
my $callIDs = '33450,31450';
$input->{help_id} = [map 0+$_, split ',', $callIDs];
If it somehow become number and you need string instead which should not in this case but advice for future work:
my $callIDs = '33450,31450';
$input->{help_id} = [map ''.$_, split ',', $callIDs];

Resources