Importing LCI database - dealing with unlinked exchanges - brightway

I am having an issue importing the ecoinvent v3.2 database (cut-off) in Brightway.
The steps followed were:
ei32cu = bw.SingleOutputEcospold2Importer(fp, "ecoinvent 3.2 cutoff")
ei32cu.apply_strategies()
All seemed to be going well. However, ei32cu.statistics() revealed that there were many unlinked exchanges:
12916 datasets
459268 exchanges
343020 unlinked exchanges
Type biosphere: 949 unique unlinked exchanges
Of course, the unlinked exchanges prevented the writing of the database using ei32cu.write_database() did not work: an "Invalid exchange" was raised.
My questions:
- How can I fix this?
- How can I access the log file (cited here) that might give me some insights?
- How can I generate a list of exchanges (and their related activities)?

It is strange you have unlinked exchanges with ei 3.2 cutoff, at least with python 3 should be very smooth importing 3.2 cutoff, are you perhaps on py2 or not using the latest version of bw2?
-difficult to give an answer without looking into the db, but if you are on py2 just try with the 3
-to check where the log is
`projects.logs_dir`
-to write the list of unlinked exchanges
ei32cu.write_excel(only_unlinked=True) #unlinked=False export the full list of exchanges

I now know why this problem occurred, and the solution is quite simple: in new projects, one needs to bw2setup before importing LCI databases.

Related

How to copy managed database?

AFAIK there is no REST API providing this functionality directly. So, I am using restore for this (there are other ways but those don’t guarantee transactional consistency and are more complicated) via Create request.
Since it is not possible to turn off short time backup (retention has to be at least 1 day) it should be reliable. I am using current time for ‘properties.restorePointInTime’ property in request. This works fine for most databases. But one db returns me this error (from async operation request):
"error": {
"code": "BackupSetNotFound",
"message": "No backups were found to restore the database to the point in time 6/14/2021 8:20:00 PM (UTC). Please contact support to restore the database."
}
I know I am not out of range because if the restore time is before ‘earliestRestorePoint’ (this can be found in GET request on managed database) or in future I get ‘PitrPointInTimeInvalid’ error. Nevertheless, I found some information that I shouldn’t use current time but rather current time - 6 minutes at most. This is also true if done via Azure Portal (where it fails with the same error btw) which doesn’t allow to input time newer than current - 6 minutes. After few tries, I found out that current time - circa 40 minutes starts to work fine. But 40 minutes is a lot and I didn’t find any way to find out what time works before I try and wait for result of async operation.
My question is: Is there a way to find what is the latest time possible for restore?
Or is there a better way to do ‘copy’ of managed database which guarantees transactional consistency and is reasonably quick?
EDIT:
The issue I was describing was reported to MS. It was occuring when:
there is a custom time zone format e.g. UTC + 1 hour.
Backups are skipped for the source database at the desired point in time because the database is inactive (no active transactions).
This should be fixed as of now (25th of August 2021) and I were not able to reproduce it with current time - 10 minutes. Also I was told there should be new API which would allow to make copy without using PITR (no sooner than 1Q/22).
To answer your first question "Is there a way to find what is the latest time possible for restore?"
Yes. Via SQL. The only way to find this out is by using extended event (XEvent) sessions to monitor backup activity.
Process to start logging the backup_restore_progress_trace extended event and report on it is described here https://learn.microsoft.com/en-us/azure/azure-sql/managed-instance/backup-activity-monitor
Including the SQL here in case the link goes stale.
This is for storing in the ring buffer (max last 1000 records):
CREATE EVENT SESSION [Verbose backup trace] ON SERVER
ADD EVENT sqlserver.backup_restore_progress_trace(
WHERE (
[operation_type]=(0) AND (
[trace_message] like '%100 percent%' OR
[trace_message] like '%BACKUP DATABASE%' OR [trace_message] like '%BACKUP LOG%'))
)
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
ALTER EVENT SESSION [Verbose backup trace] ON SERVER
STATE = start;
Then to see output of all backup events:
WITH
a AS (SELECT xed = CAST(xet.target_data AS xml)
FROM sys.dm_xe_session_targets AS xet
JOIN sys.dm_xe_sessions AS xe
ON (xe.address = xet.event_session_address)
WHERE xe.name = 'Verbose backup trace'),
b AS(SELECT
d.n.value('(#timestamp)[1]', 'datetime2') AS [timestamp],
ISNULL(db.name, d.n.value('(data[#name="database_name"]/value)[1]', 'varchar(200)')) AS database_name,
d.n.value('(data[#name="trace_message"]/value)[1]', 'varchar(4000)') AS trace_message
FROM a
CROSS APPLY xed.nodes('/RingBufferTarget/event') d(n)
LEFT JOIN master.sys.databases db
ON db.physical_database_name = d.n.value('(data[#name="database_name"]/value)[1]', 'varchar(200)'))
SELECT * FROM b
NOTE: This tip came to me via Microsoft support when I had the same issue of point in time restores failing what seemed like randomly. They do not give any SLA for log backups. I found that on a busy database the log backups seemed to happen every 5-10 minutes but on a quiet database hourly. Recovery of a database this way can be slow depending on number of transaction logs and amount of activity to replay etc. (https://learn.microsoft.com/en-us/azure/azure-sql/database/recovery-using-backups)
To answer your second question: "Or is there a better way to do ‘copy’ of managed database which guarantees transactional consistency and is reasonably quick?"
I'd have to agree with Thomas - if you're after guaranteed transactional consistency and speed you need to look at creating a failover group https://learn.microsoft.com/en-us/azure/azure-sql/database/auto-failover-group-overview?tabs=azure-powershell#best-practices-for-sql-managed-instance and https://learn.microsoft.com/en-us/azure/azure-sql/managed-instance/failover-group-add-instance-tutorial?tabs=azure-portal
A failover group for a managed instance will have a primary server and failover server with the same user databases on each kept in synch.
But yes, whether this suits your needs depends on the question Thomas asked of what is the purpose of the copy.

Cognos Analytics - Report does not load when combining two dimensions

If I want to use two dimensions in a crosstab, the report starts to process (no errors) but does not finish to load (after 30 minutes). When I use the dimension individually the report is ready after 30 to 60 seconds. Both Dimensions have a 1 to 1 (Dimension) to 1 to N (fact table) relationship modelled in Cognos Framework Manager. If I run the generated MDX/SQL directly in the database the query is running indefinitely as well...
Does anyone have an idea why this could happen?
Any help appreciated!
Share the generated MDX/SQL with your data architect and/or database administrator. They should be able to tell you what is wrong with the relationships in your model.

Getting Multiple Last Price Quotes from Interactive Brokers's API

I have a question regarding the Python API of Interactive Brokers.
Can multiple asset and stock contracts be passed into reqMktData() function and obtain the last prices? (I can set the snapshots = TRUE in reqMktData to get the last price. You can assume that I have subscribed to the appropriate data services.)
To put things in perspective, this is what I am trying to do:
1) Call reqMktData, get last prices for multiple assets.
2) Feed the data into my prediction engine, and do something
3) Go to step 1.
When I contacted Interactive Brokers, they said:
"Only one contract can be passed to reqMktData() at one time, so there is no bulk request feature in requesting real time data."
Obviously one way to get around this is to do a loop but this is too slow. Another way to do this is through multithreading but this is a lot of work plus I can't afford the extra expense of a new computer. I am not interested in either one.
Any suggestions?
You can only specify 1 contract in each reqMktData call. There is no choice but to use a loop of some type. The speed shouldn't be an issue as you can make up to 50 requests per second, maybe even more for snapshots.
The speed issue could be that you want too much data (> 50/s) or you're using an old version of the IB python api, check in connection.py for lock.acquire, I've deleted all of them. Also, if there has been no trade for >10 seconds, IB will wait for a trade before sending a snapshot. Test with active symbols.
However, what you should do is request live streaming data by setting snapshot to false and just keep track of the last price in the stream. You can stream up to 100 tickers with the default minimums. You keep them separate by using unique ticker ids.

Should changeman demote delete the components from promotion libraries

I promoted a new JCL 'TTTTS360' to TST( Promotion level 1). I noticed and JCL was created in ( as this is new JCL) in TTTTTST.E998.JCL(TTTTS360) and similarly entry was created in parameter lib 'COMPTST.AAAA.PARMLIB(QEEEEAU)'.
Now once I demote my package to level 0 i.e. development , I still see 'TTTTTST.E998.JCL(TTTTS360)', 'COMPTST.AAAA.PARMLIB(QEEEEAU)', shouldn't they be removed ? I was expecting them to be removed all together?
I see following steps in changeman JOB
SYSPRINT DEL1CTC
SYSPRINT DEL1JCL
DEL1CTC CHANGEMAN STEP
DELETE QEEEEAU
QEEEEAU WAS DELETED FROM TARGET DATA SET
DEL1JCL CHANGEMAN STEP
DELETE TTTTS360
TTTTS360 WAS DELETED FROM TARGET DATA SET
ChangeMan has the concept of "staging libraries" and "promotion libraries." The former are sometimes referred to as "package datasets" because they are part of your ChangeMan package.
When you promote your package, typically the members from your staging libraries are copied to the corresponding target promotion libraries. When you demote your package, the members that were promoted are deleted from the target promotion libraries.
Your staging libraries aren't cleaned up until after install and baseline have completed as part of your request to install in your production environment. The cleanup may be days or weeks afterward, as a backout requires the staging libraries to be present.
Having said all that, ChangeMan is very configurable as Bruce Martin indicated in his comments. Talk to your ChangeMan Administrator(s) about what behavior you should expect to see.

Performance issues in Sybase 12.5 to Sybase 15 migration

We are in the process of migrating our DB to Sybase 15. The stored procedures which were working fine in Sybase 12.5 have a poor performance in Sybase 15. However when we add 'set merge_join off' Syabse 15 performs faster. Is there any way to use the sybase 12.5 stored procs as it is in Sybase 15 / or with minimal changes? Do we have any alternate ways apart from rewriting the whole stored proc?
I think this depends on how much time and energy you have to investigate Sybase 15 and use its new optimisers.
If this is a small app and you just want it working without clueing up on some or all of the new optimisers, index statistics, datachange, login triggers, then use either compatibility mode or maybe better, restrict the Optimiser to be allrows_oltp, avoiding dss and mix (which would use hash joins and merge joins respectively.)
If it's a big system and you have time, I think you should find out about the above, allow at least mix if not dss too, and make sure you
have index statistics up to date (much more important to have stats on 2nd and subsequent cols of indexes to opimise right for merge and hash joins.)
understand DATACHANGE (to find tables that need stats updates.)
login triggers (can be v useful to configure some sessions/users down or up optimisation levels - see sypron website for Rob Verschoor's write-up.)
make sure you've got access to sp_showplan (use a tool, or get sa_role, or use Rob Verschoor's CIS technique to grant.)
The new optimisers are good, but I think it's true to say that they take time and energy to understand and make work. If you don't have time and energy and don't need the extra performance, just stick to allrows_oltp, or even compatibility mode (I don't have experience of the latter, but somehow it seems wrong to me.)
There is compatibility mode in sybase 15.
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00967.1550/html/MigrationGuide/CBHJACAF.htm
I would say try to find root cause of issue, We too had a issue with one of our procs where timing went up from 27 mins to 40 mins. When diagnosed and fixed proc just took 6 mis to complete (which was 27 mins). ASE15 optimizer and query processing is much better then 12.5.
If you dont have time just set the compatibility mode at session level for this proc.
"set compatibility_mode on"
But do compare the results.
Additionally if you have time do try using DBCC (302,310) and 3604 (for redirection) to understand why optimizer is using such LAVA operator.
Excellent Article by Rob V
Sybase 15 optimizer uses more algorithm of joins i.e Merge Join, Hash join, nested loop join, etc
Where as in Sybase 12.5, the most used algorithm for join is Nested loop join.
Apart from switching the compatibility mode on (This will use Sybase 12.5 optimizer and wont give you any benefits of Sybase 15 optimizer), you can play with various optimization goals.
In your case I suggest you set the optimization goal to "allrows_oltp", which will only use nested loop joins in your queries, at server level.
-- server-wide default:
sp_configure 'optimization goal', 0, 'allrows_oltp'
-- session-level setting (overrides server-wide setting):
set plan optgoal allrows_oltp
-- query-level setting (overrides server-wide and session-level settings):
select * from T1, T2 where T1.a = T2.b plan '(use optgoal allrows_oltp)'
allrows_oltp resembles Sybase 12.5 way very closely, and should be tried first before trying any other optimization goals.
Note: After setting to allrows_oltp, do proper testing to see if any other query got affected by this
More info about optimization goals can be found here

Resources