I need to access about 9 files in a particular sharepoint sub-folder for my powerbi visualisation. Each file holds different data and I need them as separate tables.
I tried the below approach as I felt connecting to the share point folder is really slow.
Sharepoint_Folder_Query > This connects to share point site. And filters for the subfolder. This uses "Sharepoint.files" as I was not able to use sharepoint.contents in subfolder since the files are in a subfolder.
File_Query > This references the above "Sharepoint_Folder_Query" and picks up the file it needs. There is 9 File_Query(s), one for each file
Data_Query > There are again 9 data queries each referencing the respective "File_Query" and performs additional manipulation on the data. These are the tables used in my visualisations.
My idea on this was that since connecting to sharepoint takes a lot of time, I wanted to connect just once and use the reference from then on.
But right now my refresh is taking almost 1 hour... During the refresh I could see each of my queries are trying to connect to share point... I could see the message "Waiting for https://..." under all my queries. Not sure what I did wrong.
For comparison, initially I had the files in one-drive and just used the Data_Query section and created all the visualisations. It then took me only 30 seconds to refresh.
So my question. Is something wrong with my approach? if yes what? Also is there a better way to do this and reduce the refresh time.
After performing two successful imports (one for users and one for follow relationships) the usage data view has not updated with the expected values. Does this mean my records were not created from the import?
I was expecting around 50k user records with as many follow relationships.
Currently, I'm just creating user records as only having their ID set. When I do this via the api get_or_create I can see the usage update in real time. However, doing this via an import appears to have had no effect? Same for follow relationships.
I've noticed in the docs that it states An array of users in which each line can have up to 10,000 entries, does that mean I'm limited to 10k users per instruction?
Some details about what the issue was.
Automatic import has a config to disable tracking statistics for dashboard.
This configuration exists to support very big imports. However, automatic import limited by 300MB by default and it doesn't need to disable it because it's a manageable amount without any care.
That's why running configuration was wrong and it's enabled now.
I have a CosmosDB Collection which I'm querying using the REST API.
I'd like to access the total number of documents which match my query. I know I can do a count, but that means two calls, one for the count and a subsequent one to retrieve the actual records.
I would assume this is not possible in a single call, BUT.. the Data Explorer in Azure Portal seems to manage it, so just wondering if anyone has been able to figure out what calls it makes, to get this:
Showing Results 1 - 10
Retrieved document count 342
Retrieved document size 2868425 bytes
Output document count 10
It's the Retrieved Document Count I need - if the portal can do it, there ought to be a way :)
I've tried the JAVA SDK as well as REST but can't see any useful options in there either
As so often is the case in this game, asking a question triggers the answer... so apologies in advance.
The answer is to send the x-ms-documentdb-populatequerymetrics header in the request.
The response then gives a whole bunch of useful stuff in x-ms-documentdb-query-metrics.
What I would like to understand still is whether this has any performance impact?
In an NodeJS application I have to maintain a "who was online in the last N minutes" state. Since there is potentially thousands of online users - for performance reasons - I decided to not update my Postgresql user table for this task.
I choosed to use Redis to manage the online status. It's very easy and efficient.
But now I want to make complex queries to the user table, sorted by the online status.
I was thinking of creating a online table filled every minute from a Redis snapshot, but I'm not sure it's the best solution.
Following the table filling, will the next query referencing the online table take a big hit caused by the new indexes creation or loading?
Does anyone know a better solution?
I had to solve almost this exact same issue, but I took a different approach because I Didn't like the issues caused by trying to mix Redis and Postgres.
My solution was to collect the online data in a queue (Zero MQ in my case) but any queueing system should work, or a stream processing facility like Amazon Kinesis (The alternative I looked at.) I then inserted the data in batches into a second table (not the users table). I don't delete or update that table, only inserts and queries are allowed.
Doing things this way preserved the ability to do joins between the last online data and the users table without bogging down the database or creating many updates on the user tables. It has the side effect of giving us a lot of other useful data.
One thing to note that I have though about when thinking of other solutions to this problem is that your users table in transactional data(OLTP) while the latest online information is really analytics data (OLAP), so if you have a data warehouse, data lake, big data, or whatever term of the week you want to use for storing this type of data and querying against it that may be a better solution.
Are there ways to get around the upper database size limit on Notes databases? We are compacting a database that is still approaching 60 gigs in size. Thank you very much if you can offer a suggestion.
Even if you could find a way to get over the 64GB limit it would not be the recommended solution. Splitting up the application into multiple databases is far better if you wish to improve performance and retain the stability of your Domino server. If you think you have to have everything in the same database in order to be able to search, please look up domain search and multi-database search in the Domino Administrator help.
Maybe some parts of the data is "old" and could be put into one or more archive databases instead?
Maybe you have a lot of large attachments and can store them in a series of attachment databases?
Maybe you have a lot of complicated views that can be streamlined or eliminated and thereby save a lot of space and keep everything in the same database for the time being? (Remove sorting on columns where not needed, using "click on column header to sort" is a sure way to increase the size of the view index.)
I'm assuming your database is large because of file attachments as well. In that case look into DAOS - it will store all file attachments on filesystem (server functionality - transparent to clients and existing applications).
As a bonus it finds duplicates and stores them only once.
More here: http://www.ibm.com/developerworks/lotus/library/domino-green/
Just a stab in the dark:
Use the DB2 storage method instead of to a Domino server?
I'm guessing that 80-90% of that space is taken up by file attachments. My suggestion is to move all the attachments to a file share, provided everyone can access that share, or to an FTP server that everyone can connect to.
It's not ideal because security becomes an issue - now you need to manage credentials to the Notes database AND to the external file share - however it'll be worth the effort from a Notes administrator's perspective.
In the Notes documents, just provide a link to the file. If users are adding these files via a Notes form, perhaps you can add some background code to extract the file from the document after it has been saved, and replace it with a link to that file.
The 64GB is not actually an absolute limit, you can go above that, I've seen 80GB and even close to 100Gb although once your past 64Gb you can get problems at any time. The limit is not actually Notes, its the underlying file system, I've seen this on AS400 but the great thing about Notes is that if you do get a huge crash you can still access all the documents and pull everything out to new copies using scheduled agents even if you can no longer get views to open in the client.
Your best best is regular archiving, if it is file attachments then anything over two years old doesn't need to be in main system, just brief synopsis and link, you could even have 5 year archive, 2 year archive 1 year archive etc, data will continue to accumulate and has to be managed, irrespective of what platform you use to store it.
If the issue really is large file attachments, I would certainly recommend looking into implementing DAOS on your server / database. It is only available with Domino Server 8.5 and later. On the other hand, if your database contains over 100,000+ documents, you may want to look seriously at dividing the data into multiple NSF's - at that number of documents, you need to be very careful about your view design, your lookup code, etc.
Some documented successes with DAOS:
http://www.edbrill.com/ebrill/edbrill.nsf/dx/yet-another-daos-success-story-from-darren-duke?opendocument&comments
If you're database is getting to 60gb.. don't use a Domino solution you need to switch to a relational database. You need to archive or move documents across several databases. Although you can get to 60gb, you shouldn't do it. The performance hit for active databases is significant. Not so much a problem for static databases.
I would also look at removing any unnecessary views & their indexes. View indexes can occupy 80-90% of your disk space. If you can't remove them, simplify their sorting arrangements/formulas and remove any unnecessary column sorting options. I halved a 50gb down to 25gb with a few simple changes like this and virtually no users noticed.
One path could be, for once, to start with the user. Do all the users need to access all that data all the time ? If no, it's time to split or archive. If yes, there is probably a flaw in the design of the application.
Technically, I would add to the previous comments a suggestion to check the many options for compaction. Quick and dirty : disard all view indices, but be sure to rebuild at least the one for the default view if you don't want your users to riot. See updall
One more thing to check: make sure you have checked
[x] Use LZ1 compression for attachments
in db properties.