Most efficient way to detect changes of a remote CMIS repository? - cmis

A remote CMIS repository contains many folders/files.
I am writing a software that keeps a local copy of these folders/files in sync.
At first run I just download everything recursively.
At later runs, I check what has changed, and download any changes.
What is the most efficient way to check the remote changes?
(additional/removal of files/folders)
Most efficient = Least bandwidth usage.
I can only use the CMIS protocol, and I can not run any custom software on the remote server.
My ideas so far:
Idea 1: Re-download everthing every time.
Idea 2: Check the root folder's modification date, hoping modification dates are recursive.
Idea 3: Use CMIS search to find all files that are more recent than the last time I synchronized. Problem: that won't tell me which files have been removed.
Any other ideas?
I don't know the CMIS protocol much, there might be something more convenient.

Using the repository's change log is the right way to go, but realize that not every repository supports this. For example, for Alfresco you must configure the audit sub-system and you must set audit.cmischangelog.enabled=true in alfresco-global.properties.
To find out if your repo supports changes you can look as the results of the repository's getCapabilities response. If you see 'Changes' set to 'None' then your repository doesn't support change logs.
Assuming it does, you need to ask the repository for its latest change log token. You can get that from getRepositoryInfo. Save that before you call getContentChanges. Then, on the next call, pass in the token. You'll get the changes made since the token was issued.
So, your code needs to:
Check getCapabilities for something other than Changes = None
Save the getRepositoryInfo's latestChangeLogToken
The first time you ask, call getContentChanges with no arguments
The next time you ask, call getcontentChanges with the last saved token
You can then process the result set. Each change log entry tells you its type (created, updated, deleted, permissions, etc., see spec for exact values) and provides the cmis:objectId of the changed object.
Repeat with step 2.
I have a "cmis-sync" script that does one-way synchronization using this approach implemented in Python. I've tested it against Alfresco as the source and the OpenCMIS InMemory repository as the target. If there is interest I can make it available.

A more ideal version of idea 3 is easily accomplished according to some digging through the CMIS protocol you posted.
2.1.11 Change Log
CMIS provides a “change log” mechanism to allow applications to easily discover the set of changes that have occurred to objects stored in the repository since a previous point in time. This change log can then be used by applications such as search services that maintain an external index of the repository to efficiently determine how to synchronize their index to the current state of the repository (rather than having to query for all objects currently in the repository).
Entries recorded in the change log are referred to below as “change events”.
Note that change events in the change log MUST be returned in ascending order from the time when the change event occurred.
Using whatever tools of your choice, you should be able to do an initial pull of the entire repository and save the time the pull was performed. Subsequent queries to the repository (at an interval of your choosing) are done with the following procedure:
Pull down the CMIS changelog from the repository
Parse all changes created after the previous pulls
Perform operations based on the ChangeType enum: for example, if the "deleted" enum is present for an objectID, delete that object locally.

Related

How to clone all public repositories from gitlab server?

There is an unstable gitlab server and I am not sure that it will be able to work in the future. Therefore, I want to make a backup copy of all the repositories (projects) that are there.
Cloning the source code will be enough, but it will be great if there is a way to save issues as well. Are there any ways to do this?
It depends on what kind of access you have, but if you don't have administrator access to do a full backup, then the best thing to do is to use a couple of API endpoints to get the information you need and go from there.
Use the Projects API to get a list of all projects accessible to you.
Note the pagination limits.
What you store depends on how you want to get the information.
Store at least the ID number of each.
Filter by membership if you only want the ones you're a member of.
Filter by min_access_level = maintainer (or higher) if you want to export whole projects.
Use the Project export API to trigger a project export for each project you're a member of, and you're a maintainer (or higher).
For all other projects where you have a lower role, or where it's public, you could still use git clone for the repositories by storing the ssh_url_to_repo or http_url_to_repo from the Projects API and running through each.
For all other parts of a project, you could store the JSON version to recreate them later if you want to go through the hassle. For example, for issues, use the Issues API.

How can I change the filename of release files in GitLab?

I need help. I've created a new release of my tool in GitLab and the zip files were created successfully. I can now download them via this URL:
https://gitlab.xxx.de/xxx-development/xxx-helper/-/archive/v1.0.0/xxx-helper-v1.0.0.zip
The problem is that I need to remove the -v1.0.0 somehow from the file name of the zip file because otherwise a target system creates a folder with the version in the name which makes huge problems. So at least I need this structure:
https://gitlab.xxx.de/xxx-development/xxx-helper/-/archive/v1.0.0/xxx-helper.zip
How can I do this?
The naming of releases is automated, and changing it would require updates to several parts of the Gitlab stack/codebase.
Answer: it's technically possible but not simple.
One portion that serves as a functional example to illustrate; guest users of private projects are allowed to view the Releases page but are not allowed to view details about the Git repository (in particular, tag names). Because of this, release titles are replaced with a generic title like “Release-1234” for Guest users to avoid leaking tag name information.
This is the URL you are referring to, and asking to change. It uses the generic title.
I can point to some parts of the codebase, but probably not all easily - it would require some significant effort. This is a project. It would also matter if you are using CE or EE.

iPhone - Core Date Model Versioning - versioning after the fact? Issues with project.pbxproj?

I have an app that I have been working on and I did a bunch of changes and then realized later I should have been adding versioning to the Core Data model. So I'm trying to go back and do that now.
Basic information:
I think everything I've done would fall under the lightweight migration feature.
I'm using git
I already have the app in user's hands
My question is: what is the easiest way to do this?
Since I'm using git, could I simply checkout the data model from when I submitted it to apple, create a new version for it, and add my changes? My main fear with this idea is that my project.pbxproj file would be incorrect. Would this an issue? Is there a way to get around this?
IF I could do this, would I need to recreate my class files or would that be ok (assuming I get it back to being identical to what I currently have).
IF I CAN'T do this, then what can I do? If its a matter of starting from the last version I pushed to Apple and applying changes I guess I should look into doing it with git rebase, right?
This has nothing to do with git.
You need to create a new version of your app, provide the new data model, set it for lightweight migration and then release it as an update. Core Data will basically assume that any model without version info is version zero and attempt a migration to the new version.
When the user downloads the update, the automatic migration will trigger the first time the app runs.
Creating a new version means nothing more than changing the version number in the project info. When submitted, that will trigger the upgrade and the migration.

What is the best way to share cruisecontrol configurations on a Linux server?

I have a team that will be using CruiseControl for continuous integration, and CC will be running on a Linux server. More than one team member may need to access the CC configuration itself: changing how tests are run, changing notification rules, etc.
What is the best practice for this?
My first thought was to set up a cc-users group, then make a shared directory somewhere (such as /usr/local, I suppose, or a new directory like /projects), where the directory has r/w for the group.
Am I missing any complications with this plan? Obviously, I've never been in charge of such a project before, otherwise I wouldn't ask such a question.
FWIW, my intention is to have all the cc configuration files under mercurial so we can roll back in case of breakage.
I have version-controlled the whole of cruisecontrol configuration, along with the project specific config files underneath it.This way, the write-access can be controlled per requirement, using your source control tool's access control method (in our case subversion) thus providing tracking as well. Whomsoever needs to make a change can checkout the file config.xml in their own workspace and make their changes and then commit. You may want to consider the same approach.

Does CC.NET detect modification when a build script performs a checkin

I've been doing some research into finally automating our Development builds and still have one nagging question that I'm hoping the StackOverflow community can solve for me.
My understanding is that an IntervalTrigger when setup properly will check VSS every X seconds for changes and if it finds a modified file, will run my tasks. One of my tasks would be to checkout the AssemblyInfo files and update the version numbers. After these files are updated they would be checked back into VSS.
Thinking about this solution it doesn't make much sense because in my mind, I'm forcing the check for changed files to true every time the trigger fires. Am I missing something here? Is there a way of doing this without triggering an automatic build on the AssemblyInfo check-in?
You can use a Filtered Source Control Block to exclude certain files from the trigger.
I just posted a bunch about my default build process here which may be of some interest to you: SVN Website Development and Deployment Solution
The way I usually configure my projects with CC.NET is to have two project blocks per solution. One configured as an interval trigger that does nothing more than get the latest from my repository, build the solution, and run unit tests. The other is a schedule trigger that does all the things the other one does, but actually publishes a build. This includes changing version numbers, publishing files, etc. This might work in your case, since the change in version would cause the interval project to trigger, but only once.
Checking the automatically generated AssemblyInfo into the version control system is a bad idea, don't do it. You'll get a lot of noise (50% of all commits!) in your history. Also, it does not give you any new information - you can always pull this from VCS. Have your build script autogenerate those files is a good practice, but don't push those changes back!

Resources