How to store metadata with Ceph? - node.js

I want to store user files in CephFS. The problem is that I also need to store some metadata of these files (Download date, verification status for example), as well as the ability to sort by date or the ability to give a number. If I use Mongodb for metadata, I have synchronization problem (file can be in database but not in CephFS or vice versa)
The file structure in CephFS is as follows:
/{user.id}/{Media collection name}/{media.id}
The media.id is uuidv4.
What idea I have:
Create a "meta" folder, in which to put the metadata of the files by their id, but without the date the file was uploaded to CephFS. To access the date, use data from CephFS(It stores the date the file was created, changed, just like any other file system(?))
I didn't find information in the Ceph documentation that it stores metadata as well, so I'm not sure if this option would work.

Related

Where are the docs (docx, xls etc) are physically store when using onlyoffice?

I assume it's not on 'document-server'? Is it true that all documents are being stored as database entities in a db?
Some information is stored in the db but the documentserver also contains information such as change history, changes and current output under documentserver/server/App_data

Is there anyway to read a value in a file and sync with the presence value in database?

hi i have a table that store all configuration setting, at the same time, i also have one hardcoded config file. but, i have problem when user, which is non developer want to access some config value, and the config value is not presence in database but present in hardcoded file, how can i achieve this objective?
how can i make sure that my configuration table overrides value that were already present in hardcoded file? so that user can access all values in the config file as well as those in database?
I'm using nodeJS, es6, objections JS
It's a bit hard to tell exactly what you're asking. If you have an ordered set of locations where configuration values can be stored and you want them to be accessed in a particular way such as:
read the value from the database
if value not stored in the database,
then check for value in the config file
Then, you would likely need to do one of these things:
Expose an API for reading the value and your own logic checks for the value in the right order and force you clients to use the API, not read database or config file directly.
Force the clients to all check for things in the proper order.
Keep the config file up to date with the latest value so an outside developer can always just read the config file to get the latest value. This means that any time a relevant value in the database is updated, you have to update the database.
Read the config file values on startup into the database so you only have one storage place for the "live" values (the database). Then force all clients to only read config values from the database.
Get rid of the config file entirely and just store things in the database. If you need to change the config, modify the values in the database.
i created a different file and include function to store the hardcoded value as in config file, example, change.js. inside my controller, i import the function to something like, newConfig.getConfig(); so the values can be dynamically obtained from changes.js. if the value cannot be found, then only i query from database.

How to prevent saving same named file on CouchDb?

I am using CouchDB with Divan - C# interfacing library for CouchDb.
A file can be uploaded many times on CouchDb. Every time the "id" is changed after file is uploaded, but the "rev" remains the same.
This happens even if all custom attributes defined for file being uploaded are same any existing file on CouchDb with same name.
Is there any way that can avoid uploading same named file if all custom attributes are same? Fetching all files and checking them for file name repetition could be a way, but definitely not preferable for its required time depending on other factors.
Thanking you.
Let's say you have 3 attributes for a file :
name
size in bytes
Date of modification
I see two main possibilities to avoid duplicates in your database.
Client approach
You query the database to check if the document with the same attributes exists with a view. If it's not existing, create it.
User defined id
You could generate an id from the attributes as this library is doing.
For example, if my document has those attributes :
"name":"test.txt",
"size":"512",
"lastModified":"2016-11-08T15:44:29.563Z"
You could build a unique id like this :
"_id":"test.txt/2016-11-08T15:44:29.563Z/512"

Keeping elasticsearch in sync with key or versioning

So I have a situation where I get in a lot of large XML files and I want that data sycronised on elasticsearch.
Current way
Have index_1
When data is updated create blank index_2
Load all of latest data into index_2
Alias to index_2 and delete index_1
Proposed way
Have a synced.xml file which has been sycronised with elasticsearch
When a new timedated xml file is availiable compare against synced.xml
If anything is new in the timedated xml file, add just that to ES
Rename timedated xml to synced.xml
This means out of 500,000 items, I only have to add the 5,000 items that have changed for example, not duplicate the 500,000 items.
Question
In a scenario like this, how to I ensure they are sycronised? For example, what happens if elasticsearch gets wiped, how can I tell my program that it would need to add the whole lot again. Is there a way to use some sort of sycronisation key on elasticsearch, or perhaps a better approach?
Here is what I recommend...
Add a stored field to your type to store a hash like MD5
Use Scan/Scroll to export the ID and Hash from ES
In your backing dataset export ID and Hash
Use something like MapReduce to "join" on exported ids from each
set
Where there are differences via comparing the hash or finding
missing keys, index/update
The hash is only useful if want to detect document changes. This also assume that either you persist ES's IDs back to your backing store or that you self assign IDs.

Redefine folder structure of document library with metadata

I have a problem in my sharepoint document library structure. Currently the document library consiste of folder sub-folder structure to store a document categorywise. Now our client want to redefine this folder structure with a metadata structure.
Can any one tell me how can I use metadata instade of folder sub folder structure..?
any related articles or links will be appriciated.
Thanks
Sachin
As already stated, you need to use columns for the metadata, preferably through a new Content Type. After creating this Content Type, you need to attach it to the library and convert all documents to it. Lastly, you also need to modify the views of the library, e.g. depending on your metadata you might only want to display certain columns or filter them.
There is an excellent whitepaper from Microsoft on Content Types available here:
http://technet.microsoft.com/en-us/library/cc262729.aspx
You can also read more about content type planning on Technet:
http://technet.microsoft.com/en-us/library/cc262735.aspx
And here's some info about Views:
http://office.microsoft.com/en-us/sharepointtechnology/HA100215771033.aspx
You must define columns for the metadata fields you want to have, create a content type that includes these columns, and assign this content type to your documents.
You might also change the default view of your document library, or create a new view, to make the new metadata columns visible.

Resources