SharePoint for education - sharepoint

I am trying to figure out if it’s possible to create learning material for a course that I am teaching something equivalent to in the format of book and chapters like https://demo.bookstackapp.com/books/dummy-content-book.
I was wondering what are the equipment of Books, Chapters, and pages in SharePoint would be.
Any suggestion greatly appreciated.

I think you can use the Sharepoint document library name as the Books name, divide the books into multiple folders by Chapters and store them in the document library, and create subfolders in the folders to store smaller chapters or store content directly.
You can set individual permissions for each subfolders, folders, document libraries. For example, those that have been taught to students can be opened to all students to see, and those that have not been taught cannot be seen, etc.

Related

What is mediaItemId in Google Photos API?

I am reading the Google Photos API documentation. I can't find out what mediaItemId is, see for example here:
https://developers.google.com/photos/library/guides/access-media-items#get-media-item
There are some other questions that might be related, but they have no answers:
How to get mediaItemId of a Google photo using its shared URL?
I've not used the API but I'm familiar with other Google services and am a Photos user.
If you consider you're experience with photos.google.com, you browse a somewhat unstructured list of all your photos. The Photos (phone|browser) apps do categorize photos by date but you have to search to filter by other metadata to find the specific photo(s) that you're seeking. Or you happy-scroll through years of photos of your cat.
This contrasts with another common metaphor for arranging files in which a hierarchy of folders is used to categorize the content e.g. /photos/cats/2022 but this mechanism is limited because you can only really navigate through one dimension (the folders).
Considerable metadata (type, width|height, creation date etc.) is associated with each photo and it is customary in schemas like this to construct a unique ID for each object. The unique ID is sometimes exposed to the end-user but not necessarily. Identifiers are generally for the system's own purposes.
With Photos, there are public, unique identifiers in the form of URLs for each photos but evidently the id and the URL although probably related (perhaps via a hash) aren't obviously related.
So, since it's not always possible to specific a photo uniquely by e.g. "The one of my dog where he's wearing sunglasses because of the eclipse" and the absence of folders, a really powerful alternative (which you'll need to employ) is to search for some subset of the photos and then iterate over the results.
It appears that the Photos service has such a search to which you provide Filters and each of the items in the results will be a MediaItem (uniquely identified by id).
Unlike the file system example above, because Photos does not use a fixed hierarchy, we can view our Photos by filtering them using an extensive set of metadata: photos of cats, taken in 2022, using my phone.

English dictionary dump for text analysis

I am looking for an English Dictionary dump for some text analysis in Python.
This would include a word and some of its attributes (noun/verb, its forms, tenses, and probably origin too!). So, I envision these as columns of a data frame.
I have gone through numerous threads where folks have suggested some sources but I believe none of those fulfill the above requirements (some are just word lists, others are words with just meanings). Moreover, they kind of look non-exhaustive (very small corpus whereas I am targeting to have ~500000 words).
Is there a dump available from authoritative sources like Oxford or Merriam Webster?
Also, there is a PyDictionary module. Is it possible to fetch such a dump from this module?
WordNet is a corpus of words, their synonyms, hyponyms, and meronyms, grouped by synsets and made available for free give that you follow their license. https://wordnet.princeton.edu/. Since this is a popular choice, you can find this corpus in almost any data format with a little searching. Database contains 155,327 words.
BabelNet is another corpus that have aggregated WordNet, Wikipedia, and many other sources into a database of 91,218,220 glossary definitions covering many languages. https://babelnet.org/
If you want to use the Oxford dictionary and Merriam Webster, they are commercial products who dont throw around their database with unlimited access. Both have API interfaces you can gain access to with a registered API key.

how schema.org can help in nlp

I am basically working on nlp, collecting interest based data from web pages.
I came across this source http://schema.org/ as being helpful in nlp stuff.
I go through the documentation, from which I can see it adds additional tag properties to identify html tag content.
It may help search engine to get specific data as per user query.
it says : Schema.org provides a collection of shared vocabularies webmasters can use to mark up their pages in ways that can be understood by the major search engines: Google, Microsoft, Yandex and Yahoo!
But I don't understand how it can help me being nlp guy? Generally I parse web page content to process and extract data from it. schema.org may help there, but don't know how to utilize it.
Any example or guidance would be appreciable.
Schema.org uses microdata format for representation. People use microdata for text analytics and extracting curated contents. There can be numerous application.
Suppose you want to create news summarization system. So you can use hNews microformats to extract most relevant content and perform summrization onit
Suppose if you have review based search engine, where you want to list products with most positive review. You can use hReview microfomrat to extract the reviews, now perform sentiment analysis on it to identify product has -ve or +ve review
If you want to create skill based resume classifier then extract content with hResume microformat. Which can give you various details like contact (uses the hCard microformat), experience, achievements , related to this work, education , skills/qualifications, affiliations
, publications , performance/skills for performance etc. You can perform classifier on it to classify CVs with particular skillsets
Thought schema.org does not helps directly to nlp guys, it provides platform to perform text processing in better way.
Check out this http://en.wikipedia.org/wiki/Microformat#Specific_microformats to see various mircorformat, same page will give you more details.
Schema.org is something like a vocabulary or ontology to annotate data and here specifically Web pages.
It's a good idea to extract microdata from Web pages but is it really used by Web developper ? I don't think so and I think that the majority of microdata are used by company such as Google or Yahoo.
Finally, you can find data but not a lot and mainly used by a specific type of website.
What do you want to extract and for what type of application ? Because you can probably use another type of data such as DBpedia or Freebase for example.
GoodRelations also supports schema.org. You can annotate your content on the fly from the front-end based on the various domain contexts defined. So, schema.org is very useful for NLP extraction. One can even use it for HATEOS services for hypermedia link relations. Metadata (data about data) for any context is good for content and data in general. Alternatives, include microformats, RDFa, RDFa Lite, etc. The more context you have the better as it will turn your data into smart content and help crawler bots to understand the data. It also leads further into web of data and in helping global queries over resource domains. In long run such approaches will help towards domain adaptation of agents for transfer learning on the web. Pretty much making the web of pages an externalized unit of a massive commonsense knowledge base. They also help advertising agencies understand publisher sites and to better contextualize ad retargeting.

Can I identify intranet page content using Named Entity Recognition?

I am new to Natural Language Processing and I want to learn more by creating a simple project. NLTK was suggested to be popular in NLP so I will use it in my project.
Here is what I would like to do:
I want to scan our company's intranet pages; approximately 3K pages
I would like to parse and categorize the content of these pages based on certain criteria such as: HR, Engineering, Corporate Pages, etc...
From what I have read so far, I can do this with Named Entity Recognition. I can describe entities for each category of pages, train the NLTK solution and run each page through to determine the category.
Is this the right approach? I appreciate any direction and ideas...
Thanks
It looks like you want to do text/document classification, which is not quite the same as Named Entity Recognition, where the goal is to recognize any named entities (proper names, places, institutions etc) in text. However, proper names might be very good features when doing text classification in a limited domain, it is for example likely that a page with the name of the head engineer could be classified as Engineering.
The NLTK book has a chapter on basic text classification.

Sharepoint Document collaboration

Document collaboration
Is this possible to restrict the section in document, based on the user.
Egg:
Document contains three sections,
Section 1, Section 2, Section 3
Three users need to contribute for document preparation,
User 1 for section 1
User 2 for section 2
User 3 for section 3.
Thanks,
Gunasekaran Sambandhan
SharePoint is only able to set permissions for discrete objects (lists, libraries, sites, documents, etc.) and is unable to segment an individual file.
We do two different things here at work to deal with this type of need. We either create a library for a given document then have individual files for each section (this is also useful for collaborating on huge documents even if you don't need to restrict access per section) or we create individual libraries for the sections. The latter is a better way to go for security because it reduces the risk that someone will create a doc and not set permissions.
FWIW: I did a quick check to see if Word would allow me to DRM sections of a document, the answer is no.
Cheers,
Reeves
SharePoint has no clue what is actually in the document (other than indexing for search).
To do what you want, you would have to break the document up into 3 different documents and use item level permissions on each, or put each part in a seperate document library that already has the permission levels set to correspond to the user(s) you want to be able to contribute to that part.
HI,
SharePoint limits all collaboration and security features to the Document level.
A solution we have used with some success is to have 3 document libraries with one user given access on each.
A Quick Part is present in the documents in these libraries which pulls information in a Document Library Column.
A fourth document library has 3 lookup colums pulling information from the aboove 3 Document Library columns
This can be linked to a single document using Quickparts again.
Kind regards,

Resources