Does Office 365 image search work? If so, how?

Does Office 365 image search work? If so, how? - sharepoint

According to Microsoft ("Image Analysis" in https://techcommunity.microsoft.com/t5/Microsoft-SharePoint-Blog/Enrich-your-SharePoint-Content-with-Intelligence-and-Automation/ba-p/194174, from May 21, 2018), we should be able to search for text within images.
Is this working for you/anyone? If so, I would like to know what you had to do to get it to work.
I have a SharePoint modern team site with PNG images that contain clearly readable text...but search will not find anything. I have requested re-indexing.
I have had a Microsoft Support request (#10638094) open since June 27 with this question/issue, and no one--even after escalation--has been able to answer it.
Based on the article above, it appears that "MediaService" column(s) should be added to the library to support this; however, I can find no such columns in the environment (using PnP export to review).
Naomi Moneypenny and Kathrine Hammervold highlighted this functionality at Ignite 2017 (https://channel9.msdn.com/Events/Ignite/Microsoft-Ignite-Orlando-2017/BRK2181, about 27:00), but it doesn't seem to be available/working (at least not for me).
August 24: So, after research, digging yet further, I have an escalated support ticket at Microsoft (#10638094, unsolved) and there are conversations at https://techcommunity.microsoft.com/t5/Intelligent-Search-Discovery/Search-for-words-in-your-images-in-Office-365/ba-p/135703, https://techcommunity.microsoft.com/t5/Microsoft-SharePoint-Blog/Enrich-your-SharePoint-Content-with-Intelligence-and-Automation/bc-p/236625, and Does Office 365 image search work? If so, how?. I have yet to hear of this functionality working for anyone. I will keep digging, and I will certainly post if I hear anything. J

After some digging, from official it seems already released at the end of 2017. However there is no any related doc or official guide to this Text in image search function.
The 2 way i can think of perform text in image search.
Perform OCR yourself on the image before uploading the image and embed the text in image metadata.
Use support image type like IIRC and TIF that image are recognized.
In your case, you can upload the image and have another column that contains text and apply metadata to the image in a list/ library column.
OneDrive in another hand also has this function. For example, search for things like "cat" and it * should* pull up most pictures you have of cats. Its more likely using tag as label for the image instead of reading the picture it self.
Also, i believe OneNote has its index recognizable text and handwriting. Maybe this can point you to the right directions.
*Microsoft Azure's computer Vision offer service to recognized text in image. Maybe this can help.

"Is this working for you/anyone?" Yes, I responded to this post elsewhere and see it posted here, as well. Unfortunately, I cannot tell you HOW to get it to work or to verify that it is correctly configured. I can only suggest a test for you to see if it is working for you, as it works for me. I have not tested every way in which it could or should work. I have only discovered it working with PNGs I inserted into Wiki Pages in SharePoint Online. Those PNGs are generated using Snag-It to take Screen Captures and I do not see where Snag-It would be doing any OCR on the image to embed anything, etc. OCR is not even in the Snag-It help file, so I believe the PNG files are just simple PNGs. I insert them into the SharePoint Wiki page, which uploads them to the Site Assets library. And, when I search for a word in the image, the image is returned as a result - not the Wiki page. So, suggest you try a simple test of just inserting a PNG with text in it into a Wiki Page and give the index a bit of time to run to see if it works for you.

It seems like the functionality has matured recently. I have been testing it more thoroughly, and I have documented the results in my blog at http://www.collaboration-foundry.com/SharePointImageAnalysis.
Bottom line: It works for me in OneDrive and SharePoint (modern and classis), but I've only seen it work on the out-of-the-box Document content type--which limits custom solutions somewhat.
It's cool functionality when it works. Looking forward to seeing Microsoft build on this.
John

Related

Google Docs: Table of content page numbers

we are currently building an application on the google cloud platform, which generates reports in Google Doc. For them, it is really important to have a table of content ... with page numbers. I know this is a feature request since a few years and there are add-ons (Paragraph Styles +, which didn't work for us) that provide this solution, butt we are considering to build this ourselves. if anybody has a suggestion on how we could start with this, it would be a great help!
thanks,

Best bet is to file a feature request on the product forums.
Currently the only way to do that level of manipulation of a doc to provide a custom TOC is to use Apps Script. It provides access to the document structure sufficient enough to build and insert a basic table of contents, but I'm not sure there's enough to do paging correctly (unless you force a page break on ever page...) There's no method to answer the question of "what page is this element on?"
Hacks like writing to a DOCX and converting don't work because TOCs are recognized for what they and show up without page numbers.
Of course you could write a DOCX or PDF with the TOC as you'd like and upload as a blob rather than as a Google Doc. They can still be viewed in Drive and such.

Google Search by Image API?

for my job, I'm looking into an idea in which people would use Google Search by Image and use any celebrity photo they find. Google would return the results and then on our end, a there'd be a database of professionals showing how to get that specific look.
I'm assuming this is extremely unlikely to do, based on that users could use ANY photo.
So, is there a way that I could have about 100 or so celebrity photos that Google Image results could compare to and then choose the one that is closest.
Basically:
Drag drop photo of Britney Spears
Google searches with that image
Google's results compare the top images with our 100, and selects the closest match.
User gets to see video of how to get Britney Spears look.
I'm not a programmer, but looking for some API or Search by Image extension that could make this remotely possible for the programmers here at my job. Does something like that (a search by image api) exist? The best I could find was just the support page, which is hardly of any help: http://support.google.com/images/bin/answer.py?hl=en&p=searchbyimagepage&answer=1325808

You can easily search by an existing image by inserting this into your address bar:
https://www.google.com/searchbyimage?site=search&sa=X&image_url=YOUR_IMAGE_URL
Example:
https://www.google.com/searchbyimage?site=search&sa=X&image_url=http://cdn.sstatic.net/Sites/stackoverflow/company/img/logos/so/so-icon.png

Sorry to say, but the Google image API is deprecated:
Important: The Google Image Search API has been officially deprecated as of May 26, 2011. It will continue to work as per our deprecation policy, but the number of requests you may make per day may be limited.
Quite sure there are some alternatives (http://www.tineye.com/ and http://mrisa.mage.me.uk)
Update (2013): There is now Google Custom Search which allows image searches.

These answers are quite obsolete, but the question comes up in searches. So, the Google Vision API has the "web detection" feature that does a reverse image search. First 1000 requests per month are free, $3.50/1000 afterwards.

I think Google Web Detection could be a solution for you. Google moved it permanently from Image search

You can do it via www.images.google.com but only from a browser (lets you upload your own image and compares it to similar).
I'm working on doing it from code (not from browser).

I had the same problem and came up with two solutions:
There are a number of APIs that give reverse image search results nowadays. The ones I used are https://reverseimageapi.com and TinEye.com.
As the selected answer mentions, you can easily scrape this information but will almost certainly need rotating proxies to prevent being banned by the search engine. There are plenty of proxy rotation services (Zyte, Oxylabs, ScrapingBee, etc.) to make you life easier.
I ended up going with option 1 due to the upkeep of scraping search engines and elements changing / breaking.

Google docs viewer url parameters

Is there any sort of documentation on exactly what parameters you can put in the url of Google viewer?
Originally, I thought it was just url,embedded,chrome, but I've recently come accross other funny ones like a,pagenumber, and a few others for authentication etc.
Any clues?

One I know is "chrome"
If you've got https://docs.google.com/viewer?........;chrome=true
then you see a fairly heavy UI version of that doc, however with "chrome=false" you get a compact version.
But indeed, I'd like a complete list myself!

I know this question is very old and perhaps you already solved your issue, but for anyone on the internet who might be looking for an answer...
I have been looking for this recently, following a guide I found on GitHub Gist
https://gist.github.com/tzmartin/1cf85dc3d975f94cfddc04bc0dd399be
More specifically, the option to embed a certain page of pdf using
<iframe src="https://docs.google.com/viewer?srcid=[put your file id here]&pid=explorer&efh=false&a=v&chrome=false&embedded=true" width="580px" height="480px"></iframe>
The best I could fing was this article (I suppose from a long time now)
https://weekly-geekly.github.io/articles/111647/index.html
HOWEVER, I tried modifying the attributes and the result was simply a redirect to
https://drive.google.com/file/d/[ID]/edit
https://drive.google.com/file/d/[ID]/preview or
https://drive.google.com/file/d/[ID]/view
AS OF MAY 2020, THIS SOLUTION PROBABLY DOESN'T WORK

I'm also on a quest to discover some of the parameters of the viewer.
the "chrome" parameter doesn't seem to do anything, though. Is this
supposed to be the same as embedded=true?
Parameters I know of:
url= (obviously)
embedded= (obviously)
hl= set language of UI (tooltips)
#:0.page.1 = jump to page 2 (page 1 is numbered 0) - this is unreliable and often requires a refresh after the first load,
defeating the purpose.
That said, when I use the Google Docs viewer on my site, "fit page to
screen" is the default view without any parameters. So maybe I'm
misunderstanding your question.
Source: For convenience, this is a full quote of the sole answer (it is from user k3david) to the crosspost of this question #Doc has posted to the Google support forum in 2011.

You can pass q=whatever to pass a search query to the viewer.

What web photo gallery software meets all my pernickety requirements?

I have a collection of photographs (about 30,000) which I'd like to put online. I've tried doing this before, over the years, with static image galleries, applications such as Gallery2, and self-rolled scripts. None have worked that well, as my requirements are fiddly, but it still seems like this should be a solved problem.
My photos are currently organised into folders named YYYY-MM-DD short album title, using Digikam.
I need a system that:
Is Free software, is essentially feature-complete, and has an active developer community.
Allows new photos and albums to be added and updated automatically with little more manual intervention than rsyncing the source directory on my computer to the web server, and rescanning.
Allows visitors to leave comments
Allows re-captcha or equivalant spam filtering and bulk moderation of these comments.
Reads tags from the IPTC Keywords field.
If it finds a tag named "friends", requires the user to enter a password to view.
If it finds a tag named "family", requires the user to enter a different password to view.
If it finds a tag named "private", does not display the photo at all, or even better, does not upload it to the live web server.
Reads descriptions from the IPTC Caption field.
Creates sane permalinks, e.g. http://example.com/2009/03/28/shortalbumtitle/IMG_0001.jpg
I acknowledge that I may be asking for something that doesn't exist, but I hope it does.
I acknowledge that answers may be something like "use Django and code the bits that don't already exist yourself", in which case do you have any tips? :)
Thanks.

Use Django and code the bits that don't already exist yourself.
Seriously. I was going to write that and was tempted not to when I saw you'd written it yourself, but it really does make the most sense if you have any familiarity with it!
I'd start with django-photologue 2. Get a basic gallery with tagging and comments working. You'll need a couple of pl's optional dependencies.
Then I'd write a custom import wrapper that allows you to rsync to a dir and update your library.
Comments are handled internally (through photologue, I think) but if not, there are plenty of comment apps that "just work". There is a recaptcha script that works as just another form field.
PIL can read IPTC
The URL structure is up to you =)

I'm finally getting around to doing this. I'm using a local python script to extract image metadata (tags, captions and timestamp) using pyexiv2, then rotate the image according to its EXIF orientation tag if appropriate, using PIL, and export a hierarchy of files to a temporary directory, where rsync uploads it to my host, and a remote python script (actually a Django app) imports the metadata into a Django DB.

Creating PDF Invoices - Are there any templating solutions? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Our company is looking to integrate invoices into a new system we are developing.
We require a solution to create a layout of the invoice and then convert to pdf.
We have considered just laying out the invoice in html/css then converting to pdf.
We have also considered using SVG->PDf conversion.
Both of these solutions integrate well into our existing templating language used for our web application.
Historically we have been a Microsoft based business and used Crystal Reports for such a task but we are looking for an open source Linux solution for this project.
Does any one have any suggestions of an approach or technology we could use for such a task?

Try this... create a blank invoice with Word (or whatever you want) and save it as a PDF.
Then use a PDF library to modify the PDF (insert the text at particular coordinates). We do this in the Microsoft world and it is extremely easy.
The biggest benefit is that we can use our own tools to create and modify the template. If we want to add some static text, we just crank open Word, make the change and save it to a PDF file (that is being used as a template).
For Microsoft, we use iTextSharp which is actually a C# port of the original Java version of iText
Additionally...
You can use Adobe Acrobat to insert fields in the PDF (address, phone, invoice number, line item 1, line item 2, etc...) and then use iText/iTextSharp to populate these fields at run time.
This is, in more detail, what we do... and it is extremely easy.

The normal way is to install (La)TeX (probably already on the linux box) and run pdflatex to get the pdfs. You can also use Apache FOP, if you prefer xslt and xsl-fo.
If the number of invoices to create is low you might want to use open-office (directly or as a toolkit).
If you want high-precision positioning and low-level access, a low-level pdf library (I don't know if iTextSharp works with mono) might be what you want.
I would try out LaTeX first, because it allows you to get results with the least effort.

I've previously produced invoices by templating a PostScript file, and then using Ghostscript's ps2pdf to convert those into PDFs.

We use Reportlab with Python. If you look around there are a load of ready-made forms/invoices/etc.

There are several OSS reporting engines (Jasper Reports, Pentaho and BIRT to name three) that you could use in much the same way as you have historically used Crystal Reports. One of the other posters mentions ReportLab, which is an option if you're using Python or can embed a Python runtime in your application.

Probably the most flexible solution is to create XMLs with invoice data and then by using XSLTs transform the, into PDFs, HTMls, whatever...

It depends on your environment. If you have access to Java, you might look at iText (http://www.lowagie.com/iText/), a library that allows you to generate PDF files on the fly.

There are two steps, if i understood correctly:
1) Creation of PDF template with placeholders to populate data programmatically
2) Populating the PDF template programmatically during run time
For #1, OpenOffice allows creation of PDF templates, which can then be populated programmatically. It's good enough to create simple invoices that doesn't probably involve datagrid/table kind of stuff.
For #2, you already have the answers here - iText, iTextSharp.
Hope this helps!

I love wkhtmltopdf http://code.google.com/p/wkhtmltopdf/

Not sure what your goal is here, but there is an opensource php-library called fpdf, which also has an extension for taking a pre-made pdf as layout and then populate it with more content, generating a new PDF with that info.
However, I would go for a solution that you can integrate nicely into the plattform you're building, but I wouldn't go in a HTML->PDF solution since you won't have any clue about what would fit on a piece of paper regarding sizes in that kind of enviroment, meaning you won't know when you should split the content into two separate templates.

You might also try using XSL:FO. XSL:FO is a documented standard for describing page layout: http://www.w3.org/TR/xsl/#fo-section.
I've had success on two projects creating documents by creating an XML schema that defines the content of the "PDF". I then use the XSD tool (from Microsoft) to generate a class representing this document. I then map my data into that structure, serialize the populated class to XML, along with an XSL stylesheet that defines how that data should be mapped into FO, and pass it to an FO formatter. For formatters, I have use Alt-Soft's Xml2Pdf with success. There are a few others out there. There are some tools available to help create the XSL to FO stylesheet (i.e. stylusstudio and XmlSpy), but I recommend learning the FO constructs as the tools seem to produce bloated stylesheets. FO is comparable to HTML (where a P tag is a BLOCK tag in FO), but can be tricky. This nice thing about FO, is that some formatter support conversion to other formats, such as Word, HTML, etc.
Other options:
iTextSharp (C# port of iText). Just started reading about this. Open source and free. I don't think there is any "templating" supported with this, but I could be wrong about that.
SQL Server Reporting Services. Assuming your invoice data is in, or can be put in, a format that can be read by reporting services (SQL Server, Web Service, etc), define the layout in SSRS and then publish to reporting server. Use SSRS Web Services or query parameter execution to execute the report and have it output as PDF.

This html-2-pdf site may be a helpful starting point: http://maarten.lippmann.us/?p=101
A site a friend of mine built uses a script to churn HTML pages into printable PDFs, too - http://philambdaupsilon.org. Not sure on the exact details of it, but he is an SO user, and I'll send this question to him, too.

Unfortunately, the best system on the market (at present) is passing the HTML & CSS to a ColdFusion server and have that return the rendered PDF. So if money isn't a big concern, this is the quickest to deploy solution that'll render the best results.
I've tried very hard to get FPDF, TCPDF, the R&OS pdf class, and even CodeIgniter's recommendation to work, but nothing with stable output for anything beyond the most basic/bland HTML files.
Honestly, if the ColdFusion solution isn't viable, I'd use html2ps, and then ps2pdf to convert your files into a PDF.
(This is all assuming that you don't want to take the time and design each PDF using the native PDF-creator code in PHP. This is what systems like SugarCRM use. Though its very functional with stable results, the actual creation of each PDF-generator file is a most painful process)

We have used Jasper Reports before. It's not what you'd call user-friendly, but it will talk directly to your database.

html2pdf works very well. You can use this to generate both HTML and PDF reports from the same source.

I'm fiddling with Black Sheep Invoices right now, which is great at first but now I'm having trouble actually getting it to render the PDFs. Lots of installation difficulties--probably a lot easier on your own server but i'm up on a shared host with it. The HTML output and data management portions are well done though, which is something you won't get out of just creating a postscript template. I was hoping to find a reference to a library that has an active development team though (Black Sheep is not being updated at this time).

If you want browser perfect HTML converted to PDF then try commandlineprint
You'll need to install firefox on a linux distro, disable all firefox alerts and then run it through a virtual display. Check this thread for more details.
It's infuriating to get running well but does give you the best results for HTML to PDF conversion I've seen.

OK, a search of Google Code projects turned up Simple Invoices, which is awesome and well maintained.

I use TROFF for my invoices because of its extremely simple textual encoding. The logic is a few lines of Perl. Keeping it simple.

For a Ruby solution, try Prawn: http://prawn.majesticseacreature.com/

I use open office on the server and then generate the XML for the document (just unzip the document and hack away)

Some can use Dhek template editor to define area/placeholder for existing PDF, without altering existing document, and then populate it to generate final doc (e.g. with user values from a form): https://github.com/applicius/dhek .

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string