Add a custom font to Apple News - apple-news

I have a font in both .otf and .ttf format. I'd like to use it on my Apple News article, but I keep getting the error Error: Custom font (postscript name=CustomFontName) not available. I know the JSON is correct because it works if I use a standard font. I've included the font in the same folder as article.json, but beyond that I can't find any documentation on how to do this.

I see no mention of custom fonts in the Apple News Format guidelines. It states in the docs:
fontName: The PostScript name of the font to apply, such as GillSans-Bold. You
can reference any font by name. see iOSfonts.com for a list of available fonts.
and elsewhere:
Fonts News supports all iOS system fonts except San Francisco, which
was introduced with iOS 9.
which originally led me to believe that it was being implied that custom fonts don't work, but experimenting with the format I see that my assumption was wrong.
Copying the PostScript name of the font (see info panel in OS X Font Book) and placing the font within the same folder as article.json file it works. But make sure if you're going to apply emphasis that you include the relevant italic or bold variants as well.
The fonts I used were open type true type from Google Fonts. You can get the exact ones from that link. I included Roman and Italic variants in my folder and referenced the font by calling it "IM_FELL_DW_Pica_Roman" within the Component Text Styles (i.e. paragraph styles). When I then applied markdown to identify italic text it automatically found the italic variant. I didn't need to reference the italic version separately (i.e. within inline text styles).
Here's my code:
{
"version": "1.1",
"identifier": "sketchyTech_Demo",
"title": "My First Article",
"language": "en",
"layout": {},
"components": [{
"role": "title",
"text": "My First Article",
"textStyle": "titleStyle",
"inlineTextStyles": [{
"rangeStart": 3,
"rangeLength": 5,
"textStyle": "redText"
}]
}, {
"role": "body",
"format": "markdown",
"text": "This is just over the minimum amount of _JSON_ required to create a valid article in Apple News Format. If you were to delete the dictionary enclosing this text, you'd be there.",
"textStyle": "bodyStyle"
}],
"componentTextStyles": {
"titleStyle": {
"textAlignment": "center",
"fontName": "HelveticaNeue-Bold",
"fontSize": 64,
"lineHeight": 74,
"textColor": "#000"
},
"bodyStyle": {
"textAlignment": "left",
"fontName": "IM_FELL_DW_Pica_Roman",
"fontSize": 18,
"lineHeight": 26,
"textColor": "#000"
}
},
"textStyles": {
"redText": {
"textColor": "#FF00007F"
}
}
}
(Note: Trying to add bold it silently fell back on italic because the relevant font wasn't available.)
One final thing I would say is that the use of custom fonts appears to be undocumented although it works. Therefore, it won't necessarily be the case (unless you've seen confirmation elsewhere) that Apple will accept the use of custom fonts when you upload.

There are some useful tips on how to get the actual name of your custom font in this answer - it isn't always what you'd expect. The same rules apply to the Apple News ecosystem.
Your font definition should look like this:
"fontName": "Raleway-ExtraLight"

Related

Generating synonyms or similar words using BERT word embeddings

I want to generate synonyms or similar words using BERT words embeddings.
I started to do this using BERT.
For later software integration, it has to be done in JAVA, so I went for easy-bert
(https://github.com/robrua/easy-bert).
It appears I can get word embeddings this way:
try(Bert bert = Bert.load(new File("com/robrua/nlp/easy-bert/bert-uncased-L-12-H-768-A-12"))) {
float[][] embedding = bert.embedTokens("A sequence");
float[][][] embeddings = bert.embedTokens("Multiple", "Sequences");
}
Do you know how I could get similars words from these word embeddings ?
Thanks for your help !
I developed a way to do this using Luminoso. I work for them so this is a bit of an ad, but it does exactly what you want it to do.
https://www.luminoso.com/search
Luminoso is really good at understanding conversational text like product reviews, product descriptions, survey results and trouble tickets. It doesn't require ANY kind of training or ontology building and will build a language model around your language. You feed the text for your pages into Luminoso and it will generate a set synonyms for the concepts used in your text.
As an example project I built a search using Amazon.com beauty products. I'll just copy a couple of the automatically generated synonyms around three concepts. There were 17851 synonyms generated from this dataset.
scent, rose-like, not sickeningly, not nauseating, not overwhelming, herb-y, no sweetness, cucumber-y, not too citrus-y, no gardenia, not lemony, pachouli, vanilla-like, fragarance, not spicy, flowerly, musk, perfume-like, floraly, not cloyingly => scent
recommend, recommende, advice, suggestion, highly recommend, suggest, recommeded, recommendation, recommend this product, reccommended, advise, suggest, indicated, suggestion, advice, agree, recommend, say, considering, mentioned => recommend
bottle, no sprayer, 8-oz, beaker, decanter, push-down, dispenser, pipet, pint, not the bottle, no dropper, keg, gallon, jug, pump-top, liter, half-full, decant, tumbler, vial => bottle
eczema, non-steroidal, ulcerative, dematitis, ecsema, Elidel, dermititis, inflammation, pityriasis, hydrocortizone, dyshidrotic, chickenpox, Stelatopia, perioral, rosacea, dry skin, nummular, ecxema, mild-moderate, ezcema => eczema
There were 800k products in this search index so the results were large as well, but this will work on small datasets as well.
Besides the synonym format there you can also place this directly into elasticsearch and associated the synonyms for a specific page with that page.
This is a sample of an Elasticsearch index enhanced with the same technology. It's dialed up super high so there are too many concepts added, but just to show you how well it finds relationships between concepts.
{"index": {"_index": "amzbeauty", "_type": "_doc", "_id": "130414089X"}}
{"title": "New Benefit Waterproof Automatic Eyeliner Pen - Black - BAD Gal Liner", "text": "Length : 13.5 cm\nColor: Black\n100% Brand new and unused.\nSmudge free.\nFine-tip. Easy to blend and smooth to apply\nCan make fine and bold eyeline with new texture and furnishing.\nProvide rich and consistant colour\nLongwearing and waterproof\nFregrance Free", "primary_concepts": ["not overpoweringly", "concoction", "equipped", "fine-tip", "water-resistant", "luxuriant", "make", "fixture", "☆", "not lengthen", "washable", "not too heady", "blendable", "doesn't collect", "shade", "niche", "supple", "smudge-proof", "sumptuous", "movable", "black", "over-apply", "quick", "silky", "colored", "sweatproof", "opacity", "accomodate", "fuchsia", "furnishes", "meld", "sturdily", "smear", "inch", "mid-back", "chin-length", "smudge", "alredy", "not cheaply", "long-wearing", "eyeline", "texture", "steady", "no-name", "audacious", "easy", "edgy", "is:A", "marketers", "greys", "decadent", "applicable", "Crease-free", "magenta", "free", "itIn", "stay-true", "racy", "application", "glides", "smooth", "sleek", "taupe", "grainy", "dark", "wealthy", "JP7506CF", "gray", "grayish", "width", "newness", "purfumes", "Lancme", "blackish", "easily", "doesn't smudge", "maroon", "blend", "convenient", "smoother", "Moschino", "long-wear", "mauve", "medium-length", "no raccoon", "revamp", "demure", "richly", "white", "brand", "offers", "lenght", "soft", "doesn't smear", "provide", "provides", "unusable", "eye-liner", "unopened", "straightforward", "silky-smooth", "uniting", "compactness", "bold", "fearless", "mix", "indulgent", "brash", "serviceable", "unmarked", "not musky", "constructed", "racoon", "smoothly", "sealant", "merged", "boldness", "reuse", "unused", "long", "Kors", "effortless", "luscious", "stain", "rich", "discard", "richness", "opulent", "short", "consistency", "fine", "sents", "newfound", "fade-resistant", "mixture", "hue", "sassy", "apply", "fragnance", "heathy", "adventurous", "not enthusiastic", "longwearing", "fregrance", "non-waterproof", "empty", "lashline", "simple", "newly", "you'r", "combined", "no musk", "mingle", "waterproof", "painless", "pinkish", "thickness", "clump-free", "gos", "consistant", "color", "smoothness", "name-brand", "new", "smudgeproof", "yaaay", "water-proof", "eyemakeup", "not instant", "spidery", "furnish", "tint", "product", "reapply", "not black", "no globs", "imitators", "blot", "cinch", "uncomplicated", "untouched", "length"], "related_concepts": ["eyeliner", "no goofs", "doesn't smear", "pen", "hundreds"]}
{"index": {"_index": "amzbeauty", "_type": "_doc", "_id": "130414643X"}}
{"title": "Goodskin Labs Eyliplex-2 Eye Life and Circle Reducer - 10ml", "text": "Eyliplex-2 is a dual solution that focuses on the problematic eye area. This breakthrough, 24-hour system from the scientists at good skin pharmacy visibly tightens eye areas while reducing dark circles. 0.34 oz. each. 64% of subjects reported younger looking eyes immediately and a 20% reduction in the appearance of dark circles in clinical studies.", "primary_concepts": ["coloration", "Laboratories", "oncology", "cornea", "undereye", "eye", "immediately", "☆", "teen", "dry-skin", "good", "eyelids", "puffiness", "behold", "research", "temperamental", "dermatological", "breakthrough", "study", "store", "nice", "lasik", "instantaneously", "teenaged", "multi", "rheostat", "dermatology", "chemist", "invisibly", "PhD", "pharmacy", "alredy", "not cheaply", "optional", "pharmacist", "Obagi-C", "topic", "supermarket", "reversible", "studies", "Younger", "medically", "report", "thermo", "tightness", "dual", "eliminate", "researcher", "Minimization", "cutaneous", "hydration", "O2", "taupe", "increase", "moisturization", "dark", "preliminary", "excellent", "Quad", "well", "appearance", "dusky", "quickly", "instantly", "CVS", "Dermal", "great", "revolutionary", "biologist", "epidermis", "blackish", "disclosed", "problem", "youngsters", "murky", "scientific", "teenager", "oz", "dark circles", "clinically", "emphasis", "absorption", "skin", "loosen", "intractable", "technological", "reduction", "clinician", "nutritional", "forthwith", "grocer", "scientifically", "swiftly", "examination", "state-of-the-art", "not acne prone", "zone", "decrease", "younger-looking", "excellently", "troublesome", "system", "radius", "tighten", "FDA", "decent", "noticeably", "WD-40", "clearer", "scientist", "saggy", "significantly", "improvement", "Teamine", "interchangeable", "visible", "visable", "no fine line", "shortly", "minimize", "survey", "problematic", "young", "glance", "racoon", "vicinity", "youthful", "exacerbated", "focal", "region", "groundbreaking", "reddish", "focus", "reduce", "increments", "nad", "fasten", "area", "soon", "complexion", "squinting", "look", "grocery", "eyliplex-2", "Eyliplex-2", "subsequently", "even-toned", "bothersome", "eyes", "mitigate", "markedly", "philosophy:you", "difficult", "darkish", "bluish", "satisfactory", "darken", "epidermal", "lessen", "appearence", "ocular", "ergonomically", "diminished", "progression", "purplish", "sun-damaged", "Cellex-C", "visibly", "diagnosis", "drugstore", "under-eye", "apothecary", ":-D", "terrific", "clinical", "oz.", "Endocrinology", "time-released", "Nouriva", "tight", "adolescent", "subject", "eyeballs", "sking", "Pro-Retinol", "aggravate", "younger", "shortcomings", "solution", "assess", "promptly", "teenage", "Kinetin", "24-hour", "Mart", "youth", "visibility", "scientists", "taut", "better", "eyesight", "no dark circles", "not reduce", "photoaging", "Pending"], "related_concepts": ["A22", "A82", "Amazon", "daytime", "HK", "nighttime", "smell", "dark circles", "purchased"]}
{"index": {"_index": "amzbeauty", "_type": "_doc", "_id": "1304146537"}}
Luminoso uses word embeddings from ConceptNet which it also develops and the technology is above and beyond what ConceptNet gives you. I'm biased, but every time I've run data through it I'm amazed. Not free, but it really works with absolutely zero pre-training of the data and nothing is actually free.
Similar task for this subject (lexical substitution) would belong to LS07 and LS14.
One researcher achieved the SOTA in those benchmarks using the BERT.
You'd be interested in reading this paper.
https://www.aclweb.org/anthology/P19-1328.pdf
The author says as below.
applies dropout to the target word’s embedding for partially masking
the word, allowing BERT to take balanced consideration of the target
word’s semantics and contexts for proposing substitute candidates, and
then validates the candidates based on their substitution’s influence
on the global contextualized representation of the sentence."
I don't know how to reproduce the same result because the implementation is not open to public. But here's the hint - the embedding dropout could be applied to generate substitute candidates.

Searching for terms with underscore doesn't return expected results

How can I search a documents named "Hola-Mundo_Army.jpg" searching by the Army* word (always using the asterisk key at the end please)? The thing is that if I search the documents using Army* the result is zero. I think that the problem is the underscore before Army word.
But if I search Mundo_Army* the result is one found, correctly.
docs?api-version=2016-09-01&search=Mundo_Army* <--- 1 result OK
docs?api-version=2016-09-01&search=Army* <--- 0 results and it should find 1 result like the previous search. I always need to use the asterisk at the end.
Thank you!
This is the blob information that I have to search and find:
{
"#search.score": 1,
"content": "{\"azure_cdn\":\"http:\\/\\/dev-dr-documents.azureedge.net\\/localhost-hugo-docs-not-indexed\\/Hola-Mundo_Army.jpg\"}\n",
"source": "dr",
"title": "Hola-Mundo_Army.jpg",
"file_name": "Hola-Mundo_Army.jpg",
"file_type": "Image",
"year_created": "2017",
"client": "LALALA",
"brand": "LELELE",
"description": "HUGO_DEV-TUCUMAN",
"categories": "Clothing and Accessories",
"media": "Online media",
"tags": null,
"channel": "Case Study",
"azuresearch_skipcontent": "1",
"id": "1683",
"metadata_storage_content_type": "application/octet-stream",
"metadata_storage_size": 109,
"metadata_storage_last_modified": "2017-04-26T18:30:35Z",
"metadata_storage_content_md5": "o2yZWelvS/EAukoOhCuuKg==",
"metadata_storage_name": "Hola-Mundo_Army.json",
"metadata_content_encoding": "ISO-8859-1",
"metadata_content_type": "text/plain; charset=ISO-8859-1",
"metadata_language": "en"
}
The best way to troubleshoot cases like this is by using the Analyze API. It will help you understand how your documents and query terms are processed by the search engine. In your case, assuming you are not setting the analyzer property on the field you are searching against, the text Hola-Mundo_Army.jpg is broken down by the default analyzer into the following two terms: hola, mundo_army.jpg. These are the terms that are in your index. That's why, when you are searching for the prefix mundo_army*, the term mundo_army.jpg is matched. Prefix army* doesn't match anything in your index.
You can learn more about the the default behavior of the search engine and how to customize it from this article: How full text search works in Azure Search

How to position 'Signature tab' using Docusign such that it does not overlap with 'Anchor String'

1) I am currently using Anchor Tagging in my application.
2) The tab definition I am using is as follows
"tabs": {
"signHereTabs": [{
"anchorString": "Please Sign Here:",
"anchorXOffset": "1",
"anchorYOffset": "0",
"anchorIgnoreIfNotPresent": "false",
"anchorUnits": "inches"
}]
}
3) Given that 'anchorXOffset' is always computed from starting point of the 'Anchor string', I am currently facing an issue in which the Anchor string is getting overlapped by the signature tab.
4) This means that depending on the 'font size' of my 'anchor string' the 'signature tag' may or may not overlap the 'anchor string'
5) QUESTION : Is there any way such that 'anchorXOffset' is computed from end point of the 'anchor string'?
If Not, is there any way where we can place the 'signature tab' dynamically with respect to the 'Anchor string', such that the font size of the 'anchor string' does not affect the positioning of 'signature tab' and 'anchor string' is not overlapped by the 'signature tab'?
DocuSign doesn't know the layout or font sizes of the source documents, so the answer to your question is no.
What I'd suggest is that you place the anchor string directly adjacent to the label for the tag. That way if the font size of the label string is changed, the anchor string will be re-positioned too.
I realize that since the anchor string is invisible to the casual source document owner, they might mess up the anchor string when updating the document.
Another solution is to first make the source documents using PDF Form fields. The form fields can then be converted to tags. But using Acrobat or similar to create/manage the source documents is obviously more difficult and expensive than using Word or similar.

How do I find all exact matches within a block of text in Elasticsearch?

I've got an index of hundreds of book titles in elasticserch, with documents like:
{"_id": 123, "title": "The Diamond Age", ...}
And I've got a block of freeform text entered by a user. The block of text could contain a number of book titles throughout it, with varying capitalization.
I'd like to find all the book titles in the block of text, so I can link to the specific book pages.
Any idea how I can do this? I've been looking around for exact phrase matches in blocks of text, with no luck.
You need to index the field title as not_analyzed or using keyword analyzer.
This will tell elasticsearch to do no operations on the field whenever you send a query and this will make you be able to do an exact match search.
I would suggest that you keep an analyzed version as well as a not_analyzed version in order to be able to do exact searches as well as analyzed searches. Your mappings would go like this, in this case I assume that the type name is movies in your case.
"mappings":{
"movies":{
"properties":{
"title":{
"type": "string",
"fields":{
"row":{
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
This will give you two fields title which contains an analyzed title and title.row which contains the exact value indexed with absolutely no processing.
title.row would match if you entered an exact

Signature tags within pdf?

I am evaluating Docusign API to automate document signing process.
I see that we need to add Tabs, Anchor tags and provide with the X,Y coordinates/offsets to place the signatures. Is there a easier way to do this. I was wondering if I can add embed this information within my document so that recipient can see this while signing.
Really appreciate any advice.
Thanks
N
With the DocuSign API you have two main methods of positioning your Stick-eTabs. One method is through Absolute positioning, where you use X and Y coordinates to place your tabs at specific locations on the document(s). The other method is through Relative or Anchor Based positioning, where tab placement is based on actual document content.
For instance, you could use Absolute positioning to place a signature tab at a location 200 pixels to the right, and 100 pixels down from the top left of the document using the following (partial) JSON body:
"tabs": {
"signHereTabs": [
{
"xPosition": "200",
"yPosition": "100",
"documentId": "1",
"pageNumber": "1",
}
]
}
On the other hand, if you wanted to use Relative positioning you can actually place any tab at a location based on document content. For instance, if you had the text "Please Sign Here" somewhere in your document, you can place any tag right on or near this text very easily. You could place a signature tab 1 inch to the right, or an initial tab 5 pixels to the left and 10 pixels down, or a date tab 1 cm up and 2 cms to the right, for example. To do this you could use the following JSON to define your tab(s):
"tabs": {
"signHereTabs": [
{
"anchorString": "Please Sign Here:",
"anchorXOffset": "1",
"anchorYOffset": "0",
"anchorIgnoreIfNotPresent": "false",
"anchorUnits": "inches"
}
]
}
The above example would place a signature tab 1 inch to the right and at the same height as the text "Please Sign Here". One common approach that many developers take here is to embed content into the documents themselves such as the string \s1 for example. They additionally set the font color to the same color as the background where the string is placed (usually white) and this in turn makes the string \s1 invisible so that the recipient only sees the DocuSign tab at this location. For more information on this and absolute vs. relative tagging please read the Tab Positioning page on the Stick-eTabs features section.

Resources