How to create custom Font Awesome 5 SVG definition? - svg

I want to add a new custom SVG icon to Fontawesome 5 and assume that I need to create a definition for it in the JavaScript file.
var icons = {
"address-book": [448, 512, [], "f2b9", "M436 160c6.627 0 12-5.373 12-12v-40c0-6.627-5.373-12-12-12h-20V48c0-26.51-21.49-48-48-48H48C21.49 0 0 21.49 0 48v416c0 26.51 21.49 48 48 48h320c26.51 0 48-21.49 48-48v-48h20c6.627 0 12-5.373 12-12v-40c0-6.627-5.373-12-12-12h-20v-64h20c6.627 0 12-5.373 12-12v-40c0-6.627-5.373-12-12-12h-20v-64h20zm-228-32c44.183 0 80 35.817 80 80s-35.817 80-80 80-80-35.817-80-80 35.817-80 80-80zm128 232c0 13.255-10.745 24-24 24H104c-13.255 0-24-10.745-24-24v-18.523c0-22.026 14.99-41.225 36.358-46.567l35.657-8.914c29.101 20.932 74.509 26.945 111.97 0l35.657 8.914C321.01 300.252 336 319.452 336 341.477V360z"],
In the example code for "address-book" what do each of the items represent?
448=Width?
512=Height?
[]=?
f2b9=?
Last item=SVG Path?

The whole object you would feed to fontawesome.library.add(…iconDefinitions) actually looks like this:
{
"prefix": "fa", // probably better to use a custom one
"iconName": "user",
"icon": [
512, // viewBox width
512, // viewBox height
[], // ligatures
"f007", // unicode codepoint - private use area
"M962…-112z" // path
]
}
I can't point out documentation to support the interpretation, but in the source code fields are used accordingly.
The symbol viewBox is always rendered as "0 0 <width> <height>", so no x/y offsets are possible.
I haven't found any js code actually rendering ligatures, or icons defining them, so I am not sure what the content of that array would be. As they would be used for a search order, probably its meant to take a list of iconNames or codepoints. If this is related to the desktop ligature support, it is moot outside DTP applications and having the .otf files installed, anyway.
Unicode codepoints are used for the CSS Pseudo-elements method. and should be unique. All fontawesome codepoints seem to be above U+F000, so the range U+E000…U+EFFF looks to be good for custom entries.

Related

Trying to get more than 4 symbols from one key in my custom xkb keyboard layout

I have no problem getting four symbols per key using the right alt key, but I am running into failure when trying to get the level 5 switch working, as well as enabling a key to toggle between two separate layouts. I may not be understanding the concepts correctly, but these were what I interpreted to be as two different ways of getting more than 4 symbols per key, when reading this article on the Arch Wiki.
This is an sample layout excerpt of what I am using, annotated with what works and what does not work. These are random symbols for the D key, not necessarily my end goal, but just placeholders for 16 symbols between two toggled layouts:
// /usr/share/X11/xkb/symbols/ocd
partial modifier_keys alphanumeric_keys
xkb_symbols "ocd" {
include "us(basic)"
// do I actually need this?
name[Group1] = "English (ocd)";
key.type[Group1] = "ONE_LEVEL";
// hide extra European keyboard key from GNOME graphical layout depiction
replace key <LSGT> {[ Shift_L ]}; // this works
// try to toggle between layouts for one key
replace key <RWIN> {[ ISO_Next_Group ]}; // does NOT work
key.type[Group1] = "EIGHT_LEVEL_SEMIALPHABETIC";
// First row
key <AE07> {[ 7, ampersand, paragraph, section ]}; // ¶ §
// Third row (2 layouts to toggle, w/ 8 symbols in layout 1)
key <AC03> {
[ d, D, grave, asciitilde,
exclam, at, numbersign, dollar
],
[ Greek_delta, Greek_DELTA, percent, asciicircum,
ampersand, asterisk, parenleft, parenright
]
};
// do these additional includes have to be at the END?
include "level3(ralt_switch)" // this works
include "level5(rctrl_switch)" // does NOT work
};
Is four symbols the most I can hope for, or am I on the right track?

Fails to parse Hebrew text from pdf using iText 7 with .net

I am trying to read a PDF file with several pages, using iText 7 on a .NET CORE 2.1
The following is my code:
Rectangle rect = new Rectangle(0, 0, 1100, 1100);
LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
inputStr = PdfTextExtractor.GetTextFromPage(pdfDocument.GetPage(i), strategy);
inputStr gets the following string:
"\u0011\v\u000e\u0012\u0011\v\f)(*).=*%'\f*).5?5.5*.\a \u0011\u0002\u001b\u0001!\u0016\u0012\u001a!\u0001\u0015\u001a \u0014\n\u0015\u0017\u0001(\u001b)\u0001)\u0016\u001c*\u0012\u0001\u001d\u001a \u0016* \u0015\u0001\u0017\u0016\u001b\u001a(\n,\u0002>&\u00...
and in the Text Visualizer, it looks like that:
)(*).=*%'*).5?5.5*. !!
())* * (
,>&2*06) 2.-=9 )=&,

2..*0.5<.?
.110
)<1,3
  2.3*1>?)10/6
 (& >(*,1=0>>*1?

  2.63)&*,..*0.5
  206)&13'?*9*<
  *-5=0>
?*&..,?)..*0.5
it looks like I am unable to resolve the encoding or there is a specific, custom encoding at the PDF level I cannot read/parse.
Looking at the Document Properties, under Fonts it says the following:
Any ideas how can I parse the document correctly?
Thank you
Yaniv
Analysis of the shared files
file1_copyPasteWorks.pdf
The font definitions here have an invalid ToUnicode entry:
/ToUnicode/Identity-H
The ToUnicode value is specified as
A stream containing a CMap file that maps character codes to Unicode values
(ISO 32000-2, Table 119 — Entries in a Type 0 font dictionary)
Identity-H is a name, not a stream.
Nonetheless, Adobe Reader interprets this name, and for apparently any name starting with Identity- assumes the text encoding for the font to be UCS-2 (essentially UTF-16). As this indeed is the case for the character codes used in the document, copy&paste works, even if for the wrong reasons. (Without this ToUnicode value, Adobe Reader also returns nonsense.)
iText 7, on the other hand, for mapping to Unicode first follows the Encoding value with unexpected results.
Thus, in this case Adobe Reader arrives at a better result by interpreting meaning into an invalid piece of data (and without that also returns nonsense).
file2_copyPasteFails.pdf
The font definitions here have valid but incomplete ToUnicode maps which only contain entries for the used Western European characters but not for Hebrew ones. They don't have Encoding entries.
Both Adobe Reader and iText 7 here trust the ToUnicode map and, therefore, cannot map the Hebrew glyphs.
How to parse
file1_copyPasteWorks.pdf
In case of this file the "problem" is that iText 7 applies the Encoding map. Thus, for decoding the text one can temporarily replace the Encoding map with an identity map:
for (int i = 1; i <= pdfDocument.GetNumberOfPages(); i++)
{
PdfPage page = pdfDocument.GetPage(i);
PdfDictionary fontResources = page.GetResources().GetResource(PdfName.Font);
foreach (PdfObject font in fontResources.Values(true))
{
if (font is PdfDictionary fontDict)
fontDict.Put(PdfName.Encoding, PdfName.IdentityH);
}
string output = PdfTextExtractor.GetTextFromPage(page);
// ... process output ...
}
This code shows the Hebrew characters for your file 1.
file2_copyPasteFails.pdf
Here I don't have a quick work-around. You may want to analyze multiple PDFs of that kind. If they all encode the Hebrew characters the same way, you can create your own ToUnicode map from that and inject it into the fonts like above.

Extract Arabic Text using iTextsharp get number only?

I try To extract Arabic Text from PDF file but it extract only number and the result like this :
: 7234569 1439/08/07 : : 1 2375173941 14 08 6 39266 1050672243 2280 30 400 24 415 24 15 720 30 402 30 499 14 07 1 610117038085 0 1069508677 0 :
My code :
public static string GetTextFromAllPages(string pdfPath) {
PdfReader reader = new PdfReader(pdfPath);
string result = null ;
//for (int i = 1; i <= reader.NumberOfPages; i++)
result = PdfTextExtractor.GetTextFromPage(reader, 1, new LocationTextExtractionStrategy()); return result;
}
Any help Please?
The embedded font for Arabic glyphs in your PDF contains this ToUnicode CMap:
/CIDInit /ProcSet findresource begin
12 dict begin
begincmap
/CIDSystemInfo << /Registry (Adobe) /Ordering (UCS) /Supplement 0 >> def
/CMapName /Adobe-Identity-UCS def
/CMapType 2 def
1 begincodespacerange
<0000> <FFFF>
endcodespacerange
endcmap
CMapName currentdict /CMap defineresource pop
end
end
According to ISO 32000-1, section 9.10.3 ToUnicode CMaps:
It shall use the beginbfchar, endbfchar, beginbfrange, and endbfrange operators to define the mapping from character codes to Unicode character sequences expressed in UTF-16BE encoding.
Unfortunately your CMap does not use these operators at all and, therefore, does not define any mappings to Unicode.
Furthermore the font has an Encoding of Identity-H and its descendant CIDFont has a ROS Adobe-Identity-0 which means that character code, CID, and GID values are equal for a character but doesn't imply any mapping of them to Unicode.
Thus, the font is missing the information required for text extraction according to ISO 32000-1 section 9.10.2 Mapping Character Codes to Unicode Values.
(In such a situation text extractors can only guess, and such guesswork usually only works for a special type of documents the extractor is optimized for. You might want to try to enhance iText to be able to guess correctly in your case but that will require you to study the PDF specification, the iText text extraction code, and your sample files in detail.)
By the way, a good first test whether text extraction is feasible is to open the PDF in Adobe Reader and to copy and paste the text in question to an editor or word processor. If this does not work (and in the case at hand it does not work), chances are that the file does have incomplete or misleading information for text extraction (or none at all).

How to create Chrome extension that will search for text in source and alter formatting

I am new here... I am wondering if anyone could help point me in the right direction here.
I am looking to create a Chrome extension that searches a page for a number of different strings (one example: "(410)" or "(1040)" without the quotes) and highlight these so they're easier to see.
To explain a little further why I need this: I frequently work out of a queue with other coworkers, and there are specific things I need to focus on but I can ignore the rest of the questions on the page. So it would be helpful if my particular items were highlighted.
Thank you!
Edit: an example of how the source code works:
<td class="col-question">28 (510). <span id="ctl00_PlaceHolderMain_ctl01_ContentCheckList_ctl28_Label1" title=" <p>
<td class="col-question">49 (1150). <span id="ctl00_PlaceHolderMain_ctl01_ContentCheckList_ctl49_Label1" title="<p>
etc etc etc... there are around 100 numbers in parenthesis I would want highlighted. And probably another 100 that I wouldn't want highlighted.
Okay, to start off with I will show you how to inject the code into the page(s) you want, we will get to selecting the correct numbers in a bit. I will be using jQuery throughout this example, it isn't strictly necessary, but I feel that it may make it a bit easier.
First we declare a content script in our manifest as well as host permissions for the page you are injecting into:
"content_scripts": [
{
"matches": ["http://www.domain.com/page.html"],
"js": ["jquery.js","highlightNumbers.js"],
"css": ["highlight.css"]
}],
"permissions": ["http://www.domain.com/*"]
This will place our code in the page we are trying to change. Now you said that there are about 100 different numbers you would want to highlight and I will assume that these are specific numbers that don't match any patterns, so the only way to select all of them would be to make an explicit list of numbers to highlight.
highlightNumbers.js
// This array will contain all of the numbers you want to highlight
// in no particular order
var numberArray = [670,710,820,1000,...];
numberArray.forEach(function(v){
// Without knowing exactly what the page looks like I will just show you
// how to highlight just the numbers in question, but you could easily
// similarly highlight surrounding text as well
var num = "(" + v + ")";
// Select the '<td>' that contains the number we are looking for
var td = $('td.col-question:contains('+num+')');
// Make sure that this number exists
if(td.length > 0){
// Now that we have it we need to single out the number and replace it
var span = td.html().replace(num,'<span class="highlight-num">'+num+'</span>');
var n = td.html(span);
}
// Now instead of '(1000)' we have
// '<span class="highlight-num">(1000)</span>'
// We will color it in the css file
});
Now that we have singled out all of the numbers that are important, we need to color them. You can, of course, use whatever color you want, but for the sake of the example I will be using a bright green.
highlight.css
span.highlight-num{
background-color: rgb(100, 255, 71);
}
This should color all of the numbers that you put in the array in the js file. Let me know if there are any problems with it as I can't exactly test it.

String replacement in latex

I'd like to know how to replace parts of a string in latex. Specifically I'm given a measurement (like 3pt, 10mm, etc) and I'd like to remove the units of that measurement (so 3pt-->3, 10mm-->10, etc).
The reason why I'd like a command to do this is in the following piece of code:
\newsavebox{\mybox}
\sbox{\mybox}{Hello World!}
\newlength{\myboxw}
\newlength{\myboxh}
\settowidth{\myboxw}{\usebox{\mybox}}
\settoheight{\myboxh}{\usebox{\mybox}}
\begin{picture}(\myboxw,\myboxh)
\end{picture}
Basically I create a savebox called mybox. I insert the words "Hello World" into mybox. I create a new length/width, called myboxw/h. I then get the width/height of mybox, and store this in myboxw/h. Then I set up a picture environment whose dimensions correspond to myboxw/h. The trouble is that myboxw is returning something of the form "132.56pt", while the input to the picture environment has to be dimensionless: "\begin{picture}{132.56, 132.56}".
So, I need a command which will strip the units of measurement from a string.
Thanks.
Use the following trick:
{
\catcode`p=12 \catcode`t=12
\gdef\removedim#1pt{#1}
}
Then write:
\edef\myboxwnopt{\expandafter\removedim\the\myboxw}
\edef\myboxhnopt{\expandafter\removedim\the\myboxh}
\begin{picture}(\myboxwnopt,\myboxhnopt)
\end{picture}
Consider the xstring package at https://www.ctan.org/pkg/xstring.
The LaTeX kernel - latex.ltx - already provides \strip#pt, which you can use to strip away any reference to a length. Additionally, there's no need to create a length for the width and/or height of a box; \wd<box> returns the width, while \ht<box> returns the height:
\documentclass{article}
\makeatletter
\let\stripdim\strip#pt % User interface for \strip#pt
\makeatother
\begin{document}
\newsavebox{\mybox}
\savebox{\mybox}{Hello World!}
\begin{picture}(\stripdim\wd\mybox,\stripdim\ht\mybox)
\put(0,0){Hello world}
\end{picture}
\end{document}

Resources