panoramic text visualization with seadragon (or other)?

panoramic text visualization with seadragon (or other)? - text

i'm working on a visualization of a text file with 100k lines, max 1k characters/line, as one large, navigable image.
similar to the bleak house example in blaise's ted talk demo of seadragon, but even simpler -- basically just the view from cat filename.txt, but with a view that's zoomed out so that the whole file is initially visible (each line fitting on the page width, without wordwrap) and can then be zoomed in on.
is this currently possible with seadragon? if not, any ideas on how i can attempt it?
(oh, and including hyperlinks would be great -- but i don't expect that'll be possible.)

Not sure you'll be able to go up to 100k lines but the OpenZoom SDK with the Flash Text Engine could get you pretty far.
Example
http://gasi.ch/go/openzoom-fte/
Source Code
http://github.com/openzoom/sdk/raw/be50b3f1062e68d88dcf20f412e6fdb9b7320ed3/examples/flex/flash-text-engine/src/FlashTextEngine.mxml

A second vote for the answer to this question... I'd like to use seadragon or other technology to achieve a similar effect.

Related

how to create timelines that have multiple streams that merge and fork off each other?

I found this image / solution online. the source is included. it's a template that uses draw.io to create multiple timelines that look like merged code.
we have many projects that look like this. does anyone know how to do this programmatically in Excel.
The goal is to be able to manipuale the a program with data to generate this timeline.
any help is awesome.

ok - so after reaching out to couple of folks / reading online - I am going with using excel line graph with really large markers
and so, I get the following type. I am going to play with style and see if I can make the straight lines into curve lines
I can now overlay text as key markers.
thanks

How can I edit a DXF in node.js?

I'd like to make a custom lasered label from a user's input on a website. I have a template dxf file and I'd like to replace placeholder text with the user input. My problem is the dxf file format is very unreadable in its text format. Is there any way to make sense of the numeric data? If not are there any other formats (svg, etc) that would be easier to work with?
EDIT: The reason I've found it unreadable in terms of text is that the program (Solidworks) converted the text to curves.) At this point I'm trying to figure out how to prevent that.

AutoDesk was nice enough to document DXF syntax in great detail. Spend a couple hours understanding the documentation from the link below, and I think you will find it quite easy to parse and edit using code.
To just replace some placeholder text, it should be just as simple as reading the DXF file into a string (a dxf file is no different than a txt file), performing a text replace operation and saving it back to file. Just make sure that your placeholder text is very unique and is not contained in any of the key words in the document below (otherwise your DXF file will get corrupted). Something like "PlaceHolderText" will do the trick.
http://images.autodesk.com/adsk/files/autocad_2012_pdf_dxf-reference_enu.pdf
Edit: More Info
I do a lot of work with AutoDesk Inventor which is in direct competition with SolidWorks, so they are effectively the same tool. We were faced with a similar problem of needing to place text onto sheet metal flat pattern DXFs that came out of Inventor in order to identify the part, but Inventor simply could not do it (see, exactly the same!). One of our developers had the idea to place a very precise geometry punch onto the flat pattern. After the DXF was generated he wrote some code that parsed the DXF file and replaced the geometry with a text entity. More specifically we used a triangle with sides having each length defined to something like the 7th decimal place. You can then use one of the vertices of the triangle to position the text, including rotation. This process would be automatic, so once you write the code with the help of the document above (which won't take the long), it will just work. If your engraver can handle text the way you want it, I'd say this is a very good solution. We generate hundreds of parts every day using this code. Hope this helps.

add a duplicate (hidden) text layer to a pdf for extra searching

My problem:
I have a pdf with lots of roman characters with complex diacritical marks (e.g., ṣ, ś, ṝ, ǎ, etc.). To make it easier to search within the pdf, I would like to add an additional layer, much as one does with hocr, where the same text is present without the diacritics.
When using full-text search engines I can index multiple terms at the same position (vector) - I would like to achieve the same effect here.
I have read lots about adding a hocr layer to scanned images, but I really just want to duplicate the text layer, pass it through a script that strips the diacritics (straightforward enough) and then adds it back in as a hidden but searchable layer.
Anyone have any suggestions? (Solutions involving any platform, language, library or toolchain will be useful!)
Thanks :)
Edit: please let me know if the question is unclear.

Well I have a (slightly ugly and hackish) solution, so I thought I'd share it.
I'm using PDFMiner to extract the text, along with the co-ordinates. Then I'm using ReportLab to write the normalized versions of the text to a new pdf, in exactly the same position, as hidden text. To make the positions line up properly, I found I had to use exactly the same font, so I've used a combination of FontForge and MuPDF to extract the required font(s) from the original pdf.
Finally, having created the new pdf, I'm using pdftk to merge it with the original.
It works pretty well, but has the downside that copying text out of the pdf results in the normalized text being copied too. But this is acceptable for my present purposes, and I can't see any way around it. The pdf spec. doesn't really support my objective, and so I don't imagine I can do better than this hackish solution.

I have written something similar to add searchable text by OCR'ing images and converting it to PDF in C#. I used QuickPDF from www.quickpdf.com to create hidden white text objects on top of the image and this worked reasonably well.
In your case QuickPDF would allow you to extract the text strings along with bounding boxes and font details. You could then normalize your text and create the invisible text objects using the existing font and position information and then save it out to a new file.
This would basically give you the same PDF as you have now and also give you both the original and normalised text as you are getting now.
QuickPDF is a commercial library. If your solution works well for you then there is no used buying a commercial engine though. The nice thing though is that it only requires 1 SDK and you would look at it if you had a more than a few PDF's to convert.

beamer includegraphics with screenshots

I'm using the LaTeX-Beamer class for making presentations. Every once in a while I need to include screenshots. Those graphics are pixel-based, of course. I use includegraphics like this:
\begin{figure}
\includegraphics[width= \paperwidth]{img/analyzer.png}
\end{figure}
or usually something like this:
\begin{figure}
\includegraphics[width= 0.8\linewidth]{img/analyzer.png}
\end{figure}
This leads to pretty bad readibility of the contained text, so I'm asking for your best practices: How would you include screenshots containing text considering, that I will do the output PDF with pdflatex?
EDIT: I suppose I'm looking for something like an 1:1 presetation of the image within beamer. However, [scale = 1.0] doesn't achieve what I'm looking for.

Your best bet is to scale the image outside of Latex for inclusion, and include it in 1:1 ratio. The scaling done by graphics packages in Latex isn't going to be anywhere near as good as possible from other tools. Latex (Tex) has limited floating-point arithmetic capabilities, whereas an external tool can use sophisticated algorithms to get the scaling better.
Another option is to use only a part of the screenshot, the one you want to concentrate on.
Edit: If you can change the font size before taking the screenshot, that's another option—just increase the font size for the screenshots.
Of course, you can combine the two methods.

I have done exactly what you do and e.g defined
\newcommand{\screenshot}[1]{\centerline{%
\includegraphics[height=7.8cm,transparent]{#1}}} % 7.8in
which worked with whatever style I was using at the time. The files included with this macro were all PNGs created with one the usual Linux screen capture tools.
Edit: You may have to play with the size (height and width) of your input files. It came out rather nice for me (and this was from a presentation in 2006).

How about scaling it as follows:
\includegraphics[scale=0.5]{images/myimage.jpg}
This works for me.

Have you tried to convert the image to .eps or .pdf file and use this file in LaTeX?
Maybe try also latex, dvips and ps2pdf.
Problem might be in used viewer, in Linux I use Document viewer or ePDFViewer and output is much worse than in Adobe Reader or Acrobat, which I use in Windows...

Tools for displaying text, powerpoint style, in linux

I have a problem where I need a way to display a repeating series of "images" on a computer monitor. Specifically, given a series of text files, I'd like a way to display the contents of said files on a screen in a way much like a powerpoint would.
My current thoughts are to find some tool that will take in a text file of some format, and then output an image which contains the text from the file. Then I'd put it in a directory and have some Slideshow program continuously go between the images in that directory. It's a very hacky solution, obviously.
So, does anyone know of tools that would do such a thing? Or is there a better way to do this? I've looked into the library libgd2, but it doesn't seem to support text-wrapping for images, which is something I'd need.
Thanks!

MagicPoint is a tool for displaying presentations. Presentations are written in a simple plain text file format, much like HTML.
You could easily generate the MagicPoint file automatically and then run it and display the presentation. You can also generate HTML, PS oder PDF from the presentation and display that.

Are you looking for powerpoint equivalent for linux? Openoffice??

have you tried some magic scripting with TeX?
a chain like
tex file | dvi2ps | ps2jpg > output
and define some TeX-Macros?

Showoff's pretty cool. It uses Markdown-formatted slides to create a simple little Sinatra app that you run (with showoff serve), and then view in a browser.

Docutils. See http://docutils.sourceforge.net/docs/user/slide-shows.html
The text syntax is reStructuredText

another idea:
text2gif

To complement the suggestions given by others, if you were going to write a program to do this, it would probably be more efficient to just render the text to the screen directly, rather than converting it to images first. It could probably be done using a canvas or text box component in a full-screen window on whatever window manager you are using (e.g. KDE or Gnome).

I give presentations with Opera's #media projection CSS support. On http://talks.webconverger.com/ you can find a template and an example which you can load in Opera's full screen mode and start sliding through.
So besides writing in a familiar language HTML, it's dead easy to share the slides and even get your audience to look at the slides as you're going through them.
If you are looking for something more flashy, there are tools on the Web to generate animations and what not, and again you would simply use a full screen browser to play it back to your audience.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string