Convert Google Docs to Jekyll Markdown - google-docs

How can I convert a Google Docs, which contains images and tables, into a Markdown file which can be published as a post using Jekyll?
Is it possible to first export the Google Docs into a PDF and then convert the PDF to Markdown? What will happen to the images and tables in that case?

May 2018 Update
The script originally suggested in this answer appears to no longer work and has not been updated for 5 years.
An alternative solution (which is based on the old script) can be found at https://github.com/evbacher/gd2md-html
I tried it out, it works pretty well.
Previous Answer
You can use a Google Script to do the conversion for you!
This one will let you convert to .md and it will email you the converted file. I've tested it and works fine. It works with basic tables, and if you have images in the doc, it will attach them to the email.
Instructions for installing are on the same link, in the GitHub description, but I pasted it here for ease of access:
Add the script:
Open your Google Drive document (http://drive.google.com)
Tools ->
Script Manager > New
Select "Blank Project", then paste this code in
and save.
Clear the myFunction() default empty function and paste the
contents of converttomarkdown.gapps into the code editor
File -> Save
Run the script:
Tools > Script Manager
Select "ConvertToMarkdown" function.
Click Run
button (First run will require you to authorize it. Authorize and run
again)
Converted doc with images attached will be emailed to you.
Subject will be "[MARKDOWN_MAKER]...".
Good luck!

You can export as HTML. Jekyll can serve static HTML files.
Btw, "standard" markdown doesn't have tables. There are implementation that have it, but I'm afraid you'll have to convert them by hand to the right format, which will be implementation dependent. I don't know about Jekyll, maybe it's easiest to just use HTML tables within the markdown text.
You could create a new theme based on the HTML export. The export should contain the stylesheet embedded in a <style> tag within the HTML document. It's not really easy to create new themes, but doable. Or, if you just want the content and don't mind using whatever Jekyll theme you already have, then you can cut out the stylesheet part and keep the html only.

Another option would be to change how files are delimited in Excel on your computer. This guide can help you do that (http://www.howtogeek.com/howto/21456/export-or-save-excel-files-with-pipe-or-other-delimiters-instead-of-commas/)
Then every time you copy and paste from excel to a markdown file/jekyll you automatically have the pipes. All you will need to do is add some dashes to separate your topline..

Google Docs -> docx to Markdown -> md
I myself looked far and wide but I believe the best way to do this is by using Pandoc.
Works for all platforms (check their incredible website ) , what you are looking for is the following command on your cmd or PowerShell (Windows) :
pandoc input_filename.docx -s -o output.md
Pro Tip:
Pandoc comes with a little trick to store up even all of the images in your document to your custom folder and then adding the image tags in the markdown by using relative referencing to those images at the correct places. The amazing line of code is:
pandoc --extract-media ./your_custom_folder input_filename.docx -o output_filename.md

Related

Is there a Linux command line utility for getting random data to work with from the web?

I am a Linux newbie and I often find myself working with a bunch of random data.
For example: I would like to work on a sample text file to try out some regular expressions or read some data into gnuplot from some sample data in a csv file or something.
I normally do this by copying and pasting passages from the internet but I was wondering if there exists some combination of commands that would allow me to do this without having to leave the terminal. I was thinking about using something like the curl command but I dont exactly know how it works...
To my knowledge there are websites that host content. I would simply like to access them and store them in my computer.
In conclusion and as a concrete example, how would i copy and paste a random passage off the internet from a website and store it in a file in my system using only the command line? Maybe you can point me in the right direction. Thanks.
You could redirect the output of a curl command into a file e.g.
curl https://run.mocky.io/v3/5f03b1ef-783f-439d-b8c5-bc5ad906cb14 > data-output
Note that I've mocked data in Mocky which is a nice website for quickly mocking an API.
I normally use "Project Gutenberg" which has 60,000+ books freely downloadable online.
So, if I want the full text of "Peter Pan and Wendy" by J.M. Barrie, I'd do:
curl "http://www.gutenberg.org/files/16/16-0.txt" > PeterPan.txt
If you look at the page for that book, you can see how to get it as HTML, plain text, ePUB or UTF-8.

Update linked excel path in PowerPoint via Python

I want to automate creating of a powerpoint ppt via linking template charts to some Excel files. Updating the excel file values changes the powerpoint slides automatically. I have created my powerpoint template and linked charts to sample excel files data.
I want to send the folder with the powerpoint and excel files to someone else. But this will break the link to excel files due to change in the path. (As path is not relative). I can edit the paths manually by going under the "edit links to files" option under File Menu but this is tedious as charts are numerous with multiple files.
I want to update the same via Python code using the Python-Pptx package.
Please help!
There's no API support for this in the current version of python-pptx.
You would need to modify the underlying XML directly, perhaps using python-pptx internals as a starting point and using lxml calls on the appropriate element objects. If you search on "python-pptx workaround function" you will find some examples.
Another thing to consider is modifying the XML by cruder but still possibly effective means by accessing the XML files in the .pptx package directly (the .pptx file is a Zip archive of largely XML files) and using regular expressions or perhaps a command line tool like sed or awk to do simple text substitution.
Either way you're going to need to want it pretty badly, depending on your Python skill level. You'll also of course need to discover just which strings in which parts of the XML are the ones that need changing. opc-diag can be helpful for that, but it's a bit of detective work even with the best tools.

How to add ton of text (using TextView) to content (Android studio)?

I'm just started to programming in android studio and need to add a ton of text, but if I just paste all the text in the text field of the TextView component, then I get a mess. Tried to insert in the code of .xml, correcting paragraphs, but all the same it turns out not that. Therefore, several questions arose:
What is the most correct way to add a ton of text to the content? (Please refer to the details)
How can I make the text inserted correctly, with paragraphs, etc.? (Ie so that they are observed)?
All of this question's about ton of text.
Thank you very much in advance!
You can add all your text in a .doc file, format it in the way you want, then in your code, you load this file and output it to your TextView.
You need to make sure you import this file as an asset in your project or either load it from the device storage.
Please check this SO post: How to read .doc file?
And check this to see How to import file to assets

Photo not loading in markdown python

I've recently began coding for my degree and for a project I am submitting it via a pdf created in Jupyter so that my code can be seen. It all works within Jupyter but when I export to PDF the image that I have embedded in markdown doesn't load. All that loads in Microsoft edge is a small black box with a white cross in and in chrome there is a small image of mountains in two pieces. I am not sure where I'm going wrong. My image is written in like this:
<img src="files/masterbiaspic.png" />
And I don't know how to fix it.
I really don't have a wide knowledge of code so please be simple with your answers.
Kind regards and happy new year,
E
You appear to be using raw HTML to insert your images into your document. What you may not know is that most Markdown parsers do not look at the contents of raw HTML, they simply pass it through unaltered. However, raw HTML is not understood by the PDF file format, and in fact, when converting to PDF, there is no clean way to convert raw HTML to PDF without also parsing the HTML (which is beyond the scope of Markdown parsers). Therefore, if you want to output to PDF, you should only use pure Markdown (without any raw HTML). That way the parser can easily convert everything to a proper format for PDF output.
As it turns out, Markdown includes its own syntax for images (see the documentation for details). Try this:
![alt text](files/masterbiaspic.png)
By doing that, Jupyter Notebook will know about the image and should import it into the PDF properly.
It could be that the above will not resolve the problem. It depends on which method is used to convert to PDF. Some tools may take the HTML output of Markdown and convert that to PDF, which would mean you have a different problem entirely.

Search through scripts of (multiple) cimplicity screens

We are using Cimplicity to operate some installations at our plant. The frontend consists of a lot of .cim files, which are the screens presented to the operator. These files are built with 'cimedit', which is basically a graphical click and drag program with which you can assemble the screens. Each object you drag onto the screen has the option to run a script, which brings me to my problem.
Because each screen contains a lot of small scripts and functions it is hard to keep track of what does what. For example I'm trying to figure out where a certain table from my database is being accessed or updated. Since the files all seem to be compressed (or so) I can't use a regular 'search the contents of this file' search.
Things I've tried so far are searching using windows, with the content option enabled and also tried the compression option. This had no success. It makes sense because like I said, the files seem to be compressed, so the actual script is not stored in plain text.
So, my question in short:
How do I search all the scripts of (preferably multiple) cimplicity screens?
Any tips on how to search compressed files are also very much appreciated.
I stumbled upon another stackoverflow post while searching for a better windows search tool and ended up finding this post: https://superuser.com/questions/26593/best-way-to-confidently-search-files-and-contents-in-windows-without-using-an
This posts recommends Agent Ransack and it is actually possible to search through the .cim files with this tool.

Resources