Split comma-separated parameters in LaTeX - string

I am trying to build a command which is similar to LaTeX \cite{}, which accepts a comma-separated list of parameters like this
\cite{Wall91, Schwartz93}
I would like to pass each item in the comma-separated list which the parameter represents to another command and return the concatenation of the individual results. I imagine it to be something like this:
\newcommand{\mycite}[1]{%
\#for\var:=\split{#1} do{%
\processCitation{\var}%
}%
}
Literature on String manipulation, variables and looping in LaTeX would be great!
Also: Is there a way to join the individual results using commas again?
Thanks!

Using Roberto's link I arrived at this solution:
\makeatletter
% Functional foreach construct
% #1 - Function to call on each comma-separated item in #3
% #2 - Parameter to pass to function in #1 as first parameter
% #3 - Comma-separated list of items to pass as second parameter to function #1
\def\foreach#1#2#3{%
\#test#foreach{#1}{#2}#3,\#end#token%
}
% Internal helper function - Eats one input
\def\#swallow#1{}
% Internal helper function - Checks the next character after #1 and #2 and
% continues loop iteration if \#end#token is not found
\def\#test#foreach#1#2{%
\#ifnextchar\#end#token%
{\#swallow}%
{\#foreach{#1}{#2}}%
}
% Internal helper function - Calls #1{#2}{#3} and recurses
% The magic of splitting the third parameter occurs in the pattern matching of the \def
\def\#foreach#1#2#3,#4\#end#token{%
#1{#2}{#3}%
\#test#foreach{#1}{#2}#4\#end#token%
}
\makeatother
Usage example:
% Example-function used in foreach, which takes two params and builds hrefs
\def\makehref#1#2{\href{#1/#2}{#2}}
% Using foreach by passing #1=function, #2=constant parameter, #3=comma-separated list
\foreach{\makehref}{http://stackoverflow.com}{2409851,2408268}
% Will in effect do
\href{http://stackoverflow.com/2409851}{2409851}\href{http://stackoverflow.com/2408268}{2408268}

Related

How to get pandoc lua filter avoid counting words using this pattern in code blocks inside Rmarkdown file?

This is a follow-up question to this post. What I want to achieve is to avoid counting words in headers and inside code blocks having this pattern:
```{r label-name}
all code words not to be counted.
```
Rather than this pattern:
```
{r label-name}
all code words not to be counted.
```
Because when I use the latter pattern I lose my fontification lock in the Rmarkdown buffer in emacs, so I always use the first one.
Consider this MWE:
MWE (MWE-wordcount.Rmd)
# Results {-}
## Topic 1 {-}
This is just a random text with a citation in markdown \#ref(fig:pca-scree)).
Below is a code block.
```{r pca-scree, echo = FALSE, fig.align = "left", out.width = "80%", fig.cap = "Scree plot with parallel analysis using simulated data of 100 iterations (red line) suggests retaining only the first 2 components. Observed dimensions with their eigenvalues are shown in green."}
knitr::include_graphics("./plots/PCA_scree_parallel_analysis.png")
```
## Topic 2 {-}
<!-- todo: a comment that needs to be avoided by word count hopefully-->
The result should be 17 words only. Not counting words in code blocks, comments, or Markdown markups (like the headers).
I followed the method explained here to get pandoc count the words using a lua filter. In short I did these steps:
from command line:
mkdir -p ~/.local/share/pandoc/filters
Then created a file there named wordcount.lua with this content:
-- counts words in a document
words = 0
wordcount = {
Str = function(el)
-- we don't count a word if it's entirely punctuation:
if el.text:match("%P") then
words = words + 1
end
end,
Code = function(el)
_,n = el.text:gsub("%S+","")
words = words + n
end,
}
function Pandoc(el)
-- skip metadata, just count body:
pandoc.walk_block(pandoc.Div(el.blocks), wordcount)
print(words .. " words in body")
os.exit(0)
end
I put the following elisp code in scratch buffer and evaluated it:
(defun pandoc-count-words ()
(interactive)
(shell-command-on-region (point-min) (point-max)
"pandoc --lua-filter wordcount.lua"))
From inside the MWE Markdown file (MWE-wordcount.Rmd) I issued M-x pandoc-count-wordsand I get the count in the minibuffer.
Using the first pattern I get 62 words.
Using the second pattern I get 22 words, more reasonable.
This method successfully avoids counting words inside a comment.
Questions
How to get the lua filter code avoid counting words using the first pattern rather than the second?
How to get the lua filter avoid counting words in the headers ##?
I would also appreciate if the answer explains how lua code works.
This is a fun question, it combines quite a few technologies. The most important here is R Markdown, and we need to look under the hood to understand what's going on.
One of the first step in R Markdown processing is to parse the document, find all R code blocks (marked by the {r ...} pattern, execute those blocks, and replaces the blocks with the evaluation results. The modified input text is then passed to pandoc, which parses it into an abstract document tree (AST). That AST can be examined or modified with a filter before pandoc writes the document in the target format.
This is relevant because it is R Markdown, not pandoc, that recognizes input of the form
``` {r ...}
# code
```
as code blocks, while pandoc parses them as inline code that is identical to ` {r ...} # code `, i.e., all newlines in the code are ignored. The reason for this lies in pandoc's attribute parsing and the overloading of ` chars in Markdown syntax.¹
This gives us the answer to your first question: we can't! The two code snippets look exactly the same by the time they reach the filter in pandoc's AST; they cannot be distinguished. However, we get proper code blocks with newlines if we run R Markdown's knitr step to execute the code.
So one solution could be to make the wordcount.lua filter part of the R Markdown processing step, but to run the filter only when the COUNT_WORDS environment variable is set. We can do that by adding this snippet to the top of the filter file:
if not os.getenv 'COUNT_WORDS` then
return {}
end
See the R Markdown cookbook on how to integrate the filter.
I'm leaving out the second question, because this answer is already quite long and that subquestion is worth a separate post.
¹: pandoc would recognize this as a code block if the r was preceded by a dot, as in
``` {.r}
# code
```

Use split function to split recursively in python

I have written a piece of code which takes a keyword and splits a name using that specific keyword, however as split returns the list, now I want to individually check the elements of the returned list and again split it using a different keyword(if it exists), but this time I dont want another sublist to be returned, rather then the elements should get extended in the same list.
Code below :-
def get_comb_drugs(keyword, name):
if keyword in name:
name = name.split(keyword)
return name
print(get_comb_drugs(", polymer with", "Acetaldehyde, polymer with ammonia and formaldehyde"))
The output I get is:
['Acetaldehyde', ' ammonia and formaldehyde']
however, I want to split 'ammonia and formaldehyde' again using " and " keyword and the exact output I want is:
['Acetaldehyde', ' ammonia', 'formaldehyde']
Guide me in achieving the desired result.
You can use re.split instead with an alternation pattern:
import re
separators = [', polymer with', ' and ']
re.split('|'.join(separators), 'Acetaldehyde, polymer with ammonia and formaldehyde')
This returns:
['Acetaldehyde', ' ammonia', 'formaldehyde']

Function prints None when i call it using dictionary

I made a dictionary switcher as follows:
switcher={
0:linked_list,
1:queue,
2:stack
}
and I used switcher[key]() to just call a function.
The function runs as normal but the issue is it prints None before
taking input in while loop of my called function, in this case linked_list()
while(c!=2):
c=int(input(print("Enter operation\n1.Insert beg\n2.Exit")))
if c==1:
some code
I have tried using a return statement and lambda but still it prints None.Also I am not printing the given function.
Because what you are trying to write to the standard output is not your menu seen as a string but the object resulting from the print function.
print function is useless. Argument sent to input function is by default written to the standard output.
Therefore:
while(c!=2):
c=int(input("Enter operation\n1.Insert beg\n2.Exit\n"))
if c == 1:
some code
is enough (with an extra newline after Exit option for more readibility).
See here official documentation about input function.

Updating dictionary - Python

total=0
line=input()
line = line.upper()
names = {}
(tag,text) = parseLine(line) #initialize
while tag !="</PLAY>": #test
if tag =='<SPEAKER>':
if text not in names:
names.update({text})
I seem to get this far and then draw a blank.. This is what I'm trying to figure out. When I run it, I get:
ValueError: dictionary update sequence element #0 has length 8; 2 is required
Make an empty dictionary
Which I did.
(its keys will be the names of speakers and its values will be how many times s/he spoke)
Within the if statement that checks whether a tag is <SPEAKER>
If the speaker is not in the dictionary, add him to the dictionary with a value of 1
I'm pretty sure I did this right.
If he already is in the dictionary, increment his value
I'm not sure.
You are close, the big issue is on this line:
names.update({text})
You are trying to make a dictionary entry from a string using {text}, python is trying to be helpful and convert the iterable inside the curly brackets into a dictionary entry. Except the string is too long, 8 characters instead of two.
To add a new entry do this instead:
names.update({text:1})
This will set the initial value.
Now, it seems like this is homework, but you've put in a bit of effort already, so while I won't answer the question I'll give you some broad pointers.
Next step is checking if a value already exists in the dictionary. Python dictionaries have a get method that will retrieve a value from the dictionary based on the key. For example:
> names = {'romeo',1}
> print names.get('romeo')
1
But will return None if the key doesn't exist:
> names = {'romeo',1}
> print names.get('juliet')
None
But this takes an optional argument, that returns a different default value
> names = {'romeo',2}
> print names.get('juliet',1)
1
Also note that your loop as it stands will never end, as you only set tag once:
(tag,text) = parseLine(line) #initialize
while tag !="</PLAY>": #test
# you need to set tag in here
# and have an escape clause if you run out of file
The rest is left as an exercise for the reader...

how use struct.pack for list of strings

I want to write a list of strings to a binary file. Suppose I have a list of strings mylist? Assume the items of the list has a '\t' at the end, except the last one has a '\n' at the end (to help me, recover the data back). Example: ['test\t', 'test1\t', 'test2\t', 'testl\n']
For a numpy ndarray, I found the following script that worked (got it from here numpy to r converter):
binfile = open('myfile.bin','wb')
for i in range(mynpdata.shape[1]):
binfile.write(struct.pack('%id' % mynpdata.shape[0], *mynpdata[:,i]))
binfile.close()
Does binfile.write automatically parses all the data if variable has * in front it (such in the *mynpdata[:,i] example above)? Would this work with a list of integers in the same way (e.g. *myIntList)?
How can I do the same with a list of string?
I tried it on a single string using (which I found somewhere on the net):
oneString = 'test'
oneStringByte = bytes(oneString,'utf-8')
struct.pack('I%ds' % (len(oneString),), len(oneString), oneString)
but I couldn't understand why is the % within 'I%ds' above replaced by (len(oneString),) instead of len(oneString) like the ndarray example AND also why is both len(oneString) and oneString passed?
Can someone help me with writing a list of string (if necessary, assuming it is written to the same binary file where I wrote out the ndarray) ?
There's no need for struct. Simply join the strings and encode them using either a specified or an assumed text encoding in order to turn them into bytes.
''.join(L).encode('utf-8')

Resources