Tcl - How to replace ? with -

Tcl - How to replace ? with - - string

(You'd think this would be easy, but I'm stumped.)
I'm converting an iOS note to a text file, and the note contains "0." and "?" whenever there is a list or bullet.
This was a bulleted list
? item 20
? Item 21
? Item 22
I'm having so much problem replacing the "?"
I don't want to replace a legitimate question mark at the end of a sentence,
but I want to replace the "?" bullets with "-" (preferably anywhere in the line, not just at the beginning)
I tried these searches - no luck
set line "? item 20"
set index_bullet [string first "(\s|\r|\n)(\?)" $line]
set index_bullet [string first "(!\w)(\?)" $line]
set index_bullet [string first ^\? $line]
This works, but it would match any question mark
set index_bullet [string first \? $line]
Does anyone know what I'm doing wrong?
How do I find and replace only question mark bullets with a "-"?
Thank you very much in advance

If you're really wanting to replace a question mark where you've got a regular expression that describes the rule, the regsub command is the right way. (The string first command finds literal substrings only. The string match command uses globbing rules.) In this case, we'll use the -all option so that every instance is replaced:
set line "? item 20"
set replaced [regsub -all {(\s|^)\?(\s)} $line {\1-\2}]
puts "'$line' --> '$replaced'"
# Prints: '? item 20' --> '- item 20'
The main tricks to using regular expressions in Tcl are, as much as possible, to keep REs and their replacements in braces so that the you can use Tcl metacharacters (e.g., backslash or square brackets) without having to fiddle around a lot.
Also, \s by default will match a newline.

It seems likely that a character used to indicate a list item is the first character on the line or the first character after optional whitespace. To match a question mark at the beginning of a line:
string match {\?*} $line
or
string match \\?* $line
The braces or doubled backslash keeps the question mark from being treated as a string match metacharacter.
To find a question mark after optional whitespace:
string match {\?*} [string trimleft $line]
The command returns 1 if it finds a match, and 0 if it doesn't.
To do this with string first, use
if {[string first ? [string trimleft $line]] eq 0} ...
but in that case, keep in mind that the index returned from string first isn't the true location of the question mark. (Use
== instead of eq if you have an older Tcl).
When you have determined that the line contains a question mark in the first non-whitespace position, a simple
set line [regsub {\?} $line -]
will perform a single substitution regardless of where it is.
Documentation:
regsub,
string,
Syntax of Tcl regular expressions

I figured it out.
I did it in two steps:
1) First find the "?"
set index_bullet [string first "\?" $line]
2) Then filter out "?" that is not a bullet
set index_question_mark [string first "\w\?" $line]
I have a solution, but please post if you have a better way of doing this.
Thanks!

Related

TCL: How to remove all letters/numbers from a string?

I am using tcl programming language and trying to remove all the letters or numbers from a string. From this example, I know a general way to remove all the letters from a string (e.x. set s abcdefg0123456) is
set new_s [string trim $s "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXXYZ"]
If I want to remove all numbers from a string in general, I can do
set new_s [string trim $s "0123456789"]
Is there a more straightforward way to remove all letters/numbers?
I also notice if I want to remove a portion of numbers (e.x. 012) instead of all numbers, the following does NOT work.
set new_s [string trim $s "012"]
Can someone explain why?

Use regular expressions:
set s abcdefg0123456
regsub -all {\d+} $s {} new_s ;# Remove all digits
regsub -all {[[:alpha:]]+} $s {} new_s ;# Remove all letters

To answer your other question: string trim (and string trimleft and string trimright as “half” versions) removes a set of characters from the ends of a string (and returns the new string; it's a pure functional operation). It doesn't do anything to the interior of the string. It doesn't know anything about patterns. The default set of characters removed is “whitespace” (spaces, newlines, tabs, etc.)
When you do:
set new_s [string trim $s "012"]
You are setting the removal set to 0, 1 and 2, but it is still only the ends that get removed. Thus it will leave x012101210y entirely alone, but turn 012101210 into the empty string.

Detecting a line with open curly brackets

I am parsing a tcl file line line by line and searching for lines with open curly braces so that I can merge them with the next line and read them.
I am struggling to get a single regex to do this. My concern is lines with with a closing } which can be skipped.
Example:
MATCH: test_command -switch1 {
NO MATCH: single_command
NO MATCH: test_tcl -switch2 {arg1 }
Please help with the regex to get the result. I tried this:
% set a "test_command -swithc1 {bye }"
test_command -swithc1 {bye }
% regexp "{" $a match
1
#0 is expected
This is not my intention. I want match only for lines with open curly brace
% set b "test_command -swithc1 {hi"
test_command -swithc1 {hi
% regexp "{" $a match
1
#1 was expected
I'm looking for a regex that will give 0 for the $a and 1 for $b

You really shouldn't be using a regular expression for that; there's a Tcl command specifically for this sort of thing: info complete. Here's how to use it:
set accumulator ""
while {![eof $inputChannel]} {
# Note well: you *must* add the newline
append accumulator [gets $inputChannel] "\n"
if {[info complete $accumulator]} {
handleCompleteChunk $accumulator
set accumulator ""
}
}
This handles various types of bracket matching and the intricacies of backslash sequences, but just to check whether the “line” is complete. (It's also the core of how Tcl's REPL works, except that uses the Tcl C API equivalents.)

You could try a couple "lookarounds", one to say "I see a left bracket" and one to say "I don't see a right bracket":
(?!.*\})(?=.*\{)
https://regex101.com/r/p8bbsF/1/

Tab on Expect String concatenation

I'm kinda a novice on Expect, but I can't get over a problem I have with a logging-monitoring script i'm writing.
I've spent hours googling on why I can't get this to work:
puts $redirect [concat "${time}\t" "${context}\t" "$id\t" "${eventtype}" "${eventstatus}\t" "${eventcontext}" ]
The \t char ( it does not work even with other \chars ) is not showing up. No matter how and where I place it, I've tried different stuff:
puts $redirect [concat "${time}" "\t" "${context}" [...] ]
puts $redirect [concat "${time}\t" "${context}" [...] ]
puts $redirect [concat "${time}" "\t${context}" [...] ]
puts $redirect [concat "${time}" \t "${context}" [...] ]
*where redirect is set redirect [open $logfile a]
*where [...] are other strings I'm concatenating, in the same way.
From http://tcl.tk/man/tcl8.5/TclCmd/Tcl.htm#M10
[5] Argument expansion.
If a word starts with the string “{}” followed by a non-whitespace character, then the leading “{}” is removed and the
rest of the word is parsed and substituted as any other word. After
substitution, the word is parsed as a list (without command or
variable substitutions; backslash substitutions are performed as is
normal for a list and individual internal words may be surrounded by
either braces or double-quote characters), and its words are added to
the command being substituted. For instance, “cmd a {}{b [c]} d
{}{$e f "g h"}” is equivalent to “cmd a b {[c]} d {$e} f "g h"”.
[6] Braces.
If the first character of a word is an open brace (“{”) and rule [5] does not apply, then the word is terminated by the matching close
brace (“}”). Braces nest within the word: for each additional open
brace there must be an additional close brace (however, if an open
brace or close brace within the word is quoted with a backslash then
it is not counted in locating the matching close brace). No
substitutions are performed on the characters between the braces
except for backslash-newline substitutions described below, nor do
semi-colons, newlines, close brackets, or white space receive any
special interpretation. The word will consist of exactly the
characters between the outer braces, not including the braces
themselves.
Ironically, I can get this to work:
puts $redirect [concat "${time}\n" "-\t${context}" [...] ]
If I put a char before the TAB, it works, but I can't use it.
Ex output: 2016-06-01 15:43:12 - macro
Wanted output: 2016-06-01 15:43:12 macro
I've tried on building the string with append but it's like it is eating pieces of string due to max buffer char, is it possible?
Am I missing something?
Thanks in advice.

That is what concat does. It eats whitespace.
From the documentation for concat:
This command joins each of its arguments together with spaces after trimming leading and trailing white-space from each of them. If all the arguments are lists, this has the same effect as concatenating them into a single list. It permits any number of arguments; if no args are supplied, the result is an empty string.

#Etan gave you why it's not working for you.
An alternate way to code that is to use format
puts $redirect [format "%s\t%s\t%s\t%s%s\t%s" $time $context $id $eventtype $eventstatus $eventcontext]

Substring extraction in TCL

I'm trying to extract a sequence of characters from a string in TCL.
Say, I have "blahABC:blahDEF:yadamsg=abcd".
I want to extract the substring starting with "msg=" until I reach the end of the string.
Or rather I am interested in extracting "abcd" from the above example string.
Any help is greatly appreciated.
Thanks.

Regular expressions are the tools for these kind of tasks.
The general syntax in Tcl is:
regexp ?switches? exp string ?matchVar? ?subMatchVar subMatchVar ...?
A simple solution for your task would be:
set string blahblah&msg=abcd&yada
# match pattern for a =, 0-n characters which are not an & and one &. The grouping with {} is necessary due to special charactaer clash between tcl and re_syntax
set exp {=([^&]*)&}
# -> is an idiom. In principle it is the variable containing the whole match, which is thrown away and only the submatch is used
b
regexp $exp $string -> subMatch
set $subMatch
A nice tool to experiment and play with regexps ist Visual Regexp (http://laurent.riesterer.free.fr/regexp/). I'd recommend to download it and start playing.
The relevant man pages are re_syntax, regexp and regsub
Joachim

Another approach: split the query parameter using & as the separator, find the element starting with "msg=" and then get the text after the =
% set string blahblah&msg=abcd&yada
blahblah&msg=abcd&yada
% lsearch -inline [split $string &] {msg=*}
msg=abcd
% string range [lsearch -inline [split $string &] {msg=*}] 4 end
abcd

Code
proc value_of {key matches} {
set index [lsearch $matches "yadamsg"]
if {$index != -1} {
return [lindex $matches $index+1]
}
return ""
}
set x "blahABC:blahDEF:yadamsg=abcd:blahGHI"
set matches [regexp -all -inline {([a-zA-Z]+)=([^:]*)} $x]
puts [value_of "yadamsg" $matches]
Output:
abcd
update
upvar not needed. see comments.

Powershell Search String in File Containing $

I want to search the following string in a text file: $$u$$
I tried select-string, get-content, .Contains. It seems to me it's not possible.
I used this for the search as a variable: $ToSearch = "'$'$u'$'$"
It always gives false result.

It is because most search filters are relying on regex. The $ symbol in regex needs to be escaped
Without knowing more of what you're trying to accomplish I can't give much of an example, but here is one:
'this is a $$u$$ test' -replace "\$",""
The '\' is what is escaping the character - meaning to translate it literally.
Edit: Per comment
$Val = 'this is a $$u$$ test'
$Val | Select-String "\$+\w\$+" -quiet
-Quiet switch returns t/f rather than a string value.

The $ is a reserved character in regex, which is complicating your search.
The default search pattern type for Select-String is regex, but you can also specify -Simplematch or -Wildcard, either of which will eliminate the need to escape the $. If you use -Wildcard, you'll need to include the wilcard * at either end of the match - '*$$u$$*'. For simplematch, just use the string you want to match for - '$$u$$'.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Tcl - How to replace ? with - - string

I figured it out. I did it in two steps: 1) First find the "?" set index_bullet [string first "\?" $line] 2) Then filter out "?" that is not a bullet set index_question_mark [string first "\w\?" $line] I have a solution, but please post if you have a better way of doing this. Thanks!

Related

TCL: How to remove all letters/numbers from a string?

Detecting a line with open curly brackets

Tab on Expect String concatenation

Substring extraction in TCL

Powershell Search String in File Containing $

Categories

Resources