Pascal reading a line of text into separate strings - string

Basically a line looks like this: 'number number text text text' with spaces dividing them. The numbers are ok, because the readln() just splits them after the space, but it reads the 3 texts as one. How can i read them into separate strings?

If anybody faces this problem, here's a really easy solution I just found: read the whole thing into a string. Then pos(' ',stringsname), then copy('spacepos'+1, 200), then delete(spacepos,200) from the first string and voilá.

Related

How to split a text into characters and get a text including every nth character ? (Maybe in Excel)

Here's the problem: I have a text and I want it to be represented as separate texts that is made by characters mod n. For example text: "hfhshsseekbfe...", n=5, then first one "hsb..." (1st,6th,11th character from the original), second one "fsf..." (2nd,7th,12th character from the original) and so on. It will be simpler to write a program in cpp that reads and extracts needed information (modulo n characters) from a file and writes it down in a new .txt file. But I'm not a coder, I did some of coding for my Numerical Analysis course, but there wasn't any strings. So maybe Excel have some algorithms to do it?
In excel sheet, with ExcelO365 you can use Sequence() formula with MID() function like-
=TEXTJOIN("",TRUE,MID(A1,SEQUENCE(LEN(A1),,1,5),1))
For second one just make Start Number parameter of Sequence() formula to 2 like
=TEXTJOIN("",TRUE,MID(A1,SEQUENCE(LEN(A1),,2,5),1))

What is the difference between these two tab-delimited .txt files that is causing .split("\t") to properly separate values from one but not the other?

I have two Japanese word frequency reports that were compiled from different sources. Each line contains a word and its number of occurrences, delimited by tabs. I also have a python script that is supposed to split each line into those two values using .split("\t"). The latter value is then converted into an integer, which is where the error is coming from:
ValueError: invalid literal for int() with base 10: '\ufeff29785713'
This is only occurring for data from the second file.
Upon testing to see if converting the number to a float would work (or change the error), the result was this:
ValueError: could not convert string to float: '\ufeff29785713'
Is this a result of the tabs or numerals in the second file perhaps not technically being the same character and not delimiting properly, causing unwanted characters in the latter value (or perhaps not splitting at all)? Both files are UTF-8 encoded.
Shorter version of first file (working)
Shorter version of second file
Honestly, not a python dev at all, but given that your second array element contains a rogue character pair you could try removing it after you split and before you convert to number:
x[1] = x[1].replace('\ufeff', '')
x being the name of the array you did split your line into. The replace operation will have no effect on the first file, because FEFF is not present

Combining two lines into 1 in Powershell

An elderly family member recorded a memoir over the past few years, using Windows Notepad, so each file (by year) is simple text. I am tasked with normalizing the documents as much as possible, for a later print run. The problem I'm struggling with is how to handle each chapter title. Within a single text file could be multiple chapter entries. some chapter titles are very simple to get, for example:
Chapter 1
text
text
text.
chapter two
text
text
But she wasn't always so neat. Some of her documents contain lines like
" chapter
three
"
with leading and trailing spaces and even a CarriageReturn/LineFeed between.
I cannot get the syntax to manage the "chapter three" situation. Here's what I have done so far:
$charstr = ' chapter
three
text here
more text
'
#remove leading spaces
$charstr2 = $charstr.trim()
#find and replace chapter to all caps and start on a new line
$charstr2.Replace("chapter ",''nCHAPTER ')
I'd sure appreciate some assistance how to normalize that multi-line text string into a format like "CHAPTER three" (ideally, I will UPPER() the chapter
number as well, like "CHAPTER THREE").
I've tried using \s, as in
$charstr2 = $charstr.trim() -replace '\s+',
but I'm obviously doing something wrong.
Thanks!
read

How can I detect "excessive spaces" in a string?

I'm making a simple android game in Lua, and in one of its steps to set the game is set an word (or sentence; basically, a string) input by the player. The "word" may have spaces, but I want to forbid the player to input a string with two or more spaces in a row, like "fly bird".
I tried using string.match(word, " "), string.match(word, "%s%s")
and string.match(word, "%s+%s+") and none of these worked, and somehow, the last one always "detect" double space, no matter if it has or not spaces.
What can I do to detect if there are multiple spaces in a row in a string? (Just detect, not replace, so I can send a warning message to the player.)
If its exactly two spaces you are interested in, simply use find
word:find(' ')
It will return range of first occurrence of two consecutive spaces.
input = input:gsub("%s+", " ")
The above code should take the input and remove all excessive spacing and replace it with just 1 space.

Comma separators in Fortran

I have come across the following issue with Fortran: that in reading a character array, for example, or any list in actuality, from a data file with fmt=*, both non-interquote blanks AND commas are natively considered as delimiters for the elements in the array/list. The fact that commas act as delimiters is a big problem for me.
So the question is: do you know of any semantic option or compilation directive in Fortran that permits to consider the commas in input files as characters and not as delimiters,
with the only delimiters being blanks? As an specific example, I would like that when reading a record like:
x,y,z
with:
read (7,*) adummy
would result in adummy (a scalar character variable) getting the value x,y,z not x.
Any help would be most welcome.
The solution is to specify formatting to match your data record, i.e. use character data descriptor when specifying the format:
read(7,fmt='(A)')adummy
will result in adummy having value x,y,z, assuming it is a variable of sufficient length.
However this method will not treat blanks as delimiters either, so if you want to read commas as character strings but have blanks as delimiter, the common way to achieve this is to read the whole record into the character variable and do the splitting into separate variables afterwards.

Resources