How can I check how long a line is in python? - python-3.x

The plan is for the user to paste a huge list, that will be divided in a lot of lines, The huge variable will be recorded as "raw_text", I want to take out the first line, which will be saved as "line_to_convert", analyse that one and then start again, but for that I need to know how long a line is

Does len(raw_text.split()[0]) work for you?
In general, what you need is str.split().
str.split(sep=None, maxsplit=- 1)
Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified or -1, then there is no limit on the number of splits (all possible splits are made).

Related

Replace specific characters in a paragraph of text based on its numbered position with randomly generated characters

I've dabbled a bit with JavaScript years ago but I couldn't quite grasp the logic behind it. I still have some understanding of the basics but not enough to achieve what I'd like to. I don't have the time to research how to write the code myself, but if you could point me to already-coded, individual functions which achieve the results I'm looking for, perhaps I could play around with them and then ask for further help after that when needed.
I've got a paragraph of text (It could be anything) about 300 characters long, including spaces, capitalization, and punctuation. I would like a function which generates a random number based on the length of characters in the paragraph, i.e. the function counts the number of characters such that the generated number would never be higher than the number of characters in the paragraph) and then replaces that character with a randomly generated character based on a list of characters which appear in the paragraph (e.g. a-z,A-Z,and punctuation).
For example, if the number generated is 34, then the 34th character (whatever it may be) will be replaced by whatever character is randomly generated.
And finally, a function to input how many times this process should repeat, e.g. 10 times, 100 times, etc. before stopping, and one can view how the resulting paragraph of text has changed.
Any suggestions will be appreciated. Thanks.
Sorry, I've not tried anything yet as I'd like to get advice, first.

How to Automatically add thousand separators for every number in a string?

How can i create a thousand separator for every number which is in my string?
So for example this string:
string = "123456,78+1234"
should be displayed as:
TextView = "123.456,78+1.234"
And the string should be editable, so the thousand separator should adapt when i remove or add a digit.
I have already read all the posts I could find about it, but I could never find an up-to-date working answer. So I would be really grateful for your help!
Your question contains two sub-questions:
A. You want to add thousand separators to a string which contains a group of numbers.
B. You want it to change.
And the answers are:
A: In your example there's , as a delimiter, so you need to split the string using this delimiter to an array of strings.
Then iterate over them and have your dots added to every 3nth index of their characters; you can also use String.format("%,d", substr.toLong()).
Lastly, append all of the strings back together with , as the separator.
B: This one can be done in different ways. You may store the original string somewhere and observe it, so when it changes it goes to the function which does A, and use the function result the way you like (which I suppose is to be set in a TextView).

How can I separate consecutive strings without any delimiters?

My input data is a VCF (Variant Call Format) file. Each line that I am interested in looks like this:
chrI 22232 DEL00BED N <DEL> . PASS SUPP=1159;SUPP_VEC=11111111111111111111111011111111111
I want to count the presence (1) of a specific deletion in a specific position (22232) supported by n samples. For this reason, I looked at SUPP_VEC= values, however, I don't know how to split each value as 1) it is a string, and 2) doesn't have delimiters. How could I add a space between every character? or How could I split/ count the values from SUPP_VEC= for Python3?
I was also curious to know what SUPP means. I found oneSUPP=2and I looked on Excel if the presence(1)\abscence(0) in the SUPP_VEC counted the value of SUPP, nevertheless, I could only count 1 instead of 2, probably does somebody know what SUPP means.
The reason for my procedure is to have a frequency table for a specific deletion type.
I hope I made myself clear.
Thank you in advance.

how can I calculate how many characters trimstart removes

I have a string, and I need to calculate the number of spaces that I remove when I do trimStart.
For example, I have the following string \t\t \tabcs
so I have two tabs and two spaces and another tab that will be removed using trim start (the rest is non space related chars).
I need to know how many spaces will be removed. since I don't know how much is \t, I can't just count it as a single char.
(My purpose is to calculate the column shift of a string due to the trimming action. Obviously comparing the lengths before and after the trim will not return me the desired result.
Do you have any ideas?
Thanks!

Bash get string between 2 6-digit numbers

I have a UTF-8-BOM encoded text file full of lines of which most start with a 6-10-digit (number increases every line) and have a string behind them.
I want to get each of those "lines" (including the number) to process further in my bash script.
It'd be an easy to do by just using a for loop with sed -n '$line\p' but unfortunately some of those strings I need have line breaks as part of them, so I need a way of extracting the string between two 6+ digit numbers (including the first number) which mark a new line.
An example of 3 "lines":
123456\tA random string here
123567\t another string
this time
it goes over
multiple lines
124567\t a normal string again
What I need:
123456\tA random string here
,
123567\t another string
this time
it goes over
multiple lines
and
124567\t a normal string again
A few things:
The strings are not surrounded with "" unfortunately
All numbers the strings contain are <6 digits long, so a >=6 digit number is always the start of a new string line
The number increases, so the number before the string is always lower than the one behind
I'd like to convert all special characters like tabs or line breaks to \t or \n
I need to get the byte length later in the script, a string must keep it's length
I'm still new here, so if I put this in the wrong place or if it was already answered, tell me!
I hope the "UTF-8-BOM encoded" is not a trap.
Here is my proposal if it is not.
bash-3.1$ sed -En '/^[0-9]{6,10}/!{:a;H;n;/^[0-9]{6,10}/!ba;x;s/\n/\\n/g;s/\t/\\t/g;p};/^[0-9]{6,10}/{x;s/\t/\\t/g;1!p;x;h;z;}' input.txt
Output for sample input (with a newline at the end):
123456\tA random string here
123567\t another string\nthis time\nit goes over\nmultiple lines
124567\t a normal string again
I assumed that the relevant 6-10 digits also always are at the start of a line,
otherwise it gets trickier.
Note:
The string length will increase by 1 for each newline \n or tabulator \t;
because the requested "\n" and "\t" are two characters each.

Resources