How to evaluate escape character when reading a text file in Python

How to evaluate escape character when reading a text file in Python - python-3.x

I have text file which is like this:
["RUN DATE: 2/08/18 9:00:24 USER:XXXXXX DISPLAY: MENULIST PROG NAME: MH4567 PAGE 1\nMENU: ADCS00 Visual Basic Things
Service\n 80 Printer / Message Control\n 90 Sign Off\nSelection or
command\n===>____________________________________________________________________________\n____________________________________________________________________________
____\n F3=Exit F4=Prompt F9=Retrieve F12=Previous\n 80 CALL PGM(GUCMD)\nAUTHORIZED: DOAPROCESS FDOAPROCES FOESUPR FPROGRAMMR OESUPR PROGRAMMER\n
90 SIGNOFF\nAUTHORIZED: DOAPROCESS FDOAPROCES FOESUPR FPROGRAMMR OESUPR PROGRAMMER\n", "RUN DATE: 5/09/19 9:00:24 USER:XXXXXX DISPLAY:
MENULIST PROG NAME: MH4567 PAGE 2\nMENU: APM001 Accounts Payable Menu\n MENU
OPTIONS DISPLAY PROGRAMS\n 1 Radar Processing 30 Vendor\n 2 Prepaid Processing
31 Prepaid\n"]
I want to translate every '\n' as new line and then print.
I tried this code but it didn't work:
import json
contents = open("file.txt", "r")
Lines = contents.readlines()
for l in Lines:
print(l)
The above code prints end of line in .txt file as new line without observing the actual '\n' character?

Use -
with open('./sample.txt', 'r') as file_:
txt = file_.read().replace('\\n', '\n')
print(txt)
Output
"RUN DATE: 2/08/18 9:00:24 USER:XXXXXX DISPLAY: MENULIST PROG NAME: MH4567 PAGE 1
MENU: ADCS00 Visual Basic Things
Service
80 Printer / Message Control
90 Sign Off
Selection or
command
===>____________________________________________________________________________
____________________________________________________________________________
____
F3=Exit F4=Prompt F9=Retrieve F12=Previous
80 CALL PGM(GUCMD)
AUTHORIZED: DOAPROCESS FDOAPROCES FOESUPR FPROGRAMMR OESUPR PROGRAMMER
90 SIGNOFF
AUTHORIZED: DOAPROCESS FDOAPROCES FOESUPR FPROGRAMMR OESUPR PROGRAMMER
", "RUN DATE: 5/09/19 9:00:24 USER:XXXXXX DISPLAY:
MENULIST PROG NAME: MH4567 PAGE 2
MENU: APM001 Accounts Payable Menu
MENU
OPTIONS DISPLAY PROGRAMS
1 Radar Processing 30 Vendor
2 Prepaid Processing
31 Prepaid
"

Related

SED style Multi address in Python?

I have an app that parses multiple Cisco show tech files. These files contain the output of multiple router commands in a structured way, let me show you an snippet of a show tech output:
`show clock`
20:20:50.771 UTC Wed Sep 07 2022
Time source is NTP
`show callhome`
callhome disabled
Callhome Information:
<SNIPET>
`show module`
Mod Ports Module-Type Model Status
--- ----- ------------------------------------- --------------------- ---------
1 52 16x10G + 32x10/25G + 4x100G Module N9K-X96136YC-R ok
2 52 16x10G + 32x10/25G + 4x100G Module N9K-X96136YC-R ok
3 52 16x10G + 32x10/25G + 4x100G Module N9K-X96136YC-R ok
4 52 16x10G + 32x10/25G + 4x100G Module N9K-X96136YC-R ok
21 0 Fabric Module N9K-C9504-FM-R ok
22 0 Fabric Module N9K-C9504-FM-R ok
23 0 Fabric Module N9K-C9504-FM-R ok
<SNIPET>
My app currently uses both SED and Python scripts to parse these files. I use SED to parse the show tech file looking for a specific command output, once I find it, I stop SED. This way I don't need to read all the file (these can get to be very big files). This is a snipet of my SED script:
sed -E -n '/`show running-config`|`show running`|`show running config`/{
p
:loop
n
p
/`show/q
b loop
}' $1/$file
As you can see I am using a multi address range in SED. My question specifically is, how can I achieve something similar in python? I have tried multiple combinations of flags: DOTALL and MULTILINE but I can't get the result I'm expecting, for example, I can get a match for the command I'm looking for, but python regex wont stop until the end of the file after the first match.
I am looking for something like this
sed -n '/`show clock`/,/`show/p'
I would like the regex match to stop parsing the file and print the results, immediately after seeing `show again , hope that makes sense and thank you all for reading me and for your help

You can use nested loops.
import re
def process_file(filename):
with open(filename) as f:
for line in f:
if re.search(r'`show running-config`|`show running`|`show running config`', line):
print(line)
for line1 in f:
print(line1)
if re.search(r'`show', line1):
return
The inner for loop will start from the next line after the one processed by the outer loop.
You can also do it with a single loop using a flag variable.
import re
def process_file(filename):
in_show = False
with open(filename) as f:
for line in f:
if re.search(r'`show running-config`|`show running`|`show running config`', line):
in_show = True
if in_show
print(line)
if re.search(r'`show', line1):
return

How to capture words spread through multiple lines which have anywhite space(newline, space, tab)

import re
c = """
class_monitor std4:
Name: xyz
Roll number: 123
Age: 9
Badge: Blue
class_monitor std5:
Name: abc
Roll number: 456
Age: 10
Badge: Red
"""
I want to print Name, Roll number and age for std4 and Name, roll number and badge for std5.
pat = (class_monitor)(.*4:)(\n|\s|\t)*(Name:)(.*)(\s|\n|\t)*(Roll number:)(.*)(\s|\n|\t)*(Age:)(.*)(\s|\n|\t)*(Badge:)(.*)
it matches the respective std if I toggle the second group (.*4:) to (.*5:) in pythex.
However, in a script mode, it is not working. Am I missing something here?

python3: Counting repeated occurrence in a list

Each line contains a special timestamp, the caller number, the receiver number, the duration of the call in seconds and the rate per minute in cents at which this call was charged, all separated by ";”. The file contains thousands of calls looks like this. I created a list instead of a dictionary to access the elements but I'm not sure how to count the number of calls originating from the phone in question
timestamp;caller;receiver;duration;rate per minute
1419121426;7808907654;7807890123;184;0.34
1419122593;7803214567;7801236789;46;0.37
1419122890;7808907654;7809876543;225;0.31
1419122967;7801234567;7808907654;419;0.34
1419123462;7804922860;7809876543;782;0.29
1419123914;7804321098;7801234567;919;0.34
1419125766;7807890123;7808907654;176;0.41
1419127316;7809876543;7804321098;471;0.31
Phone number || # |Duration | Due |
+--------------+-----------------------
|(780) 123 4567||384|55h07m53s|$ 876.97|
|(780) 123 6789||132|17h53m19s|$ 288.81|
|(780) 321 4567||363|49h52m12s|$ 827.48|
|(780) 432 1098||112|16h05m09s|$ 259.66|
|(780) 492 2860||502|69h27m48s|$1160.52|
|(780) 789 0123||259|35h56m10s|$ 596.94|
|(780) 876 5432||129|17h22m32s|$ 288.56|
|(780) 890 7654||245|33h48m46s|$ 539.41|
|(780) 987 6543||374|52h50m11s|$ 883.72|
list =[i.strip().split(";") for i in open("calls.txt", "r")]
print(list)

I have very simple solution for your issue:
First of all use with when opening file - it's a handy shortcut and it provides sames functionality as wrap this funtion into try...except. Consider this:
lines = []
with open("test.txt", "r") as f:
for line in f.readlines():
lines.append(line.strip().split(";"))
print(lines)
counters = {}
# you browse through lists and later through numbers inside lists
for line in lines:
for number in line:
# very basic way to count occurences
if number not in counters:
counters[number] = 1
else:
counters[number] += 1
# in this condition you can tell what number of digits you accept
counters = {elem: counters[elem] for elem in counters.keys() if len(elem) > 5}
print(counters)

This should get you started
import csv
import collections
Call = collections.namedtuple("Call", "duration rate time")
calls = {}
with open('path/to/file') as infile:
for time, nofrom, noto, dur, rate in csv.reader(infile):
calls.get(nofrom, {}).get(noto,[]).append(Call(dur, rate, time))
for nofrom, logs in calls.items():
for noto, callist in logs.items():
print(nofrom, "called", noto, len(callist), "times")

Search multiline error log for error code and then some of it's parameters on Linux

What command would give me the output I need for each instance of an error code in a very large log file? The file has records marked by a begin and end with number of characters. Such as:
SR 120
1414760452 0 1 Fri Oct 31 13:00:52 2014 2218714 4
GROVEMR2 scn
../SrxParamIF.m 284
New Exam Started
EN 120
The 5th field is the error code, 2218714 in previous example.
I thought of just grep'ing for the error code and outputting -A lines afterwards; then picking what I needed from that rather than parsing the entire file. That seems easy but my grep/awk/sed usage isn't to that level.
ONLY when error 2274021 is encountered as in the following example I'd like some output as shown.
Show me output such as: egrep ‘Coil:|Connector:|Channels faulted:| First channel:’ ERRORLOG|less
Part of input file of interest:
Mon Nov 24 13:43:37 2014 2274021 1
AWHMRGE3T NSP
SCP:RfHubCanHWO::RfBias 4101
^MException Class: Unknown Severity: Unknown
Function: RF: RF Bias
PSD: VIBRANT Coil: Breast SMI Scan: 1106/14
Coil Fault - Short Circuit
A multicoil bias fault was detected.
.
Connector: Port 1 (P1)
Channels faulted: 0x200
First channel: 10 of 32, counting from 1
Fault value: -2499 mV, Channel: 10->
Output:
Coil: Breast SMI
Connector: Port 1 (P1)
Channels faulted: 0x200
First channel: 10 of 32, counting from 1
Thanks in advance for any pointers!

Try the following (with the convenient adaptations)
#!/usr/bin/perl
use strict;
$/="\nEN "; # register separated by "\nEN "
my $error=2274021; # the error!
while(<>){ # for all registers
next unless /\b$error\b/; # ignore unless error
for my $line ( split(/\n/,$_)){
print "$line\n" if ($line =~ /Coil:|Connector:|Channels faulted:|First channel:/);
}
print "====\n"
}
Is this what you need?

Python 3: Can't properly encode and print a downloaded string with /xXX literals

So here's the problem. I want to, for example, download and print a list of all possible languages from
https://www.fanfiction.net/game/Pok%C3%A9mon/
(Visible under 'filters' button).
In HTML, it's represented as a following series of options:
<option value='17' >Svenska<option value='31' >čeština<option value='10' >Русский
<option value='39' >देवनागरी<option value='38' >ภาษาไทย<option value='5' >中文<option value='6' >日本語
I download it using urllib.request package
def getByUrl(self,url):
response = urllib.request.urlopen(url)
html = response.read()
return html
and then, I try to display it like this:
#staticmethod
def fromCollection_getPossibleLanguages(self,pageContent):
parsedHtml = BeautifulSoup(pageContent)
possibleMatches = parsedHtml.findAll('select',{'name':'languageid','class':'filter_select'})
possibleMatches = possibleMatches[0].findAll('option')
for match in possibleMatches:
print(str(match.text.encode('unicode')) + " - " + str(match.get('value')))
However, all my attempts to play with .encode() function(e.g. passing a 'utf-8' or 'unicode' args) have failed to display anything more than, for example:
b'\xd0\xa0\xd1\x83\xd1\x81\xd1\x81\xd0\xba\xd0\xb8\xd0\xb9' - 10
I'm displaying it in mac os x's terminal and in Eclipse's console view - same result

You don't need to encode at all. BeautifulSoup has already decoded the response bytes to Unicode values, and print() can take care of the rest here.
However, the page is malformed, as there are no closing </option> tags. This can confuse the standard HTML parser. Install lxml or the html5lib package, and the page can be parsed correctly:
parsedHtml = BeautifulSoup(pageContent, 'lxml')
or
parsedHtml = BeautifulSoup(pageContent, 'html5lib')
Next, you can select the <option> tags with one CSS selector:
possibleMatches = parsedHtml.select('select[name=languageid] option')
for match in possibleMatches:
print(match.text, "-", match.get('value'))
Demo:
>>> possibleMatches = soup.select('select[name=languageid] option')
>>> for match in possibleMatches:
... print(match.text, "-", match.get('value'))
...
Language - 0
Bahasa Indonesia - 32
Català - 34
Deutsch - 4
Eesti - 41
English - 1
Español - 2
Esperanto - 22
Filipino - 21
Français - 3
Italiano - 11
Język polski - 13
LINGUA LATINA - 35
Magyar - 14
Nederlands - 7
Norsk - 18
Português - 8
Română - 27
Suomi - 20
Svenska - 17
čeština - 31
Русский - 10
देवनागरी - 39
ภาษาไทย - 38
中文 - 5
日本語 - 6

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to evaluate escape character when reading a text file in Python - python-3.x

Related

SED style Multi address in Python?

How to capture words spread through multiple lines which have anywhite space(newline, space, tab)

python3: Counting repeated occurrence in a list

Search multiline error log for error code and then some of it's parameters on Linux

Python 3: Can't properly encode and print a downloaded string with /xXX literals

Categories

Resources